Closed Bug 1333206 Opened 8 years ago Closed 8 years ago

Deploy testpilot to parquet output on the DWL

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: trink, Assigned: whd)

References

Details

(Whiteboard: [SvcOps])

No description provided.
-- configuration file filename = "s3_parquet.lua" message_matcher = "Type == 'telemetry' && Fields[docType] == 'testpilot'" ticker_interval = 60 preserve_data = false parquet_schema = [=[ message testpilot { required binary id; optional binary clientId; required group metadata { required int64 Timestamp; required binary submissionDate; optional binary Date; optional binary normalizedChannel; optional binary geoCountry; optional binary geoCity; } optional group application { optional binary name; } optional group environment { optional group system { optional group os { optional binary name; optional binary version; } } } optional group payload { optional binary version; optional binary test; repeated group events { optional int64 timestamp; optional binary event; optional binary object; } } } ]=] metadata_group = "metadata" json_objects = {"Fields[submission]", "Fields[environment.system]"} s3_path_dimensions = { {name = "_submission_date", source = "Fields[submissionDate]"}, } batch_dir = "/var/tmp/parquet" max_writers = 100 max_rowgroup_size = 10000 max_file_size = 1024 * 1024 * 300 max_file_age = 60 * 60 hive_compatible = true
Blocks: 1330047
Blocks: 1255755
No longer blocks: 1330047
I've updated the CEP with hsadmin 0.0.6 and the parquet modules, and configured it to write parquet test data out to s3://net-mozaws-prod-us-west-2-pipeline-analysis, e.g. s3://net-mozaws-prod-us-west-2-pipeline-analysis/whd/output.whd.s3_parquet.parquet. People should be able to test parquet generation on the prod CEP now. I see occasional warnings such as: "output.whd.s3_parquet process_message returned: -1 column 'version' data type mismatch (integer)" when running the testpilot config above, which is probably an issue with the source data. I patched https://github.com/mozilla-services/lua_sandbox_extensions/blob/master/parquet/sandboxes/heka/output/s3_parquet.lua#L163 to append ".done" so the s3 upload script works correctly. Right now it's using the username as the first s3 namespace (instead of prepending something like "cep-output") which seems fine to me but we can add another prefix if desired. It appears users can only run one parquet output at a time, so I switched my test parquet generation to the core pings one from bug #1333203. If this looks good I'll deploy to the DWL.
Assignee: nobody → whd
Points: --- → 1
Priority: -- → P1
Whiteboard: [SvcOps]
Assignee: whd → whd
https://github.com/mozilla-services/puppet-config/compare/acc75820a7abca88db844a11f055e9636342f8be...new_telemetry I removed the s3 output from the CEP since it wasn't intended to function that way. I've deployed this to the DWL and parquet output is available at s3://net-mozaws-prod-us-west-2-pipeline-data/telemetry-testpilot-parquet I've added an entry (telemetry-testpilot-parquet) to the metadata in s3://new-mozaws-prod-us-west-2-pipeline-metadata/sources.json but left the metadata_prefix empty. We should decide if we want to add s3_path_dimensions and/or parquet_schema in some format to the metadata.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
While deploying bug #1339611 I noticed that something like 90% of testpilot pings are failing parquet validation like so: output.telemetry_testpilot_parquet_s3 process_message returned: -1 column 'event' data type mismatch (integer) This data isn't being used yet but we may need to update the schemas, if possible. I'm not sure what we can do though as it appears the client is sending two different data types for the same field.
Examples of each "events":[{"timestamp":851080,"event":0,"object":"daily"}] "events":[{"timestamp":225319,"event":"disabled","object":"blok@mozilla.org"}]
Status: RESOLVED → REOPENED
Flags: needinfo?(ssuh)
Resolution: FIXED → ---
This was a bug in testpilot -- I filed and bug and there's a fix in review already: https://github.com/mozilla/testpilot/issues/2206
Flags: needinfo?(ssuh)
Closed, the current schema is correct and already deployed.
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
-- latest configuration file filename = "s3_parquet.lua" message_matcher = "Type == 'telemetry' && Fields[docType] == 'testpilot'" ticker_interval = 60 preserve_data = false parquet_schema = [=[ message testpilot { required binary id (UTF8); optional binary clientId (UTF8); required group metadata { required int64 Timestamp; required binary submissionDate (UTF8); optional binary Date (UTF8); optional binary normalizedChannel (UTF8); optional binary geoCountry (UTF8); optional binary geoCity (UTF8); } optional group application { optional binary name (UTF8); } optional group environment { optional group system { optional group os { optional binary name (UTF8); optional binary version (UTF8); } } } optional group payload { optional binary version (UTF8); optional binary test (UTF8); repeated group events { optional int64 timestamp; optional binary event (UTF8); optional binary object (UTF8); } } } ]=] metadata_group = "metadata" json_objects = {"Fields[submission]", "Fields[environment.system]"} s3_path_dimensions = { {name = "_submission_date", source = "Fields[submissionDate]"}, } batch_dir = "/var/tmp/parquet" max_writers = 100 max_rowgroup_size = 10000 max_file_size = 1024 * 1024 * 300 max_file_age = 60 * 60 hive_compatible = true
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
https://github.com/mozilla-services/puppet-config/commit/d65bcc3419ed2184545a4da3d4d28b6231e42250 This is a schema change so I don't know how things will be affected downstream (well there are no downstream consumers so that's not a problem). In the future we should probably come up with a versioning scheme.
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
The versioning of our other parquet files is s3://bucket/prefix/v<N>/<table_name>, can we replicate that for direct-to-parquet? Also, we will need to decide what are and are not backwards compatible schema changes.
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.