Closed
Bug 1065567
Opened 10 years ago
Closed 7 years ago
Use Pulse for creation of hg resultsets
Categories
(Tree Management :: Treeherder: Data Ingestion, defect, P5)
Tree Management
Treeherder: Data Ingestion
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: camd)
References
Details
Attachments
(2 files)
Broken out from bug 1048043. If/when bug 1022701 is fixed, we should see if it's viable to use pulse to detect new pushes, rather than having to poll json-pushes. We may still want to poll json-pushes periodically to catch up any missed pulse notifications, but it would at least mean we can ingest pushes more quickly in the majority of the time.
Reporter | ||
Updated•10 years ago
|
Component: Treeherder → Treeherder: Data Ingestion
Reporter | ||
Updated•10 years ago
|
Priority: P4 → P5
Comment hidden (typo) |
Comment hidden (typo) |
Assignee | ||
Comment 4•9 years ago
|
||
Ahh shoot, I didn't see this one when I searched. Thanks for clearing it up. :)
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → cdawson
Assignee | ||
Comment 5•8 years ago
|
||
Removing the dependency on the old bug because we don't need it fixed to move forward on this
No longer depends on: 1022701
Assignee | ||
Comment 6•8 years ago
|
||
Here's the docs to the new pulse exchange: https://mozilla-version-control-tools.readthedocs.io/en/latest/hgmo/notifications.html
Assignee | ||
Updated•8 years ago
|
Summary: Use pulse for quicker pushlog ingestion rather than polling json-pushes → Use Pulse for creation of hg resultsets
Comment 7•8 years ago
|
||
Assignee | ||
Comment 8•8 years ago
|
||
Comment on attachment 8861190 [details] [treeherder] mozilla:hg-pulse-resultsets > mozilla:master Hey Buddy-- Rebased this old branch, but seems like time to enable this. After this is rolled out and working I will submit a PR for Bug 1359246 to remove the celery beat and ingestion mechanism for the old way.
Attachment #8861190 -
Flags: review?(emorley)
Assignee | ||
Comment 9•8 years ago
|
||
When deploying this, add this to PULSE_RESULTSET_SOURCES: { "exchange": "exchange/hgpushes/v1", "routing_keys": [ "#" ] } After deploy, wait a few minutes to be 100% sure the celery beat and pulse mechanisms are overlapping. Then turn off the celerybeat pushlog mechanism by changing the dyno setting for ``worker_pushlog`` to 0 dynos.
Reporter | ||
Comment 10•8 years ago
|
||
Comment on attachment 8861190 [details]
[treeherder] mozilla:hg-pulse-resultsets > mozilla:master
I've left some comments on the PR :-)
Attachment #8861190 -
Flags: review?(emorley)
Assignee | ||
Comment 11•8 years ago
|
||
Comment on attachment 8861190 [details]
[treeherder] mozilla:hg-pulse-resultsets > mozilla:master
Thanks for catching those items. Apologies that I didn't. I guess I rushed it.
Attachment #8861190 -
Flags: review?(emorley)
Reporter | ||
Comment 12•8 years ago
|
||
Comment on attachment 8861190 [details]
[treeherder] mozilla:hg-pulse-resultsets > mozilla:master
Awesome! :-)
Attachment #8861190 -
Flags: review?(emorley) → review+
Comment 13•8 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/23c40414c0c9dec6160628e3e577d076db642128 Bug 1065567 - Add ability to ingest Mercurial Pushes via Pulse (#2420) This uses the same mechanism we use for ingesting GitHub pushes. This adds an additional Transformer for HG pushes, and requires adding the Pulse exchange of ``exchange/hgpushes/v1`` to the existing PULSE_RESULTSET_SOURCES environment variable.
Assignee | ||
Comment 14•8 years ago
|
||
The next PR will remove the proc and code for the old hg push ingestion mechanism and unify the Push ingestion for ``ingest_push`` and ``resultset_loader``.
Reporter | ||
Comment 15•8 years ago
|
||
Stage is showing some keyerror exceptions: https://rpm.newrelic.com/accounts/677903/applications/14179733/filterable_errors#/show/c2ab2c9e-2f50-11e7-931c-0242ac110012_0_4526/stack_trace?top_facet=transactionUiName&primary_facet=error.class&barchart=barchart&_k=nu64vc File "/app/treeherder/etl/tasks/pulse_tasks.py", line 30, in store_pulse_resultsetsFile "/app/treeherder/etl/resultset_loader.py", line 35, in process exceptions:KeyError: 'details' I also got a `Queue total messages alarm: treeherder-stage pushlog`, which I'm guessing is due to turning down the pushlog dyno count to zero on stage. Before we do that we have to stop celery scheduling pushlog tasks, otherwise it causes queue alerts :-)
Comment 16•8 years ago
|
||
(In reply to Cameron Dawson [:camd] from comment #14) > The next PR will remove the proc and code for the old hg push ingestion > mechanism and unify the Push ingestion for ``ingest_push`` and > ``resultset_loader``. Pulse is not a reliable delivery channel. hg.mozilla.org has some robustness so it guarantees at least once delivery to Pulse. But Pulse itself can lose messages. So unless you are OK with data loss in the event that Pulse suffers a failure (which happened a few months ago), you should continue to periodically poll the pushlog to make sure you didn't miss any messages. However, the polling interval can be significantly reduced (to say every 5 minutes) because Pulse works 99.9+% of the time. If you want to get out of the polling pushlog game completely, we could potentially look into a durable SNS topic or similar. The notification mechanism on hg.mozilla.org can ensure at least once delivery and it can write to pretty much anything (including directly to treeherder if we wanted to go that route).
Comment 17•7 years ago
|
||
Hey Cam, I was looking at this in the context of the missing resultsets yesterday. Should we resolve this bug? Per comment 16 it doesn't sound like we want to disable pushlog polling any time soon.
Flags: needinfo?(cdawson)
Comment 18•7 years ago
|
||
Assignee | ||
Comment 19•7 years ago
|
||
Comment on attachment 8891471 [details]
[treeherder] mozilla:pushlog-longer-interval > mozilla:master
Hey Will-- Since Ed is on PTO, would you be up for reviewing this micro-PR? :)
Flags: needinfo?(cdawson)
Attachment #8891471 -
Flags: review?(wlachance)
Updated•7 years ago
|
Attachment #8891471 -
Flags: review?(wlachance) → review+
Comment 20•7 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/12af9cbd01d73fb0cfa91b2e84b172729e4ed829 Bug 1065567 - Decrease pushlog interval to 5 minutes (#2667) Now that we are also ingesting HG pushes via Pulse, this celery beat fetch interval is just a failsafe. So we can decrease the interval from every minute to every 5 minutes.
Assignee | ||
Comment 21•7 years ago
|
||
I think we're done here. We've kept the json-pushes polling, but now it's just a back-up to the pulse ingestion.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•