Closed Bug 1297269 Opened 8 years ago Closed 8 years ago

[PulseGuardian] Polling queues for current bindings is very slow

Categories

(Webtools :: Pulse, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mcote, Assigned: camd)

References

Details

Bug 1021495 added queue bindings to the information PulseGuardian tracks. However, it did so by making an API call for each queue on every call to monitor_queues(). With 7000+ queues on Pulse, this causes monitor_queues(), and hence an iteration of guard(), to take a very long time to complete. This means that queue sizes aren't updated very often, which in turn causes various problems, such as overgrowing queues being deleted without a warning message even if they are growing relatively slowly. Ideally we can get this information in a single not-too-slow API call per loop. If we can't, we need to rethink this feature's requirements and/or design. Cam, do you mind looking into this?
Sure, I can look into this. I'm pretty booked at the moment, though. What's the urgency on this? Is this situation happening quite a bit?
Per IRC: mcote> camd: it's kinda urgent. it has already caused a couple problems. <mcote> camd: I can also back out your patch, but your stuff is probably relying on it now, right?
Fixed was deployed yesterday: https://github.com/mozilla/pulseguardian/commit/c35979e13e70e6ccae4976a2afeaf987ce8491a0 Whereas yesterday it seemed like the loop only went down to 20 minutes (which was still much faster than before, when it was 1-2 hours!), now it appears to be chugging along at around 4 minutes. This is still longer than we originally intended, but it's much, much better. I'm guessing the remaining slowness is database-related, since there are very few API calls now. We could undoubtedly do better there; there are probably some no-ops, and we could be batching calls into single transactions.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.