Closed
Bug 645704
Opened 14 years ago
Closed 8 years ago
reign in memory usage on crash_analysis scripts
Categories
(Socorro :: General, task)
Tracking
(Not tracked)
RESOLVED
INVALID
People
(Reporter: rhelmer, Unassigned)
References
Details
The first casualty from Fx4 being set full-throttle (643661) was bug 645530; the crash analysis scripts seem to be using up all available memory and sending sp-admin01 into swap.
sp-admin01 has 8GB of RAM, so as a temporary measure jabba moved the cron job (cron_libraries.sh) to sp-processor10 which has 24GB of RAM, and the job seems to complete there.
I took 2-second samples from ps (SIZE) while this was running, here is the peak per-process memory usage during that time period:
pid size(kb) cmd
16614 5824204 python /data/crash-data-tools/per-crash-interesting-modules.py -p Firefox -r 4.0 -f /tmp/Firefox_4.0.tar
5912 1288136 python /data/socorro/application/socorro/storage/hbaseClient.py -h socorro-thrift1.zlb.phx1.mozilla.com export_jsonz_tarball_for_ooids /tmp /tmp/Firefox_4.0.tar
We should determine if using this much memory is necessary.
Comment 1•14 years ago
|
||
the other solution to the problem is just to sample a subset of the data for any given release.
seems like the script ran fine up to the point where we had about 11 million active daily users on firefox 4.0 reporting crashes.
if we get a release with more ADUs than that,
or the volume of any one crash gets out of control,
or the number of modules that we are tracking
or number versions of those modules increases
they could all lead to high memory use conditions.
something to think about for all reports as we move to processing 100% of all report submissions would be to reduce the window for the span of data we look at, or do sampling out of window which in this case is 24 hours.
Comment 2•14 years ago
|
||
the 11 million unthrottled adu's is under what we might expect in crash volume of 150 million users throttled at 10%.
its probably something more like trying to process the module correlations for the top crash on 4.0
25680 crashes per day
signature: mozalloc_abort(char const* const) | NS_DebugBreak_P | nsCycleCollectingAutoRefCnt::decr(nsISupports*)
the bug tracking that signature is bug 633445
25k crashes per day probably exceeds anything we have ever seen for a single signature by a wide margin.
Assignee | ||
Updated•13 years ago
|
Component: Socorro → General
Product: Webtools → Socorro
Comment 3•8 years ago
|
||
We don't support the correlation script on crash-analysis any more.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INVALID
You need to log in
before you can comment on or make changes to this bug.
Description
•