Closed Bug 738888 Opened 13 years ago Closed 13 years ago

Talos crashes end with "error executing: ... minidump_stackwalk ....dmp ../symbols"

Categories

(Testing :: Talos, defect)

defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: philor, Unassigned)

References

Details

(Keywords: intermittent-failure)

Lots of moving parts that could be at fault, but inbound was still okay while crashing as of https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=66223f04fb55 at 16:30, then merged the Talos zip update seven pushes later, and then eight and nine pushes after that got the next crashes and was https://tbpl.mozilla.org/php/getParsedLog.php?id=10338558&tree=Mozilla-Inbound Rev3 Fedora 12 mozilla-inbound talos tpr_responsiveness on 2012-03-23 19:49:08 PDT for push 9c463a882b6f NOISE: Found crashdump: /tmp/tmpK5shjy/profile/minidumps/73097215-3e75-40b1-6da3e474-423ef3e2.dmp Failed tp5r: Stopped Fri, 23 Mar 2012 20:51:56 FAIL: Busted: tp5r FAIL: error executing: '/home/cltbld/talos-slave/talos-data/talos/breakpad/linux/minidump_stackwalk /tmp/tmpK5shjy/profile/minidumps/73097215-3e75-40b1-6da3e474-423ef3e2.dmp ../symbols' Traceback (most recent call last): File "run_tests.py", line 681, in <module> main() File "run_tests.py", line 678, in main test_file(arg, options, parser.parsed) File "run_tests.py", line 619, in test_file raise e utils.talosError: "error executing: '/home/cltbld/talos-slave/talos-data/talos/breakpad/linux/minidump_stackwalk /tmp/tmpK5shjy/profile/minidumps/73097215-3e75-40b1-6da3e474-423ef3e2.dmp ../symbols'" program finished with exit code 1 https://tbpl.mozilla.org/php/getParsedLog.php?id=10337944&tree=Mozilla-Inbound Rev3 Fedora 12x64 mozilla-inbound talos dirty on 2012-03-23 20:00:10 PDT for push 6bbe864b5162 NOISE: Found crashdump: /tmp/tmpuu9Rf4/profile/minidumps/3c2b99ec-d0aa-34b0-0ca64a8d-2a17987a.dmp Failed ts_places_generated_med: Stopped Fri, 23 Mar 2012 20:28:25 FAIL: Busted: ts_places_generated_med FAIL: error executing: '/home/cltbld/talos-slave/talos-data/talos/breakpad/linux64/minidump_stackwalk /tmp/tmpuu9Rf4/profile/minidumps/3c2b99ec-d0aa-34b0-0ca64a8d-2a17987a.dmp ../symbols' Traceback (most recent call last): File "run_tests.py", line 681, in <module> main() File "run_tests.py", line 678, in main test_file(arg, options, parser.parsed) File "run_tests.py", line 619, in test_file raise e utils.talosError: "error executing: '/home/cltbld/talos-slave/talos-data/talos/breakpad/linux64/minidump_stackwalk /tmp/tmpuu9Rf4/profile/minidumps/3c2b99ec-d0aa-34b0-0ca64a8d-2a17987a.dmp ../symbols'" program finished with exit code 1 Could be Linux-only, could be that we just crash a lot more in Talos on Linux than elsewhere, and we haven't gotten around to crashing on any other platform yet.
Not sure if this is normal or not but I see a lot of these [1]: NOISE: Could not read chrome manifest 'file:///home/cltbld/talos-slave/talos-data/firefox/extensions/%7B972ce4c6-7e08-4474-a285-3208198ce6fd%7D/chrome.manifest'. NOISE: [JavaScript Warning: "Use of enablePrivilege is deprecated. Please use code that runs with the system principal (e.g. an extension) instead." {file: "file:///home/cltbld/talos-slave/talos-data/talos/startup_test/startup_test.html?begin=1332558487108" line: 0}] NOISE: __startTimestamp1332558487717__endTimestamp NOISE: __startBeforeLaunchTimestamp1332558487108__endBeforeLaunchTimestamp NOISE: __startAfterTerminationTimestamp1332558488184__endAfterTerminationTimestamp Just for the record, talos.zip had not changed since Mar. 3rd (bug 732835). [1] http://mxr.mozilla.org/mozilla-central/source/xpcom/components/nsComponentManager.cpp#512
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #7) > Not sure if this is normal or not but I see a lot of these [1]: > NOISE: Could not read chrome manifest > 'file:///home/cltbld/talos-slave/talos-data/firefox/extensions/%7B972ce4c6- > 7e08-4474-a285-3208198ce6fd%7D/chrome.manifest'. > NOISE: [JavaScript Warning: "Use of enablePrivilege is deprecated. Please > use code that runs with the system principal (e.g. an extension) instead." > {file: > "file:///home/cltbld/talos-slave/talos-data/talos/startup_test/startup_test. > html?begin=1332558487108" line: 0}] > NOISE: __startTimestamp1332558487717__endTimestamp > NOISE: __startBeforeLaunchTimestamp1332558487108__endBeforeLaunchTimestamp > NOISE: > __startAfterTerminationTimestamp1332558488184__endAfterTerminationTimestamp > > Just for the record, talos.zip had not changed since Mar. 3rd (bug 732835). > > [1] > http://mxr.mozilla.org/mozilla-central/source/xpcom/components/ > nsComponentManager.cpp#512 I also get a lot of these testing locally. We should fix the enablePrivilege bug. I actually don't understand the chrome.manifest bug (though i haven't looked into it). That said, I see it all the time without this failure. It looks like the underlying binary fails (/home/cltbld/talos-slave/talos-data/talos/breakpad/linux/minidump_stackwalk) though I don't know why. If https://bugzilla.mozilla.org/show_bug.cgi?id=734163 got fixed we could at least see where it was actually failing and hopefully instrument this. I haven't seen this error in practice so I don't have much feeling why the failing occurs. I would blindly guess that the resultant stack dumps are somehow corrupt (not flushed to disk?), but its purely a guess
Does that "program finished with exit code 1" mean that minidump_stackwalk finished with exit code 1, or is that the run_tests.py script finishing with exit code 1?
(In reply to Ted Mielczarek [:ted] from comment #12) > Does that "program finished with exit code 1" mean that minidump_stackwalk > finished with exit code 1, or is that the run_tests.py script finishing with > exit code 1? The traceback seems to imply the latter
jmaher gave me a copy of this talos.zip, and unzipping it on my Linux system shows that the minidump_stackwalk executables are apparently not zipped with executable permissions. That could definitely cause this error.
So the issue is found: running create_talos_zip.py with python 2.4 (as is on people) will not preserve permissions properly. Running on 2.7 will. Casual googling has not given me when exactly this changed, but for now create_talos_zip.py should be run on python 2.7
Not that the stack is useful, but we fixed the problem as described: https://tbpl.mozilla.org/php/getParsedLog.php?id=10410744&tree=Mozilla-Inbound&full=1
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.