Closed Bug 1043485 Opened 10 years ago Closed 10 years ago

Android robocop crashes no longer pass the correct test name to mozcrash

Categories

(Testing :: General, defect)

ARM
Android
defect
Not set
major

Tracking

(firefox33 fixed, firefox34 fixed)

RESOLVED FIXED
mozilla34
Tracking Status
firefox33 --- fixed
firefox34 --- fixed

People

(Reporter: emorley, Assigned: akachkach)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

eg:
https://tbpl.mozilla.org/php/getParsedLog.php?id=44525801&tree=B2g-Inbound
09:24:21  WARNING -  PROCESS-CRASH | remoteautomation.py | application crashed [@ js::jit::ICSetProp_Native::Compiler::getStub(js::jit::ICStubSpace*)]

Should have been:
09:24:21  WARNING -  PROCESS-CRASH | testImportFromAndroid | application crashed [@ js::jit::ICSetProp_Native::Compiler::getStub(js::jit::ICStubSpace*)]

And:
https://tbpl.mozilla.org/php/getParsedLog.php?id=44524261&tree=Mozilla-Inbound
08:57:23  WARNING -  PROCESS-CRASH | remoteautomation.py | application crashed [@ CustomElf::GetSymbol(char const*, unsigned long) const]

Should have been:
08:57:23  WARNING -  PROCESS-CRASH | testDistribution | application crashed [@ CustomElf::GetSymbol(char const*, unsigned long) const]

The string printed there is test_name passed to mozcrash:
http://mxr.mozilla.org/mozilla-central/source/testing/mozbase/mozcrash/mozcrash/mozcrash.py#25

Via lastTestSeen:
http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/runtests.py#1447
1447       crashed = mozcrash.check_for_crashes(minidump_path,
1448                                            symbolsPath,
1449                                            test_name=self.lastTestSeen)

I'm guessing this might be fallout from bug 886570?
I'm presuming this change doesn't play nicely with robocop:

-    def record_last_test(self, line):
-      """record last test on harness"""
-      if "TEST-START" in line and "|" in line:
-        self.harness.lastTestSeen = line.split("|")[1].strip()
-      return line
+    def record_last_test(self, message):
+      """record last test on harness"""
+      if message['action'] == 'test_start':
+        self.harness.lastTestSeen = message['test']
+      return message

Does robocop not send a test_start event?

(Note mozcrash has a fallback, so it's not that:
message['test'] = remoteautomation.py
...but instead:
message['action'] != 'test_start'

and thus we're hitting the fallback:
http://mxr.mozilla.org/mozilla-central/source/testing/mozbase/mozcrash/mozcrash/mozcrash.py#61
Oh well. It's way worse than that actually: Robocop isn't doing structured logs for now, as I thought passing unstructured logs as log messages would be enough while we figure this out. Here 

I started working on a patch to convert the Java code to structured logging (bug 1034236 and bug 1034267), so I can either try to finish the Java refactor (it should be almost done... the problem is that I can't get fennec/robocop to run locally, so it's hard to test) or add conditions on unstructured logs (if message['action'] == 'log' and 'TEST-START' in message['message']). Not sure which is preferable.
Depends on: 1034267
A temporary workaround until bug 1034236 / bug 1034267 are finished might be best, if that's ok? :-)
I believe this would fix the problem (and a couple other handlers) but we won't have 100% compatibility until we land the structured robocop logs. Hopefully that won't take too long (see bug 1034267), just have to make sure the output looks the same as before.
Attachment #8462735 - Flags: review?(emorley)
Attachment #8462735 - Flags: review?(emorley) → review+
Thank you :-)
Assignee: nobody → akachkach
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/00b25727072e
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Unfortunately I'm still seeing a missing name for "Shutdown", eg:
https://tbpl.mozilla.org/php/getParsedLog.php?id=44811243&tree=Mozilla-Inbound

{
09:48:16     INFO -  1372 INFO TEST-PASS | testNativeCrypto | Expected byte array length matches key length - 20 should equal 20
09:48:16     INFO -  1373 INFO TEST-PASS | testNativeCrypto | Expected byte array matches key byte array -
09:48:16     INFO -  1374 INFO TEST-PASS | testNativeCrypto | MessageDigest hash is the same as NativeCrypto SHA-1 hash -
09:48:16     INFO -  1375 INFO TEST-PASS | testNativeCrypto | MessageDigest hash is the same as NativeCrypto SHA-1 hash -
09:48:16     INFO -  1376 INFO TEST-PASS | testNativeCrypto | MessageDigest hash is the same as NativeCrypto SHA-1 hash -
09:48:16     INFO -  1377 INFO TEST-END | testNativeCrypto | finished in 40471ms
09:48:16     INFO -  1378 INFO TEST-START | Shutdown
09:48:16     INFO -  1379 INFO Passed: 46
09:48:16     INFO -  1380 INFO Failed: 0
09:48:16     INFO -  1381 INFO Todo: 0
09:48:16     INFO -  1382 INFO SimpleTest FINISHED
09:48:16     INFO -  INFO | automation.py | Application ran for: 0:01:16.247024
09:48:16     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmp3hGV4_pidlog
09:48:16     INFO -  Contents of /data/anr/traces.txt:
09:48:16     INFO -  
09:48:16     INFO -  
09:48:16     INFO -  mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android/1406647240/fennec-34.0a1.en-US.android-arm.crashreporter-symbols.zip
09:48:16  WARNING -  PROCESS-CRASH | remoteautomation.py | application crashed [@ libui.so + 0x1befe]
}

The above crash is intermittent failure bug 986738 - previous logs for that failure mode looked like:

https://tbpl.mozilla.org/php/getParsedLog.php?id=44089292&tree=Mozilla-Inbound

{
01:46:01     INFO -  15 INFO TEST-PASS | testAddonManager | Given message occurred for registered event: {"parentId":0,"delayLoad":false,"title":"http:\/\/mochi.test:8888\/tests\/robocop\/robocop_blank_01.html","selected":true,"isPrivate":false,"stub":false,"external":false,"desktopMode":false,"tabIndex":-1,"tabID":2,"type":"Tab:Added","uri":"http:\/\/mochi.test:8888\/tests\/robocop\/robocop_blank_01.html"} - Tab:Added should equal Tab:Added
01:46:01     INFO -  16 INFO TEST-PASS | testAddonManager | Given message occurred for registered event: {"errorType":"","bgColor":"transparent","type":"DOMContentLoaded","tabID":2} - DOMContentLoaded should equal DOMContentLoaded
01:46:01     INFO -  EventExpecter: no longer listening for Tab:Added
01:46:01     INFO -  EventExpecter: no longer listening for DOMContentLoaded
01:46:01     INFO -  17 INFO TEST-PASS | testAddonManager | The correct number of tabs are opened - 2 should equal 2
01:46:01     INFO -  18 INFO TEST-PASS | testAddonManager | Page title is correct - Browser Blank Page 01 should equal Browser Blank Page 01
01:46:01     INFO -  waitForText timeout on ^Add-ons$
01:46:01     INFO -  19 INFO TEST-PASS | testAddonManager | The correct number of tabs are opened - 2 should equal 2
01:46:01     INFO -  20 INFO TEST-END | testAddonManager | finished in 152371ms
01:46:01     INFO -  21 INFO TEST-START | Shutdown
01:46:01     INFO -  22 INFO Passed: 18
01:46:01     INFO -  23 INFO Failed: 0
01:46:01     INFO -  24 INFO Todo: 0
01:46:01     INFO -  25 INFO SimpleTest FINISHED
01:46:01     INFO -  
01:46:01     INFO -  INFO | automation.py | Application ran for: 0:03:10.208340
01:46:01     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmpqFoFyfpidlog
01:46:01     INFO -  Contents of /data/anr/traces.txt:
01:46:01     INFO -  
01:46:01     INFO -  
01:46:01     INFO -  mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android/1405666711/fennec-33.0a1.en-US.android-arm.crashreporter-symbols.zip
01:46:01  WARNING -  PROCESS-CRASH | Shutdown | application crashed [@ libui.so + 0x1befe]
}

In fact looking at the comments in that bug, you can see the test name change around 2014-07-18, which is when bug 886570 landed.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
In fact it seems like this isn't working for !Shutdown too, eg:

(this push was after the fix here merged around)

https://tbpl.mozilla.org/php/getParsedLog.php?id=44769830&tree=Fx-Team

21:24:47     INFO -  11 INFO TEST-START | testLoad
21:24:47     INFO -  INFO | automation.py | Application ran for: 0:00:15.764747
21:24:47     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmptLyQVepidlog
21:24:47     INFO -  Contents of /data/anr/traces.txt:
21:24:49     INFO -  mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/fx-team-android/1406604281/fennec-34.0a1.en-US.android-arm.crashreporter-symbols.zip
21:24:53  WARNING -  PROCESS-CRASH | remoteautomation.py | application crashed [@ CustomElf::GetSymbol(char const*, unsigned long) const]
Can we wait for bug 1034267 to land? Using actual structured logs in Robocop might solve this.

Actually, I'm starting to wonder if this isn't caused by the changes to remoteautomation.py

http://dxr.mozilla.org/mozilla-central/source/build/mobile/remoteautomation.py#251
After further inspecting remoteautomation.py, it seems like this bug is caused because we still pass a structured logger to RemoteAutomation, which in turn processes the logs like structured logs (looking for a "test_start" action where it will only get logs).

I think bug 1034267 will fix this. What's an easy way to create a PROCESS-CRASH in robocop tests?
Pushed the bug 1034267 patch with a MOZ_CRASH to see if it's fixed: https://hg.mozilla.org/try/rev/3a3d80f713df
With the patch applied, here are the logs we get on a crash:

16:30:12     INFO -  10 INFO SimpleTest START
16:30:12     INFO -  11 INFO TEST-START | testFlingCorrectness
16:30:12     INFO -  12 INFO TEST-PASS | testFlingCorrectness | Given message occurred for registered event: {"type":"Gecko:Ready"} - Gecko:Ready should equal Gecko:Ready
16:30:12     INFO -  13 INFO EventExpecter: no longer listening for Gecko:Ready
16:30:12     INFO -  INFO | automation.py | Application ran for: 0:00:45.686828
16:30:12     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmpUX5pqapidlog
16:30:12     INFO -  Contents of /data/anr/traces.txt:
16:30:12     INFO -  
16:30:12     INFO -  
16:30:12     INFO -  mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/try-builds/ahmed.kachkach@gmail.com-3a3d80f713df/try-android/fennec-34.0a1.en-US.android-arm.crashreporter-symbols.zip
16:30:12  WARNING -  PROCESS-CRASH | testFlingCorrectness | application crashed [@ nsWindow::OnGlobalAndroidEvent(mozilla::AndroidGeckoEvent*)]
16:30:12     INFO -  Crash dump filename: /tmp/tmpB1OHgg/0b330c82-e0b9-20be-4b047e8a-10985a44.dmp

So I guess we should just wait for bug 1034267 to land.
Flags: needinfo?(emorley)
Sounds good to me, thank you :-)
Flags: needinfo?(emorley)
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: