Closed Bug 906793 Opened 11 years ago Closed 10 years ago

B2G mochitests intermittently just stop after "IMediaResourceManagerDeathNotifier: media server died" and/or "ServiceManager: service 'media.resource_manager' died"

Categories

(Firefox OS Graveyard :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: RyanVM, Unassigned)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 obsolete file)

Worst of all, the error only shows up in the dump, so TBPL can't suggest anything for it. https://tbpl.mozilla.org/php/getParsedLog.php?id=26722508&tree=B2g-Inbound b2g_emulator_vm b2g-inbound opt test mochitest-3 on 2013-08-19 08:37:33 PDT for push 24fbe2665d27 slave: tst-linux64-ec2-108 09:15:48 INFO - 08-19 16:08:39.423 699 699 I GeckoDump: 15720 INFO TEST-PASS | /tests/content/smil/test/test_smilMappedAttrFromTo.xhtml | image-rendering: checking that 'from' value is set at start of animation 09:15:48 INFO - 08-19 16:08:39.433 699 699 I GeckoDump: 15721 INFO TEST-PASS | /tests/content/smil/test/test_smilMappedAttrFromTo.xhtml | image-rendering: checking value halfway through animation 09:15:48 INFO - 08-19 16:14:09.692 699 1015 W IMediaResourceManagerDeathNotifier: media server died 09:15:48 INFO - 08-19 16:14:09.692 33 33 I ServiceManager: service 'media.resource_manager' died
Summary: B2G mochitests intermittently just stop (after "IMediaResourceManagerDeathNotifier: media server died") → B2G mochitests intermittently just stop after "IMediaResourceManagerDeathNotifier: media server died" and/or "ServiceManager: service 'media.resource_manager' died"
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #0) > 09:15:48 INFO - 08-19 16:14:09.692 699 1015 W > IMediaResourceManagerDeathNotifier: media server died > 09:15:48 INFO - 08-19 16:14:09.692 33 33 I ServiceManager: > service 'media.resource_manager' died The above says that b2g process is died by some reason during when video codec is instantiating. On emulator, sw video codec is created.
Sotaro, can you please suggest an assignee for this bug? https://tbpl.mozilla.org/php/getParsedLog.php?id=26959050&full=1&branch=b2g-inbound
Flags: needinfo?(sotaro.ikeda.g)
Depends on: 907745
Flags: needinfo?(sotaro.ikeda.g)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #15) > Sotaro, can you please suggest an assignee for this bug? > > https://tbpl.mozilla.org/php/getParsedLog.php?id=26959050&full=1&branch=b2g- > inbound I take this bug. Before investigating the problem, Bug 907745 needs to be fixed. The bug seems to be fixed in near future.
Assignee: nobody → sotaro.ikeda.g
Depends on: 917252
I still haven't been able to figure out the interleaving, but figured I might as well upload the fix to the timeout error message while I was thinking about it.
Attachment #809412 - Flags: review?(jhammel)
Comment on attachment 809412 [details] [diff] [review] Patch 1.0 - fix format errors in the timeout error message Beh, sorry. I meant to upload this to bug 917252
Attachment #809412 - Attachment is obsolete: true
Attachment #809412 - Flags: review?(jhammel)
https://tbpl.mozilla.org/php/getParsedLog.php?id=30163533&tree=Mozilla-B2g26-v1.2 Andrew, this is probably the most frequent B2G emulator mochitest failure we hit and there hasn't been any action on it in months. Can we please get someone to look at this?
Flags: needinfo?(overholt)
It looks like mochitests may be randomly running out of memory. In each of the failing lots, I see lots of lines like: log : OOM Message Logger Started
Bug 929005 might be related to this bug. but it is not clear.
Sotaro's the best person to investigate here. Let's hope the work in bug 929005 helps.
Flags: needinfo?(overholt)
Depends on: 929005
Bug 935981 might be related. But it is not clear.
Depends on: 935981
The following log just say b2g process was died by some reason. It has basically no relation ship to media. > ServiceManager: service 'media.resource_manager' died
If there is no crash log and b2g process was killed by some reason, a possible cause is 'killed by low memory killer' or 'killed by out of memory killer'.
Many of the logs being pasted here are now no longer related the the issue in the summary. I've filed bug 937684 for ensuring the "TEST-UNEXPECTED-FAIL | /foo/bar/baz.html | application timed out after 330.0 seconds with no output" lines don't get interleaved, and thus we have more to star against, and this hopefully avoid false positives here.
Depends on: 937684
https://tbpl.mozilla.org/php/getParsedLog.php?id=30843693&tree=Mozilla-Inbound&full=1 Sotaro, it seems unclear to me where we go next. Any ideas?
Flags: needinfo?(sotaro.ikeda.g)
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #95) > https://tbpl.mozilla.org/php/getParsedLog.php?id=30843693&tree=Mozilla- > Inbound&full=1 > > Sotaro, it seems unclear to me where we go next. Any ideas? Need to make clear if Comment 82 happens.
Flags: needinfo?(sotaro.ikeda.g)
dhylands, is it possible to output "adb shell dmesg" log?
Flags: needinfo?(dhylands)
So we recently added oom-msg-logger for on-device which will look for OOM related messages. These will then get captured and placed into the logcat log. Currently, this only works on phones which our init.b2g.rc gets included with the kernel and we flash the kernel. It looks like the OOM Message Logger is, in fact, trying to run. I see these near the end of the log: 10:55:35 INFO - 11-08 18:54:24.896 1288 1288 I log : OOM Message Logger Started We discovered that it really needs to run as root, and this was fixed just recently in Bug 939370 Looking at the more recent logs (after Nov 15) I see that the OOM Message Logger is running, but I'm not seeing any output from it, so it would appear that the media server isn't dying due to the low memory OOM killer. Looking at the logs: 02:06:10 INFO - 11-25 10:04:23.585 702 702 I GeckoDump: 18058 INFO TEST-START | Shutdown 02:06:10 INFO - 11-25 10:04:23.595 702 702 I GeckoDump: 18059 INFO Passed: 31704 02:06:10 INFO - 11-25 10:04:23.595 702 702 I GeckoDump: 18060 INFO Failed: 0 02:06:10 INFO - 11-25 10:04:23.610 702 702 I GeckoDump: 18061 INFO Todo: 578 02:06:10 INFO - 11-25 10:04:23.625 702 702 I GeckoDump: 18062 INFO Slowest: 164922ms - /tests/dom/tests/mochitest/ajax/jquery/test_jQuery.html 02:06:10 INFO - 11-25 10:04:23.655 702 702 I GeckoDump: 18063 INFO SimpleTest FINISHED 02:06:10 INFO - 11-25 10:04:23.675 702 702 I GeckoDump: 18064 INFO TEST-INFO | Ran 1 Loops 02:06:10 INFO - 11-25 10:04:23.675 702 702 I GeckoDump: 18065 INFO SimpleTest FINISHED 02:06:10 INFO - 11-25 10:04:26.505 662 662 I Gecko:ProcessPriorityManager: [Mochitest, child-id=1, pid=702] Scheduling reset timer to fire in 1000ms. 02:06:10 INFO - 11-25 10:04:26.535 662 683 E Profiler: BPUnw: [1 total] thread_unregister_for_profiling(me=0x22b10) (NOT REGISTERED) 02:06:10 INFO - 11-25 10:04:26.625 662 827 E Profiler: BPUnw: [1 total] thread_unregister_for_profiling(me=0x22fe0) (NOT REGISTERED) 02:06:10 INFO - 11-25 10:04:26.745 662 662 I Gecko:ProcessPriorityManager: [Mochitest, child-id=1, pid=702] ScheduleResetPriority bailing; the timer is already running. 02:06:10 INFO - 11-25 10:04:26.755 662 662 I Gecko:ProcessPriorityManager: Making background LRU pool with size(31) 02:06:10 INFO - 11-25 10:04:26.755 662 662 I Gecko:ProcessPriorityManager: [child-id=1, pid=-1] Destroying ParticularProcessPriorityManager. 02:06:10 INFO - 11-25 10:04:28.626 33 33 I ServiceManager: service 'media.resource_manager' died It looks like the tests all completed successfully, and that mochitest was in the the process of shutting down when the mediaserver died. So maybe something was opened but not closed and that then caused the media server to die?
Flags: needinfo?(dhylands)
Thanks for the comment! 'media.resource_manager' are not media server. It is instantiated in b2g process. Do you know if b2g process is closed at these timing? I recognized that these services Shutdown do not implemented yet. If b2g process is shutdown, but service manager is still in active, the service manager detects 'media.resource_manager' died.
In the most recent instances of all these failures, the tests are passing and completing successfully, but for some reason the mozharness script isn't reporting this correctly. This looks like a stdout buffering/threading problem, e.g., 13:55:41 INFO - 39652 INFO TEST-START | Shutdown 13:55:41 INFO - 39653 INFO Passed: 37433 13:56:52 INFO - 39654 Mochitest INFO | runtestsb2g.py | Running tests: end. 13:57:53 INFO - INFO Failed: 0 13:57:53 INFO - 39655 INFO Todo: 157 13:57:53 INFO - 39656 INFO Slowest: 57602ms - /tests/content/html/content/test/forms/test_input_typing_sanitization.html 13:57:53 INFO - 39657 INFO SimpleTest FINISHED 13:57:53 INFO - 39658 INFO TEST-INFO | Ran 1 Loops 13:57:53 INFO - 39659 INFO SimpleTest FINISHED
Assignee: sotaro.ikeda.g → nobody
We haven't seen this in a while. I think this message indicated a crash, which is now being reported correctly, and so we can close this bug.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: