Closed
Bug 810719
Opened 12 years ago
Closed 12 years ago
B2G: turn on jsloader.reuseGlobal on Beta
Categories
(Firefox OS Graveyard :: General, defect, P1)
Tracking
(blocking-basecamp:+, firefox18 fixed, firefox19 unaffected)
Tracking | Status | |
---|---|---|
firefox18 | --- | fixed |
firefox19 | --- | unaffected |
People
(Reporter: khuey, Assigned: khuey)
References
Details
(Whiteboard: [MemShrink:P1])
Attachments
(1 file)
(deleted),
patch
|
justin.lebar+bug
:
review+
justin.lebar+bug
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
+++ This bug was initially created as a clone of Bug #807478 +++
We still need to do this on Aurora
Updated•12 years ago
|
Whiteboard: [MemShrink]
Updated•12 years ago
|
Assignee: nobody → khuey
Updated•12 years ago
|
Summary: B2G: turn on jsloader.reuseGlobal on Aurora → B2G: turn on jsloader.reuseGlobal on Aurora and Beta
Comment 1•12 years ago
|
||
Well, the good news is we don't need this on Aurora anymore... :-/
Summary: B2G: turn on jsloader.reuseGlobal on Aurora and Beta → B2G: turn on jsloader.reuseGlobal on Beta
This is a MemShrink:P1 right? It would be an indescribable shame for all this work to die on the vine ...
Comment 3•12 years ago
|
||
(In reply to Chris Jones [:cjones] [:warhammer] from comment #2)
> This is a MemShrink:P1 right? It would be an indescribable shame for all
> this work to die on the vine ...
Yes, absolutely. It's not triaged because we usually triage bugs only at our bi-weekly meetings. In cases like this, we hold off triaging not because we expect there to be any disagreement about the priority, but because we use triage as an opportunity to discuss bugs. If we mark this as P1 right now, this bug won't be on our list of issues to talk about.
I doubt Kyle isn't working on this because it's not marked MemShrink:P1.
As before when we were counting beans over memshrink priorities, my only goal is to ensure that this bug is prioritized above the other slim-fast work on the memshrink radar.
Assignee | ||
Comment 5•12 years ago
|
||
Kyle is working on this, and Kyle works faster when he doesn't have to read bugmail about how Kyle is not working on stuff :-P
I didn't mean to imply that Kyle wasn't working on this :). However, I suspect Kyle also works faster when he doesn't have to read bugmail about how Kyle *is* working on stuff, so let's move on.
Comment 7•12 years ago
|
||
This bug has been called out as likely having risk to non-B2G platforms. Given that, marking as P1, and moving into the C2 milestone. We should prioritize this landing to mozilla-beta as soon as possible, to prevent late-breaking regressions to other platforms.
Priority: -- → P1
Target Milestone: --- → B2G C2 (20nov-10dec)
Assignee | ||
Comment 8•12 years ago
|
||
This bug has no risk to non-B2G platforms. Bugs that this depends on might, but flipping the pref will only affect b2g.
Comment 9•12 years ago
|
||
> Bugs that this depends on might
It's for this reason that I brought this bug to akeybl's attention.
\o/
Updated•12 years ago
|
Whiteboard: [MemShrink] → [MemShrink:P1]
Assignee | ||
Comment 11•12 years ago
|
||
Comment 12•12 years ago
|
||
Comment on attachment 686178 [details] [diff] [review]
Patch
r+a=me
Attachment #686178 -
Flags: review+
Attachment #686178 -
Flags: approval-mozilla-beta+
Assignee | ||
Comment 13•12 years ago
|
||
I tried to land this today but I had to back it out because it causes b2g to crash :-(
tl;dr
- an app process is SIGBUS'ing around the time we push new files
- this might be racy and the app process might be seeing inconsistent .so's
- b2g dies and is relaunched twice (?) after that
- on the fourth relaunching (fourth time's a charm!) we get a legitimate-looking error
14:41:08 INFO - E/GeckoConsole( 342): [JavaScript Error: "resource://specialpowers/MockPermissionPrompt.jsm - EXPORTED_SYMBOLS is not an array." {file: "chrome://specialpowers/content/specialpowersAPI.js" line: 13}]
- things seem to proceed normally but then the harness errors out
- Kyle and I can't seem to reproduce
More precisely, in https://tbpl.mozilla.org/php/getParsedLog.php?id=17426792&full=1&branch=mozilla-beta at least
14:41:08 INFO - installing gecko binaries...
...
14:41:08 INFO - restarting B2G
14:41:08 INFO - 'cp' not found, but 'dd' was found as a replacement
14:41:08 INFO - Traceback (most recent call last):
...
14:41:08 INFO - File "/usr/lib/python2.6/telnetlib.py", line 395, in read_very_lazy
14:41:08 INFO - raise EOFError, 'telnet connection closed'
14:41:08 INFO - EOFError: telnet connection closed
14:41:08 ERROR - Return code: 1
The telnet connection seems to be interrupted when we push the new gecko onto the emulator. That's strange. Continuing on
...
14:41:08 INFO - E/GeckoConsole( 43): [JavaScript Error: "NS_ERROR_ILLEGAL_VALUE: Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIXPCComponents_Utils.import]" {file: "jar:file:///system/b2g/omni.ja!/components/HealthReportService.js" line: 10}]
14:41:08 INFO - I/Gecko ( 43): 1354142124088 Marionette INFO MarionetteComponent loaded
14:41:08 INFO - I/Gecko ( 43): 1354142124139 Marionette INFO marionette enabled
The restarted b2g process comes up, and marionette seems to be happy. Then
14:41:08 INFO - F/libc ( 182): Fatal signal 7 (SIGBUS) at 0x41390cf4 (code=2)
...
14:41:08 INFO - I/DEBUG ( 35): pid: 182, tid: 182 >>> /system/b2g/plugin-container <<<
...
14:41:08 INFO - E/GeckoConsole( 43): Content JS INFO at app://system.gaiamobile.org/js/window_manager.js:996 in createFrame: %%%%% Launching FTU as remote (OOP)
14:41:08 INFO - E/GeckoConsole( 43): Content JS INFO at app://system.gaiamobile.org/js/window_manager.js:996 in createFrame: %%%%% Launching Homescreen as remote (OOP)
...
14:41:08 INFO - I/Gecko ( 43): [Parent 43] WARNING: pipe error (105): Connection reset by peer: file /data/jenkins/jobs/b2g-build/workspace/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 431
14:41:08 INFO - I/Gecko ( 43): [Parent 43] WARNING: pipe error (110): Connection reset by peer: file /data/jenkins/jobs/b2g-build/workspace/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 431
14:41:08 INFO - I/Gecko ( 43): [Parent 43] WARNING: pipe error (112): Connection reset by peer: file /data/jenkins/jobs/b2g-build/workspace/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 431
14:41:08 INFO - F/libc ( 43): Fatal signal 11 (SIGSEGV) at 0x00000047 (code=1)
This is a little hard to interpret, but basically what's happening is
- b2g process launches FTU and homescreen apps
- one of those two crashes with SIGBUS (this is backwards in logcat, confusingly)
- the other doesn't show a crash, but it appears to die too
- the main b2g processes segfaults
This is where things are obviously going wrong.
After that,
14:41:08 INFO - E/profiler( 267): Registering start signal
14:41:08 INFO - I/Gecko ( 267): 1354142210634 Marionette INFO MarionetteComponent loaded
14:41:08 INFO - I/Gecko ( 267): 1354142210665 Marionette INFO marionette enabled
b2g is relaunched by the services manager and marionette comes up again.
14:41:08 INFO - E/profiler( 342): Registering start signal
14:41:08 INFO - I/Gecko ( 342): 1354142229425 Marionette INFO MarionetteComponent loaded
14:41:08 INFO - I/Gecko ( 342): 1354142229443 Marionette INFO marionette enabled
Hm, so the third b2g dies and a fourth is launched. That's weird because there are no crashes logged.
14:41:08 INFO - E/GeckoConsole( 342): [JavaScript Error: "resource://specialpowers/MockPermissionPrompt.jsm - EXPORTED_SYMBOLS is not an array." {file: "chrome://specialpowers/content/specialpowersAPI.js" line: 13}]
That looks relevant at least!
14:41:08 INFO - E/profiler( 380): Registering start signal
14:41:08 INFO - E/GeckoConsole( 342): Content JS INFO at app://system.gaiamobile.org/js/window_manager.js:996 in createFrame: %%%%% Launching FTU as remote (OOP)
14:41:08 INFO - E/GeckoConsole( 342): Content JS INFO at app://system.gaiamobile.org/js/window_manager.js:996 in createFrame: %%%%% Launching Homescreen as remote (OOP)
14:41:08 INFO - E/profiler( 405): Registering start signal
This time, with the fourth b2g process, FTU and homescreen seem to launch fine. But next
14:41:08 WARNING - # TBPL WARNING #
14:41:08 WARNING - The mochitest suite: mochitest-1 ran with return status: WARNING
and we're done.
I'm not able to reproduce anything like this locally with ./test.sh, although |./test.sh mochitest| is broken. I can't start up the emulator with ./run-emulator.sh. I don't think Kyle has been able to get anything running.
jgriffin, how can we most closely duplicate the tbpl environment locally?
Flags: needinfo?(jgriffin)
Try push to rule out ... some class of weirdness, but I'm not holding my breath for anything meaningful
https://tbpl.mozilla.org/?tree=Try&rev=4eea13894e1d
I see basically the same thing looking through a failing reftest log,
https://tbpl.mozilla.org/php/getParsedLog.php?id=17425077&full=1&branch=mozilla-beta#error0
I may be misinterpreting these log statements
14:41:08 INFO - waiting for system-message-listener-ready...
14:41:08 INFO - done
14:41:08 INFO - installing gecko binaries...
14:41:08 INFO - pushing /system/b2g/crashreporter-override.ini (attempt 1 of 10)
...
14:41:08 INFO - pushing /system/b2g/components/binary.manifest (attempt 1 of 10)
14:41:08 INFO - restarting B2G
but they make it appear like we're pushing the updated build while the previous b2g service is running. If so, that would explain two of the weird things in comment 14
- SIGBUS on a build that only has a pref change
- b2g process dying and restarting without error messages (this happens on SIGKILL, which is what |adb stop b2g| delivers)
To (mostly) fix this problem, we need to stop b2g before pushing new code to the emulator.
I also forgot to mention above, I don't see any issues when running with this pref change on an otoro.
Sigh. I was testing with a .userconfig pointing at mozilla-inbound :/. Here we go again ...
Another interpretation of these logs is
[some weird and possibly bad, but irrelevant, stuff happens on startup]
...
14:41:08 INFO - E/GeckoConsole( 342): [JavaScript Error: "resource://specialpowers/MockPermissionPrompt.jsm - EXPORTED_SYMBOLS is not an array." {file: "chrome://specialpowers/content/specialpowersAPI.js" line: 13}]
...
[harness doesn't start up properly and we fail the tests]
Occam's Razor favors this. Build almost done, will see.
I can reproduce locally. Simpler explanation seems to be correct.
Bug 807478 seems not to have been uplifted hard enough.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
jgriffin, leaving needinfo? outstanding. Last time we diagnosed these test failures, we discussed the right incantations to run the suites like CI does, but I totally forgot them :). We should document that somewhere.
Updated•12 years ago
|
Flags: needinfo?(jgriffin)
You need to log in
before you can comment on or make changes to this bug.
Description
•