Closed Bug 1187873 Opened 9 years ago Closed 9 years ago

adb-helper crashes firefox on Devices.emit("fastboot-start-polling") on mac

Categories

(DevTools :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: daleharvey, Assigned: gerard-majax)

References

Details

Attachments

(9 files, 5 obsolete files)

(deleted), image/png
Details
(deleted), text/x-github-pull-request
daleharvey
: review+
Details
(deleted), image/png
Details
(deleted), text/plain
Details
(deleted), text/plain
Details
(deleted), text/x-github-pull-request
ochameau
: review+
Details
(deleted), text/x-github-pull-request
daleharvey
: review+
Details
(deleted), application/x-xpinstall
daleharvey
: feedback+
Details
(deleted), application/x-xpinstall
daleharvey
: feedback+
Details
Will post the crash log in a second
So I am working on b2g-installer, the flashing step was failing for me, now I changed the code to addEventListener("load", function load() { removeEventListener("load", load, false); Devices.emit("fastboot-start-polling"); and OSX gives me a crash dialog for Firefox (although the running instance seems to work fine), if I dismiss the dialog it shows up again seconds later until I quit firefox
taking a stab in the dark that this is because of a space in the binary name? $ /Users/daleharvey/Library/Application Support/Firefox/Profiles/enc2qivd.test/extensions/adbhelper@mozilla.org/mac64/fastboot devices -bash: /Users/daleharvey/Library/Application: No such file or directory
Heh I tried let callPayload = { command: '"' + binary + '"', name:"NS_ERROR_FILE_UNRECOGNIZED_PATH"
and just a note, if I comment out |subprocess.call(callPayload)| I get no crash
(In reply to Dale Harvey (:daleharvey) from comment #2) > taking a stab in the dark that this is because of a space in the binary name? > > $ /Users/daleharvey/Library/Application > Support/Firefox/Profiles/enc2qivd.test/extensions/adbhelper@mozilla.org/ > mac64/fastboot devices > -bash: /Users/daleharvey/Library/Application: No such file or directory That was working for sure.
Yup, its not a space issue, I tested by moving it to somewhere else without a space and still crashed Also from advice from ochameau I tried environment = ["TMPDIR=" + Services.dirsvc.get("TmpD", Ci.nsIFile).path]; and still crashing
No problem on current nightly on Linux.
So I just tried subprocess.call({ command: '/Users/daleharvey/hello', stdout: function(data) { console.log(data); }, done: function() { console.log('DONE!'); } }); Where hello is #!/bin/bash echo "Hello" and got the same crash
Hey Alexandre, just pinging in case you know anyone familiar with subprocess who may have an idea about this? Cheers
Flags: needinfo?(poirot.alex)
You may try to reach Patrick, the most "recent" contributor of this code. I took the code from this repo: http://hg.mozilla.org/ipccode/
Flags: needinfo?(poirot.alex)
Just another data point, tried running the code on a completely different osx machine, same failure
So Alexandre suggested trying older builds Nightly - 42.0a1 (2015-07-05) No crash, nothing works, it sits in a loop calling fastboot
And getting a new error now on nightly ReferenceError: fastboot is not defined: onFastbootStart@resource://gre/modules/commonjs/toolkit/loader.js -> resource://adbhelperatmozilla.org/main.js:72:3 EventEmitter_emit@resource://gre/modules/devtools/event-emitter.js:147:11 flashStep@chrome://b2g-installer/content/about.js:905:3 step/<@chrome://b2g-installer/content/about.js:1010:12 promise callback*install@chrome://b2g-installer/content/about.js:1051:3 EventListener.handleEvent*load/<@chrome://b2g-installer/content/about.js:1127:1 promise callback*load@chrome://b2g-installer/content/about.js:1121:3 EventListener.handleEvent*@chrome://b2g-installer/content/about.js:1118:1
Attached image Screen Shot 2015-08-12 at 12.16.02.png (deleted) —
Tested this morning on a brand new Mac OSX 10.10.4. No issue with current Nightly :(
Hey Myk, Janx mhenretty said one of you may be able to help out with this, its taken me 2 weeks of debugging, I am getting various errors and it looks like Alexandre isnt able to reproduce. Basically any attempt to use the fastboot part of adb-helper is broken for me on mac osx, usually with a crash in subprocess.js http://junk.arandomurl.com/ has the addons, to see the bug if you are using macos, install the plugin, visit about:b2g-installer and plugin an aries device and attempt to flash it.
Flags: needinfo?(myk)
Flags: needinfo?(janx)
Im also going wip another machine and try upgrading (on 10.9.5 here)
So I completely wiped my machine, it was clean install of 10.10.4 on the machine with nothing else except todays firefox nightly and I got the same crash
Oh and an easier and faster way to attempt to reproduce the crash as Alexandre pointed out, you dont need to try and flash the phone, just open a console and type |Devices.emit('fastboot-start-polling');|
Attached file PR (deleted) —
So after some debugging with Vosky, he ran into similar issues of Fastboot not working as expected but on Linux. The attached PR did make fastboot work on his side, so we would be facing some nasty race between fastboot polling and fastboot polling stop. Let's hope it fixes for you too.
Assignee: nobody → lissyx+mozillians
Status: NEW → ASSIGNED
Attachment #8648476 - Flags: review?(dale)
Hopefully Alex's pull request also fixes the OSX polling-start issue. I don't currently have access to an OSX machine to verify.
Flags: needinfo?(janx)
Comment on attachment 8648476 [details] PR This doesnt fix my specific issue, still reproducing the crash, since I can get the crash by pretty much any subprocess call I am not sure its a race. However looks like a good change to make specially if its fixing another race.
Attachment #8648476 - Flags: review?(dale) → review+
Attached image screenshot of crash dialog (deleted) —
(In reply to Dale Harvey (:daleharvey) from comment #19) > Oh and an easier and faster way to attempt to reproduce the crash as > Alexandre pointed out, you dont need to try and flash the phone, just open a > console and type |Devices.emit('fastboot-start-polling');| I don't get an immediate crash (Mac OS X 10.10.5) after evaluating that code, but a minute or so later I get this dialog that "firefox" has crashed (although the browser itself doesn't crash, nor does it tell me that the about:b2g-installer tab has crashed). Is this the crash in question? (In reply to Dale Harvey (:daleharvey) from comment #0) > Will post the crash log in a second Do you have that available? I have one from the OS Problem Reporter, but it'd be useful to see one from the Firefox Crash Reporter, and it didn't come up for me.
Flags: needinfo?(myk)
Attached file crash from bug 1199513 (deleted) —
Comment on attachment 8654491 [details] crash from bug 1199513 Could we be exposing a bug in JS interpreter?
Attachment #8654491 - Flags: feedback?(nicolas.b.pierron)
(In reply to Alexandre LISSY :gerard-majax from comment #27) > Comment on attachment 8654491 [details] > crash from bug 1199513 > > Could we be exposing a bug in JS interpreter? A Bug in the JS interpreter sounds unlikely, the backtrace reports a problem under ObjectGroup::allocationSiteGroup (cc-ing bhackett in case he has any idea). Also the stack size is of 55MB (what?) which sounds wrong, but you might want to see if bumping the default stack size in js/xpconnect/src/XPCJSRuntime.cpp can help. Based on the stack pointer I do not think this would be the issue. Also, are the 3 switches of compartment justified?
(In reply to Nicolas B. Pierron [:nbp] from comment #28) > (In reply to Alexandre LISSY :gerard-majax from comment #27) > > Comment on attachment 8654491 [details] > > crash from bug 1199513 > > > > Could we be exposing a bug in JS interpreter? > > A Bug in the JS interpreter sounds unlikely, the backtrace reports a problem > under ObjectGroup::allocationSiteGroup (cc-ing bhackett in case he has any > idea). > > Also the stack size is of 55MB (what?) which sounds wrong, but you might > want to see if bumping the default stack size in > js/xpconnect/src/XPCJSRuntime.cpp can help. Based on the stack pointer I > do not think this would be the issue. > > Also, are the 3 switches of compartment justified? I think so: fastboot lives in the ADB Helper addon, this is being registered against Gecko's Devices.jsm and we are making use of this from B2G Installer addon code, from the "about:" page.
(In reply to Jan Keromnes [:janx] from comment #21) > Hopefully Alex's pull request also fixes the OSX polling-start issue. I > don't currently have access to an OSX machine to verify. I have one but I don't reproduce the issue ... Dale and Naoki do reproduce it constantly. We checked and Dale and myself have the same OSX, Nightly, and addons versions ... We are obviously missing something but no idea what, and this is really blocking us now.
Flags: needinfo?(janx)
Ryan, Alex can't reproduce this issue on his machine. Could you please see if you can?
Flags: needinfo?(janx) → needinfo?(jryans)
(In reply to Alexandre LISSY :gerard-majax from comment #27) > Comment on attachment 8654491 [details] > crash from bug 1199513 > > Could we be exposing a bug in JS interpreter? Naveed, we need some JS help here.
Flags: needinfo?(nihsanullah)
Oh nbp is already helping out here.
Flags: needinfo?(nihsanullah)
I tested Nightly (2015-09-03) on OS X 10.10.5 with ADB Helper 0.8.0 and B2G Installer 7ca68c1f5ddc05dd70630553b319feb973a4238b. Opening about:b2g-installer worked fine, and it detected my device successfully. Are there other steps I need to try to repro the crash?
Flags: needinfo?(jryans)
It wont trigger until fastboot polling has started, so opening / detecting the device expected to work, if you run |Devices.emit('fastboot-start-polling');| at the shell it should hopefully trigger it
(In reply to Dale Harvey (:daleharvey) from comment #35) > It wont trigger until fastboot polling has started, so opening / detecting > the device expected to work, if you run > |Devices.emit('fastboot-start-polling');| at the shell it should hopefully > trigger it For me, running this had no effect, no crash either.
(In reply to J. Ryan Stinnett [:jryans] (use ni?) from comment #36) > (In reply to Dale Harvey (:daleharvey) from comment #35) > > It wont trigger until fastboot polling has started, so opening / detecting > > the device expected to work, if you run > > |Devices.emit('fastboot-start-polling');| at the shell it should hopefully > > trigger it > > For me, running this had no effect, no crash either. So I just happened to reproduce. The trick was to ... let the fastboot polling run for enough time. After a few minutes, I start to get crash notifications on a regular basis.
Flags: needinfo?(jryans)
Attached file adbhelper-0.8.1pre-mac64.xpi (obsolete) (deleted) —
Can you try this addon? Remove the ADB Helper from about:addons, then install this one. You will get debug in JS Console (-jsconsole). The STR to produce attachment 8658178 [details]: - plug a device in fastboot - open about:b2g-installer - issue Devices.emit("fastboot-start-polling");
Flags: needinfo?(dale)
(In reply to Alexandre LISSY :gerard-majax from comment #37) > (In reply to J. Ryan Stinnett [:jryans] (use ni?) from comment #36) > > (In reply to Dale Harvey (:daleharvey) from comment #35) > > > It wont trigger until fastboot polling has started, so opening / detecting > > > the device expected to work, if you run > > > |Devices.emit('fastboot-start-polling');| at the shell it should hopefully > > > trigger it > > > > For me, running this had no effect, no crash either. > > So I just happened to reproduce. The trick was to ... let the fastboot > polling run for enough time. After a few minutes, I start to get crash > notifications on a regular basis. I tried for a few minutes, but I never saw a crash.
Flags: needinfo?(jryans)
Patrick, any idea ? This is not dependant on the binary itself at least.
Flags: needinfo?(patrick)
So dale retested: - crashing seems to occur at the end of the subprocess.call(), since we have |done| called but we don't have the |fastboot devices| output - using /bin/ls instead of the fastboot binary does the same
I'm not sure what you try to do, but looking at the code, you might want to wait for the subprocess to terminate before you do anything else? let p= subprocess.call({ command: '/Users/daleharvey/hello', stdout: function(data) { console.log(data); }, done: function() { console.log('DONE!'); } }); p.wait() In an case, looking at the stack trace, Firefox seems to crash at je_malloc (i.e. when allocating memory). The only functions that subprocess.jsm would call from libc are: read, write, poll, posix_spawn*, execve, dup2, close, exit, waitpid, kill. Everything else is purely JavaScript. If you point me to the code, I'm happy to look at it, but I'm quite convinced that this is a Firefox platform issue.
Flags: needinfo?(patrick)
(In reply to Patrick Brunschwig from comment #43) > I'm not sure what you try to do, but looking at the code, you might want to > wait for the subprocess to terminate before you do anything else? > > let p= subprocess.call({ > command: '/Users/daleharvey/hello', > stdout: function(data) { > console.log(data); > }, > done: function() { > console.log('DONE!'); > } > }); > > p.wait() > Well this is something we can try, but I thought that |subprocess.call()| would be able to run async ? > > In an case, looking at the stack trace, Firefox seems to crash at je_malloc > (i.e. when allocating memory). Yep. People I talked with are suspecting memory corruption being the reason for now. > > The only functions that subprocess.jsm would call from libc are: read, > write, poll, posix_spawn*, execve, dup2, close, exit, waitpid, kill. > Everything else is purely JavaScript. Well JS with ctypes. And I know we can have crashes, like in bug 1163956. > > If you point me to the code, I'm happy to look at it, but I'm quite > convinced that this is a Firefox platform issue. So the code lives in several places: - subprocess.call() is done from https://github.com/mozilla/adbhelper/blob/b4a6f3e78bc2fa5d6b7b2894ba709c7ebce016e1/fastboot.js#L102 - and it is triggered via https://github.com/mozilla-b2g/b2g-installer/blob/7ca68c1f5ddc05dd70630553b319feb973a4238b/about.js#L928 Right now I'm mostly stuck on where to analyze. If you have any idea what I should check for in subprocess that would already be a great step forward.
Right, so this code is also crashing: > return new Promise((resolve, reject) => { > [...] > let p = subprocess.call(callPayload); > p.wait(); > }):
Flags: needinfo?(patrick)
Increasing BufferSize in subprocess_worker_unix.js from 1024 to 1024*1024 reduces greatly the frequency of occurrence of the issue. Reducing it from 1024 to 32 makes it much more frequent.
That's an interesting observation. Increasing the BufferSize leads to less cycles in the for(...) loop. Given that it also crashes using /bin/ls, I'd assume that the crash happens in the readPipe() function in subprocess_worker_unix.js, more specifically around here: var line = new ReadBuffer(); // equals to ctypes.uint8_t.array(BufferSize); At this moment, we allocate BufferSize bytes from memory, and I believe this is what we see in the crash (jsmalloc). This seems to point to an error in the js-ctypes implementation. HTH
Flags: needinfo?(patrick)
Clearing needinfo since we talked about this on IRC, testing out the new version now
Flags: needinfo?(dale)
adding a couple of Mac people looking for help with this strange Mac crasher.
Attached file adbhelper-0.8.1pre-mac64.xpi (obsolete) (deleted) —
Attachment #8658195 - Attachment is obsolete: true
Flags: needinfo?(nhirata.bugzilla)
Flags: needinfo?(dale)
I am still getting the crash with the above attached :( https://gist.github.com/daleharvey/ce86892b8519a1b6d5da
Flags: needinfo?(dale)
(In reply to comment #49) Looking at the crash stack from bug 1199513, this more properly requires experts in jemalloc and/or JIT. And in any case I'm retiring at the end of this month, so I won't have time to work on this bug :-)
Gecko build from tree with and without jemalloc do not expose the issue ... But nightly from Mozilla does ...
Attached file adbhelper-0.8.1pre-mac64.xpi (obsolete) (deleted) —
with updated subprocess code from gecko
Attachment #8660442 - Attachment is obsolete: true
I am running attachment 8662937 [details] since a very long time now and cannot replicate the issue anymore. This is only diffenring by copying "new" (as of 2014) subprocess.js and subprocess_worker_unix.js from Gecko. It is creating process with |posix_spawn()| instead of |fork()+execve()|. It also has a couple of changes regarding fd's and some jsctypes types. If it's enough to fix the issue, I'll be happy ... Can you test on your side to confirm? Thanks!
Flags: needinfo?(nhirata.bugzilla) → needinfo?(dale)
(In reply to Alexandre LISSY :gerard-majax from comment #55) > I am running attachment 8662937 [details] since a very long time now and > cannot replicate the issue anymore. This is only diffenring by copying "new" > (as of 2014) subprocess.js and subprocess_worker_unix.js from Gecko. > > It is creating process with |posix_spawn()| instead of |fork()+execve()|. It > also has a couple of changes regarding fd's and some jsctypes types. If it's > enough to fix the issue, I'll be happy ... > > Can you test on your side to confirm? Thanks! If this fixes it, we should probably remove subprocess from ADB Helper and just use the SDK version from Gecko.
wooot, ran this locally and didnt see any crash :) r++++
Flags: needinfo?(dale)
(In reply to J. Ryan Stinnett [:jryans] (use ni?) from comment #56) > (In reply to Alexandre LISSY :gerard-majax from comment #55) > > I am running attachment 8662937 [details] since a very long time now and > > cannot replicate the issue anymore. This is only diffenring by copying "new" > > (as of 2014) subprocess.js and subprocess_worker_unix.js from Gecko. > > > > It is creating process with |posix_spawn()| instead of |fork()+execve()|. It > > also has a couple of changes regarding fd's and some jsctypes types. If it's > > enough to fix the issue, I'll be happy ... > > > > Can you test on your side to confirm? Thanks! > > If this fixes it, we should probably remove subprocess from ADB Helper and > just use the SDK version from Gecko. Yeah, but for now I would prefer just updating the copy, less risks.
Attachment #8662937 - Attachment is obsolete: true
Attached file adbhelper-0.8.1pre-mac64.xpi (obsolete) (deleted) —
Attachment #8663372 - Flags: feedback?(dale)
Attached file b2g-installer-0.3-mac64.xpi (obsolete) (deleted) —
Attachment #8663373 - Flags: feedback?(dale)
Attached file ADB Helper PR (deleted) —
Attachment #8663374 - Flags: review?(poirot.alex)
Attached file B2G Installer PR (deleted) —
Attachment #8663375 - Flags: review?(dale)
Attached file b2g-installer-0.3-mac64.xpi (deleted) —
Attachment #8663373 - Attachment is obsolete: true
Attachment #8663373 - Flags: feedback?(dale)
Attachment #8663376 - Flags: feedback?(dale)
Comment on attachment 8663375 [details] B2G Installer PR Ran through the full flash procedure with no crashes, awesome job
Attachment #8663375 - Flags: review?(dale) → review+
Comment on attachment 8663376 [details] b2g-installer-0.3-mac64.xpi Tested it and runs great, thanks
Attachment #8663376 - Flags: feedback?(dale) → feedback+
Comment on attachment 8663372 [details] adbhelper-0.8.1pre-mac64.xpi wooot
Attachment #8663372 - Flags: feedback?(dale) → feedback+
Comment on attachment 8663374 [details] ADB Helper PR Please remove subprocess from addons and fetch it via the following require instead: const Subprocess = require("sdk/system/child_process/subprocess"); Thanks!
Attachment #8663374 - Flags: review?(poirot.alex)
Attached file adbhelper-0.8.1pre-mac64.xpi (deleted) —
Removed and using subprocess from SDK. Looks ok on the Mac next to me.
Attachment #8663372 - Attachment is obsolete: true
Attachment #8663654 - Flags: feedback?(dale)
Attachment #8654491 - Flags: feedback?(nicolas.b.pierron)
Comment on attachment 8663654 [details] adbhelper-0.8.1pre-mac64.xpi Awesome, had problems rolling back but this works good for me, cheers
Attachment #8663654 - Flags: feedback?(dale) → feedback+
Comment on attachment 8663374 [details] ADB Helper PR Updated PR to just drop the subprocess copy. Works well on my side and on Dale's one.
Attachment #8663374 - Flags: review?(poirot.alex)
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: