Closed
Bug 1182197
Opened 9 years ago
Closed 9 years ago
crash in mozilla::dom::Promise::MaybeSettle or mozilla::dom::Promise::Settle
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
RESOLVED
FIXED
mozilla45
People
(Reporter: kairo, Assigned: bzbarsky)
References
Details
(Keywords: crash, topcrash-win)
Crash Data
Attachments
(7 files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
away
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
nsm
:
review+
ritu
:
approval-mozilla-release+
|
Details | Diff | Splinter Review |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
mccr8
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
lizzard
:
approval-mozilla-aurora+
lizzard
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
[Tracking Requested - why for this release]:
This bug was filed from the Socorro interface and is
report bp-7cf1e3e2-1448-46e0-99ce-e2b282150709.
=============================================================
Top Frames:
0 xul.dll mozilla::dom::Promise::MaybeSettle(JS::Handle<JS::Value>, mozilla::dom::Promise::PromiseState) dom/promise/Promise.cpp
1 xul.dll mozilla::dom::Promise::ResolveInternal(JSContext*, JS::Handle<JS::Value>) dom/promise/Promise.cpp
2 xul.dll mozilla::dom::Promise::JSCallback(JSContext*, unsigned int, JS::Value*) dom/promise/Promise.cpp
3 xul.dll js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) js/src/vm/Interpreter.cpp
4 xul.dll js::DirectProxyHandler::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) js/src/proxy/DirectProxyHandler.cpp
5 xul.dll js::CrossCompartmentWrapper::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) js/src/proxy/CrossCompartmentWrapper.cpp
6 xul.dll js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) js/src/vm/Interpreter.cpp
7 xul.dll Interpret js/src/vm/Interpreter.cpp
[...]
It looks like all of those have an address of 0x4, so this seems to be a near-null deref.
Those crashes are new to 40, and are 0.9% of overall 40.0b2 data.
Reporter | ||
Updated•9 years ago
|
Crash Signature: [@ mozilla::dom::Promise::MaybeSettle(JS::Handle<T>, mozilla::dom::Promise::PromiseState)] → [@ mozilla::dom::Promise::MaybeSettle(JS::Handle<T>, mozilla::dom::Promise::PromiseState)]
[@ mozilla::dom::Promise::MaybeSettle]
Flags: needinfo?(kairo)
Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(kairo)
Reporter | ||
Comment 2•9 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #1)
> Dup of bug 1178271?
If MaybeSettle and Settle are the same, then it might very well be - but the signature here has more volume in the wild.
Flags: needinfo?(kairo)
Assignee | ||
Comment 3•9 years ago
|
||
Here's the full implementation of MaybeSettle:
1383 Promise::MaybeSettle(JS::Handle<JS::Value> aValue,
1384 PromiseState aState)
1385 {
1386 // Promise.all() or Promise.race() implementations will repeatedly call
1387 // Resolve/RejectInternal rather than using the Maybe... forms. Stop SetState
1388 // from asserting.
1389 if (mState != Pending) {
1390 return;
1391 }
1392
1393 Settle(aValue, aState);
1394 }
so your signature just depends on what inlining happened.
Since bug 1178271 is the one with more information, marking dependent.
Depends on: 1178271
This was mentioned in the FF40 Beta stability issues in today's channel meeting.
Boris, KaiRo mentioned that we are hitting this crash on Beta channel at 1% which is still pretty high. Can you please help find an owner for this bug? I don't see much activity on bug 1178271 either and therefore requesting your help. Thanks!
Flags: needinfo?(bzbarsky)
Assignee | ||
Comment 6•9 years ago
|
||
It's really not clear what's going on here, unfortunately. I don't know who can own this because it's not clear where things are going wrong.
It's possibly our memory is just corrupt. It's possible there are bugs in the GC or CC...
Do we have any useful data on when this started? On whether it's correlated with anything?
Flags: needinfo?(bzbarsky)
On aurora, this was first seen in 20150515091916. The crash volume was pretty low at the time, so the issue may have been introduced a build or two before then. Here's a possible regression range: https://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=d14cf14709f4&tochange=b03c5f1a77a1 It's mostly Pocket changes.
On nightly this was first seen in 20150430030201 but the daily volume was even lower, so it's really hard to say.
If there's no obvious fix in sight, can you think of any RELEASE_ASSERTs or other diagnostics that might help point to the problem?
Assignee | ||
Comment 8•9 years ago
|
||
One thing that might be handy is the JS stack when we're resolving this cced promise. That would give us some idea of what code is involved that's keeping it around after it's been CCed, at least....
This stack would presumably contain URIs in it, though ideally only chrome:// ones, so I'm not sure what our policies are around collecting such a thing.
Comment 9•9 years ago
|
||
This bug is now wontfix for 40.
I see crash reports for every beta in the 40 cycle except beta8 and beta9. (beta9 was only released today.) Did we perhaps fix this in beta8?
https://crash-stats.mozilla.com/report/list?product=Firefox&signature=mozilla%3A%3Adom%3A%3APromise%3A%3AMaybeSettle%28JS%3A%3AHandle%3CT%3E%2C+mozilla%3A%3Adom%3A%3APromise%3A%3APromiseState%29
Comment 10•9 years ago
|
||
I experienced this crash (cfce7fcf-f66e-470f-9515-dc0dd2150802).
I was seeing black drawing over the UI & Page. Scrolling and such would resolve it, but eventually it got too frustrating and I went to shut down the browser at which point it crashed.
David, Boris, I am trying to follow up on 41 tracked bugs. This crash is showing up most predominantly on 40.0.2 followed up by 41.0b2. Do we have additional crash dumps that could be investigated to help root cause it further? I know you mentioned a regression range in comment 7 and I am wondering whether we may want to try out a speculative fix?
Flags: needinfo?(dmajor)
Flags: needinfo?(bzbarsky)
Comment 12•9 years ago
|
||
I defer to bz; I can't really speak to the next steps here without knowing the code.
Flags: needinfo?(dmajor)
Assignee | ||
Comment 13•9 years ago
|
||
We have no idea for a speculative fix so far. It _looks_ like someone is holding on to a Promise after unlink and then exposing it to JS, but I have no idea what would be doing that, exactly...
We could try to modify the Promise code to gather the information I suggest in comment 8, as long as we're OK with shipping that information out in crash dumps...
Flags: needinfo?(bzbarsky)
Reporter | ||
Comment 14•9 years ago
|
||
I personally would be very happy if we put something in there that makes the crashes easier to investigate. It sounds like this could happen more often with Promises getting used more and more.
Comment 15•9 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #8)
> One thing that might be handy is the JS stack when we're resolving this cced
> promise. That would give us some idea of what code is involved that's
> keeping it around after it's been CCed, at least....
>
> This stack would presumably contain URIs in it, though ideally only
> chrome:// ones, so I'm not sure what our policies are around collecting such
> a thing.
Flagging Benjamin to confirm, but AFAIK this is fine since all fields are restricted-access by default.
Flags: needinfo?(benjamin)
Comment 16•9 years ago
|
||
If somebody can guarantee that the stacks involved only contain chrome frames and never content frames, then this is absolutely fine (and we could probably expose this publicly).
If it's possible that this contains content frames, we need to talk more, since then it would probably needs to be controlled by the submit-URLs submission checkbox.
Flags: needinfo?(benjamin)
Assignee | ||
Comment 17•9 years ago
|
||
We can't guarantee that they only contain chrome frames, sadly. At least not with a simple patch.
Comment 18•9 years ago
|
||
Can we redact non-chrome frames?
Assignee | ||
Comment 19•9 years ago
|
||
Yes, with a not-simple patch. We'd have to figure out how to define "chrome frame" for our purposes and then find them in there...
Comment 20•9 years ago
|
||
So, for purposes of deciding between "redact non-chrome frames" and "make this annotation respect the submit-URL checkbox":
How often do you expect to see non-chrome frames? How much would your debugging be hindered by not seeing them? Would you be willing to go without them in exchange for a higher rate of user submissions? How much work is the not-simple patch?
Comment 21•9 years ago
|
||
Are there any common URLs for the pages these are crashing on? Maybe we can reproduce something locally.
Comment 22•9 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #21)
> Are there any common URLs for the pages these are crashing on? Maybe we can
> reproduce something locally.
It's basically a typical distribution of web usage in general (facebook, gmail, youtube, twitter).
Assignee | ||
Comment 23•9 years ago
|
||
> How often do you expect to see non-chrome frames?
I don't know. We _think_ this is correlated to Promise usage in Pocket, but maybe something else can trigger it too... We just don't have much info.
> How much would your debugging be hindered by not seeing them?
Probably not much.
> Would you be willing to go without them in exchange for a higher rate of user
> submissions?
I think that as long as we get _one_ user submission here we'll likely have as much data as we can get from multiple such submissions..
> How much work is the not-simple patch?
My estimate is that to do it right is 1-2 days of work. The simple patch is probably closer to 1-2 hours.
Assignee | ||
Comment 24•9 years ago
|
||
David, can we just get this into crash reports and see how things look? I'm not sure what's involved in getting the XPCOM stackString into a crash report, exactly...
Flags: needinfo?(dmajor)
Updated•9 years ago
|
Crash Signature: [@ mozilla::dom::Promise::MaybeSettle(JS::Handle<T>, mozilla::dom::Promise::PromiseState)]
[@ mozilla::dom::Promise::MaybeSettle] → [@ mozilla::dom::Promise::MaybeSettle(JS::Handle<T>, mozilla::dom::Promise::PromiseState)]
[@ mozilla::dom::Promise::MaybeSettle]
[@ mozilla::dom::Promise::Settle(JS::Handle<T>, mozilla::dom::Promise::PromiseState)]
[@ mozilla::dom::Promise::Settle]
Updated•9 years ago
|
Flags: needinfo?(continuation)
Comment 26•9 years ago
|
||
#ifdef MOZ_CRASHREPORTER
#include "nsExceptionHandler.h"
#endif
#ifdef MOZ_CRASHREPORTER
CrashReporter::AnnotateCrashReport(NS_LITERAL_CSTRING("xpcom_stack"), ...)
#endif
Flags: needinfo?(dmajor)
Assignee | ||
Comment 27•9 years ago
|
||
Attachment #8658358 -
Flags: review?(nsm.nikhil)
Attachment #8658358 -
Flags: review?(dmajor)
Attachment #8658358 -
Flags: review?(nsm.nikhil) → review+
Comment 28•9 years ago
|
||
Comment on attachment 8658358 [details] [diff] [review]
investigation patch. Log the stack at promise fulfillment on a CCed promise into the crash reporter data
Review of attachment 8658358 [details] [diff] [review]:
-----------------------------------------------------------------
The double conversion of the string is kind of unfortunate but since we're about to crash anyway, it seems fine.
I don't really know how those various JS objects on the stack work (I'll leave that to nsm). Is there any chance of the callees doing a null pointer dereference or some such, given that we're already in a weird state?
Attachment #8658358 -
Flags: review?(nsm.nikhil)
Attachment #8658358 -
Flags: review?(dmajor)
Attachment #8658358 -
Flags: review+
Comment 29•9 years ago
|
||
That was meant to be an r+ but apparently I mid-aired?
Assignee | ||
Comment 30•9 years ago
|
||
> The double conversion of the string is kind of unfortunate
Yeah... One of those conversions is just a straight copy in the common case, though. And yes, this is not exactly hot code!
> Is there any chance of the callees doing a null pointer dereference or some such, given
> that we're already in a weird state?
Chance, maybe. I mean, the whole JS heap could be bonkers. But none of those callees should use the promise or any of its members other than the stack member, at least...
Comment 31•9 years ago
|
||
Would it be worthwhile to change the null check of mGlobal in the ctor to be a RELEASE_ASSERT?
Also I noticed the regression range sort of lines up with bug 1058695, but I'm not sure how that would lead to this, so that's probably a red herring.
When we do these various PromiseCallback Calls, is mGlobal being unmark grayed? I forget how automatically that happens nowadays.
Comment 32•9 years ago
|
||
Assignee | ||
Comment 33•9 years ago
|
||
I assume we'll get some data from this on just nightly, right? So no need to uplift anywhere?
> Would it be worthwhile to change the null check of mGlobal in the ctor to be a
> RELEASE_ASSERT?
You mean the MOZ_ASSERT? The PRomise ctor is protected; the only caller should be Promise::Create. Promise::Create then immediately calls CreateWrapper, which tries to Init an AutoJSAPI with the global. If it were null, the Init() call would return false, CreateWrapper would throw on the ErrorResult, and Create() would return null and let the Promise just go away. So I'm not sure it's all that worthwhile.
> Also I noticed the regression range sort of lines up with bug 1058695
Well... That's interesting. That bug is what added the dereference of mGlobal to MaybeSettle, no? Before that, MaybeSettle didn't touch mGlobal. So the bug with mGlobal being null was preexisting, I bet, but the crash was in fact introduced by bug 1058695.
> When we do these various PromiseCallback Calls, is mGlobal being unmark grayed?
mGlobal is an nsIGlobalObject pointer, not a JS thing.
Blocks: 1058695
Whiteboard: [leave open]
Comment 34•9 years ago
|
||
I just noticed: should we get the stack prior to doing the reject, for symmetry?
Assignee | ||
Comment 35•9 years ago
|
||
We could, but we're not seeing crashes with the reject on the stack, right? Since I assume the testing patch will just get backed out once we have the data we want, I didn't worry too much about it.
Comment 36•9 years ago
|
||
Comment on attachment 8658358 [details] [diff] [review]
investigation patch. Log the stack at promise fulfillment on a CCed promise into the crash reporter data
Review of attachment 8658358 [details] [diff] [review]:
-----------------------------------------------------------------
I already r+ed
Attachment #8658358 -
Flags: review?(nsm.nikhil)
Comment 38•9 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #33)
> > When we do these various PromiseCallback Calls, is mGlobal being unmark grayed?
>
> mGlobal is an nsIGlobalObject pointer, not a JS thing.
Sorry, I meant the mGlobal fields of the PromiseCallback subclasses (WrapperPromiseCallback, ResolvePromiseCallback, RejectPromiseCallback) which have type JS::Heap<JSObject*>. I was wondering because I see WrapperPromiseCallback::Call() in many (but not all) of these crash stacks, but maybe that's just always the way we often invoke these promises.
Some promises are triggered from the observer service. Like this one:
https://crash-stats.mozilla.com/report/index/c72a5c7d-ec67-4046-9c48-bdf702150908
Note that this is spinning the event loop while we're notifying observers. It would be nice to know what is triggering those.
Some are being triggered from the storage CompletionNotifier:
https://crash-stats.mozilla.com/report/index/45c8ec4d-5a2c-4c8f-b8cc-0047c2150907
It doesn't seem too likely that there's something particular about the promise that is triggering this, but who knows. It does seem worth noting that this all appears to be chrome JS.
Comment 39•9 years ago
|
||
dmajor, could you look into the crash dumps for a few of these stacks? It would be nice to know what is triggering the observer service in these stacks:
https://crash-stats.mozilla.com/report/index/dce65f8b-18f7-4cba-abd5-1f0bc2150908
https://crash-stats.mozilla.com/report/index/c72a5c7d-ec67-4046-9c48-bdf702150908
https://crash-stats.mozilla.com/report/index/a48c08d2-9f0b-42ea-b584-2a2042150908
Also, it would be nice to get some more frames for these crashes that have CompletionNotifier near the end to see what is spinning the event loop:
https://crash-stats.mozilla.com/report/index/41a4f21f-9b63-4fbf-b81a-037c92150908
https://crash-stats.mozilla.com/report/index/81040734-b10d-448f-aa4c-f480b2150908
https://crash-stats.mozilla.com/report/index/36b1d90b-8f73-4765-bf4d-048b12150908
Thanks.
Flags: needinfo?(dmajor)
Assignee | ||
Comment 40•9 years ago
|
||
> Sorry, I meant the mGlobal fields of the PromiseCallback subclasses
Ah. Those might want to unmarkgray mGlobal in Call, indeed...
> but maybe that's just always the way we often invoke these promises.
Right. A promise chain will end up with those PromiseCallback things in the callstack.
Comment 41•9 years ago
|
||
dmajor looked at one of these observer crashes and the thing that was triggering the observer service was like this:
33 0022f908 515dca1e xul!nsObserverList::NotifyObservers+0x3e
34 0022f928 51715991 xul!nsObserverService::NotifyObservers+0xc6
35 0022f958 5146c4d2 xul!nsXREDirProvider::DoShutdown+0x72
36 0022f99c 5146837a xul!ScopedXPCOMStartup::~ScopedXPCOMStartup+0x51
So we're in shutdown, but not that late in shutdown.
Comment 42•9 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #39)
> dmajor, could you look into the crash dumps for a few of these stacks? It
> would be nice to know what is triggering the observer service in these
> stacks:
> https://crash-stats.mozilla.com/report/index/dce65f8b-18f7-4cba-abd5-
> 1f0bc2150908
> https://crash-stats.mozilla.com/report/index/c72a5c7d-ec67-4046-9c48-
> bdf702150908
> https://crash-stats.mozilla.com/report/index/a48c08d2-9f0b-42ea-b584-
> 2a2042150908
Those all have the same stack as the one you linked me to on IRC (in other words, comment 41).
>
> Also, it would be nice to get some more frames for these crashes that have
> CompletionNotifier near the end to see what is spinning the event loop:
> https://crash-stats.mozilla.com/report/index/41a4f21f-9b63-4fbf-b81a-
> 037c92150908
> https://crash-stats.mozilla.com/report/index/81040734-b10d-448f-aa4c-
> f480b2150908
> https://crash-stats.mozilla.com/report/index/36b1d90b-8f73-4765-bf4d-
> 048b12150908
>
> Thanks.
Below CompletionNotifier is:
10 0019f7bc 55429daf xul!mozilla::storage::`anonymous namespace'::CompletionNotifier::Run+0x17
11 0019f8cc 553cdc36 xul!nsThread::ProcessNextEvent+0x6e8
12 0019f8fc 553cf0ba xul!mozilla::ipc::MessagePump::Run+0x5f
13 0019f934 553ceac3 xul!MessageLoop::RunHandler+0x20
14 0019f954 554291ad xul!MessageLoop::Run+0x19
15 0019f960 55428cf4 xul!nsBaseAppShell::Run+0x32
16 0019f96c 55498db9 xul!nsAppShell::Run+0x1b
17 0019f97c 552e8a43 xul!nsAppStartup::Run+0x20
18 0019fb60 552e8c6c xul!XREMain::XRE_mainRun+0x4b9
Flags: needinfo?(dmajor)
Comment 43•9 years ago
|
||
(In reply to David Major [:dmajor] from comment #42)
> Those all have the same stack as the one you linked me to on IRC (in other
> words, comment 41).
Thanks, I'll try looking through things that listen for that shutdown stuff and see what might be triggering JS.
> Below CompletionNotifier is:
Thanks. It is good to know this is not nested event loop spinning.
Reporter | ||
Updated•9 years ago
|
Summary: crash in mozilla::dom::Promise::MaybeSettle(JS::Handle<T>, mozilla::dom::Promise::PromiseState) → crash in mozilla::dom::Promise::MaybeSettle or mozilla::dom::Promise::Settle
Hoping that the investigative patch in Nightly will help us find a possible cause and fix for 42. It is too late for 41.
Assignee | ||
Comment 45•9 years ago
|
||
So I was thinking about this a bit more... given that this is fallout from bug 1058695, we could in fact consider adding a null-check for mGlobal and treat null as IsDying(). That might be ok for 41 if there's still time for that; it's certainly better than crashing...
Flags: needinfo?(rkothari)
Boris, given that this crash is in the top 10 for 41 at 1.17%, we might be able to consider taking a fix in 41RC, depending on the complexity and risk associated. Would you be able to nominate a patch for uplift today? We gtb 41 RC Monday am PST.
Flags: needinfo?(rkothari) → needinfo?(bzbarsky)
Assignee | ||
Comment 47•9 years ago
|
||
I _think_ this is pretty safe, but it could use a second set of eyes.
Also, I'm out on Monday so if someone is going to uplift it then it won't be me...
Flags: needinfo?(bzbarsky)
Attachment #8660010 -
Flags: review?(nsm.nikhil)
Resetting the status because there is a good chance we might taking a likely fix for this in 41RC.
Comment on attachment 8660010 [details] [diff] [review]
Patch would look something like this
Review of attachment 8660010 [details] [diff] [review]:
-----------------------------------------------------------------
This looks fine to me, I only read the bug report, but it definitely doesn't do any more harm. Is the patch supposed to do anything else eventually considering the attachment description?
Attachment #8660010 -
Flags: review?(nsm.nikhil) → review+
Assignee | ||
Comment 50•9 years ago
|
||
No, that's it. I guess I'll try landing that on inbound and if it doesn't blow up the world people should consider uplifting it on Monday...
Comment 51•9 years ago
|
||
Comment 52•9 years ago
|
||
Nikhil, would you be able to nominate this patch for uplift to Beta? We will gtb 41 RC on Monday 9/14 by noon. Thanks!
Flags: needinfo?(nsm.nikhil)
Comment 54•9 years ago
|
||
Comment on attachment 8660010 [details] [diff] [review]
Patch would look something like this
Approval Request Comment
[Feature/regressing bug #]: bug 1058695
[User impact if declined]: crashes
[Describe test coverage new/current, TreeHerder]: at least some of this code runs regularly
[Risks and why]: this should mostly restore existing behavior, but we don't entirely understand what is happening here
[String/UUID change made/needed]: none
Flags: needinfo?(nsm.nikhil)
Attachment #8660010 -
Flags: approval-mozilla-beta?
Comment on attachment 8660010 [details] [diff] [review]
Patch would look something like this
[Triage Comment] This is another fix that we are taking in 41 RC. This fix is safe and low risk.
Attachment #8660010 -
Flags: approval-mozilla-beta? → approval-mozilla-release+
Also landing to beta at Ritu's request: https://hg.mozilla.org/releases/mozilla-beta/rev/7aaeff6d190d
Comment 58•9 years ago
|
||
While investigating something else, I happened to notice that a large number of these crashes are within the "OOM danger zone" for AvailableVirtualMemory.
Could it be related? Maybe high memory means more CC which leads to the unlink problem?
Assignee | ||
Comment 59•9 years ago
|
||
Could be...
Do we by any chance have any reports with the stack instrumentation we added at this point?
Comment 60•9 years ago
|
||
No, and it's going to be difficult, because:
- This crash is pretty rare on nightly to begin with, and
- With the null check workaround, we'll no longer get a crash report for this issue. We'd have to look for cced_promise_stack annotations on machines that go on to crash somewhere else later.
Assignee | ||
Comment 61•9 years ago
|
||
Hmm. I didn't realize this was rare on nightly. Should have pushed the investigation patch to all channels before landing the workaround...
Also, the null check hasn't landed on nightly and aurora, right? So there we would still get crashes (though as you say they might be rare).
Comment 62•9 years ago
|
||
The null-check landed on m-c in comment 52.
Perhaps you could undo the null check before merge day and let the logging go to aurora. We get some single-digit crashes per day there.
Assignee | ||
Comment 63•9 years ago
|
||
> The null-check landed on m-c in comment 52.
So it did. I clearly can't remember what I did.
Undoing the null check on m-c makes sense. I'll do that next time the tree is open.
Assignee | ||
Comment 64•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/28161d563685 backs out the null-check.
Comment 65•9 years ago
|
||
Got some data from aurora 43. Most of the stacks are:
cced_promise_stack PromiseSet.prototype.add/<@resource://gre/modules/AsyncShutdown.jsm:145:9 Handler.prototype.process@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:934:23 this.PromiseWalker.walkerLoop@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:813:7 Promise*this.PromiseWalker.scheduleWalkerLoop@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:744:11 this.PromiseWalker.schedulePromise@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:776:7 this.PromiseWalker.completePromise@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:711:7 Spinner.prototype.observe@resource://gre/modules/AsyncShutdown.jsm:523:9
But there are a couple others:
cced_promise_stack ConnectionData.prototype<._executeStatement/pending<.handleCompletion@resource://gre/modules/Sqlite.jsm:784:13
cced_promise_stack i@http://platform.twitter.com/widgets.js:9:3662 o@http://platform.twitter.com/widgets.js:9:3692 promise callback*o/</<@http://platform.twitter.com/widgets.js:9:3736 o/<@http://platform.twitter.com/widgets.js:9:3709 o@http://platform.twitter.com/widgets.js:9:3599 o@http://platform.twitter.com/widgets.js:9:2844 a@http://platform.twitter.com/widgets.js:9:3032 @http://platform.twitter.com/widgets.js:1:1785 a<@http://platform.twitter.com/widgets.js:8:9002 @http://platform.twitter.com/widgets.js:1:1713 e@http://platform.twitter.com/widgets.js:1:406 @http://platform.twitter.com/widgets.js:1:1379 @http://platform.twitter.com/widgets.js:1:1 @http://platform.twitter.com/widgets.js:1:1
By the way, it looks like we didn't land any changes on 42? If these stacks don't point to an obvious fix, could you land the bandaid on beta for now?
Flags: needinfo?(bzbarsky)
Comment 66•9 years ago
|
||
Bug 1203289 landed on 42. But I guess that did not fix it.
Assignee | ||
Comment 67•9 years ago
|
||
Hmm. The Promise-backend stuff is under a nested event loop spin from an observer notification.... but clearly twitter is not likely to be managing that (modulo sync XHR perhaps).
I'll think a bit about this over the weekend, then probably land the null-check on Monday. :(
Comment 68•9 years ago
|
||
Boris, any news on this?
It seems that the attachment 8660010 [details] [diff] [review] which landed only in 41 mitigated the crash. Most of the crashes are in beta (but still affect release).
Assignee | ||
Comment 69•9 years ago
|
||
So Andrew and I walked through some of this code just now, and we think we see a possible hole in the GC/CC stuff in Promise code. I filed bug 1213391 on it.
I will post a patch there and aim to get that onto Aurora and see whether that nixes the crashes, but in the meantime I have landed the bandaid on beta: https://hg.mozilla.org/releases/mozilla-beta/rev/e96fd3320ab4
Flags: needinfo?(bzbarsky)
Assignee | ||
Comment 70•9 years ago
|
||
The fix for bug 1213391 is now on m-c and aurora; we should see what crashstats says about this crash there....
Flags: needinfo?(dmajor)
Reporter | ||
Comment 71•9 years ago
|
||
bz, dmajor is on PTO, if you are looking for stats stuff, e.g. how crash volume might change, it's better to ni? myself, but this is so rare on Dev Edition and Nightly that those channels won't tell us much, I think only Beta will give us info if this has really been fixed.
Flags: needinfo?(bzbarsky)
Assignee | ||
Comment 72•9 years ago
|
||
OK. I guess we wait for 1213391 to ride the trains to beta, then....
Flags: needinfo?(bzbarsky)
Updated•9 years ago
|
status-firefox43:
--- → ?
status-firefox44:
--- → affected
OS: Windows NT → Windows
Hardware: x86 → All
Version: Trunk → 39 Branch
Comment hidden (off-topic) |
Comment hidden (off-topic) |
Comment hidden (offtopic) |
Comment 76•9 years ago
|
||
As we are going to build 42 rc today, this is too late for this bug. Wontfix.
Assignee | ||
Comment 77•9 years ago
|
||
42 is "fixed" by the change mentioned in comment 69.
The question is whether this crash signature is visible in 43.
Reporter | ||
Comment 78•9 years ago
|
||
[Tracking Requested - why for this release]:
(In reply to Boris Zbarsky [:bz] from comment #77)
> The question is whether this crash signature is visible in 43.
It is. This is 1.8% of 43.0b1 crashes.
tracking-firefox43:
--- → ?
Assignee | ||
Comment 79•9 years ago
|
||
OK, so bug 1213391 didn't fix this. :(
Andrew, should we just permanently add this null-check and give up? :(
Flags: needinfo?(continuation)
Comment 80•9 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #79)
> Andrew, should we just permanently add this null-check and give up? :(
Yeah, it is going to be hard to figure out what is actually going on here. Maybe have an assertion in there with the null check.
Flags: needinfo?(continuation)
Tracking for 43 and 44 as this is still fairly high volume.
tracking-firefox44:
--- → +
Assignee | ||
Comment 82•9 years ago
|
||
OK, let's just do this and then I guess backport to aurora/beta
Attachment #8686282 -
Flags: review?(continuation)
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → bzbarsky
Status: NEW → ASSIGNED
Comment 83•9 years ago
|
||
Comment on attachment 8686282 [details] [diff] [review]
workaround. Add a null-check for mGlobal in Promise::Settle
Review of attachment 8686282 [details] [diff] [review]:
-----------------------------------------------------------------
Can we maybe assert if mGlobal is null? Though maybe any such failures will be intermittent and thus not of use to anybody. It might be nice to throw in a comment about how we don't really expect mGlobal to be null, but it happens in the wild.
Attachment #8686282 -
Flags: review?(continuation) → review+
Assignee | ||
Comment 84•9 years ago
|
||
Comment 85•9 years ago
|
||
Assignee | ||
Comment 86•9 years ago
|
||
Comment on attachment 8686311 [details] [diff] [review]
workaround. Add a null-check for mGlobal in Promise::Settle
Approval Request Comment
[Feature/regressing bug #]: Unknown.
[User impact if declined]: The crashes continue until morale improves.
[Describe test coverage new/current, TreeHerder]: We have yet to reproduce this.
[Risks and why]: Low risk. We've been shipping this for several releases now in
practice.
[String/UUID change made/needed]: None.
Attachment #8686311 -
Flags: approval-mozilla-beta?
Attachment #8686311 -
Flags: approval-mozilla-aurora?
Comment 87•9 years ago
|
||
bugherder |
Comment on attachment 8686311 [details] [diff] [review]
workaround. Add a null-check for mGlobal in Promise::Settle
Possible crash fix, high volume crash, let's uplift it.
Attachment #8686311 -
Flags: approval-mozilla-beta?
Attachment #8686311 -
Flags: approval-mozilla-beta+
Attachment #8686311 -
Flags: approval-mozilla-aurora?
Attachment #8686311 -
Flags: approval-mozilla-aurora+
Updated•9 years ago
|
Keywords: topcrash-win
Assignee | ||
Comment 89•9 years ago
|
||
Fixed, see comment 87.
Reporter | ||
Updated•9 years ago
|
status-firefox45:
--- → fixed
Target Milestone: --- → mozilla45
Comment 90•9 years ago
|
||
bugherder uplift |
Comment 91•9 years ago
|
||
bugherder uplift |
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•