Crash in [@ mozilla::dom::ServiceWorkerRegistrationInfo::MaybeScheduleTimeCheckAndUpdate]
Categories
(Core :: DOM: Service Workers, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox72 | --- | unaffected |
firefox73 | --- | disabled |
firefox74 | --- | fixed |
firefox75 | --- | fixed |
firefox76 | --- | fixed |
People
(Reporter: philipp, Assigned: perry)
References
Details
(Keywords: crash, regression)
Crash Data
Attachments
(3 files, 2 obsolete files)
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
pascalc
:
approval-mozilla-beta+
|
Details |
This bug is for crash report bp-3f7cd4e1-1124-45ef-a3d4-6dddc0200116.
Top 10 frames of crashing thread:
0 xul.dll mozilla::dom::ServiceWorkerRegistrationInfo::MaybeScheduleTimeCheckAndUpdate dom/serviceworkers/ServiceWorkerRegistrationInfo.cpp:460
1 xul.dll std::_Func_impl_no_alloc<`lambda at z:/task_1579054948/build/src/dom/serviceworkers/ServiceWorkerPrivateImpl.cpp:460:7', void, mozilla::dom::ServiceWorkerOpResult&&>::_Do_call
2 xul.dll mozilla::MozPromise<mozilla::dom::ServiceWorkerOpResult, mozilla::ipc::ResponseRejectReason, 1>::ThenValue<`lambda at z:/task_1579054948/build/src/dom/serviceworkers/ServiceWorkerPrivateImpl.cpp:840:7'>::DoResolveOrRejectInternal xpcom/threads/MozPromise.h:793
3 xul.dll mozilla::MozPromise<RefPtr<mozilla::AudioData>, mozilla::MediaResult, 1>::ThenValueBase::ResolveOrRejectRunnable::Run xpcom/threads/MozPromise.h:402
4 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1241
5 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:486
6 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:87
7 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:308
8 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:290
9 xul.dll nsBaseAppShell::Run widget/nsBaseAppShell.cpp:137
another serviceworker-related crash signature that's popping up during the 73.0b cycle.
Comment 1•5 years ago
|
||
This seems to be caused by a ServiceWorkerRegistrationInfo
nullptr. I see several calls to MaybeScheduleTimeCheckAndUpdate
:
dom/serviceworkers/FetchEventOpChild.cpp
165 mRegistration->MaybeScheduleTimeCheckAndUpdate(); // found in mozilla::dom::(anonymous namespace)::SynthesizeResponseWatcher::CancelInterception
470 mRegistration->MaybeScheduleTimeCheckAndUpdate(); // found in mozilla::dom::FetchEventOpChild::Recv__delete__
678 mRegistration->MaybeScheduleTimeCheckAndUpdate(); // found in mozilla::dom::FetchEventOpChild::MaybeScheduleRegistrationUpdate
dom/serviceworkers/ServiceWorkerPrivate.cpp
386 mRegistration->MaybeScheduleTimeCheckAndUpdate(); // found in mozilla::dom::(anonymous namespace)::RegistrationUpdateRunnable::Run
1573 registration->MaybeScheduleTimeCheckAndUpdate(); // found in mozilla::dom::ServiceWorkerPrivate::SendFetchEvent
dom/serviceworkers/ServiceWorkerPrivateImpl.cpp
463 registration->MaybeScheduleTimeCheckAndUpdate(); // found in mozilla::dom::ServiceWorkerPrivateImpl::SendPushEventInternal
466 registration->MaybeScheduleTimeCheckAndUpdate(); // found in mozilla::dom::ServiceWorkerPrivateImpl::SendPushEventInternal
in this case we are inside ServiceWorkerPrivateImpl
. None of these callers checks before, if mRegistration
is nullptr. But I see several points where it is assigned mRegistration = nullptr;
, so probably we need to understand, if we can do anything more meaningful than crash if mRegistration is nullptr (we could crash/assert only in debug builds?).
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 2•5 years ago
|
||
Based on stack frame 2 (or 1, depending on the report) in the crash report
mozilla::MozPromise<mozilla::dom::ServiceWorkerOpResult, mozilla::ipc::ResponseRejectReason, 1>::ThenValue<`lambda at z:/task_1579058349/build/src/dom/serviceworkers/ServiceWorkerPrivateImpl.cpp:840:7'>::DoResolveOrRejectInternal(mozilla::MozPromise<mozilla::dom::ServiceWorkerOpResult, mozilla::ipc::ResponseRejectReason, 1>::ResolveOrRejectValue&)
the null ServiceWorkerRegistrationInfo
is used in ExecServiceWorkerOp
's callback(s) (ServiceWorkerPrivateImpl.cpp:840:7 is ExecServiceWorkerOp
). Out of all calls to ExecServiceWorkerOp
, only SendPushEventInternal
does MaybeScheduleTimeCheckAndUpdate
in the callbacks.
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 3•5 years ago
|
||
Going to add some assertions to narrow down the crash, so marking leave-open.
Assignee | ||
Comment 4•5 years ago
|
||
Attempting to narrow down the crash of a null ServiceWorkerRegistrationInfo.
Assignee | ||
Comment 5•5 years ago
|
||
Many things may go wrong if mOrderedScopes isn't actually ordered. We can put
more of that responsibility on the STL.
Depends on D60326
Updated•5 years ago
|
Assignee | ||
Comment 7•5 years ago
|
||
Comment on attachment 9121667 [details]
Bug 1609665 - promote diagnostic assert to release assert r?asuth
Beta/Release Uplift Approval Request
- User impact if declined: We believe there will be no impact because this MOZ_RELEASE_ASSERT should catch the nullptr dereference before it actually happens and crash earlier instead. If so, then the patch won't change the amount of crashes. If not, then there will be a new assertion failure to fix.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Not risky because it is only changing a MOZ_DIAGNOSTIC_ASSERT to MOZ_RELEASE_ASSERT. This bug's crash is only showing up on beta, where MOZ_DIAGNOSTIC_ASSERTs are removed.
- String changes made/needed:
Comment 8•5 years ago
|
||
bugherder |
Comment 9•5 years ago
|
||
Comment on attachment 9121667 [details]
Bug 1609665 - promote diagnostic assert to release assert r?asuth
Shifts the crash from a nullptr crash to a release assert so we can hopefully get better diagnostics. Approved for 73.0b7.
Comment 10•5 years ago
|
||
bugherder uplift |
Updated•5 years ago
|
Comment 11•5 years ago
|
||
Comment on attachment 9121667 [details]
Bug 1609665 - promote diagnostic assert to release assert r?asuth
Clearing the approval flag to get this off the needs-uplift radar.
Comment 12•5 years ago
|
||
Backed out for build bustages on ServiceWorkerManager.cpp.
Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=285503789&repo=mozilla-beta&lineNumber=44201
Backout: https://hg.mozilla.org/releases/mozilla-beta/rev/fe6f34b2859f0706c88ec07199e30395041af258
There was a conflict on ServiceWorkerManager.cpp when patch was imported for uplift.
Assignee | ||
Comment 13•5 years ago
|
||
Assignee | ||
Comment 14•5 years ago
|
||
Comment on attachment 9122518 [details]
Bug 1609665 - promote diagnostic assert to release assert r?asuth
Beta/Release Uplift Approval Request
- User impact if declined: See https://bugzilla.mozilla.org/show_bug.cgi?id=1609665#c7. Patch modified to land cleanly on mozilla-beta.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): See https://bugzilla.mozilla.org/show_bug.cgi?id=1609665#c7
- String changes made/needed:
Comment 15•5 years ago
|
||
Comment on attachment 9122518 [details]
Bug 1609665 - promote diagnostic assert to release assert r?asuth
Same as comment 9. Approved for 73.0b9.
Comment 16•5 years ago
|
||
Comment on attachment 9122518 [details]
Bug 1609665 - promote diagnostic assert to release assert r?asuth
Pushed to Beta:
https://hg.mozilla.org/releases/mozilla-beta/rev/eca62a891f1d
Also clearing the approval flag to get this off the needs-uplift radar.
Updated•5 years ago
|
Comment 17•5 years ago
|
||
sw-e10s was disabled in 73.0b9
Assignee | ||
Comment 18•5 years ago
|
||
Assignee | ||
Comment 19•5 years ago
|
||
I'm working on this, but the crash is coming from a callback for push notifications and it's difficult to tell what the state is when the callback is registered. It's possible that the Service Worker's "ordered scopes" is buggy (which could explain the crash), so I'll upload a patch for that as a first step.
Updated•5 years ago
|
Comment 20•5 years ago
|
||
Comment 21•5 years ago
|
||
bugherder |
Reporter | ||
Comment 22•5 years ago
|
||
crashes with the assert are now showing up with the [@ mozilla::dom::ServiceWorkerManager::SendPushEvent ]
signature
Updated•5 years ago
|
Assignee | ||
Comment 23•5 years ago
|
||
Also adds ScopeToPrincipal function to convert scope nsIURI/strings to principals.
Comment 24•5 years ago
|
||
Comment 25•5 years ago
|
||
bugherder |
Assignee | ||
Comment 26•5 years ago
|
||
Comment on attachment 9128265 [details]
Bug 1609665 - push notifications should use registrations with an exact scope match r?#dom-workers-and-storage-reviewers
Beta/Release Uplift Approval Request
- User impact if declined: nullptr dereference when handling push notifications, which may crash.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Not risky because it just does an early-return when it sees a particular nullptr, instead of saving it and trying to de-reference later.
- String changes made/needed:
Comment 27•5 years ago
|
||
Comment on attachment 9128265 [details]
Bug 1609665 - push notifications should use registrations with an exact scope match r?#dom-workers-and-storage-reviewers
Service-worker crash fix, one of the dependencies to allow shipping sw-e10s ship in 74, uplift approved for 74.0b8, thanks.
Comment 28•5 years ago
|
||
bugherder uplift |
Assignee | ||
Updated•5 years ago
|
Updated•5 years ago
|
Comment 29•5 years ago
|
||
Perry, this seems to have spiked big again on Nightly. Any ideas why?
Comment 30•5 years ago
|
||
Comment 31•5 years ago
|
||
Bug 1407276 looks like the only plausible change in that range.
Comment 32•5 years ago
|
||
With any luck, the spike should go away in the next set of Nightlies with bug 1407276 backed out.
Comment 33•5 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #31)
Bug 1407276 looks like the only plausible change in that range.
Yes, it is caused by bug 1407276's patch. The core dump stack must be only produced by the patch.
Updated•5 years ago
|
Updated•5 years ago
|
Description
•