Closed
Bug 1441498
Opened 7 years ago
Closed 7 years ago
Intermittent usercss/usercss-moz-document.html | application terminated with exit code 11 and "mozilla::detail::MutexImpl::~MutexImpl: pthread_mutex_destroy failed: Device or resource busy"
Categories
(Core :: Graphics, defect, P5)
Core
Graphics
Tracking
()
RESOLVED
FIXED
mozilla60
Tracking | Status | |
---|---|---|
firefox60 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: sotaro)
References
Details
(Keywords: intermittent-failure)
Attachments
(2 files, 2 obsolete files)
(deleted),
patch
|
kats
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
Details | Diff | Splinter Review |
Filed by: aiakab [at] mozilla.com
https://treeherder.mozilla.org/logviewer.html#?job_id=164551776&repo=mozilla-inbound
https://queue.taskcluster.net/v1/task/JydRIJBjQSq7A5_vpZmZVA/runs/0/artifacts/public/logs/live_backing.log
https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://queue.taskcluster.net/v1/task/JydRIJBjQSq7A5_vpZmZVA/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1
Comment 1•7 years ago
|
||
The log's first symptom of an issue here is:
"mozilla::detail::MutexImpl::~MutexImpl: pthread_mutex_destroy failed: Device or resource busy"
Summary: Intermittent usercss/usercss-moz-document.html | application terminated with exit code 11 → Intermittent usercss/usercss-moz-document.html | application terminated with exit code 11 and "mozilla::detail::MutexImpl::~MutexImpl: pthread_mutex_destroy failed: Device or resource busy"
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → sotaro.ikeda.g
Assignee | ||
Comment 3•7 years ago
|
||
Assignee | ||
Comment 4•7 years ago
|
||
Attachment #8956327 -
Attachment is obsolete: true
Assignee | ||
Comment 5•7 years ago
|
||
Assignee | ||
Updated•7 years ago
|
Attachment #8956332 -
Attachment description: patch - Use lock for accessing mCompositorVsyncDispatcher → patch - Protect mCompositorVsyncDispatcher by lock
Assignee | ||
Comment 6•7 years ago
|
||
Attachment #8956332 -
Attachment is obsolete: true
Assignee | ||
Updated•7 years ago
|
Attachment #8956354 -
Flags: review?(bugmail)
Comment 7•7 years ago
|
||
Comment on attachment 8956354 [details] [diff] [review]
patch - Protect mCompositorVsyncDispatcher by lock
Review of attachment 8956354 [details] [diff] [review]:
-----------------------------------------------------------------
I don't see how this will help. The CompositorVsyncDispatcher implementation uses threadsafe refcounting [1] and the functions you're adding the lock to are all using proper RefPtr handling, as far as I can tell, so we should never end up with dangling pointers. In other words, I think that no matter when nsBaseWidget::GetCompositorVsyncDispatcher() is called, it should either return a strong reference to a valid CompositorVsyncDispatcher, or it should return a null pointer.
[1] https://searchfox.org/mozilla-central/rev/bffd3e0225b65943364be721881470590b9377c1/widget/VsyncDispatcher.h#48
Attachment #8956354 -
Flags: review?(bugmail) → review-
Assignee | ||
Comment 8•7 years ago
|
||
attachment 8956354 [details] [diff] [review] should help to address the problem. attachment 8956354 [details] [diff] [review] protects for accessing mCompositorVsyncDispatcher.
"Threadsafe refcounting" makes mRefCnt of refcounted object to be thread safe. It does not make RefPtr<CompositorVsyncDispatcher> to be thread safe. RefPtr::mRawPtr is not protected atomically.
Assignee | ||
Comment 9•7 years ago
|
||
Therefore if mCompositorVsyncDispatcher is update on main thread and its RefPtr is copied on Compositor Thread, mCompositorVsyncDispatcher needs to be protected by lock.
Assignee | ||
Comment 10•7 years ago
|
||
Assignee | ||
Comment 11•7 years ago
|
||
attachment 8956731 [details] [diff] [review] is a reduced test case by using gtest. By running the test, I could reproduced the similar crash. The gtest log was like the following.
--------------------------------------------------------
Running GTest tests...
Note: Google Test filter = RefPtr.*
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from RefPtr
[ RUN ] RefPtr.MultiThreadedTest
mozilla::detail::MutexImpl::lock: pthread_mutex_lock failed: Invalid argument
ExceptionHandler::GenerateDump cloned child 16556
ExceptionHandler::SendContinueSignalToChild sent continue signal to child
ExceptionHandler::WaitForContinueSignal waiting for continue signal...
Assignee | ||
Comment 12•7 years ago
|
||
Assignee | ||
Comment 13•7 years ago
|
||
I run attachment 8956731 [details] [diff] [review] on linux for testing with "./mach gtest RefPtr.*"
Assignee | ||
Updated•7 years ago
|
Component: Layout → Graphics
Comment 15•7 years ago
|
||
Aha, you're right. Thanks for making the gtest, that helped me understand the problem better!
Just to state my understanding: GetCompositorVsyncDispatcher() [1] runs dispatcher.assign_with_AddRef(mCompositorVsyncDispatcher.mRawPtr) as one of the steps in creating the RefPtr copy. And DestroyCompositor() [2] runs mCompositorVsyncDispatcher.assign_assuming_AddRef(nullptr) when it nulls out mCompositorVsyncDispatcher. If the first call gets interrupted before it actually executes any of the function, say at [3], and the second call runs at that point, it can release and free the object whose pointer was passed to the first call. So when the first call resumes it continues with what is now a bad pointer, on which it does an AddRef. But that doesn't change the fact that the object is already gone. So with this interleaving you can end up with a RefPtr that has a garbage mRawPtr in this scenario, and that would result in the mutex locking failures we've been seeing.
[1] https://searchfox.org/mozilla-central/rev/efce4ceddfbe5fe4e2d74f1aed93894bada9b273/widget/nsBaseWidget.cpp#1277
[2] https://searchfox.org/mozilla-central/rev/efce4ceddfbe5fe4e2d74f1aed93894bada9b273/widget/nsBaseWidget.cpp#267
[3] https://searchfox.org/mozilla-central/rev/efce4ceddfbe5fe4e2d74f1aed93894bada9b273/mfbt/RefPtr.h#53
Flags: needinfo?(bugmail)
Comment 16•7 years ago
|
||
Comment on attachment 8956354 [details] [diff] [review]
patch - Protect mCompositorVsyncDispatcher by lock
Review of attachment 8956354 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks, this patch should do the job then
::: widget/nsBaseWidget.h
@@ +7,5 @@
>
> #include "InputData.h"
> #include "mozilla/EventForwards.h"
> #include "mozilla/RefPtr.h"
> +#include "mozilla/Mutex.h"
Move this include up one line to keep it alphabetical
Attachment #8956354 -
Flags: review- → review+
Updated•7 years ago
|
Comment 17•7 years ago
|
||
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/76a8f710da41
Protect mCompositorVsyncDispatcher by lock. r=kats
Comment 18•7 years ago
|
||
^ Went ahead and landed it, with the change in comment 16 plus fixing a typo in the commit message.
Comment 19•7 years ago
|
||
Although... is there any particular reason you used a UniquePtr instead of just keeping the mutex as an instance variable on nsBaseWidget?
Comment 20•7 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox60:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Assignee | ||
Comment 21•7 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #19)
> Although... is there any particular reason you used a UniquePtr instead of
> just keeping the mutex as an instance variable on nsBaseWidget?
It is for reducing Mutex instances.
nsBaseWidget needs the Mutex only when the nsBaseWidget owns CompositorSession. nsBaseWidget is ancestor of PuppetWidget and nsWindow. PuppetWidget does not own CompositorSession, then the Mutex is not necessary. nsWindow not always own CompositorSession. Only nsWindow created by kCChildCID has a change to own CompositorSession like the following.
https://github.com/sotaroikeda/firefox-diagrams/blob/master/widget/widget_nsWindow_FirefoxOS_2_2.pdf
Comment 22•7 years ago
|
||
Makes sense, thanks!
You need to log in
before you can comment on or make changes to this bug.
Description
•