Closed
Bug 1203585
Opened 9 years ago
Closed 9 years ago
TSan: data race dom/media/GraphDriver.h:97 Switching()
Categories
(Core :: Audio/Video: MediaStreamGraph, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla46
Tracking | Status | |
---|---|---|
firefox46 | --- | fixed |
People
(Reporter: froydnj, Assigned: padenot)
References
(Blocks 1 open bug)
Details
(Whiteboard: [tsan])
Attachments
(8 files)
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
jesup
:
review-
|
Details | Diff | Splinter Review |
(deleted),
text/x-review-board-request
|
jesup
:
review+
|
Details |
(deleted),
text/x-review-board-request
|
jesup
:
review+
|
Details |
(deleted),
text/x-review-board-request
|
jesup
:
review+
|
Details |
(deleted),
text/x-review-board-request
|
jesup
:
review+
|
Details |
(deleted),
text/x-review-board-request
|
jesup
:
review+
|
Details |
(deleted),
text/x-review-board-request
|
jesup
:
review+
|
Details |
The attached logfile shows a thread/data race detected by TSan (ThreadSanitizer).
* Specific information about this bug
This is a straightforward race where we're writing GraphDriver::mPreviousDriver on an MSG thread (?) and we're reading it on a thread ALSA has set up for us. There does not seem to be any locking around the read or write.
* General information about TSan, data races, etc.
Typically, races reported by TSan are not false positives, but it is possible that the race is benign. Even in this case though, we should try to come up with a fix unless this would cause unacceptable performance issues. Also note that seemingly benign races can possibly be harmful (also depending on the compiler and the architecture) [1][2].
If the bug cannot be fixed, then this bug should be used to either make a compile-time annotation for blacklisting or add an entry to the runtime blacklist.
[1] http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
[2] _How to miscompile programs with "benign" data races_: https://www.usenix.org/legacy/events/hotpar11/tech/final_files/Boehm.pdf
Updated•9 years ago
|
Component: Audio/Video → Audio/Video: MSG/cubeb/GMP
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → padenot
Reporter | ||
Comment 1•9 years ago
|
||
Paul pointed out things work better when log files actually get attached.
Assignee | ||
Comment 2•9 years ago
|
||
Attachment #8659936 -
Flags: review?(rjesup)
Comment 3•9 years ago
|
||
Comment on attachment 8659936 [details] [diff] [review]
Lock around access to mNextDriver and mPreviousDriver. r=
r-: NextDriver() accesses mNextDriver without the monitor, and it's not clear it's called with the monitor locked - if so, it should be documented (and checked) and perhaps asserted.
This points out that the documentation of what's covered by the monitor and what's not is inconsistent. mPromisesForOperation appears to need the monitor, but doesn't have "This is synchronized by the graph's Monitor". Apparently mNextDriver and mPreviousDriver need that statement too. SwitchAtNextIteration() should document it needs the monitor too (and appears to always have it).
Most uses of mNextDriver are tied to mGraphImpl->SetCurrentDriver(), which likely does need the monitor. It's not clear that all cases do, especially calls to Switching(). This makes me wonder if there are lower-cost alternatives (Atomics?), though the need for Switching to check mNextDriver and mPreviousDriver implies some form of lock/semaphore/monitor is needed - but it also appears there may be more unlocked references to mPreviousDriver.
So I think more cleanup and especially more documentation/checking is needed. Perhaps it just needs documentation and/or an assert or two.
Attachment #8659936 -
Flags: review?(rjesup) → review-
Comment 4•9 years ago
|
||
Should we try to update and land this? Lack of locking on access to these could cause "interesting" problems even without help of TSAN compiler weirdness.
If we're sure it's safe outside of the compiler doing an evil TSAN trick, we can reduce the priority.
Rank: 19
Flags: needinfo?(padenot)
Priority: -- → P1
Assignee | ||
Comment 5•9 years ago
|
||
Bug 1203585 - Add a comment block on how MediaStreamGraph switch GraphDrivers. r?jesup
Attachment #8694180 -
Flags: review?(rjesup)
Assignee | ||
Comment 6•9 years ago
|
||
Bug 1203585 - Add new methods to GraphDriver to assert that locks are held. r?jesup
Attachment #8694181 -
Flags: review?(rjesup)
Assignee | ||
Comment 7•9 years ago
|
||
Bug 1203585 - Add threading assertions to GraphDriver switching methods. r?jesup
Attachment #8694182 -
Flags: review?(rjesup)
Assignee | ||
Comment 8•9 years ago
|
||
Bug 1203585 - Update the MediaStreamGraph code to lock properly. r?jesup
Attachment #8694183 -
Flags: review?(rjesup)
Assignee | ||
Comment 9•9 years ago
|
||
Bug 1203585 - Add comments about threading and locking on GraphDriver's members. r?jesup
Attachment #8694184 -
Flags: review?(rjesup)
Assignee | ||
Comment 10•9 years ago
|
||
Bug 1203585 - Remove some dead code in GraphDriver.cpp. r?jesup
Attachment #8694185 -
Flags: review?(rjesup)
Comment 11•9 years ago
|
||
Comment on attachment 8694180 [details]
MozReview Request: Bug 1203585 - Add a comment block on how MediaStreamGraph switch GraphDrivers. r?jesup
https://reviewboard.mozilla.org/r/26685/#review24305
::: dom/media/GraphDriver.h:66
(Diff revision 1)
> + * scenarii that can happe:
nit: happen
::: dom/media/GraphDriver.h:72
(Diff revision 1)
> + * - The thread T is posting a message to the main thread to terminate itself.
nit: Threat T posts a message
::: dom/media/GraphDriver.h:81
(Diff revision 1)
> + * - At the end of the MSG iteration, the SystemClockDriver transfers its timing
Is mNextDriver used here?
::: dom/media/GraphDriver.h:86
(Diff revision 1)
> + * thread to shut it down.
it -> the SystemClockDriver thread
::: dom/media/GraphDriver.h:93
(Diff revision 1)
> + * starts the SystemClockDriver. This starts a thread.
This starts a thread -> A new SystemClockDriver thread is started from the current audio thread.
It may make sense to scrub this to have consistent names for the threads.
::: dom/media/GraphDriver.h:101
(Diff revision 1)
> + * of the different attributes of drivers, and they access pattern is documented
nit: they->their
Attachment #8694180 -
Flags: review?(rjesup) → review+
Comment 12•9 years ago
|
||
Comment on attachment 8694181 [details]
MozReview Request: Bug 1203585 - Add new methods to GraphDriver to assert that locks are held. r?jesup
https://reviewboard.mozilla.org/r/26687/#review24309
::: dom/media/GraphDriver.cpp:199
(Diff revision 1)
> + GraphDriver* previousDriver = nullptr;
= nullptr is spurious here. up to you though
::: dom/media/GraphDriver.cpp:219
(Diff revision 1)
> + mDriver->SetPreviousDriver(nullptr);
Is there any chance something could run and want to mess with mDriver's PreviousDriver value while we're in this Run()? Would it make sense to simply hold the lock longer in this function?
Attachment #8694181 -
Flags: review?(rjesup) → review+
Comment 13•9 years ago
|
||
Comment on attachment 8694182 [details]
MozReview Request: Bug 1203585 - Add threading assertions to GraphDriver switching methods. r?jesup
https://reviewboard.mozilla.org/r/26689/#review24315
Attachment #8694182 -
Flags: review?(rjesup) → review+
Comment 14•9 years ago
|
||
Comment on attachment 8694183 [details]
MozReview Request: Bug 1203585 - Update the MediaStreamGraph code to lock properly. r?jesup
https://reviewboard.mozilla.org/r/26691/#review24317
Attachment #8694183 -
Flags: review?(rjesup) → review+
Comment 15•9 years ago
|
||
Comment on attachment 8694184 [details]
MozReview Request: Bug 1203585 - Add comments about threading and locking on GraphDriver's members. r?jesup
https://reviewboard.mozilla.org/r/26693/#review24319
::: dom/media/GraphDriver.h:251
(Diff revision 1)
> + // durining the initialization.
during
::: dom/media/GraphDriver.h:257
(Diff revision 1)
> + // This written to when changing driver, from the previous driver's thread, or
nit: This is written
::: dom/media/GraphDriver.h:259
(Diff revision 1)
> + // and read each time we need to check whether we're changing driver (in
"At this point and read each time" - reword
Attachment #8694184 -
Flags: review?(rjesup) → review+
Comment 16•9 years ago
|
||
Comment on attachment 8694185 [details]
MozReview Request: Bug 1203585 - Remove some dead code in GraphDriver.cpp. r?jesup
https://reviewboard.mozilla.org/r/26695/#review24321
Attachment #8694185 -
Flags: review?(rjesup) → review+
Comment 17•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/d276304ca35e
https://hg.mozilla.org/integration/mozilla-inbound/rev/25b10c010e66
https://hg.mozilla.org/integration/mozilla-inbound/rev/4c9fe5d2c979
https://hg.mozilla.org/integration/mozilla-inbound/rev/aeeb1c2cd5ec
https://hg.mozilla.org/integration/mozilla-inbound/rev/ff9d9dd7eb59
https://hg.mozilla.org/integration/mozilla-inbound/rev/40138116cc46
Comment 18•9 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/d276304ca35e
https://hg.mozilla.org/mozilla-central/rev/25b10c010e66
https://hg.mozilla.org/mozilla-central/rev/4c9fe5d2c979
https://hg.mozilla.org/mozilla-central/rev/aeeb1c2cd5ec
https://hg.mozilla.org/mozilla-central/rev/ff9d9dd7eb59
https://hg.mozilla.org/mozilla-central/rev/40138116cc46
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox46:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
Comment 20•6 years ago
|
||
AudioCallbackDriver::Start() didn't reset mPreviousDriver until after calling
Init().
https://hg.mozilla.org/mozilla-central/annotate/61079e59ef35/dom/media/GraphDriver.cpp#l627
Init() called(/calls) AudioCallbackDriver::StartStream(), which set(s) mStarted
after cubeb_stream_start().
cubeb_stream_start() triggered(/triggers) MediaStreamGraphImpl::Process().
MediaStreamGraphImpl::Process() was testing Switching(),
https://hg.mozilla.org/mozilla-central/annotate/61079e59ef35/dom/media/MediaStreamGraph.cpp#l1441
Switching() is/was mNextDriver || mPreviousDriver.
The race was later fixed when
https://hg.mozilla.org/mozilla-central/rev/ef761a32a969178df81a61bc2d0a376dde2eec03#l1.160
moved the reset of mPreviousDriver from after Init() to before Init().
I expect that made the monitor unnecessary (and I doubt the locking around the Switching()
call was helpful).
(https://hg.mozilla.org/mozilla-central/rev/ef761a32a969178df81a61bc2d0a376dde2eec03#l1.309
also moved the mixer RemoveCallback() from MediaStreamGraphImpl::Process() to
AudioCallbackDriver::DataCallback() et al, where the driver switch is
performed.)
You need to log in
before you can comment on or make changes to this bug.
Description
•