Closed
Bug 557586
Opened 15 years ago
Closed 15 years ago
XPCPerThreadData::~XPCPerThreadData race leading to possible crash
Categories
(Core :: XPConnect, defect)
Tracking
()
RESOLVED
FIXED
mozilla1.9.3a5
People
(Reporter: jseward, Assigned: jseward)
Details
Attachments
(1 file)
(deleted),
patch
|
timeless
:
review+
|
Details | Diff | Splinter Review |
There's a race in XPCPerThreadData::~XPCPerThreadData that can cause
XPCPerThreadData::gLock to be PR_Destroy'd and nulled out whilst a
second thread is holding it. In the worst case this could cause an
essentially unreproducible segfault at shutdown. All assuming the
following analysis is correct, of course.
What follows is a bit of background and an interleaving of execution
which I believe can lead to the failure.
All line numbers pertain to js/src/xpconnect/src/xpcthreadcontext.cpp
(m-c as of today).
BACKGROUND
This code appears to maintain a linked list of XPCPerThreadDatas.
Each such element has a field mNextThread which points to the
next element. The list is rooted by a global variable
325 XPCPerThreadData* XPCPerThreadData::gThreads = nsnull;
which is protected by this lock
324 PRLock* XPCPerThreadData::gLock = nsnull;
XPCPerThreadData::~XPCPerThreadData() is intended to remove itself
from the linked list |gThreads|. Then, if the list is empty, |gLock|
is destroyed and nulled out.
However, these actions are not carried out atomically, and it can be
that a second thread T2 entering ~XPCPerThreadData() just after a
previous thread T1 entered it, can cause T1 to destroy and null out
|gLock| while T2 still holds it.
FAILURE INTERLEAVING
Imagine there are two threads, T1 and T2, there are two entries in
|gThreads|, and |gLock| is non-NULL. Here's the relevant code:
394 // Unlink 'this' from the list of threads.
395 if(gLock)
396 {
397 nsAutoLock lock(gLock);
398 if(gThreads == this)
399 gThreads = mNextThread;
400 else
401 {
402 XPCPerThreadData* cur = gThreads;
403 while(cur)
404 {
405 if(cur->mNextThread == this)
406 {
407 cur->mNextThread = mNextThread;
408 break;
409 }
410 cur = cur->mNextThread;
411 }
412 }
413 }
414
415 if(gLock && !gThreads)
416 {
417 PR_DestroyLock(gLock);
418 gLock = nsnull;
419 }
T1: 395 (taken)
397 (T1 acquires lock)
398 .. 412 (T1 removes itself from list)
At this point |gThreads| is non-NULL (still
one entry in it), and we still hold the lock.
413 (T1 drops lock at end of scope)
414 stall
T2: 395 (taken)
397 (T2 acquires lock)
398 .. 412 (T2 removes itself from list)
|gThreads| is now null
412 stall, still holding the lock
T1: 415 (taken - since |gLock| is non-NULL and gThreads is NULL)
416 .. 419 destroy |gLock| and null it
T2: 413 (drop lock at end of scope).
Unfortunately it's now NULL.
Possible segfault.
This race was pointed out by Helgrind. It reports that the write of
|gThreads| at 399 conflicts with the read of it at 415:
Possible data race during write of size 8 at 0x6b23220 by thread #16
at 0x558138C: XPCPerThreadData::~XPCPerThreadData()
(xpcthreadcontext.cpp:399)
by 0x55813AD: xpc_ThreadDataDtorCB(void*) (xpcthreadcontext.cpp:427)
by 0x76DFA4E: _PR_DestroyThreadPrivate (prtpd.c:265)
by 0x76F5033: _pt_root (ptthread.c:270)
by 0x4C2A5D8: mythread_wrapper (hg_intercepts.c:213)
by 0x4E34A03: start_thread (pthread_create.c:300)
by 0xAB2680C: clone (clone.S:112)
This conflicts with a previous read of size 8 by thread #14
at 0x5581357: XPCPerThreadData::~XPCPerThreadData()
(xpcthreadcontext.cpp:415)
by 0x55813AD: xpc_ThreadDataDtorCB(void*) (xpcthreadcontext.cpp:427)
by 0x76DFA4E: _PR_DestroyThreadPrivate (prtpd.c:265)
by 0x76F5033: _pt_root (ptthread.c:270)
by 0x4C2A5D8: mythread_wrapper (hg_intercepts.c:213)
by 0x4E34A03: start_thread (pthread_create.c:300)
by 0xAB2680C: clone (clone.S:112)
Comment 1•15 years ago
|
||
This isn't a problem in practice, because we know the main thread outlives all other threads: we explicitly join all other threads (at least the ones which might run xpconnect) during shutdown well before we get around to destroying the XPCPerThreadData for the main thread.
Assignee | ||
Comment 2•15 years ago
|
||
Comment on attachment 437374 [details] [diff] [review]
patch to make update and delete-decision atomic
i can live with this.
Attachment #437374 -
Flags: review+
Assignee | ||
Comment 4•15 years ago
|
||
(In reply to comment #1)
> This isn't a problem in practice, [...]
It would be better if the code was "obviously correct" without having
to invoke non-local invariants, given that in this case it's easy to
make it so.
Assignee | ||
Updated•15 years ago
|
Keywords: checkin-needed
Updated•15 years ago
|
Assignee: nobody → jseward
Status: NEW → ASSIGNED
Comment 5•15 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.3a5
You need to log in
before you can comment on or make changes to this bug.
Description
•