Closed
Bug 383977
Opened 17 years ago
Closed 17 years ago
Crash on exit caused by thread local storage cleanup [@ PL_DHashTableOperate]
Categories
(NSPR :: NSPR, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
4.7
People
(Reporter: KaiE, Assigned: wtc)
References
Details
(Keywords: crash, Whiteboard: [sg:critical?] double free?)
Crash Data
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
wtc
:
review+
|
Details | Diff | Splinter Review |
We recently changed NSPR to clean up its thread local storage on exit (bug 362768). This patch has not yet been picked up by Mozilla, but it will be soon.
We also fixed a bug in Mozilla XPCOM, in order to avoid a double free (bug 379550). This patch has not yet been picked up by Mozilla, but it will be soon.
We got reports that Mozilla software that uses both patches will crash on exit.
(Please see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=242370 and potentially https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=238770 )
The call stack given in Red Hat bug 242370 suggests there might be a double free bug in XPConnect. See https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=242370#c11
Comment 1•17 years ago
|
||
Any relationship to bug 335018? I'd guess not; it seems like this stuff isn't something that's allowed to run this late in shutdown, so we need to make sure it's cleaned up earlier so that the TLS destructors can't run later.
Updated•17 years ago
|
Summary: Crash on exit caused by thread local storage cleanup → Crash on exit caused by thread local storage cleanup [@ PL_DHashTableOperate]
Assignee | ||
Comment 2•17 years ago
|
||
I'm now convinced that it is unsafe for NSPR's shared library
finalization function to call back to the thread private data
destructors. First, the libraries in which the destructors are
defined in may have been shut down or unloaded. Second, this
breaks backward compatibility, because NSPR did not call the
TPD destructors when it is unloaded before.
This patch adds a new version of _pt_thread_death, called
_pt_thread_death_no_tpd_destructors, that is the same as
_pt_thread_death except that it doesn't call the TPD
destructors. Note that this patch changes _pt_thread_death
to call _PR_DestroyThreadPrivate first, as opposed to after
removing the thread from the linked list in 'pt_book'. I
can't think of a reason why the ordering would matter.
Kai, please review and test this patch. Please also go
through the original test for which we added _PR_Fini in
the first place. Thanks!
Attachment #268128 -
Flags: review?(kengert)
Assignee | ||
Updated•17 years ago
|
Assignee: nobody → wtc
Component: XPConnect → NSPR
Product: Core → NSPR
QA Contact: xpconnect → nspr
Target Milestone: --- → 4.7
Version: 1.8 Branch → 4.7
Assignee | ||
Comment 3•17 years ago
|
||
If you can suggest a better name than _pt_thread_death_no_tpd_destructors,
I'd appreciate it.
Status: NEW → ASSIGNED
Reporter | ||
Comment 4•17 years ago
|
||
(In reply to comment #2)
> This patch adds a new version of _pt_thread_death, called
> _pt_thread_death_no_tpd_destructors, that is the same as
> _pt_thread_death except that it doesn't call the TPD
> destructors. Note that this patch changes _pt_thread_death
> to call _PR_DestroyThreadPrivate first, as opposed to after
> removing the thread from the linked list in 'pt_book'. I
> can't think of a reason why the ordering would matter.
This seems to be a very critical piece of code, and I propose we try our best to minimize risk. I propose we avoid the risk of reordering the cleanup steps.
I'm attaching a slightly different patch that will keep the order of things.
Reporter | ||
Comment 5•17 years ago
|
||
Wan-Teh, would you be ok with this slightly changed version of your patch?
I convert the existing implementation into a function that takes a parameter, whether or not to call TPD.
The existing function _pt_thread_death will pass PR_TRUE and therefore continue to call the TPD, while keeping the current order of events.
The _PR_Fini function will pass PR_FALSE and the TPD will not get called.
I will now continue with your other comments.
Reporter | ||
Comment 6•17 years ago
|
||
Wan-Teh, I talked to Ray Strode.
He is still able to reproduce the original bug.
He tested attachment 270044 [details] [diff] [review] and and it works for him.
Reporter | ||
Comment 7•17 years ago
|
||
Comment on attachment 268128 [details] [diff] [review]
Add a version of _pt_thread_death that doesn't call TPD destructors
So, I'm fine with this patch in general, but I would like to avoid the risk of re-ordering calls. This is the only reason I'm giving this patch a r-
I would like to request that we keep the order of calls, and I went ahead and implemented that in the other patch (which was tested to work).
Attachment #268128 -
Flags: review?(kengert) → review-
Reporter | ||
Comment 8•17 years ago
|
||
Comment on attachment 270044 [details] [diff] [review]
Patch v1
Wan-Teh, can you please review?
Attachment #270044 -
Flags: review?(wtc)
Assignee | ||
Comment 9•17 years ago
|
||
Comment on attachment 270044 [details] [diff] [review]
Patch v1
r=wtc.
Attachment #270044 -
Flags: review?(wtc) → review+
Reporter | ||
Comment 10•17 years ago
|
||
Checked in to nspr head:
Checking in pr/src/pthreads/ptthread.c;
/cvsroot/mozilla/nsprpub/pr/src/pthreads/ptthread.c,v <-- ptthread.c
new revision: 3.75; previous revision: 3.74
done
Checked in to NSPRPUB_PRE_4_2_CLIENT_BRANCH:
Checking in pr/src/pthreads/ptthread.c;
/cvsroot/mozilla/nsprpub/pr/src/pthreads/ptthread.c,v <-- ptthread.c
new revision: 3.42.2.21; previous revision: 3.42.2.20
done
Marking fixed
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Comment 11•17 years ago
|
||
This patch will need to land on the 1.8 stable branches as well, right?
Flags: wanted1.8.1.x?
Flags: wanted1.8.0.x?
Whiteboard: [sg:critical?] double free?
Assignee | ||
Comment 12•17 years ago
|
||
Dan, no, you don't need to worry about this bug for the Mozilla
1.8 stable branches because the patch we decided to use in the
end is an NSPR patch.
Reporter | ||
Comment 13•17 years ago
|
||
Dan, this bug does not apply to 1.8.1.x or 1.8.0.x, because they still use NSPR 4.6.
The original (bad) patch from bug 362768 was only added to NSPR 4.7.
This newer fix is only required for NSPR 4.7, only.
Only distributions who explicitly picked up the earlier patch require the new patch.
Only Mozilla 1.9 uses NSPR 4.7 and has already picked up this patch.
Comment 14•17 years ago
|
||
Will we ever require the 1.8.1 branch releases to upgrade to a newer NSS (thus dragging in a newer NSPR) like we did with FF1.5 going from NSS 3.10.x to 3.11.x?
Reporter | ||
Comment 15•17 years ago
|
||
If we ever apply NSPR 4.7 to an older branch, we will use a NSPR release. There is no NSPR 4.7 release yet.
Probably at the time when a NSS 3.12 release is made it will be accompanied by a NSPR 4.7 release.
The first NSPR 4.7 release will therefore contain the fix from this bug, and we will be fine.
Assignee | ||
Updated•17 years ago
|
Attachment #268128 -
Attachment is obsolete: true
Updated•17 years ago
|
Group: security
Comment 16•17 years ago
|
||
NSPR 4.7 has been released and includes this fix.
Updated•16 years ago
|
Flags: wanted1.8.1.x?
Flags: wanted1.8.1.x-
Flags: wanted1.8.0.x?
Updated•13 years ago
|
Crash Signature: [@ PL_DHashTableOperate]
You need to log in
before you can comment on or make changes to this bug.
Description
•