Open
Bug 1187421
Opened 9 years ago
Updated 2 years ago
Leak in error_get_my_stack
Categories
(NSPR :: NSPR, defect)
NSPR
NSPR
Tracking
(Not tracked)
NEW
People
(Reporter: mccr8, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: memory-leak)
Attachments
(2 files)
(deleted),
patch
|
KaiE
:
review-
|
Details | Diff | Splinter Review |
(deleted),
patch
|
ttaubert
:
review+
franziskus
:
checked-in+
|
Details | Diff | Splinter Review |
LeakSanitizer detected this small leak in a Firefox content process:
Direct leak of 72 byte(s) in 1 object(s).
#0 0x4750f1 in calloc /builds/slave/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:90
#1 0x7ff63c019890 in error_get_my_stack src/security/nss/lib/base/error.c:109
#2 0x7ff63c019c38 in nss_ClearErrorStack src/security/nss/lib/base/error.c:245
#3 0x7ff63c0182f8 in NSSArena_Create src/security/nss/lib/base/arena.c:352
#4 0x7ff63bff452b in NSSTrustDomain_Create src/security/nss/lib/pki/trustdomain.c:34
error_get_my_stack allocates a new stack and then calls PR_SetThreadPrivate() on it, so I'm guessing the thread private data isn't being freed somehow. It looks like that's supposed to happen in nss_DestroyErrorStack().
This happens with dom/media/tests/mochitest/ipc/test_ipc.html, and in e10s Mochitest-1. I haven't narrowed it down any further.
Comment 1•8 years ago
|
||
Still reproduces in dom/crypto:
https://treeherder.mozilla.org/logviewer.html#?job_id=29489030&repo=try
Comment 2•8 years ago
|
||
This is still pretty widespread across other mochitest suites as well.
Comment 3•8 years ago
|
||
Tim, is there any chance you'd be able to look at this? It's the only NSS LSAN suppression we've got at this point.
Flags: needinfo?(ttaubert)
Comment 4•8 years ago
|
||
I've seen this before. The error stack is allocated whenever we hit errors. Looks like we're not calling PR_Cleanup() or whatever again the function was that cleans this up.
Flags: needinfo?(ttaubert)
Comment 5•8 years ago
|
||
I'm not sure where the right place to call PR_Cleanup() would be? As long as Firefox is still using NSPR too we can't call it right after NSS_Shutdown().
Updated•8 years ago
|
Assignee: nobody → nobody
Component: Libraries → NSPR
Product: NSS → NSPR
Version: 3.15 → other
Comment 6•8 years ago
|
||
I ran into this bug as well. It looks to me like there are multiple leaks here. This patch fixes them. (And hopefully doesn't break anything.)
Attachment #8838442 -
Flags: review?(ted)
Comment 7•8 years ago
|
||
This is the NSS part of the leak fix.
Assignee: nobody → franziskuskiefer
Attachment #8838444 -
Flags: review?(ttaubert)
Comment 8•8 years ago
|
||
Comment on attachment 8838444 [details] [diff] [review]
fix-error-leak-nss.patch
Review of attachment 8838444 [details] [diff] [review]:
-----------------------------------------------------------------
::: lib/base/error.c
@@ +265,5 @@
> {
> if (INVALID_TPD_INDEX != error_stack_index) {
> PR_SetThreadPrivate(error_stack_index, NULL);
> + error_stack_index = INVALID_TPD_INDEX;
> + error_call_once = error_call_again; /* allow to init again */
I've seen memset(0) used to achieve the same, we do that in libssl at least. So maybe:
PORT_Memset(&error_call_once, 0, sizeof(error_call_once));
Attachment #8838444 -
Flags: review?(ttaubert) → review+
Reporter | ||
Comment 9•8 years ago
|
||
Please remove the leak suppression for this bug from build/sanitizers/lsan_suppressions.txt, assuming that an ASan run is green with your patches.
Comment 10•8 years ago
|
||
Comment on attachment 8838444 [details] [diff] [review]
fix-error-leak-nss.patch
landed as https://hg.mozilla.org/projects/nss/rev/430f78b2fa77a844e446593b8d9ba747301da770
Attachment #8838444 -
Flags: checked-in+
Comment 11•8 years ago
|
||
Comment 12•8 years ago
|
||
(In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #11)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=d177f0889c53
Looks like this doesn't fix the Firefox problem :( Something else must be broken (probably not on the NSS side as I can't reproduce the leak with only NSS anymore).
Reporter | ||
Comment 13•8 years ago
|
||
Thanks for checking! Maybe the whitelist could be changed to NSSTrustDomain_Create, which sounds less generic. It looks like the problem is much less common than described in comment 0, so I think a bunch of stuff was fixed.
Reporter | ||
Updated•8 years ago
|
Reporter | ||
Comment 14•8 years ago
|
||
I filed bug 1341354 for the residual leaks.
Updated•8 years ago
|
Attachment #8838442 -
Flags: review?(ted) → review?(kaie)
Comment 15•8 years ago
|
||
Comment on attachment 8838442 [details] [diff] [review]
fix-nspr-leaks.patch
I suspect this might have been knowingly allowed to leak in the past.
Looking at _PR_DestroyThreadPrivate, it attempts to loop through TPD multiple times, and assumes that some items could have unpredictable side effects, and maybe even the TPD items could be set again, after the loop.
If there's anything pending that couldn't get destroyed, then destroying _pr_tpd_destructors could potentially result in a new shutdown crash, right?
If you want to try getting this cleaned up, please set the variables to NULL after freeing them (which the existing code already does for most other variables).
Comment 16•8 years ago
|
||
Additional comment:
You are using a mix of free and PR_Free. I checked if it matches the allocation.
$ grep -e "->name *=" pr/src/linking/prlink.c
This shows that it always gets allocated using strdup.
Please use "free" to destroy pr_exe_loadmap->name, not PR_Free.
pr_exe_loadmap is allocated using PR_NEWZAP, which uses PR_Calloc.
Using PR_Free to destroy it is correct.
_pr_tpd_destructors is allocated using PR_CALLOC.
Please use PR_Free to destroy it, not "free".
Updated•8 years ago
|
Attachment #8838442 -
Flags: review?(kaie) → review-
Comment 17•8 years ago
|
||
I researched when this code was added, it was more than 15 years ago:
https://hg.mozilla.org/projects/nspr/rev/9640a6cb2c22
Function _PR_ShutdownLinker is still in that original state.
Updated•7 years ago
|
Updated•7 years ago
|
Assignee: franziskuskiefer → nobody
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•