Closed
Bug 1299076
Opened 8 years ago
Closed 8 years ago
Massive Xpcshell failure in debug on all platforms: application crashed [@ nsWeakReference::QueryReferent(nsID const &,void * *)]
Categories
(Thunderbird :: General, defect)
Thunderbird
General
Tracking
(Not tracked)
RESOLVED
FIXED
Thunderbird 51.0
People
(Reporter: jorgk-bmo, Assigned: rkent)
References
Details
(Keywords: intermittent-failure)
First seen: Tue Aug 30, 2016, 5:05:18:
https://treeherder.mozilla.org/#/jobs?repo=comm-central&revision=8c7ddccf95c4168995fa6e57182133f22d21be03
Massive Xpcshell failure in debug on all platforms. Some examples:
TEST-UNEXPECTED-FAIL | mail/base/test/unit/test_emptyTrash_dbViewWrapper.js | xpcshell return code: 1
PROCESS-CRASH | mail/base/test/unit/test_emptyTrash_dbViewWrapper.js | application crashed [@ nsWeakReference::QueryReferent(nsID const &,void * *)]
TEST-UNEXPECTED-FAIL | mailnews/db/gloda/test/unit/test_index_junk_imap_offline.js | xpcshell return code: 1
PROCESS-CRASH | mailnews/db/gloda/test/unit/test_index_junk_imap_offline.js | application crashed [@ nsWeakReference::QueryReferent(nsID const &,void * *)]
Looks like some M-C landing added a MOZ_ASSERT somewhere. Bingo:
PROCESS | 312 | Hit MOZ_CRASH(nsWeakReference not thread-safe) at c:/builds/moz2_slave/tb-c-cen-w32-d-000000000000000/build/mozilla/xpcom/glue/nsWeakReference.cpp:144
Bug 956338:
https://hg.mozilla.org/mozilla-central/rev/e2fda83015d8#l4.18
Honza, this has caused mayhem to many/most/all of our Xpcshell tests in debug mode. Please advise.
Flags: needinfo?(honzab.moz)
Comment 1•8 years ago
|
||
Definitely result of added assertions in bug 956338, yes.
The patch was pushed several times to m-c based try, catching number of bugs (now all fixed), but I never attempted to push to comm-try (my bad).
First of all, the added assertions detected a critical bug!
What you see is a misuse of weak reference: the problem the added assertions in bug 956338 try to catch is an attempt to access weak ptr on one thread while the real object is just being destroyed on another thread. Hence the simple restriction to query weak ptrs only on a single thread. Dereference of a weak ptr to the same object on more then one thread (say A and B) is _LIKELY_ bad, at least it's a very bad programing architecture. If you expect the real object go away on either thread A or B or even some thread X, then you cannot avoid very dangerous race conditions when referrers, still alive, keep the weak ref and try to query it.
So, we need to find the offending code and fix it ASAP.
One I can see is query for a weak reference from nsImapProtocol::GetPassword, exactly here:
https://dxr.mozilla.org/comm-central/rev/8c7ddccf95c4168995fa6e57182133f22d21be03/mailnews/imap/src/nsImapProtocol.cpp#8237
I didn't look further if there are more, but I expect them (there were 4+ bugs to fix in firefox/gecko).
I can definitely help with these bugs.
Blocks: 956338
Flags: needinfo?(honzab.moz)
Comment 2•8 years ago
|
||
It looks like the fixup bugs on m-c are marked security-only - at least they are hidden. Should this be the case for the corresponding c-c bugs as well?
Comment 3•8 years ago
|
||
Should tihs be considered critical or blocker? Or only the bug fixes makred to block this?
/me wonders if this will hit some of the imap threading issues
Severity: normal → major
Comment 4•8 years ago
|
||
Not confined to TB. Seamonkey is also affected. Happens here:
>> nsCOMPtr<nsIMsgIncomingServer> server = do_QueryReferent(m_server, &rv)
Reporter | ||
Comment 5•8 years ago
|
||
Thanks for looking into this.
(In reply to Honza Bambas (:mayhemer) from comment #1)
> So, we need to find the offending code and fix it ASAP.
> One I can see is query for a weak reference from
> nsImapProtocol::GetPassword, exactly here:
Looking though
https://archive.mozilla.org/pub/thunderbird/tinderbox-builds/comm-central-win32-debug/1472526380/comm-central_win7_ix-debug_test-xpcshell-bm111-tests1-windows-build3.txt.gz
this string
nsImapProtocol::GetPassword(nsCString &,bool) [C:/builds/moz2_slave/tb-c-cen-w32-d-000000000000000/build/mailnews/imap/src/nsImapProtocol.cpp:8237
features a lot.
I see 108x "application crashed" in the log and 108x "#03: nsImapProtocol::GetPassword".
So with some luck, that is the only call site that needs to be fixed.
BTW, what is the replacement for
nsCOMPtr<nsIMsgIncomingServer> server = do_QueryReferent(m_server, &rv);
?
Flags: needinfo?(honzab.moz)
Comment 6•8 years ago
|
||
(In reply to aleth [:aleth] from comment #2)
> It looks like the fixup bugs on m-c are marked security-only - at least they
> are hidden. Should this be the case for the corresponding c-c bugs as well?
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #3)
> Should tihs be considered critical or blocker? Or only the bug fixes makred
> to block this?
Let's file individual bugs as critical + security sensitive, blocking bug 956338.
(In reply to Jorg K (GMT+2, PTO during summer) from comment #5)
> So with some luck, that is the only call site that needs to be fixed.
From my experience, fixing one bug uncovers another one. But we'll see :)
>
> BTW, what is the replacement for
> nsCOMPtr<nsIMsgIncomingServer> server = do_QueryReferent(m_server, &rv);
> ?
That's not that simple. Let's file a bug for this piece and talk there.
Flags: needinfo?(honzab.moz)
Reporter | ||
Comment 7•8 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #6)
> That's not that simple. Let's file a bug for this piece and talk there.
Done, bug 1299113.
Reporter | ||
Comment 8•8 years ago
|
||
Fixed by bug 1299116, thanks, Kent!
You need to log in
before you can comment on or make changes to this bug.
Description
•