Closed
Bug 855331
Opened 12 years ago
Closed 12 years ago
TransactionThreadPool doing sketchy things with mCompleteCallbacks array that can mutate
Categories
(Core :: Storage: IndexedDB, defect)
Core
Storage: IndexedDB
Tracking
()
RESOLVED
FIXED
mozilla23
Tracking | Status | |
---|---|---|
firefox20 | --- | unaffected |
firefox21 | --- | unaffected |
firefox22 | + | wontfix |
firefox23 | + | fixed |
firefox-esr17 | --- | unaffected |
b2g18 | --- | unaffected |
People
(Reporter: bent.mozilla, Assigned: janv)
References
Details
(Keywords: regression, sec-critical, Whiteboard: [regression from bug 767944?][don't uplift yet, see comment 23][adv-main23+])
Attachments
(1 file)
(deleted),
text/plain
|
Details |
See https://tbpl.mozilla.org/php/getParsedLog.php?id=21144311&tree=Mozilla-Inbound
TransactionThreadPool::FinishTransaction() calls MaybeFireCallback with a DatabasesCompleteCallback& reference that comes out of mCompleteCallbacks. It then calls Run() on the callback, which could do something like call WaitForAllDatabasesToComplete which appends an element to mCompleteCallbacks. If that causes reallocation then the DatabasesCompleteCallback& on our stack is now garbage.
Somehow the log says we're jumping to HasTransactionsForDatabase (line 472), but I can't see how that is happening. Maybe we're calling a virtual Run() on one of our callback pointers that is now garbage?
Updated•12 years ago
|
Keywords: sec-critical
Comment 1•12 years ago
|
||
When we figure this out, we'd like to know how far back it goes.
status-firefox22:
--- → affected
tracking-firefox22:
--- → +
Comment 2•12 years ago
|
||
bent says he's not working on bug 855276, which presumably means the same for this bug. Can we please get an owner? It's a frequent orange and s-s.
Updated•12 years ago
|
Comment 3•12 years ago
|
||
Kyle you lost our coin toss :) (Please find another owner if appropriate)
Assignee: nobody → khuey
Updated•12 years ago
|
Updated•12 years ago
|
Assignee: nobody → khuey
We suspect that this isn't actually a security issue, but we're not confident enough to open it up yet. In particular, comment 0 is wrong, and every instance of this we've seen is a null deref.
Whiteboard: work happening in bug 855276
Updated•12 years ago
|
Keywords: sec-critical → sec-other
We're pretty convinced this is a security bug again ;-)
Still haven't managed to reproduce.
Keywords: sec-other → sec-critical
I landed a patch to change from pass-by-reference to pass-by-value when calling this function and that didn't change anything, so this looks like actual heap corruption.
Comment 7•12 years ago
|
||
I'm going to assume this affects everything. Adjust as needed.
status-b2g18:
--- → affected
status-firefox20:
--- → wontfix
status-firefox21:
--- → affected
status-firefox-esr17:
--- → affected
Updated•12 years ago
|
Keywords: regression
I strongly suspect this is a regression for bug 767944.
Comment 9•12 years ago
|
||
Great, that would match up with the time this started happening.
Keywords: regression
Whiteboard: [regression from bug 767944?]
Yes that's my primary piece of evidence ;-)
I think we should consider backing out Bug 767944. I haven't managed to reproduce this locally :-/
Assignee | ||
Comment 12•12 years ago
|
||
Please give me a few days, I'll try to run IDB tests under valgrind.
Assignee | ||
Comment 13•12 years ago
|
||
Assignee: khuey → Jan.Varga
Comment 14•12 years ago
|
||
(In reply to Jan Varga [:janv] from comment #12)
> Please give me a few days, I'll try to run IDB tests under valgrind.
Where do we stand on this? Are we considering a backout at this point? We're nearing the end of the Fx23 cycle and I don't think we want to bring this over to Aurora.
Assignee | ||
Comment 15•12 years ago
|
||
The centralized quota manager is already on Aurora, but it seems the crash doesn't happen there so often.
Could the crash somehow relate to the fact that dom/indexedDB tests now run in mochitest-3 ? They used to run in mochitest-2.
Reporter | ||
Comment 16•12 years ago
|
||
It's important to keep in mind that this patch cannot be simply backed out. It changed the directory structure of idb databases so a complete backout would effectively lose all idb data. If we decide to back this out we'll need to handle that migration as well.
Assignee | ||
Comment 17•12 years ago
|
||
I'm testing a patch here:
https://tbpl.mozilla.org/?tree=Try&rev=01b6bfd06fd7
It looks promising so far ...
Assignee | ||
Comment 18•12 years ago
|
||
16 * 40 mochitest-3 runs, no crashes in MaybeFireCallback()
Should we land it on m-c and see if it really helps ?
Reporter | ||
Comment 19•12 years ago
|
||
Definitely.
Assignee | ||
Comment 20•12 years ago
|
||
Comment 21•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/093f7b379757
Looking very good to me! Please request Aurora uplift ASAP :)
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
Assignee | ||
Comment 22•12 years ago
|
||
Ben, it looks to me that we were firing a callback for some other database instance and that database instance didn't exist anymore.
Not sure how can that happen, but that's the only explanation I have for now.
Reporter | ||
Comment 23•12 years ago
|
||
We really need to understand why this happened before we move on or uplift or anything else here. Otherwise we risk simply hiding the problem or (worse) moving it somewhere else.
Reporter | ||
Updated•12 years ago
|
Whiteboard: [regression from bug 767944?] → [regression from bug 767944?][don't uplift yet, see comment 23]
(In reply to ben turner [:bent] from comment #23)
> We really need to understand why this happened before we move on or uplift
> or anything else here. Otherwise we risk simply hiding the problem or
> (worse) moving it somewhere else.
So are we going to do that ...?
We need to do something on beta.
Flags: needinfo?(bent.mozilla)
Assignee | ||
Comment 26•11 years ago
|
||
(In reply to ben turner [:bent] from comment #25)
> Jan, are you looking into this?
well, I can take a look ...
do you think it could be something like the thing described in comment 22 ?
Reporter | ||
Comment 27•11 years ago
|
||
I haven't been able to come up with any explanation of why this patch seems to work. Since you came up with it I'm hoping you might have a better idea ;)
Comment 28•11 years ago
|
||
There hasn't been enough traction here to get into Beta 22, and comment 27 doesn't make me feel good about getting this into our final two betas.
Comment 29•11 years ago
|
||
This is a "kungFuDeathGrip" type patch. The worst risk of uplifting it would seem to be memory leaks (but those should show up as unexpected leaks in tbpl and don't seem to be).
Blocks: 767944
Comment 30•11 years ago
|
||
I don't know how to appeal the "wontfix" because there's no attachment on which to request approval :-( but that is what the previous comment was attempting to be.
Comment 31•11 years ago
|
||
(In reply to ben turner [:bent] from comment #23)
> We really need to understand why this happened before we move on or uplift
> or anything else here. Otherwise we risk simply hiding the problem or
> (worse) moving it somewhere else.
Whether we uplift to Fx22 or not the above is true since the patch is on its way to release regardless.
Reporter | ||
Comment 32•11 years ago
|
||
Can we back this out of trunk before we branch? There's no reason to think that this patch is what we actually want.
Comment 33•11 years ago
|
||
Can we just back this out from beta after the merge rather than inflicting the nearly once per push crash back on the rest of the tree? This was one of our most frequent oranges when we were hitting it and the rate at which this and associated bugs has moved along doesn't leave me brimming with confidence that the pain of a trunk backout will be short-lived.
Assignee | ||
Comment 34•11 years ago
|
||
Ok, Ben and I suspect that those crashes we used to have before the "noone knows why it works" patch could be caused by the issue described in the bug 878703 and that the centralized manager somehow made the crashes to happen more often.
I'm going to backout the "noone knows why it works" patch on the try server this weekend. I need to retrigger the tests like hundreds of times to be sure that the crashes don't happen anymore, that's why I'll do it on the weekend, when the load is lower.
Updated•11 years ago
|
Assignee | ||
Comment 35•11 years ago
|
||
no, bug 878703 seems to be unrelated
when I backed out the hacky patch, crashes reappeared:
https://tbpl.mozilla.org/php/getParsedLog.php?id=24186335&tree=Try
Comment 36•11 years ago
|
||
Tagging this since we seem to be shipping this in two weeks.
Whiteboard: [regression from bug 767944?][don't uplift yet, see comment 23] → [regression from bug 767944?][don't uplift yet, see comment 23][adv-main23+]
Updated•11 years ago
|
Group: core-security
You need to log in
before you can comment on or make changes to this bug.
Description
•