Closed
Bug 802168
Opened 12 years ago
Closed 12 years ago
Assertion failure: (next == this) == (prev == this) discarding images in LinkedList
Categories
(Core :: Graphics: ImageLib, defect)
Core
Graphics: ImageLib
Tracking
()
RESOLVED
FIXED
mozilla19
People
(Reporter: bc, Assigned: joe)
References
()
Details
(Keywords: assertion, regression, sec-critical, Whiteboard: [adv-track-main17+][adv-track-esr17+])
Attachments
(5 files)
(deleted),
text/html
|
Details | |
(deleted),
text/html
|
Details | |
(deleted),
application/x-gzip
|
Details | |
(deleted),
patch
|
jrmuizel
:
review+
jdm
:
review+
dveditz
:
sec-approval+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
joe
:
review+
lsblakk
:
approval-mozilla-aurora+
lsblakk
:
approval-mozilla-beta+
akeybl
:
approval-mozilla-esr10+
dveditz
:
sec-approval+
|
Details | Diff | Splinter Review |
1. http://epic.vn/
Lots of other examples in automation.
2. Assertion failure: (next == this) == (prev == this), at c:\work\mozilla\builds\nightly\mozilla\firefox-debug\dist\include\mozilla/LinkedList.h:165
mozilla::LinkedListElement<mozilla::image::DiscardTracker::Node>::isInList() mozilla::image::RasterImage::DiscardingActive() mozilla::image::RasterImage::UnlockImage() imgRequestProxy::UnlockImage() nsDocument::RemoveImage(imgIRequest*, unsigned int)
Reproduced on Windows and Linux so far.
Also have seen Windows crashes at nsCOMPtr<nsIWeakReference>::~nsCOMPtr<nsIWeakReference>() | mozilla::image::RasterImage::~RasterImage() mozilla::image::RasterImage::`scalar deleting destructor'(unsigned int) + 0xe mozilla::image::RasterImage::Release() nsRefPtr<mozilla::image::Image>::~nsRefPtr<mozilla::image::Image>() imgRequest::~imgRequest()
that are rated low exploitable by breakpad.
There are a number of other stacks associated with this. Marking s-s.
Crashes OSX debug as well.
Reporter | ||
Comment 1•12 years ago
|
||
the OSX crash was at:
0x000000010391f650 in _moz_cairo_surface_flush (surface=0x6000000013381b29) at /work/mozilla/builds/nightly/mozilla/gfx/cairo/cairo/src/cairo-surface.c:1107
1107 if (surface->status)
Current language: auto; currently minimal
(gdb) bt
#0 0x000000010391f650 in _moz_cairo_surface_flush (surface=0x6000000013381b29) at /work/mozilla/builds/nightly/mozilla/gfx/cairo/cairo/src/cairo-surface.c:1107
#1 0x00000001036fbc50 in gfxASurface::Flush (this=0x13401af00) at /work/mozilla/builds/nightly/mozilla/gfx/thebes/gfxASurface.cpp:246
#2 0x00000001015cc68f in imgFrame::Optimize (this=0x13401aa90) at /work/mozilla/builds/nightly/mozilla/image/src/imgFrame.cpp:330
#3 0x00000001015b5472 in mozilla::image::RasterImage::DecodingComplete (this=0x11431aa90) at /work/mozilla/builds/nightly/mozilla/image/src/RasterImage.cpp:1429
#4 0x00000001015ac5bc in mozilla::image::Decoder::Finish (this=0x100c4aa00) at /work/mozilla/builds/nightly/mozilla/image/src/Decoder.cpp:123
#5 0x00000001015af709 in mozilla::image::RasterImage::ShutdownDecoder (this=0x11431aa90, aIntent=mozilla::image::RasterImage::eShutdownIntent_Interrupted) at /work/mozilla/builds/nightly/mozilla/image/src/RasterImage.cpp:2426
#6 0x00000001015af3dd in mozilla::image::RasterImage::~RasterImage (this=0x11431aa90) at /work/mozilla/builds/nightly/mozilla/image/src/RasterImage.cpp:279
#7 0x00000001015af215 in mozilla::image::RasterImage::~RasterImage (this=0x11431aa90) at /work/mozilla/builds/nightly/mozilla/image/src/RasterImage.cpp:252
#8 0x00000001015af1e9 in mozilla::image::RasterImage::~RasterImage (this=0x11431aa90) at /work/mozilla/builds/nightly/mozilla/image/src/RasterImage.cpp:252
#9 0x00000001015aecd9 in mozilla::image::RasterImage::Release (this=0x11431aa90) at /work/mozilla/builds/nightly/mozilla/image/src/RasterImage.cpp:208
Reporter | ||
Comment 3•12 years ago
|
||
I'll rebuild and recheck the urls. That would explain the wide variety of crash signatures/stacks I'm seeing.
Reporter | ||
Comment 4•12 years ago
|
||
Still running the full set of tests but so far I'm seeing
###!!! ABORT: Invalid frame index!: 'aFrameNum < mFrames.Length()', file c:/work/mozilla/builds/nightly/mozilla/image/src/RasterImage.cpp, line 1292
with new builds.
Comment 5•12 years ago
|
||
ok can you maybe reduce this or at least come up with a local testcase?
Reporter | ||
Comment 6•12 years ago
|
||
epic.vn doesn't reproduce locally but I have other candidates to try once the retest completes.
Comment 7•12 years ago
|
||
bug 802144 should be the proper fix here.
Reporter | ||
Comment 8•12 years ago
|
||
the original.htm lithium testcase where I saved the html page (but not complete). This reproduces on win7 about 90% of the time. This still accesses the network.
Reporter | ||
Comment 9•12 years ago
|
||
reduced to 34 lines with reproducibility of 30% still with network access. hopefully this helps. let me know if you need a better one and I'll try the other urls as well. PS this was from epic.vn.
Comment 10•12 years ago
|
||
Unfortunately I'm doubtful of my ability to turn these into some kind of automated test. The problem is clear, but it is extraordinarily timing-dependent in ways that can't easily be constructed artificially.
Comment 11•12 years ago
|
||
if bug 802144 is the fix then this looks sec-critical, although the timing-dependent nature may prevent a reliable exploit so I'll go with sec-high
Keywords: sec-high
Reporter | ||
Comment 12•12 years ago
|
||
This is not fixed. I have additional urls which reproduce the original assertion on the linked list. I have a number of urls I can retest. I'll check back in after I retest them.
Reporter | ||
Comment 13•12 years ago
|
||
http://alimama.vn/mua-ban/ban-theo-bo-set/?&p=2
==7234== Invalid write of size 8
==7234== at 0x197C2A51: gpusLoadTransformFeedbackBuffers (in /System/Library/PrivateFrameworks/GPUSupport.framework/Versions/A/Libraries/libGPUSupport.dylib)
==7234== Invalid write of size 8
==7234== at 0x8816460: mozilla::LinkedListElement<mozilla::image::DiscardTracker::Node>::setNextUnsafe(mozilla::image::DiscardTracker::Node*) (LinkedList.h:211)
==7234== Invalid read of size 1
==7234== at 0x881638C: mozilla::LinkedListElement<mozilla::image::DiscardTracker::Node>::asT() (LinkedList.h:189)
==7234== Invalid read of size 8
==7234== at 0x840F79C: mozilla::TimeStamp::IsNull() const (TimeStamp.h:188)
==7234== Invalid read of size 8
==7234== at 0x840F6E9: mozilla::TimeStamp::operator-(mozilla::TimeStamp const&) const (TimeStamp.h:202)
==7234== Invalid read of size 8
==7234== at 0x840F6F7: mozilla::TimeStamp::operator-(mozilla::TimeStamp const&) const (TimeStamp.h:204)
==7234== Invalid read of size 8
==7234== at 0x88156A4: mozilla::image::DiscardTracker::DiscardNow() (DiscardTracker.cpp:268)
==7234== Invalid read of size 1
==7234== at 0x881D241: mozilla::image::RasterImage::CanDiscard() (RasterImage.cpp:2311)
Reporter | ||
Comment 14•12 years ago
|
||
Note I am seeing assertions and crashes related to the above valgrind errors as well as a number of javascript and layout related assertions and crashes on this set of urls. They are all over the map. I'm calling is sec-critical.
Keywords: sec-high → sec-critical
Assignee | ||
Comment 15•12 years ago
|
||
Bob, can you do a --track-origins valgrind run while we get this into someone's queue?
Reporter | ||
Comment 16•12 years ago
|
||
the run was done with valgrind --dsymutil=yes --trace-children=yes --track-origins=yes --smc-check=all-non-file
Assignee | ||
Comment 17•12 years ago
|
||
Ahem, next time I will look at your log before asking you for things you've already done
Reporter | ||
Comment 18•12 years ago
|
||
(In reply to Bob Clary [:bc:] from comment #13)
> ==7234== Invalid write of size 8
> ==7234== at 0x197C2A51: gpusLoadTransformFeedbackBuffers (in
> /System/Library/PrivateFrameworks/GPUSupport.framework/Versions/A/Libraries/
> libGPUSupport.dylib)
Who can we report this to?
Assignee | ||
Comment 19•12 years ago
|
||
Apple, I guess.
Assignee | ||
Comment 21•12 years ago
|
||
Got a bit more of a stack by using --num-callers=20
==27880== Invalid write of size 8
==27880== at 0x698C1C5: mozilla::image::DiscardTracker::Reset(mozilla::image::DiscardTracker::Node*) (LinkedList.h:211)
==27880== by 0x6990038: mozilla::image::RasterImage::DecodingComplete() (RasterImage.cpp:1552)
==27880== by 0x698BB04: mozilla::image::Decoder::Finish() (Decoder.cpp:123)
==27880== by 0x698FCA5: mozilla::image::RasterImage::ShutdownDecoder(mozilla::image::RasterImage::eShutdownIntent) (RasterImage.cpp:2563)
==27880== by 0x699494C: mozilla::image::RasterImage::~RasterImage() (RasterImage.cpp:404)
==27880== by 0x6994B1B: mozilla::image::RasterImage::~RasterImage() (RasterImage.cpp:417)
==27880== by 0x698C5FF: mozilla::image::RasterImage::Release() (in /home/joe/src/mozilla-central/obj-ff-opt/toolkit/library/libxul.so)
==27880== by 0x69A633D: imgRequest::~imgRequest() (nsAutoPtr.h:874)
==27880== by 0x69A63B3: imgRequest::~imgRequest() (imgRequest.cpp:107)
==27880== by 0x69A3A98: imgRequest::Release() (in /home/joe/src/mozilla-central/obj-ff-opt/toolkit/library/libxul.so)
==27880== by 0x69A8EAF: imgRequestProxy::~imgRequestProxy() (imgRequestProxy.cpp:89)
==27880== by 0x69A73B7: imgRequestProxy::Release() (in /home/joe/src/mozilla-central/obj-ff-opt/toolkit/library/libxul.so)
==27880== by 0x6C7EDC6: nsImageLoadingContent::MakePendingRequestCurrent() (nsImageLoadingContent.cpp:972)
==27880== by 0x6C7F39F: nsImageLoadingContent::LoadImage(nsIURI*, bool, bool, nsIDocument*, unsigned int) (nsImageLoadingContent.cpp:678)
==27880== by 0x6C7F590: nsImageLoadingContent::LoadImage(nsAString_internal const&, bool, bool) (nsImageLoadingContent.cpp:578)
==27880== by 0x6D8236F: nsHTMLImageElement::SetAttr(int, nsIAtom*, nsIAtom*, nsAString_internal const&, bool) (nsHTMLImageElement.cpp:378)
==27880== by 0x6D59695: nsGenericHTMLElement::SetAttribute(nsAString_internal const&, nsAString_internal const&) (nsGenericHTMLElement.cpp:344)
==27880== by 0x724C28F: nsIDOMElement_SetAttribute(JSContext*, unsigned int, JS::Value*) (dom_quickstubs.cpp:2060)
==27880== by 0x1DE7F839: ???
==27880== by 0x7F867E5: js::mjit::EnterMethodJIT(JSContext*, js::StackFrame*, void*, JS::Value*, bool) (MethodJIT.cpp:1043)
==27880== Address 0x25f1c3b0 is 144 bytes inside a block of size 344 free'd
==27880== at 0x402B579: free (vg_replace_malloc.c:446)
==27880== by 0x698C5FF: mozilla::image::RasterImage::Release() (in /home/joe/src/mozilla-central/obj-ff-opt/toolkit/library/libxul.so)
==27880== by 0x69A633D: imgRequest::~imgRequest() (nsAutoPtr.h:874)
==27880== by 0x69A63B3: imgRequest::~imgRequest() (imgRequest.cpp:107)
==27880== by 0x69A3A98: imgRequest::Release() (in /home/joe/src/mozilla-central/obj-ff-opt/toolkit/library/libxul.so)
==27880== by 0x69A8EAF: imgRequestProxy::~imgRequestProxy() (imgRequestProxy.cpp:89)
==27880== by 0x69A73B7: imgRequestProxy::Release() (in /home/joe/src/mozilla-central/obj-ff-opt/toolkit/library/libxul.so)
==27880== by 0x6C7EDC6: nsImageLoadingContent::MakePendingRequestCurrent() (nsImageLoadingContent.cpp:972)
==27880== by 0x6C7F39F: nsImageLoadingContent::LoadImage(nsIURI*, bool, bool, nsIDocument*, unsigned int) (nsImageLoadingContent.cpp:678)
==27880== by 0x6C7F590: nsImageLoadingContent::LoadImage(nsAString_internal const&, bool, bool) (nsImageLoadingContent.cpp:578)
==27880== by 0x6D8236F: nsHTMLImageElement::SetAttr(int, nsIAtom*, nsIAtom*, nsAString_internal const&, bool) (nsHTMLImageElement.cpp:378)
==27880== by 0x6D59695: nsGenericHTMLElement::SetAttribute(nsAString_internal const&, nsAString_internal const&) (nsGenericHTMLElement.cpp:344)
==27880== by 0x724C28F: nsIDOMElement_SetAttribute(JSContext*, unsigned int, JS::Value*) (dom_quickstubs.cpp:2060)
==27880== by 0x1DE7F839: ???
==27880== by 0x7F867E5: js::mjit::EnterMethodJIT(JSContext*, js::StackFrame*, void*, JS::Value*, bool) (MethodJIT.cpp:1043)
==27880== by 0x7F86E04: js::mjit::JaegerShot(JSContext*, bool) (MethodJIT.cpp:1101)
==27880== by 0x7D82815: js::Interpret(JSContext*, js::StackFrame*, js::InterpMode) (jsinterp.cpp:2423)
==27880== by 0x7D864CE: js::RunScript(JSContext*, JS::Handle<JSScript*>, js::StackFrame*) (jsinterp.cpp:324)
==27880== by 0x7D8695B: js::InvokeKernel(JSContext*, JS::CallArgs, js::MaybeConstruct) (jsinterp.cpp:379)
==27880== by 0x7D2DEAE: js_fun_call(JSContext*, unsigned int, JS::Value*) (jsinterp.h:109)
Reporter | ||
Comment 22•12 years ago
|
||
fyi, i am seeing Assertion failure: (next == this) == (prev == this) on over 100 urls now. This does not include related crashes which may push the url count higher. The number and variety of crash signatures and other fatal assertion messages associated with these urls is quite staggering. In fact the noise from these crashes is overwhelming the signal from other bugs. It is difficult to tell them apart.
Assignee | ||
Comment 23•12 years ago
|
||
Shutting down a decoder from the destructor is a little too high-touch, because it's really meant for an image that's going to be sticking around for a while, and we emphatically aren't.
The real problem here is that we removed ourselves from the DiscardTracker, then stopped the decoder correctly, which said "Oh, since we've finished decoding, time to make ourselves discardable" and re-added us to the DiscardTracker. Then our memory was deleted. Good times.
Attachment #673364 -
Flags: review?(jmuizelaar)
Comment 24•12 years ago
|
||
Comment on attachment 673364 [details] [diff] [review]
don't shut down decoders in ~RasterImage
Review of attachment 673364 [details] [diff] [review]:
-----------------------------------------------------------------
::: image/src/RasterImage.cpp
@@ +402,5 @@
> if (mDecoder) {
> + // Kill off our decode request, if it's pending. (If not, this call is
> + // harmless.)
> + DecodeWorker::Singleton()->StopDecoding(this);
> + mDecoder = nullptr;
I hope this is safe.
Attachment #673364 -
Flags: review?(jmuizelaar) → review+
Assignee | ||
Comment 25•12 years ago
|
||
Comment on attachment 673364 [details] [diff] [review]
don't shut down decoders in ~RasterImage
[Security approval request comment]
How easily can the security issue be deduced from the patch?
Not easily. I'm not even sure how easily it can be reproduced.
Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
No.
Which older supported branches are affected by this flaw?
None, probably.
If not all supported branches, which bug introduced the flaw?
Very likely bug 505385.
Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
N/A
How likely is this patch to cause regressions; how much testing does it need?
It could cause issues in debug builds from old assertions firing, so needs a bunch of testing.
Attachment #673364 -
Flags: sec-approval?
Assignee | ||
Comment 26•12 years ago
|
||
Assignee | ||
Comment 27•12 years ago
|
||
Comment on attachment 673364 [details] [diff] [review]
don't shut down decoders in ~RasterImage
Josh, can you convince yourself whether this is correct or not?
Attachment #673364 -
Flags: review?(josh)
Updated•12 years ago
|
Blocks: 505385
status-firefox-esr10:
--- → unaffected
status-firefox18:
--- → unaffected
status-firefox19:
--- → affected
tracking-firefox19:
--- → +
Comment 28•12 years ago
|
||
Comment on attachment 673364 [details] [diff] [review]
don't shut down decoders in ~RasterImage
If this isn't affecting older branches we don't have to worry about the patch giving away too much (not that this one does!)
sec-approval+
Attachment #673364 -
Flags: sec-approval? → sec-approval+
Updated•12 years ago
|
Flags: in-testsuite?
Comment 29•12 years ago
|
||
Comment on attachment 673364 [details] [diff] [review]
don't shut down decoders in ~RasterImage
Review of attachment 673364 [details] [diff] [review]:
-----------------------------------------------------------------
This looks safe to me, given that each image should only be in the request list at most once.
Attachment #673364 -
Flags: review?(josh) → review+
Assignee | ||
Comment 30•12 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/4b3247d6507b
I was never able to get an automated test to work unfortunately :(
Flags: in-testsuite? → in-testsuite-
Target Milestone: --- → mozilla19
Comment 31•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 33•12 years ago
|
||
fyi, i retested all of the urls where automation had seen this assertion and could not reproduce after the patch landed. Thanks!
Comment 34•12 years ago
|
||
Bug 804041 was marked as a dupe of this and is tracking+ for 17 - should this patch land on Aurora and Beta as well? (And if so, should we track this for 17 and remove the dupe from tracking, as it shows up on the list of nonfixed trackers for that release?)
Comment 35•12 years ago
|
||
Something needs to land on Aurora + Beta, whether it's the code from bug 804041, bug 803688, or this bug. I'd guess that the first two options are safer. The first one is a bit smaller of a change, while the second one points a slightly larger arrow at the attack vector (but it's still not particularly obvious).
Assignee | ||
Comment 36•12 years ago
|
||
Actually bug 804041 would be a bit more of an arrow at the vector. Either of bug 803688 or this bug would be less obvious. I think, however, that bug 804041 would be a bit better for uplift.
I'll let the security team decide which to push, though.
Assignee | ||
Comment 37•12 years ago
|
||
[Security approval request comment]
How easily can the security issue be deduced from the patch? More easily than the other options; see comment 36.
Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No.
Which older supported branches are affected by this flaw? All, probably.
How likely is this patch to cause regressions; how much testing does it need? Not terribly likely to regress things.
Attachment #675369 -
Flags: sec-approval?
Attachment #675369 -
Flags: review+
Assignee | ||
Updated•12 years ago
|
Attachment #675369 -
Attachment description: less risky patch for branches → bug 804041 patch (jlebar)
Comment 38•12 years ago
|
||
Comment on attachment 675369 [details] [diff] [review]
bug 804041 patch (jlebar)
sec-approval+
This one looks good too, land which ever one fixes the problem(s) with the least risk of regressions on branches.
Attachment #675369 -
Flags: sec-approval? → sec-approval+
Assignee | ||
Comment 39•12 years ago
|
||
Comment on attachment 675369 [details] [diff] [review]
bug 804041 patch (jlebar)
[Approval Request Comment]
Bug caused by (feature/regressing bug #): not regression
User impact if declined: security bug
Testing completed (on m-c, etc.): A morally equivalent but riskier fix is on m-c.
Risk to taking this patch (and alternatives if risky): Less risky than the other alternatives.
String or UUID changes made by this patch: none
Attachment #675369 -
Flags: approval-mozilla-beta?
Attachment #675369 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 40•12 years ago
|
||
While this assertion doesn't happen on other branches, the underlying bug does.
Comment 41•12 years ago
|
||
Comment on attachment 675369 [details] [diff] [review]
bug 804041 patch (jlebar)
been on central 3 days, approving for branches.
Attachment #675369 -
Flags: approval-mozilla-beta?
Attachment #675369 -
Flags: approval-mozilla-beta+
Attachment #675369 -
Flags: approval-mozilla-aurora?
Attachment #675369 -
Flags: approval-mozilla-aurora+
Assignee | ||
Updated•12 years ago
|
Attachment #675369 -
Flags: approval-mozilla-esr10?
Assignee | ||
Comment 42•12 years ago
|
||
Updated•12 years ago
|
Attachment #675369 -
Flags: approval-mozilla-esr10? → approval-mozilla-esr10+
Updated•12 years ago
|
tracking-firefox-esr10:
--- → 17+
Assignee | ||
Comment 43•12 years ago
|
||
Updated•12 years ago
|
Whiteboard: [adv-track-main17+][adv-track-esr17+]
Comment 44•12 years ago
|
||
Kamil, can you please test to see if this is fixed? You should be able to reproduce this assertion on Windows using the debug Nightly from Oct 21st(1). Once reproduced, please test this is fixed using the latest debug Nightly(2), Aurora(3), Beta(4), and ESR10(5) builds.
1. ftp://ftp.mozilla.org/pub/firefox/nightly/2012-10-21-mozilla-central-debug/
2. ftp://ftp.mozilla.org/pub/firefox/nightly/2012-11-14-mozilla-central-debug/
3. ftp://ftp.mozilla.org/pub/firefox/nightly/2012-11-14-mozilla-aurora-debug/
4. ftp://ftp.mozilla.org/pub/firefox/nightly/2012-11-14-mozilla-beta-debug/
5. ftp://ftp.mozilla.org/pub/firefox/nightly/2012-11-14-mozilla-esr10-debug/
Please coordinate with my via email or on IRC (ashughes in #qa on irc.mozilla.org) if you run into problems. Thanks.
Comment 45•12 years ago
|
||
When going through the following issue, am I looking for actual crashes or just a assertion dialog box? Currently, I am just receiving crashes with both of the central-debug versions linked by Anthony (2012-10-21 & 2012-11-14)
Here's a crash report that occurred on the 2012-11-14 central-debug version:
https://crash-stats.mozilla.com/report/index/bp-e5e2664b-024d-492a-911f-4225c2121116
Comment 46•12 years ago
|
||
I'm not sure you will see a crash but you should at least see the assertion. To see the assertion you will need to run debug Firefox from a command line. The assertion should show in the command line output when running the testcase (you won't see anything in the Firefox window).
Reporter | ||
Comment 47•12 years ago
|
||
I ran the original test case locally over 100 times on Windows 7 32bit esxi vm and a debug Nightly from yesterday and could not reproduce a crash or assertion.
Kamil, in case you are seeing an artifact of previous testing, please reboot, install a fresh debug build and use a fresh profile.
Updated•12 years ago
|
Group: core-security
You need to log in
before you can comment on or make changes to this bug.
Description
•