Closed
Bug 1444537
Opened 7 years ago
Closed 7 years ago
100% CPU usage due to what looks like GIF image decoding stuck in a loop
Categories
(Core :: Layout: Images, Video, and HTML Frames, defect, P3)
Core
Layout: Images, Video, and HTML Frames
Tracking
()
RESOLVED
FIXED
mozilla61
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox59 | --- | unaffected |
firefox60 | --- | fixed |
firefox61 | --- | fixed |
People
(Reporter: mstange, Assigned: aosmond)
References
Details
(Keywords: power, regression)
Crash Data
Attachments
(4 files, 1 obsolete file)
(deleted),
patch
|
tnikkel
:
review+
jcristau
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
tnikkel
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
tnikkel
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
aosmond
:
review+
|
Details | Diff | Splinter Review |
Profile: https://perfht.ml/2GfPgbL
I only have one window open and was looking at a blank tab while taking this profile. My battery percentage had been going down at an alarming rate for at least 20 minutes before I started to investigate.
Assignee | ||
Comment 2•7 years ago
|
||
I'm investigating this. I suspect it is related to some recent shutdown crashes, e.g.:
https://crash-stats.mozilla.com/report/index/4c97739d-7eb7-4d9a-952d-c10be0180309#allthreads
My best guess at this point is that it never releases the thread, hence the shutdown hang / 100% CPU. This could in theory happen because of how advancing and discarding interact.
Assignee: nobody → aosmond
Crash Signature: [@ shutdownhang | nsThread::Shutdown | mozilla::image::DecodePoolImpl::Shutdown ]
Assignee | ||
Comment 3•7 years ago
|
||
Assignee | ||
Comment 4•7 years ago
|
||
I'm not convinced this is the correct solution yet... but in theory, it would make these crash reports go away if it is simply a matter of the refresh ticks advancing the animation.
Comment 5•7 years ago
|
||
Can we get a status here?
Assignee | ||
Comment 6•7 years ago
|
||
Comment on attachment 8958189 [details] [diff] [review]
0002-Bug-1444537-Part-1.-Ensure-discarding-of-animated-im.patch, v1
I haven't been able to figure out any other root cause besides the shutdown path. Let's just land and see if it happens again.
Flags: needinfo?(aosmond)
Attachment #8958189 -
Flags: review?(tnikkel)
Assignee | ||
Updated•7 years ago
|
Attachment #8958192 -
Flags: review?(tnikkel)
Updated•7 years ago
|
Attachment #8958189 -
Flags: review?(tnikkel) → review+
Updated•7 years ago
|
Attachment #8958192 -
Flags: review?(tnikkel) → review+
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4d66a4a931e4
Part 1. Ensure discarding of animated image frames is clear. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/1c5d4e965209
Part 2. Shutting down the decode pool should make animated decoders bail early. r=tnikkel
Comment 8•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/4d66a4a931e4
https://hg.mozilla.org/mozilla-central/rev/1c5d4e965209
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Assignee | ||
Comment 9•7 years ago
|
||
I want to see the crashes go away, will then request uplift.
Flags: needinfo?(aosmond)
Updated•7 years ago
|
status-firefox59:
--- → unaffected
status-firefox-esr52:
--- → unaffected
Assignee | ||
Updated•7 years ago
|
Crash Signature: [@ shutdownhang | nsThread::Shutdown | mozilla::image::DecodePoolImpl::Shutdown ] → [@ shutdownhang | nsThread::Shutdown | mozilla::image::DecodePoolImpl::Shutdown ]
[@ shutdownhang | mozilla::SpinEventLoopUntil<T> | nsThread::Shutdown | mozilla::image::DecodePoolImpl::Shutdown ]
[@ shutdownhang | __psynch_cvwait | <name omitted> | nsT…
Assignee | ||
Comment 10•7 years ago
|
||
Comment on attachment 8958189 [details] [diff] [review]
0002-Bug-1444537-Part-1.-Ensure-discarding-of-animated-im.patch, v1
Approval Request Comment
[Feature/Bug causing the regression]: Bug 523950
[User impact if declined]: May hang on shutdown due to stuck decoding an animated image.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: The reproduction rate was low on nightly, but it has not reoccurred since landing.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: Both patches in this bug (part 1 and part 2).
[Is the change risky?]: No.
[Why is the change risky/not risky?]: The changes are small and well understood. They should only have an functional impact during shutdown.
[String changes made/needed]: None.
Flags: needinfo?(aosmond)
Attachment #8958189 -
Flags: approval-mozilla-beta?
Comment 11•7 years ago
|
||
Comment on attachment 8958189 [details] [diff] [review]
0002-Bug-1444537-Part-1.-Ensure-discarding-of-animated-im.patch, v1
gotta have our gifs. approved for 60.0b8
Attachment #8958189 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 12•7 years ago
|
||
bugherder uplift |
Comment 13•7 years ago
|
||
(In reply to Andrew Osmond [:aosmond] from comment #10)
> [User impact if declined]: May hang on shutdown due to stuck decoding an
> animated image.
> [Is this code covered by automated tests?]: Yes.
> [Has the fix been verified in Nightly?]: The reproduction rate was low on
> nightly, but it has not reoccurred since landing.
> [Needs manual test from QE? If yes, steps to reproduce]: No.
Setting qe-verify- based on Andrew's assessment on manual testing needs and the fact that this fix has automated tests.
Flags: qe-verify-
Assignee | ||
Comment 14•7 years ago
|
||
This is still happening. I see a path I missed, ugh.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 15•7 years ago
|
||
In two places we should have also checked if we were shutting down:
https://searchfox.org/mozilla-central/rev/b55e1a1cbcaee34878e133fbac20c4c2af6e11b5/image/AnimationSurfaceProvider.cpp#275
https://searchfox.org/mozilla-central/rev/b55e1a1cbcaee34878e133fbac20c4c2af6e11b5/image/AnimationSurfaceProvider.cpp#332
But this begs the question, why are we so frequently in always in AnimationSurfaceProvider::FinishDecoding in the crash reports, or a similar teardown method? Most of the time during decode should be spent producing frames, not destroying the decoder. It doesn't seem like it is coincidence.
In theory we could hit an infinite loop if we recreate the decoder, and it terminates before producing a frame. It takes the early return at line 332 and loops again without giving up the thread context.
If a decoder doesn't produce the same number of frames on a redecode as the first decode, we should probably abort and stop the animation. We've already forgotten about the image at this point, so it is difficult for us to signal it and update the number of frames, get back in sync with FrameAnimator. Ugly.
Assignee | ||
Comment 16•7 years ago
|
||
Assignee | ||
Comment 17•7 years ago
|
||
Assignee | ||
Comment 18•7 years ago
|
||
Comment on attachment 8966978 [details] [diff] [review]
0002-Bug-1444537-Part-4.-Add-gtests-for-AnimatedFrameBuff.patch, v1
I realize the behaviour is less than ideal but we need something that can be uplifted to beta. The crash rate is high enough that I do worry what the release population will look like. Open to alternatives!
As we move towards generating full frames, and ditching FrameAnimator, I can promise things will get better :).
Attachment #8966978 -
Flags: review?(tnikkel)
Assignee | ||
Updated•7 years ago
|
Attachment #8966977 -
Flags: review?(tnikkel)
Updated•7 years ago
|
Attachment #8966977 -
Flags: review?(tnikkel) → review+
Updated•7 years ago
|
Attachment #8966978 -
Flags: review?(tnikkel) → review+
Comment 19•7 years ago
|
||
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/07f5d48b90cb
Part 3. Fix how redecode errors could cause animated image state inconsistencies. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/0d23d74448c8
Part 4. Add gtests for AnimatedFrameBuffer for redecode errors. r=tnikkel
Comment 20•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/07f5d48b90cb
https://hg.mozilla.org/mozilla-central/rev/0d23d74448c8
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
Resolution: --- → FIXED
Comment 21•7 years ago
|
||
bugherder |
Comment 22•7 years ago
|
||
Backout by csabou@mozilla.com:
https://hg.mozilla.org/mozilla-central/rev/a0c455e036df
Backed out 2 changesets for causing crashes on test_discardFramesAnimatedImage.html. a=backout
Comment 23•7 years ago
|
||
Backed out for causing crashes on test_discardFramesAnimatedImage.html
Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-searchStr=android%20debug%20M(41)&fromchange=16c170eb322ba0e07ce412aa02c05b7dae624549&tochange=837a6f4efa3eaf3cff106349b735e718efb4d4a6&group_state=expanded&selectedJob=173704843
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=173704843&repo=mozilla-inbound&lineNumber=1999
Backout link: https://hg.mozilla.org/mozilla-central/rev/a0c455e036df741556d695cf46baeacc581c0c58
Status: RESOLVED → REOPENED
status-firefox61:
fixed → ---
Flags: needinfo?(aosmond)
Resolution: FIXED → ---
Target Milestone: mozilla61 → ---
Comment 24•7 years ago
|
||
thanks for backing this out, you can see many failures on bug 1454130 and some retriggers that show this is the root cause:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-searchStr=test-android-4.3-arm7-api-16%2Fdebug-mochitest-41&fromchange=6fd88252e9c240b07e195493aefebee9953dd90a&selectedJob=173704843
Assignee | ||
Updated•7 years ago
|
Assignee | ||
Comment 25•7 years ago
|
||
This took more time than I thought it would. The stack trace was a bit misleading, as it felt like a nullptr deref, but I believe it was actually an assertion failure. It should happen on other platforms besides Android however as I don't believe it to be a timing problem in general. It was a result of a missing continue I removed in error.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d0c46e142e953fae02989b2cd7939c3f62b57b3c
If the try is green, I will reland.
Attachment #8966977 -
Attachment is obsolete: true
Flags: needinfo?(aosmond)
Attachment #8970557 -
Flags: review+
Comment 26•7 years ago
|
||
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1173dccd114e
Part 3. Fix how redecode errors could cause animated image state inconsistencies. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/96a90528f1cd
Part 4. Add gtests for AnimatedFrameBuffer for redecode errors. r=tnikkel
Comment 27•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/1173dccd114e
https://hg.mozilla.org/mozilla-central/rev/96a90528f1cd
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
status-firefox61:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Updated•6 years ago
|
Product: Core → Core Graveyard
Updated•6 years ago
|
Product: Core Graveyard → Core
You need to log in
before you can comment on or make changes to this bug.
Description
•