Closed Bug 682593 Opened 13 years ago Closed 13 years ago

Crash, null pointer deref [@ nsBuiltinDecoderStateMachine::GetBuffered ]

Categories

(Core :: Audio/Video, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla9
Tracking Status
firefox8 - wontfix
firefox9 - ---

People

(Reporter: bjacob, Assigned: kinetik)

References

Details

(Keywords: crash, regression, topcrash, Whiteboard: [qa-])

Crash Data

Attachments

(1 file)

I was just looking at the top crashers for Firefox 9.0a1 on Linux. This is ranking 17th.

https://crash-stats.mozilla.com/report/index/b0c27c54-85d5-4b8f-8d7c-4f6612110826

This is a segfault at address 0, at this line of code:

http://hg.mozilla.org/mozilla-central/annotate/6009974c1e1c/content/media/nsBuiltinDecoderStateMachine.cpp#l1869

Which at first glance suggests that mDecoder is null; but the calling frame is

http://hg.mozilla.org/mozilla-central/annotate/6009974c1e1c/content/html/content/src/nsHTMLMediaElement.cpp#l2602

and we're inside of this if():

  if (mReadyState >= nsIDOMHTMLMediaElement::HAVE_CURRENT_DATA && mDecoder) {

So... mDecoder can't be null, as we just checked for it. So I'm clueless :-)
Severity: normal → critical
Keywords: crash, topcrash
https://crash-stats.mozilla.com/report/list?range_value=4&range_unit=weeks&signature=nsBuiltinDecoderStateMachine::GetBuffered says we have also been seeing this on Mac - and also on a 8.0a2 (aurora) build, so affects 8 as well, probably.
The real call chain is nsHTMLMediaElement->nsBuiltinDecoder->nsBuiltinDecoderStateMachine (the nsBuiltinDecoder step is not present in the Linux crash report).  nsBuiltinDecoderStateMachine assumes its mDecoder is never null, but can become null during shutdown, where mDecoder.forget() is called during the SHUTDOWN state in RunStateMachine.  This opens a window where mDecoder->mDecoderStateMachine is valid, but mDecoderStateMachine->mDecoder is not.

This code last changed in bug 592833.
Blocks: 592833
Keywords: regression
OS: Linux → All
Hardware: x86_64 → All
I just ran into this on http://lab.simurai.com/toy/monster/, with Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:8.0a2) Gecko/20110911 Firefox/8.0a2.
Assignee: nobody → kinetik
Status: NEW → ASSIGNED
Attached patch patch v0 (deleted) β€” β€” Splinter Review
Because of the assumption that an nsBuiltinDecoderStateMachine's mDecoder is never null, there are a number of similar null-deref crashes, such as attempting to set the media element's volume after a network error has occurred.

Avoid this by changing the shutdown dance slightly so that the decoder and state machine's pointers to each other are dropped at the same time.

Patch also removes unused Decoder() call on nsMediaStream discovered while auditing places where null-checks may have to be added.  Also moves mDecoder and mState decls on nsBuiltinDecoderStateMachine from public(!) to protected.
Attachment #561649 - Flags: review?(chris)
Comment on attachment 561649 [details] [diff] [review]
patch v0

Review of attachment 561649 [details] [diff] [review]:
-----------------------------------------------------------------

Great, thanks. Lets get this landed before the next m-c -> aurora merge on September 27.
Attachment #561649 - Flags: review?(chris) → review+
Comment on attachment 561649 [details] [diff] [review]
patch v0

http://hg.mozilla.org/integration/mozilla-inbound/rev/cf4a13b84474
Target Milestone: --- → mozilla9
https://hg.mozilla.org/mozilla-central/rev/cf4a13b84474
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Does this need tracking/status for Firefox 9?

What is the testcase to verify this fix? I see a test URL in comment 3 but it's not clear to me what is needed to test for this crash (apart from simply checking crashstats for the prevalence of this crash post-fix)
I tested it by loading a video from an HTTP server that returned a 400 error for every request after the first.  After loading the video, I would seek into an unbuffered area, which would fail due to the HTTP error and cause the decoder to be shutdown.  Then execute element.buffered.length or element.volume=0 and see if a crash occurs.
QA needs a minimized testcase to verify this fix -- given the requirements in comment 9.

qa- until a testcase can be produced.
Keywords: testcase-wanted
Whiteboard: [qa-]
Some Windows crashes showing up in Beta - https://crash-stats.mozilla.com/report/list?signature=nsBuiltinDecoderStateMachine::GetBuffered%28nsTimeRanges*%29. The signature is slightly different than this one so I added it to the crash signature field.

The volume in Beta is not extremely high - 125 crashes in Beta 1 and 37 in Beta 2. We hit over 1 million ADU in Beta 1 so in the scheme of things it is not high volume, but maybe a nice to have fix for 8? Adding the tracking flag so the triage team can weigh in.
Crash Signature: [@ nsBuiltinDecoderStateMachine::GetBuffered ] → [@ nsBuiltinDecoderStateMachine::GetBuffered ] [@ nsBuiltinDecoderStateMachine::GetBuffered(nsTimeRanges*) ]
---------------------------------[ Triage Comment ]---------------------------------

Because the volumes are not high we will not track this for Firefox 8. Please renominate if information changes or we missed something.

This landed in Firefox 9 and the above holds so we won't track it there either.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: