Closed
Bug 1280443
Opened 8 years ago
Closed 8 years ago
Crash in nr_socket_sendto
Categories
(Core :: WebRTC: Networking, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla50
backlog | webrtc/webaudio+ |
People
(Reporter: bwc, Assigned: bwc)
References
Details
(Keywords: csectype-race, csectype-uaf, sec-critical, Whiteboard: [adv-main48+])
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
drno
:
review+
Sylvestre
:
approval-mozilla-aurora+
Sylvestre
:
approval-mozilla-beta+
Sylvestre
:
approval-mozilla-release+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
From crash-stats:
https://crash-stats.mozilla.com/signature/?date=%3E01%2F01%2F2016&signature=nr_socket_sendto&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&page=1#reports
As of 48, we are seeing crashes in nr_socket_sendto, some of which look like UAF. Here's the mtransport stuff that landed in 48:
https://bugzilla.mozilla.org/buglist.cgi?list_id=13071777&resolution=FIXED&query_format=advanced&component=WebRTC%3A%20Networking&target_milestone=mozilla48
It might have something to do with ICE restart, but it is hard to say yet. My preliminary analysis of the source did not turn anything up. It may be some bug elsewhere has crept in, but I looked over the timer stuff and I didn't see anything scary.
Assignee | ||
Updated•8 years ago
|
backlog: --- → webrtc/webaudio+
Rank: 10
Assignee | ||
Comment 1•8 years ago
|
||
Ok, after looking at the timer stuff some more, there's a bunch of racy stuff going on, particularly in nsTimerImpl::Release. I'm going to put together a diagnostic patch for this bug to see if the problem is timers failing to cancel properly.
Assignee | ||
Comment 2•8 years ago
|
||
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → docfaraday
Status: NEW → ASSIGNED
Assignee | ||
Comment 3•8 years ago
|
||
Comment 4•8 years ago
|
||
Hm crash-stats shows a single crash in 39 and then lots of crashes from 48 on. Looks like something has changed in the timing in 48.
Looks like only ICE restart really changed anything significantly in 48. The only idea which comes to mind would be double checking if we properly set, reset and clear all timers in case of rollbacks and if ICE restarts fail.
Assignee | ||
Comment 5•8 years ago
|
||
Yeah, I've looked it over, and ICE restart made very few modifications to the nICEr code, which is where all the timers are handled. It could be that we're running into a similar problem as found in bug 1279146, where we have some dangling pointers to nICEr stuff that we try to use, but so far I have not discovered a code-path that gets from there to this crash. It may be that bug 1279146 is corrupting the heap in a way that causes this bug, but that's a long shot.
Assignee | ||
Updated•8 years ago
|
Attachment #8763626 -
Flags: review?(drno)
Assignee | ||
Comment 6•8 years ago
|
||
Comment on attachment 8763626 [details] [diff] [review]
Make a timer assert more aggressive
[Security approval request comment]
How easily could an exploit be constructed based on the patch?
Very hard I think.
Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
Not really.
Which older supported branches are affected by this flaw?
Unknown; this patch should detect whether canceling an nsTimerImpl is not working in rare cases. If this ends up being the case, it is probably a long-standing bug.
How likely is this patch to cause regressions; how much testing does it need?
At most it will cause some null pointer crashes in rare cases.
Attachment #8763626 -
Flags: sec-approval?
Updated•8 years ago
|
Keywords: csectype-uaf
Comment 7•8 years ago
|
||
Comment on attachment 8763626 [details] [diff] [review]
Make a timer assert more aggressive
Review of attachment 8763626 [details] [diff] [review]:
-----------------------------------------------------------------
LGTM
Attachment #8763626 -
Flags: review?(drno) → review+
Updated•8 years ago
|
Group: core-security → media-core-security
Comment 8•8 years ago
|
||
Comment on attachment 8763626 [details] [diff] [review]
Make a timer assert more aggressive
sec-approval=dveditz
Attachment #8763626 -
Flags: sec-approval? → sec-approval+
Assignee | ||
Updated•8 years ago
|
Keywords: checkin-needed
Assignee | ||
Comment 9•8 years ago
|
||
Comment on attachment 8763626 [details] [diff] [review]
Make a timer assert more aggressive
Approval Request Comment
[Feature/regressing bug #]:
If the crash is being caused by flaws in the timer implementation, this problem has likely existed since webrtc landed.
[User impact if declined]:
This could convert a rare UAF crash into a slightly less rare null pointer crash.
[Describe test coverage new/current, TreeHerder]:
Given how rare this crash is, it would probably be hard to get coverage.
[Risks and why]:
It is possible that we will get null pointer crashes more often, instead of the pretty rare UAF crashes we're seeing.
[String/UUID change made/needed]:
None.
Attachment #8763626 -
Flags: approval-mozilla-beta?
Attachment #8763626 -
Flags: approval-mozilla-aurora?
Comment 10•8 years ago
|
||
Keywords: checkin-needed
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox50:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
Updated•8 years ago
|
status-firefox48:
--- → affected
status-firefox49:
--- → affected
Comment 12•8 years ago
|
||
Byron, it seems that esr45 is affected, right?
status-firefox-esr45:
--- → ?
Flags: needinfo?(docfaraday)
Comment 13•8 years ago
|
||
Comment on attachment 8763626 [details] [diff] [review]
Make a timer assert more aggressive
Taking it as it fixes a sec-high.
Should be in 48 beta 4.
Attachment #8763626 -
Flags: approval-mozilla-beta?
Attachment #8763626 -
Flags: approval-mozilla-beta+
Attachment #8763626 -
Flags: approval-mozilla-aurora?
Attachment #8763626 -
Flags: approval-mozilla-aurora+
Comment 14•8 years ago
|
||
Comment 15•8 years ago
|
||
Assignee | ||
Comment 16•8 years ago
|
||
I have not seen an esr 45 crash, but it is possible it is affected. Also, I should have marked this leave-open, since the patch might not actually fix the root cause. We have to wait and see.
Status: RESOLVED → REOPENED
Flags: needinfo?(docfaraday)
Resolution: FIXED → ---
Comment 17•8 years ago
|
||
I maybe see one crash since this landed on 48, and none on 49, for what that is worth.
Updated•8 years ago
|
Whiteboard: [adv-main48+]
Updated•8 years ago
|
Assignee | ||
Comment 18•8 years ago
|
||
.
Assignee | ||
Updated•8 years ago
|
Attachment #8763626 -
Attachment is obsolete: true
Assignee | ||
Updated•8 years ago
|
Attachment #8779833 -
Flags: review?(drno)
Assignee | ||
Comment 20•8 years ago
|
||
Comment 21•8 years ago
|
||
I've run the patch from Try for 40min on a machine where it insta-crashed, no problems. Also on Linux for me it would i-loop due to the same bug, also running for 40 min now.
Source of the bug was bug 1258753
Blocks: 1258753
status-firefox51:
--- → affected
Comment 22•8 years ago
|
||
Sec-critical since the dup shows you can insta-crash it with a call through a UAF vtbl
Rank: 10 → 5
Keywords: sec-moderate → sec-critical
Comment 23•8 years ago
|
||
Comment on attachment 8779833 [details] [diff] [review]
Copy this the hard way
Review of attachment 8779833 [details] [diff] [review]:
-----------------------------------------------------------------
LGTM
Attachment #8779833 -
Flags: review?(drno) → review+
Assignee | ||
Comment 24•8 years ago
|
||
Comment on attachment 8779833 [details] [diff] [review]
Copy this the hard way
Approval Request Comment
[Feature/regressing bug #]:
Bug 1258753
[User impact if declined]:
We're using a function pointer in freed memory using a fairly simple test-case, which means an arbitrary code execution exploit could be mounted.
[Describe test coverage new/current, TreeHerder]:
Getting the timing down has been hard; we've only been able to repro consistently on windows 10.
[Risks and why]:
Pretty low. It is a pretty simple change.
[String/UUID change made/needed]:
None.
[Security approval request comment]
How easily could an exploit be constructed based on the patch?
Based on the patch, not so easily, because it is hard to work out where the sec problem comes in.
Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
Not really. It's pretty subtle.
Which older supported branches are affected by this flaw?
48 and up.
If not all supported branches, which bug introduced the flaw?
Bug 1258753
Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
I bet this will apply cleanly, but if it does not it will be easy to make backports.
How likely is this patch to cause regressions; how much testing does it need?
Pretty unlikely.
Attachment #8779833 -
Flags: sec-approval?
Attachment #8779833 -
Flags: approval-mozilla-release?
Attachment #8779833 -
Flags: approval-mozilla-beta?
Attachment #8779833 -
Flags: approval-mozilla-aurora?
Comment 25•8 years ago
|
||
I'm confused by the re-opening of this bug. We fixed it in 48 and noted it in the roll up advisory. Did the fix not work correctly?
Flags: needinfo?(docfaraday)
Comment 26•8 years ago
|
||
We landed a patch which was described as "This could convert a rare UAF crash into a slightly less rare null pointer crash." Also, byron meant this to be leave-open I think (thus his reopening it right after it was closed: "Also, I should have marked this leave-open, since the patch might not actually fix the root cause. We have to wait and see.")
It did quiet down the crashes quite a bit, but not quite 100%.
Now that we have a reproducible testcase, we were able to track the root cause down. (Ironically the crash didn't exactly lead us to the source of the problem, it was the iloop on Linux that pointed directly at the cause.)
Comment 27•8 years ago
|
||
I'm going also second byron's evaluation: "We're using a function pointer in freed memory using a fairly simple test-case, which means an arbitrary code execution exploit could be mounted." Since it's now known to be easy to force the UAF, and it's a function-pointer UAF, I expect a practical attack could be constructed.
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(docfaraday)
Comment 28•8 years ago
|
||
I can take it in the 48.0.1 release but I need a sec approval today.
Comment 29•8 years ago
|
||
(In reply to Sylvestre Ledru [:sylvestre] from comment #28)
> I can take it in the 48.0.1 release but I need a sec approval today.
Al -- Just want to make sure this is at the top of your queue for this morning. Thanks.
Flags: needinfo?(abillings)
Comment 30•8 years ago
|
||
Comment on attachment 8779833 [details] [diff] [review]
Copy this the hard way
sec-approval for this then!
Flags: needinfo?(abillings)
Attachment #8779833 -
Flags: sec-approval? → sec-approval+
Assignee | ||
Comment 31•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/eb4a4b7bd0497ba8a4868ccbcd04036e63145a26
Bug 1280443: Copy this the hard way. r=drno
Comment 32•8 years ago
|
||
Comment 33•8 years ago
|
||
Comment on attachment 8779833 [details] [diff] [review]
Copy this the hard way
Let's take it in the 48.0.1 dot release
Attachment #8779833 -
Flags: approval-mozilla-release?
Attachment #8779833 -
Flags: approval-mozilla-release+
Attachment #8779833 -
Flags: approval-mozilla-beta?
Attachment #8779833 -
Flags: approval-mozilla-beta+
Attachment #8779833 -
Flags: approval-mozilla-aurora?
Attachment #8779833 -
Flags: approval-mozilla-aurora+
Status: REOPENED → RESOLVED
Closed: 8 years ago → 8 years ago
Resolution: --- → FIXED
Updated•8 years ago
|
Group: media-core-security → core-security-release
Comment 35•8 years ago
|
||
Added in the 48.0.1 release notes: "Fix a crash in WebRTC".
relnote-firefox:
--- → 48+
Updated•8 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•