Closed
Bug 1421963
Opened 7 years ago
Closed 7 years ago
Intermittent GECKO(3202) | ==3255==ERROR: AddressSanitizer: heap-use-after-free on address 0x61d000a582b4 at pc 0x7f1472493c22 bp 0x7f14688f2d30 sp 0x7f14688f2d28
Categories
(Core :: WebRTC: Audio/Video, defect, P1)
Core
WebRTC: Audio/Video
Tracking
()
RESOLVED
FIXED
mozilla59
People
(Reporter: intermittent-bug-filer, Assigned: jesup)
References
(Blocks 1 open bug)
Details
(Keywords: csectype-uaf, intermittent-failure, sec-high, Whiteboard: [post-critsmash-triage][adv-main59+])
Attachments
(1 file)
(deleted),
patch
|
drno
:
review+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
Comment 1•7 years ago
|
||
Nico, is this a dupe of the other one you're looking at?
Group: media-core-security
Component: Build Config → WebRTC: Audio/Video
Flags: needinfo?(na-g)
Comment 2•7 years ago
|
||
Ryan, this doesn't look related to me, this appears to be DataChannels.
Flags: needinfo?(na-g)
Comment 3•7 years ago
|
||
Should be fixed by backout of bug 1297418. Please reopen if you think otherwise.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
Assignee | ||
Comment 4•7 years ago
|
||
Reopening to track UAF issues in the new library, which may also be relevant to the current impl and cause the existing security crashes.
(copied from an email):
I *finally* reproed it with ASAN + rr, so I could look into why it happens.
The memory was freed due to setsockopt() called from SendOutgoingStreamReset() in DataChannels.cpp. This is freed on "MainThread" aka thread 0.
The crash happens when it's accessed on thread 6 ("Socket Thread") in response to a packet input.
We appear to be getting a stream reset while we're in the middle of closing it from our end.
The ordering is apparently (per rr):
t0:
DataChannel::Close() {
MutexAutoLock lock(mLock)
....
setsockopt()
...
t6:
<packet input>
usrsctp_conninput() {
...
sctp_reset_out_streams() {
...
sctp_ulp_notify(SCTP_NOTIFY_STR_RESET_SEND,...) {
...
sctp_notify_stream_reset() {
...
sctp_add_to_readq() {
...
sctp_invoke_recv_callback() {
...
inp->recv_callback() {
...
mozilla::receive_cb() {
...
DataChannelConnection::ReceiveCallback() {
MutexAutoLock lock(mLock);
<wait on lock>
t0:
<finish freeing buffers>
<release lock in Close()>
....
t6:
<wake up with lock>
HandleStreamResetEvent()
finish sctp_reset_out_streams();
call sctp_reset_clear_pending() {
crash
Now to figure out what's the right thing to do here, and if this is in some what a hole in the library or in how Close() is implemented... Note that the sctp library doesn't lock access while it's invoking the recv callback or running reset_out_streams(); the setsockopt() does lock in sctp.
The lock in the ReceiveCallback just makes this more likely/repeatable; if you wrote code that called setsockopt() on one thread, and it interleaved with conninput() and you simply task-switched at the right points, you'd have the same problem. Grabbing the global lock before calling conninput instead of in ReceiveCallback() would be another option, though I worry about possible deadlocks.
Another solution would be to force Close() to proxy the setsockopt() to the Socket thread, though that begs the question of what else needs to be proxied.
Is it supposed to be allowed to call conninput() on a different thread than other calls like setsockopt()? (or is it allowed with external locking?) In general I thought the API was supposed to be threadsafe.
Phew... that wasn't easy (and solutions aren't obvious either)
Further notes: it appears moving the lock to SctpDtlsInput() (which calls usrsctp_conninput()) works (one minor mod to assert lock held in the threshold code instead of grabbing it). This might be worth doing in beta 58 if we believe that's safe deadlock-wise. it's tough to believe it wouldn't be safe in general; there are only a couple of callbacks into our code, and this acts as a giant lock around the library input code.
Status: RESOLVED → REOPENED
Flags: needinfo?(tuexen)
Flags: needinfo?(lennart.grahl)
Resolution: DUPLICATE → ---
Assignee | ||
Comment 5•7 years ago
|
||
sec-high since this may affect trunk
status-firefox57:
--- → ?
status-firefox58:
--- → ?
status-firefox59:
--- → disabled
status-firefox-esr52:
--- → ?
Keywords: csectype-uaf,
sec-high
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → rjesup
Assignee | ||
Updated•7 years ago
|
Rank: 7
Priority: P5 → P1
Assignee | ||
Comment 6•7 years ago
|
||
Lock around the entire input processing of the sctp library, since there appears to be incomplete locking within the library. Assert we own the lock on the callbacks from the library instead of taking the lock. Ran 1000 iterations of the failing test with no problems; before it would fail in 5-50 iterations. This may also solve bug 1400563 and related issues if applied on the old library, since that seems related to resets also.
Attachment #8934387 -
Flags: review?(tuexen)
Attachment #8934387 -
Flags: review?(drno)
Comment 7•7 years ago
|
||
Comment on attachment 8934387 [details] [diff] [review]
lock around SCTP input processing, not just the receive callback
Review of attachment 8934387 [details] [diff] [review]:
-----------------------------------------------------------------
In general the patch looks good to me.
I am worried though that our mochitests for data channels don't cover much of what (else) could go wrong. Dispatching from MainThread to STS seems to the safer option to me, but I understand it's probably harder to get right.
Michael should have the last verdict here I think.
Attachment #8934387 -
Flags: review?(drno) → review+
Updated•7 years ago
|
Flags: needinfo?(lennart.grahl)
Assignee | ||
Updated•7 years ago
|
Comment 8•7 years ago
|
||
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
Assignee | ||
Comment 10•7 years ago
|
||
beta, and 52esr - though I'm tempted to wait for confirmation in beta before uplifting to ESR. Signal is strong enough in beta to know.
Flags: needinfo?(rjesup)
Assignee | ||
Comment 11•7 years ago
|
||
Comment on attachment 8934387 [details] [diff] [review]
lock around SCTP input processing, not just the receive callback
Approval Request Comment
[Feature/Bug causing the regression]: NA
[User impact if declined]: Continued UAFs
[Is this code covered by automated tests?]: yes, but they don't hit it. requires simultaneous closes from both ends, plus luck in timing.
[Has the fix been verified in Nightly?]: no; will watch for crashes after merge date. volume is low but not 0 in nightly. easily verified in beta via crash-stats
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: Not very
[Why is the change risky/not risky?]: risk is only of deadlock. analysis indicates it shouldn't deadlock, and in tests it hasn't.
[String changes made/needed]: none
Attachment #8934387 -
Flags: review?(tuexen) → approval-mozilla-beta?
Updated•7 years ago
|
Tracking this and bug 1400563, sounds like this might fix both bugs.
Assignee | ||
Comment 14•7 years ago
|
||
(In reply to Julien Cristau [:jcristau] from comment #12)
> This still needs sec-approval, right?
This report was for a new checkin that was then backed out; we landed this fix with the new code. we're supposing that it will also fix a series of existing UAF crashes.
For completeness, even though this code is already in Nightly as part of the major SCTP update, I'll ask for s-a
Flags: needinfo?(rjesup)
Assignee | ||
Comment 15•7 years ago
|
||
Comment on attachment 8934387 [details] [diff] [review]
lock around SCTP input processing, not just the receive callback
[Security approval request comment]
How easily could an exploit be constructed based on the patch? Very hard
Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No. Clearly we move where the lock is grabbed, which gives a very indirect idea that there's a locking issue somewhere, but not what it might be or cause.
Which older supported branches are affected by this flaw? all; appears to be a bug in the imported library
If not all supported branches, which bug introduced the flaw?
Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? Trivial; should apply cleanly
How likely is this patch to cause regressions; how much testing does it need? Very unlikely to cause regressions, none reported yet (this is on nightly as part of a bigger sctp update). Only likely regression would be deadlocks
Attachment #8934387 -
Flags: sec-approval?
Assignee | ||
Comment 16•7 years ago
|
||
Asking for sec-approval and beta, but please don't land on beta until I do more checking of crash-stats to see if we see an improvement. The crashes in the field may be related to this, but not the same bug. Creashes in the field are (mostly) due to timeouts, while this bug was directly related to crossing internal-close and external-close-packet. This may point to the problem when a timeout occurs, however.
We may need a different patch to solve the crash-stats failures, or this patch plus another.
Comment 17•7 years ago
|
||
Comment on attachment 8934387 [details] [diff] [review]
lock around SCTP input processing, not just the receive callback
Sec-approval+ and beta approval.
Attachment #8934387 -
Flags: sec-approval?
Attachment #8934387 -
Flags: sec-approval+
Attachment #8934387 -
Flags: approval-mozilla-beta?
Attachment #8934387 -
Flags: approval-mozilla-beta+
Updated•7 years ago
|
Assignee | ||
Comment 18•7 years ago
|
||
Thanks - reminder we don't want to actually uplift this until I do more checks that this problem exists in the older sctp library.
Updated•7 years ago
|
Group: media-core-security → core-security-release
Assignee | ||
Comment 20•7 years ago
|
||
No noted regressions in 59; it doesn't fix the timeout in beta, but it should block the same issue if it exists in 58 (58 doesn't have the updated SCTP library, and sctp_add_to_readq() has been significantly rewritten. It *appears* that the same sort of issue could occur, but it's hard to be 100% sure. Given a lack of in-the-field crashes, I don't think we should risk it with the older SCTP library.
The other bug is worth trying to fix in 58 if we can.
Flags: needinfo?(rjesup)
Comment 21•7 years ago
|
||
I assume that means ESR52 is also wontfix.
Assignee | ||
Updated•7 years ago
|
tracking-firefox58:
+ → ---
Comment 22•7 years ago
|
||
Comment on attachment 8934387 [details] [diff] [review]
lock around SCTP input processing, not just the receive callback
this wasn't uplifted to 58 after all, clearing flag.
Attachment #8934387 -
Flags: approval-mozilla-beta+
Updated•7 years ago
|
Updated•7 years ago
|
Flags: needinfo?(tuexen)
Updated•7 years ago
|
Updated•7 years ago
|
Flags: qe-verify-
Whiteboard: [post-critsmash-triage]
Updated•7 years ago
|
Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][adv-main59+]
Updated•6 years ago
|
Group: core-security-release
Updated•5 years ago
|
Blocks: asan-maintenance
You need to log in
before you can comment on or make changes to this bug.
Description
•