Closed Bug 1191492 Opened 9 years ago Closed 9 years ago

AddressSanitizer: heap-buffer-overflow during incremental GC

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla42

Tracking Flags:

Tracking

Status

firefox40

---

unaffected

firefox41

---

unaffected

firefox42

---

fixed

firefox-esr38

---

unaffected

b2g-v2.0

---

unaffected

b2g-v2.0M

---

unaffected

b2g-v2.1

---

unaffected

b2g-v2.1S

---

unaffected

b2g-v2.2

---

unaffected

b2g-v2.2r

---

unaffected

b2g-master

---

fixed

Tracking

Status

firefox40

firefox41

firefox42

firefox-esr38

b2g-v2.0

b2g-v2.0M

b2g-v2.1

b2g-v2.1S

b2g-v2.2

b2g-v2.2r

b2g-master

relnote-firefox

thunderbird_esr102

thunderbird_esr115

firefox-esr102

firefox-esr115

firefox117

firefox118

firefox119

firefox125

firefox126

firefox123

firefox124

People

(Reporter: bc, Assigned: bzbarsky)

References

(Blocks 1 open bug)

Details

(Keywords: csectype-bounds, regression, sec-high, Whiteboard: [asan][post-critsmash-triage][b2g-adv-main2.5-])

Attachments

(3 files)

cnn-nightly-asan-no-flash.log 9 years ago Bob Clary [:bc] (inactive) (deleted), text/plain		Details
new error with today's cnn 9 years ago Bob Clary [:bc] (inactive) (deleted), text/plain		Details
cnn-3e51753a099f.log 9 years ago Bob Clary [:bc] (inactive) (deleted), text/plain		Details

Bob Clary [:bc] (inactive)

Reporter

Description

•

9 years ago

Attached file cnn-nightly-asan-no-flash.log (deleted) — Details

1. Use Spider to scan cnn.com or load ~20 urls from cnn.com manually with Flash disabled or click-to-play (see Bug 1191489). ;-) See the attached log for the urls loaded in this scan. The actual crashing url will change depending on the phase of the moon, etc. 2. AddressSanitizer: heap-buffer-overflow on address 0x6170005da608 at pc 0x7fa71f769ebb bp 0x7ffcfea05450 sp 0x7ffcfea05448 WRITE of size 8 at 0x6170005da608 thread T0 #0 0x7fa71f769eba in remove /builds/slave/m-cen-l64-asan-000000000000000/build/src/obj-firefox/js/src/../../dist/include/mozilla/LinkedList.h:208 #1 0x7fa71f769eba in unboxedLayout /builds/slave/m-cen-l64-asan-000000000000000/build/src/js/src/vm/UnboxedObject.cpp:278 #2 0x7fa71f769eba in js::ObjectGroup::sweep(js::AutoClearTypeInferenceStateOnOOM*) /builds/slave/m-cen-l64-asan-000000000000000/build/src/js/src/vm/TypeInference.cpp:4119

Bob Clary [:bc] (inactive)

Reporter

Comment 1

•

9 years ago

fyi, this was on nightly only and not beta or aurora

status-firefox40: --- → unaffected

status-firefox41: --- → unaffected

Keywords: regression

Nicholas Nethercote [inactive]

Comment 2

•

9 years ago

bc: is this a result from the new Bughunter+ASAN combination?

Flags: needinfo?(bob)

Bob Clary [:bc] (inactive)

Reporter

Comment 3

•

9 years ago

Not yet. It is the result of me testing running asan builds and trying to figure out the steps required to add support to Bughunter. I figured I'd file bugs as I see them during my experimenting rather than wait. One thing for sure I've learned is the need for Bughunter to load multiple urls in order to find GC related issues but that will have to wait for the moment.

Flags: needinfo?(bob)

Bob Clary [:bc] (inactive)

Reporter

Updated

•

9 years ago

Attachment #8643914 - Attachment mime type: text/x-log → text/plain

Andrew McCreight [:mccr8]

Comment 4

•

9 years ago

Bug 1191465 is another report of similar-sounding GC sweeping crashes.

Andrew McCreight [:mccr8]

Updated

•

9 years ago

Flags: needinfo?(terrence)

Andrew McCreight [:mccr8]

Updated

•

9 years ago

Blocks: 1191465

Andrew McCreight [:mccr8]

Comment 5

•

9 years ago

The stack in comment 0 involves unboxed object, but of course that could be incidental.

Terrence Cole [:terrence]

Comment 6

•

9 years ago

This stack is right in the middle of UnboxedObject, which might implicate fdf5862a8c00 as well. Flagging Brian to see if he can spot anything.

Flags: needinfo?(bhackett1024)

Andrew McCreight [:mccr8]

Comment 7

•

9 years ago

I'm marking this sec-high because whatever is happening it seems very discoverable, and the GC is involved so who knows what sort of badness is happening.

Keywords: sec-high

Andrew McCreight [:mccr8]

Comment 8

•

9 years ago

Bob, would it be possible for you to bisect this a little using inbound builds? Thanks.

Flags: needinfo?(bob)

Bob Clary [:bc] (inactive)

Reporter

Comment 9

•

9 years ago

Attached file new error with today's cnn (deleted) — Details

fyi, I've started bisecting and decided to make sure I could still reproduce with the same build I used yesterday (2015-08-05) and found this new error which is still GC sweep related but different from what I found yesterday. I should have an inbound range soon.

Terrence Cole [:terrence]

Comment 10

•

9 years ago

(In reply to Bob Clary [:bc:] from comment #9) > Created attachment 8644315 [details] > new error with today's cnn > > fyi, I've started bisecting and decided to make sure I could still reproduce > with the same build I used yesterday (2015-08-05) and found this new error > which is still GC sweep related but different from what I found yesterday. I > should have an inbound range soon. \o/ With a signature that moves around this much, is hard to repro locally, and has no obvious candidates in the regression range, bisection is basically the only way we're ever going to find the cause.

Bob Clary [:bc] (inactive)

Reporter

Comment 11

•

9 years ago

fwiw, on my bisection I saw only the original failure. Good http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux64-asan/1438624016/ Bad http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux64-asan/1438624017/ https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=8ad982618f06&tochange=2f16fb18314a -> Bug 1181908 bz is on vacation, so can someone else decide if this makes sense?

Flags: needinfo?(bob)

Terrence Cole [:terrence]

Comment 12

•

9 years ago

Not a ton, but it's as good a candidate as anything else in our short list and /much/ more likely than the majority of patches that don't touch spidermonkey at all. I vote that we back out and see if the crashes go away on tip.

Flags: needinfo?(terrence)

Ryan VanderMeulen [:RyanVM]

Comment 13

•

9 years ago

Bug 1181908 has been backed out from m-c and I'll be triggering new nightlies shortly. Will hold off on resolving this bug until we know for sure that it's fixed.

Bob Clary [:bc] (inactive)

Reporter

Comment 14

•

9 years ago

Ryan, is http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux64-asan/1438882244/ the correct build? I can still reproduce the asan error with it.

Flags: needinfo?(ryanvm)

Ryan VanderMeulen [:RyanVM]

Comment 15

•

9 years ago

Looks like it

Flags: needinfo?(ryanvm)

Bob Clary [:bc] (inactive)

Reporter

Comment 16

•

9 years ago

Ok, Under the assumption that the bisection using a 1-level deep scan of cnn.com was insufficient to deterministically find the crash I'll start a bisection using two-level scan. This will take a some time however.

Bob Clary [:bc] (inactive)

Reporter

Comment 17

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=d66d293b4498 did not help where I backed out http://hg.mozilla.org/mozilla-central/rev/fdf5862a8c00 still reproduced the error. I'll continue to bisection again.

Bob Clary [:bc] (inactive)

Reporter

Comment 18

•

9 years ago

(In reply to Bob Clary [:bc:] from comment #17) > https://treeherder.mozilla.org/#/jobs?repo=try&revision=d66d293b4498 did not > help where I backed out > http://hg.mozilla.org/mozilla-central/rev/fdf5862a8c00 still reproduced the > error. I'll continue to bisection again. I think I tested the wrong build here. I redownloaded and retested with a 1-level scan and *did not* reproduce the error. I'll do several more then a 2-level scan to confirm but it does look like reverting http://hg.mozilla.org/mozilla-central/rev/fdf5862a8c00 fixed the issue.

Bob Clary [:bc] (inactive)

Reporter

Comment 19

•

9 years ago

Running the 1-level scan 10 times did not reproduce this error though it did find another unrelated error for HashKey. reverting http://hg.mozilla.org/mozilla-central/rev/fdf5862a8c00 does fix this bug.

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 20

•

9 years ago

I backed that out in https://hg.mozilla.org/mozilla-central/rev/d6ea652c5799 I'll leave it to you to determine if that's enough to close this bug. :)

Flags: needinfo?(bob)

Andrew McCreight [:mccr8]

Comment 21

•

9 years ago

(In reply to Bob Clary [:bc:] from comment #19) > Running the 1-level scan 10 times did not reproduce this error though it did > find another unrelated error for HashKey. reverting > http://hg.mozilla.org/mozilla-central/rev/fdf5862a8c00 does fix this bug. Yeah, that makes a lot more sense given the stacks.

Bob Clary [:bc] (inactive)

Reporter

Comment 22

•

9 years ago

Attached file cnn-3e51753a099f.log (deleted) — Details

I tested with the build in http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux64-asan/latest/ from http://hg.mozilla.org/mozilla-central/rev/3e51753a099f and still can reproduce this.

Bob Clary [:bc] (inactive)

Reporter

Comment 23

•

9 years ago

fwiw, I tried and failed to get a local build of asan so I could bisect locally and am going back to bisecting inbound asan builds from tinderbox again in the hope I can get the real regressor. It will take a while though, so don't wait up.

Bob Clary [:bc] (inactive)

Reporter

Comment 24

•

9 years ago

I confirmed the regression range by testing by scanning cnn 1 level deep 10 times for each iteration that the regression range is: Good http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux64-asan/1438624016/ Bad http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux64-asan/1438624017/ https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=8ad982618f06&tochange=2f16fb18314a -> Bug 1181908 which mccr8 confirmed in bug 1191465. I don't know why I failed to confirm this with try builds.

Flags: needinfo?(bob)

Andrew McCreight [:mccr8]

Comment 25

•

9 years ago

Regressor has been identified, so I'm clearing the ni? on bhackett. Jesse and Decoder, we had a massive regression (this and all of the many crashes in bug 1191465) from bug 1181908, apparently caused by content JS. It would be good to figure out why none of the fuzzers seemed to have detected this. That other bug messes with JS compiler options, so maybe something more needs to be exposed to the fuzzers? Of course, it will be easier to figure out once somebody knows why that bug caused these crashes.

Blocks: 1181908

Flags: needinfo?(bhackett1024)

Andrew McCreight [:mccr8]

Updated

•

9 years ago

Flags: needinfo?(choller)

Andrew McCreight [:mccr8]

Updated

•

9 years ago

Flags: needinfo?(jruderman)

Bob Clary [:bc] (inactive)

Reporter

Comment 26

•

9 years ago

I retested scanning cnn 1 level deep with today's mozilla central asan build and could not reproduce the error. fixed by the backout of Bug 1181908. We could probably unhide this and dupe to Bug 1181908 if you want.

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

Christian Holler (:decoder)

Comment 27

•

9 years ago

I have found this crash with fuzzing as well and I remember trying to get a proper test for it. The problem was that the test was very intermittent and failed most reduction attempts. I have 29 reports for this particular bug in FuzzManager but none of them reproduced properly for me. Maybe we could improve this somehow if we figure out why it's so hard to hit.

Flags: needinfo?(choller)

Jesse Ruderman

Comment 28

•

9 years ago

DOMFuzz is partially offline at the moment. Did jsfunfuzz hit this, and did it create any reduced testcases?

Flags: needinfo?(jruderman) → needinfo?(gary)

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Comment 29

•

9 years ago

(In reply to Jesse Ruderman from comment #28) > DOMFuzz is partially offline at the moment. Did jsfunfuzz hit this, and did > it create any reduced testcases? Nope, certainly not on a large enough volume to generate notice.

Flags: needinfo?(gary)

Julian Seward [:jseward]

Comment 30

•

9 years ago

Can we get this TSan'd? If it's that intermittent, it might be a race.

Flags: needinfo?(nfroyd)

Andrew McCreight [:mccr8]

Comment 31

•

9 years ago

For what it is worth, bug 1191628 is a decoder-reported intermittent test case that is almost certainly another version of this issue. I'm not sure how hard it would be to run TSan on the shell.

Nathan Froyd [:froydnj]

Comment 32

•

9 years ago

(In reply to Julian Seward [:jseward] from comment #30) > Can we get this TSan'd? If it's that intermittent, it might be a race. It looks like we know what's going on here without TSan's involvement.

Flags: needinfo?(nfroyd)

Ryan VanderMeulen [:RyanVM]

Updated

•

9 years ago

Assignee: nobody → bzbarsky

status-b2g-v2.0: --- → unaffected

status-b2g-v2.0M: --- → unaffected

status-b2g-v2.1: --- → unaffected

status-b2g-v2.1S: --- → unaffected

status-b2g-v2.2: --- → unaffected

status-b2g-v2.2r: --- → unaffected

status-b2g-master: --- → fixed

status-firefox42: affected → fixed

status-firefox-esr38: --- → unaffected

Target Milestone: --- → mozilla42

BMO Automation

Updated

•

9 years ago

Group: core-security → core-security-release

Matt Wobensmith [:mwobensmith][:matt:]

Updated

•

9 years ago

Whiteboard: [asan] → [asan][post-critsmash-triage]

Daniel Veditz [:dveditz]

Updated

•

9 years ago

Group: core-security-release

Christiane Ruetten [INACTIVE]

Updated

•

9 years ago

Whiteboard: [asan][post-critsmash-triage] → [asan][post-critsmash-triage][b2g-adv-main2.5-]

Daniel Veditz [:dveditz]

Updated

•

8 years ago

Keywords: csectype-bounds

Sylvestre Ledru [:Sylvestre]

Updated

•

5 years ago

Blocks: asan-maintenance

You need to log in before you can comment on or make changes to this bug.