Closed Bug 1197183 Opened 9 years ago Closed 9 years ago

crash in js::TenuringTracer::moveToTenured(JSObject*)

Categories

(Core :: JavaScript: GC, defect, P3)

ARM
Gonk (Firefox OS)
defect

Tracking

()

RESOLVED WORKSFORME
mozilla43
blocking-b2g 2.5?
Tracking Status
firefox43 --- fixed
b2g-master --- fixed

People

(Reporter: martijn.martijn, Assigned: terrence)

References

Details

(Keywords: crash, qablocker, Whiteboard: [platform])

Crash Data

Attachments

(1 file)

I got this crash on the Flame, using 319MB memory: Build ID 20150821030209 Gaia Revision c6705f739fb605031eb2a0b943ba55c64bee5a03 Gaia Date 2015-08-20 14:36:40 Gecko Revision https://hg.mozilla.org/mozilla-central/rev/095988abdc560bf8ba07a94a425c6922a3e9bfd6 Gecko Version 43.0a1 Device Name flame Firmware(Release) 4.4.2 Firmware(Incremental) eng.cltbld.20150727.063909 Firmware Date Mon Jul 27 06:39:20 EDT 2015 Bootloader L1TC000118D0 This was happening while trying to run the test_delete_contact.py test: http://mxr.mozilla.org/gaia/source/tests/python/gaia-ui-tests/gaiatest/tests/functional/contacts/test_delete_contact.py (this command: adb forward tcp:2828 tcp:2828 && DEVICE_DEBUG=1 && NO_LOCK_SCREEN=1 && gaiatest --type=b2g+smoketest-dsds --address=localhost:2828 --testvars /Users/mwargers/B2G/testvars_home.json --gecko-log=/Users/mwargers/B2G/gecko.log --restart --log-mach=bug1099985.log --log-mach-level=debug tests/python/gaia-ui-tests/gaiatest/tests/functional/contacts/test_delete_contact.py ) I've now run the test again and this time, it didn't cause a crash, so I guess it's intermittently. This bug was filed from the Socorro interface and is report bp-8efb9818-35be-4d28-9c2d-0773b2150821. ============================================================= 0 libxul.so js::TenuringTracer::moveToTenured(JSObject*) js/src/gc/Heap.h 1 libxul.so js::gc::StoreBuffer::MonoTypeBuffer<js::gc::StoreBuffer::CellPtrEdge>::trace(js::gc::StoreBuffer*, js::TenuringTracer&) js/src/gc/Marking.cpp 2 libxul.so js::Nursery::collect(JSRuntime*, JS::gcreason::Reason, js::Vector<js::ObjectGroup*, 0u, js::SystemAllocPolicy>*) js/src/gc/StoreBuffer.h 3 libxul.so js::gc::GCRuntime::minorGCImpl(JS::gcreason::Reason, js::Vector<js::ObjectGroup*, 0u, js::SystemAllocPolicy>*) js/src/jsgc.cpp 4 libxul.so js::gc::GCRuntime::evictNursery(JS::gcreason::Reason) js/src/gc/GCRuntime.h 5 libxul.so js::gc::GCRuntime::gcCycle(bool, js::SliceBudget&, JS::gcreason::Reason) js/src/jsgc.cpp 6 libxul.so js::gc::GCRuntime::collect(bool, js::SliceBudget, JS::gcreason::Reason) js/src/jsgc.cpp 7 libxul.so js::gc::GCRuntime::gcSlice(JS::gcreason::Reason, long long) js/src/jsgc.cpp 8 libxul.so js::gc::GCRuntime::gcIfRequested(JSContext*) js/src/jsgc.cpp 9 libxul.so js::gc::GCRuntime::gcIfNeededPerAllocation(JSContext*) js/src/gc/Allocator.cpp 10 libxul.so js::Shape* js::Allocate<js::Shape, (js::AllowGC)1>(js::ExclusiveContext*) js/src/gc/Allocator.cpp
Ok, I just saw it again with this stacktrace with running test_import_edit_export_contact.py multiple times. Also on Jenkins we see quite a few crash report pages showing up in various tests. I'm pretty sure this is going to be a topcrash for the Flame device soon.
[Blocking Requested - why for this release]: I'm pretty sure this will become a topcrash.
blocking-b2g: --- → 2.5?
I guess this depends on a fix for bug 1196210.
Depends on: 1196210
Adding qablocker keyword so we can track this in daily bulletin page With the latest build on flame, automation fails to connect every time due to this bug.
Keywords: qablocker
I've stopped using the latest builds, they are causing too many failures, because of this bug.
This prevents the smoketests from being executed properly.
blocking-b2g: 2.5? → 2.5+
Keywords: smoketest
Component: JavaScript Engine → JavaScript: GC
Should be fixed now by bug 1196210.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
The fix for bug 1196210 went in, but I don't think this was picked up by the Flame builds yet, the latest build has this Gecko revision: https://hg.mozilla.org/mozilla-central/rev/f61c3cc0eb8b7533818e7379ccc063b611015d9d ..which seems just before the fix for bug 1196210.
Assignee: nobody → terrence
Target Milestone: --- → mozilla43
Looks like this wasn't fixed by bug 1196210. https://crash-stats.mozilla.com/report/index/13a5bdca-4a99-4978-b6ef-e72712150828 Build ID 20150827030202 Gaia Revision 14f32ddf49e9c1f2b30c391a26ba2dc867e948c1 Gaia Date 2015-08-26 23:28:10 Gecko Revision https://hg.mozilla.org/mozilla-central/rev/f8086bd3c84fc1a42c3625cf3cc2253f0a5e8cfd Gecko Version 43.0a1 Device Name flame Firmware(Release) 4.4.2 Firmware(Incremental) eng.cltbld.20150727.063909 Firmware Date Mon Jul 27 06:39:20 EDT 2015 Bootloader L1TC000118D0 I got this by running test_brick_verification.py on the Flame and repeating 10 times. At the 10th repeat, I got this crash. Terence, could you perhaps look at this?
Status: RESOLVED → REOPENED
Flags: needinfo?(terrence)
Resolution: FIXED → ---
Yeah, this is not a crash that bug 1196210 could have conceivably caused. Nothing much has landed in the GC in this timeframe, so the root cause is likely new heap corruption elsewhere that the GC is tripping over. I haven't touched my flame in almost a year, so I'll need to dust it off and update my toolchain. Even then, debugging these sorts of crashes is generally extremely difficult without rr. It's quite likely that bisection on inbound would be quicker and less painful, so I'd like to exhaust that first.
Yes, well, this is not happening a lot anymore, currently. Only occasionally. I got the impression it happened more often when I reported this. At this point I would say it's not a qablocker anymore. It's only interfering in a couple of tests (randomly), while doing a test run.
Still worth investigating, as there's clearly a bug somewhere and we can reproduce it fairly reliably.
This is happening more frequently on today's nightly build. It happens right after the restart in my case. Build ID 20150901030202 Gaia Revision c80e8ff25425b007181fd6e3de0500a0358fab37 Gaia Date 2015-08-31 16:35:09 Gecko Revision https://hg.mozilla.org/mozilla-central/rev/cafb1c90f794a73100a8f0afb9fe3301df0f2bde Gecko Version 43.0a1 Device Name flame Firmware(Release) 4.4.2 Firmware(Incremental) eng.cltbld.20150901.083711 Firmware Date Tue Sep 1 08:37:24 EDT 2015 Bootloader L1TC000118D0 What's interesting is that now about 30+ crash reports are submitted at the same time, all with this error: -rw-r----- root root 49 2015-09-01 11:37 bp-02a38fdf-893b-41f0-b399-1ad0e2150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-099edbfd-aa86-4a00-9392-3fdd82150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-1720fce3-1cdf-419e-bf88-540452150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-1bc6eb9b-41c0-4b8d-9336-4a45a2150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-2426ed37-e500-485b-ab5b-8d6502150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-2af3d8cd-42b5-47cf-a2e5-cf1912150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-35dec1cb-527d-4621-9dab-9486d2150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-4d67b990-9f91-44a7-b26b-5ac0d2150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-53585d4f-19a9-4895-8f0b-80d532150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-57024604-bdd2-48ca-a85c-783422150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-59ffd933-405b-4c03-98fa-5c9182150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-618d8864-a5d9-4cbc-b3b3-9c3bf2150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-74d095e9-0478-4bd8-9066-fcd642150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-79da94b3-ada4-4313-9220-9efd02150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-7f25b27a-12ec-4508-97e4-0bd552150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-82dd8cfb-02ff-4545-afcf-6f58f2150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-83606958-a1ca-4513-a441-3c2822150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-8f9f1fb2-6894-48f7-9968-5ac2c2150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-977bb201-93ba-4cd8-a450-7152a2150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-ada67033-fb41-462f-a068-2f4cf2150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-b5993371-58ff-4854-bb89-227ce2150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-bbdc2cc7-dd61-4676-af54-9ed662150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-be55a5ef-f1a7-4f0b-9fc2-149b02150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-c5478595-c060-4e51-b015-d17d12150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-cee50fcc-77d7-4453-a62e-2b4712150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-d1cc984a-0fba-4df2-aae4-34b342150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-d3c79147-bdd8-4f5b-95ca-b18222150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-d61f1c50-a77f-4f81-9083-6694a2150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-eab87639-898d-4b00-a1a8-9c25d2150901.txt -rw-r----- root root 49 2015-09-01 11:37 bp-f55396b5-ff8c-4593-9d83-11cc82150901.txt
I hit this as well today, after a shallow flash on Flame and launching the Settings app.
OS: Android → Gonk (Firefox OS)
Priority: -- → P1
Hardware: Unspecified → ARM
QA Whiteboard: [QAnalyst-Triage+]
Keywords: smoketest
I got this today while streaming a video clip in private mode on Aries 2.5. https://crash-stats.mozilla.com/report/index/b949b876-27a1-4b34-a478-4b9c92150924 Device: Aries Master Build ID: 20150924111215 Gaia: 4bb17b24620818cbda0ba0c0d69e0ce3f914e1b7 Gecko: 001942e4617b2324bfa6cdfb1155581cbc3f0cc4 Gonk: 2916e2368074b5383c80bf5a0fba3fc83ba310bd Version: 44.0a1 (Master) Firmware Version: D5803_23.1.A.1.28_NCB.ftf User Agent: Mozilla/5.0 (Mobile; rv:44.0) Gecko/44.0 Firefox/44.0
Attached file logs.txt (deleted) —
Crash Signature: [@ js::TenuringTracer::moveToTenured(JSObject*)] → [@ js::TenuringTracer::moveToTenured(JSObject*)] [@ js::TenuringTracer::moveToTenured]
This bug hasn't had any movement for a couple weeks now. Can we please get an updated status as this is a p1 blocker for our 2.5 release and that is less than 3 weeks from now thanks!
Crash Signature: [@ js::TenuringTracer::moveToTenured(JSObject*)] [@ js::TenuringTracer::moveToTenured] → [@ js::TenuringTracer::moveToTenured(JSObject*)]
Crash Signature: [@ js::TenuringTracer::moveToTenured(JSObject*)] → [@ js::TenuringTracer::moveToTenured(JSObject*)] [@ js::TenuringTracer::moveToTenured]
Blocks: TV_Gecko_P2
This is just an OOM. There's nothing we can do here.
Flags: needinfo?(terrence)
So this bug should be WONTFIX then?
Flags: needinfo?(terrence)
No, not WONTFIX as there's a real bug somewhere here -- we just need STR. I think it only got so highly prioritized because it started at the same time as another more frequent bug and we thought it was more frequent than it actually was.
Flags: needinfo?(terrence)
FWIW: I looked at https://crash-stats.mozilla.com/report/index/8f9f1fb2-6894-48f7-9968-5ac2c2150901 The crash address 0x3c looks to me like a null zone pointer. On my 64-bit system: (gdb) p &((JS::Zone*)0)->arenas.freeLists.mArray.mArr[0] $5 = (js::gc::FreeList *) 0x78 On 32-bit ARM, that would be half as much, and 0x78 / 2 == 0x3c. So this is consistent with src->zone() == nullptr, thingKind = 0 (which is an AllocKind of FUNCTION). I'm not sure this helps track it down any, though.
Whiteboard: [platform]
Per crash report, some crashes were reported on Oct 14 [1][2][3] with same signature. I agree with Terrence in comment 21 that we need STR for troubleshooting. If this has low possibility to reproduce, we could manage this for v2.5 release. Martijn, could we try to provide STR? Thanks. [1] https://crash-stats.mozilla.com/report/index/ad187378-f1dd-45ae-90bd-3bc0e2151014 [2] https://crash-stats.mozilla.com/report/index/77dc57ed-a2af-4f99-8a09-618ef2151014 [3] https://crash-stats.mozilla.com/report/index/b4befbcd-89ca-44da-b93d-010bf2151014
Flags: needinfo?(martijn.martijn)
Keywords: qawanted
No, currently, I'm not seeing this crash anymore in current builds.
Flags: needinfo?(martijn.martijn)
Bobby, those crashes all use builds from quite some time ago. The most recent build ID is from 9-20-2015 with the others being even earlier. I went through the 10 most recent crashes and none were from builds after 9-20. Removing the qawanted for now. If this occurs again with builds more recent than that please add the steps-wanted tag.
Flags: needinfo?(ktucker)
Flags: needinfo?(bchien)
Keywords: qawanted
Thanks, Jayme. I marked this bug as p3 to keep monitor in 2.5.
Flags: needinfo?(bchien)
Priority: P1 → P3
Flags: needinfo?(ktucker)
Via email, Terrence told me: "we need STR before we can do anything more than speculate about the proximate cause -- itself not likely to point to the actual, actionable root cause. That said, later comments appear to indicate that the problem has not recurred since late September. Are we even still seeing this?" Comment 6 seems to indicate this isn't happening anymore (or at least for the past month).
[Blocking Requested - why for this release]: QAnalysts, Can you confirm if this is still happening? Thanks
blocking-b2g: 2.5+ → 2.5?
Keywords: qawanted
I tried to reproduce this issue today with no luck on the latest Aries build. The crash hasn't occurred on a build more recent than September 21st so I believe that it no longer occurs. Environmental Variables: Device: Aries 2.5 BuildID: 20151026111709 Gaia: a677ddd3aa3a81058775938bd56008d96dbc78b0 Gecko: 5ca03a00d26823ce91ee0eaa2937bed605bd53c1 Gonk: 2916e2368074b5383c80bf5a0fba3fc83ba310bd Version: 44.0a1 (2.5) Firmware Version: D5803_23.1.A.1.28_NCB.ftf User Agent: Mozilla/5.0 (Mobile; rv:44.0) Gecko/44.0 Firefox/44.0
Flags: needinfo?(mpotharaju)
Flags: needinfo?(ktucker)
Thanks for confirming this Jayme.
Flags: needinfo?(mpotharaju)
Bobby/Martijn, Its not being reproduced. Can you close this as WFM? Please re-open if you encounter this issue again. Thanks
Status: REOPENED → UNCONFIRMED
Ever confirmed: false
Flags: needinfo?(bchien)
Per comment 29, resolved as workforme.
Status: UNCONFIRMED → RESOLVED
Closed: 9 years ago9 years ago
Flags: needinfo?(bchien)
Resolution: --- → WORKSFORME
Flags: needinfo?(ktucker)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: