Closed
Bug 1276669
Opened 8 years ago
Closed 8 years ago
Intermittent Main app process exited normally | application crashed [@ NS_ShutdownAtomTable()] after Assertion failure: gUnusedAtomCount == 0, at xpcom/ds/nsAtomTable.cpp:546 (sometimes reported during crashtest 1113005.html)
Categories
(Core :: General, defect, P3)
Core
General
Tracking
()
RESOLVED
FIXED
mozilla54
People
(Reporter: cbook, Assigned: froydnj)
References
(Depends on 1 open bug, )
Details
(Keywords: assertion, crash, intermittent-failure, Whiteboard: [stockwell fixed])
Attachments
(11 files, 1 obsolete file)
(deleted),
patch
|
hsivonen
:
review+
ritu
:
approval-mozilla-aurora+
jcristau
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bzbarsky
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
hsivonen
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
benjamin
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
erahm
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bzbarsky
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
erahm
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
erahm
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
erahm
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
erahm
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
erahm
:
review+
|
Details | Diff | Splinter Review |
https://treeherder.mozilla.org/logviewer.html#?job_id=29017674&repo=mozilla-inbound#L84278
04:15:27 ERROR - REFTEST PROCESS-CRASH | Main app process exited normally | application crashed [@ XUL + 0x96620]
Crash dump filename: /var/folders/t0/q6j3ybpx29q21vrxg9z4zx0800000w/T/tmpv_8bdS.mozrunner/minidumps/5319246F-CA7C-40D5-8C22-ADF9C3EFB716.dmp
04:15:27 INFO - Operating system: Mac OS X
04:15:27 INFO - 10.10.5 14F27
04:15:27 INFO - CPU: amd64
04:15:27 INFO - family 6 model 69 stepping 1
04:15:27 INFO - 4 CPUs
04:15:27 INFO - Crash reason: EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
04:15:27 INFO - Crash address: 0x0
04:15:27 INFO - Process uptime: 440 seconds
04:15:27 INFO - Thread 0 (crashed)
04:15:27 INFO - 0 XUL + 0x96620
04:15:27 INFO - rax = 0x0000000000000000 rdx = 0x00007fff737f91f8
04:15:27 INFO - rcx = 0x0000000000000000 rbx = 0x00007fff737f9c50
04:15:27 INFO - rsi = 0x01936c0001936c00 rdi = 0x0000000105967734
04:15:27 INFO - rbp = 0x00007fff5f23c300 rsp = 0x00007fff5f23c2f0
04:15:27 INFO - r8 = 0x00007fff5f23c2a0 r9 = 0x00007fff72734300
04:15:27 INFO - r10 = 0x00000001009c2ce0 r11 = 0x00000001009c2ca0
04:15:27 INFO - r12 = 0x000000010767a040 r13 = 0x00007fff5f23c328
04:15:27 INFO - r14 = 0x0000000107193d88 r15 = 0x00007fff5f23c318
04:15:27 INFO - rip = 0x0000000100e3b620
04:15:27 INFO - Found by: given as instruction pointer in context
04:15:27 INFO - 1 XUL + 0x121258
04:15:27 INFO - rbp = 0x00007fff5f23c370 rsp = 0x00007fff5f23c310
04:15:27 INFO - rip = 0x0000000100ec6258
04:15:27 INFO - Found by: previous frame's frame pointer
04:15:27 INFO - 2 XUL + 0x38b6b41
04:15:27 INFO - rbp = 0x00007fff5f23c390 rsp = 0x00007fff5f23c380
04:15:27 INFO - rip = 0x000000010465bb41
04:15:27 INFO - Found by: previous frame's frame pointer
04:15:27 INFO - 3 XUL + 0x66e4b3
04:15:27 INFO - rbp = 0x00007fff5f23c3b0 rsp = 0x00007fff5f23c3a0
04:15:27 INFO - rip = 0x00000001014134b3
04:15:27 INFO - Found by: previous frame's frame pointer
04:15:27 INFO - 4 XUL + 0x38b7686
04:15:27 INFO - rbp = 0x00007fff5f23d070 rsp = 0x00007fff5f23c3c0
04:15:27 INFO - rip = 0x000000010465c686
04:15:27 INFO - Found by: previous frame's frame pointer
04:15:27 INFO - 5 plugin-container + 0x12cc
04:15:27 INFO - rbp = 0x00007fff5f23d0b0 rsp = 0x00007fff5f23d080
04:15:27 INFO - rip = 0x00000001009c32cc
04:15:27 INFO - Found by: previous frame's frame pointer
04:15:27 INFO - 6 plugin-container + 0xcd4
04:15:27 INFO - rbp = 0x00007fff5f23d0c8 rsp = 0x00007fff5f23d0c0
04:15:27 INFO - rip = 0x00000001009c2cd4
04:15:27 INFO - Found by: previous frame's frame pointer
04:15:27 INFO - Thread 1
04:15:27 INFO - 0 libsystem_kernel.dylib!kevent64 + 0xa
04:15:27 INFO - rax = 0x0000000002000171 rdx = 0x0000000000000001
04:15:27 INFO - rcx = 0x00000001074985b8 rbx = 0x0000000000000000
04:15:27 INFO - rsi = 0x00007fff73f2c730 rdi = 0x0000000000000004
04:15:27 INFO - rbp = 0x0000000107498640 rsp = 0x00000001074985b8
04:15:27 INFO - r8 = 0x0000000000000001 r9 = 0x0000000000000000
04:15:27 INFO - r10 = 0x00000001074985d0 r11 = 0x0000000000000246
04:15:27 INFO - r12 = 0x00007fff8f0d7258 r13 = 0x00007fff73f2bec0
04:15:27 INFO - r14 = 0x00000001074985d0 r15 = 0x0000000107498600
04:15:27 INFO - rip = 0x00007fff913e0232
04:15:27 INFO - Found by: given as instruction pointer in context
04:15:27 INFO - 1 libdispatch.dylib!_dispatch_mgr_thread + 0x34
04:15:27 INFO - rbp = 0x0000000107498e50 rsp = 0x0000000107498650
04:15:27 INFO - rip = 0x00007fff8f0b7a6a
04:15:27 INFO - Found by: previous frame's frame pointer
04:15:27 INFO - Thread 2
04:15:27 INFO - 0 libsystem_kernel.dylib!mach_msg_trap + 0xa
04:15:27 INFO - rax = 0x000000000100001f rdx = 0x0000000000000000
04:15:27 INFO - rcx = 0x0000000109958df8 rbx = 0x0000000000000000
04:15:27 INFO - rsi = 0x0000000000000002 rdi = 0x0000000107618800
04:15:27 INFO - rbp = 0x0000000109958e40 rsp = 0x0000000109958df8
04:15:27 INFO - r8 = 0x0000000000002403 r9 = 0x0000000000000000
04:15:27 INFO - r10 = 0x000000000000041c r11 = 0x0000000000000202
04:15:27 INFO - r12 = 0x000000000000041c r13 = 0x0000000000000000
04:15:27 INFO - r14 = 0x0000000107618800 r15 = 0x0000000000002403
04:15:27 INFO - rip = 0x00007fff913da4de
04:15:27 INFO - Found by: given as instruction pointer in context
04:15:27 INFO - 1 XUL + 0x642f0d
04:15:27 INFO - rbp = 0x000000010
Reporter | ||
Comment 1•8 years ago
|
||
could this be a regression from bug 1275755 ?
Flags: needinfo?(nfroyd)
Assignee | ||
Comment 2•8 years ago
|
||
(In reply to Carsten Book [:Tomcat] from comment #1)
> could this be a regression from bug 1275755 ?
Looks like. How frequent is this?
Flags: needinfo?(nfroyd)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Reporter | ||
Comment 5•8 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #2)
> (In reply to Carsten Book [:Tomcat] from comment #1)
> > could this be a regression from bug 1275755 ?
>
> Looks like. How frequent is this?
not super frequent but happens i would say
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 11•8 years ago
|
||
This looks really frequent on OSX debug AFAICT.
Flags: needinfo?(nfroyd)
Summary: Intermittent Main app process exited normally | application crashed [@ XUL + 0x96620] | 04:15:26 INFO - Assertion failure: gUnusedAtomCount == 0, at /builds/slave/m-in-m64-d-0000000000000000000/build/src/xpcom/ds/nsAtomTable.cpp:546 → Intermittent Main app process exited normally | application crashed [@ XUL + 0x96620] after Assertion failure: gUnusedAtomCount == 0, at /builds/slave/m-in-m64-d-0000000000000000000/build/src/xpcom/ds/nsAtomTable.cpp:546
Assignee | ||
Comment 12•8 years ago
|
||
In one particular testsuite, even.
I would take a patch that would provide more insight into what's going on with some diagnostic messages and such. Want to learn to wrote more C++? =D
Flags: needinfo?(nfroyd)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 16•8 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #12)
> In one particular testsuite, even.
>
> I would take a patch that would provide more insight into what's going on
> with some diagnostic messages and such. Want to learn to wrote more C++? =D
Still happening, so I guess you've left me with no other choice.
Flags: needinfo?(nfroyd)
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 18•8 years ago
|
||
Flags: needinfo?(nfroyd)
Assignee | ||
Comment 19•8 years ago
|
||
Assignee | ||
Comment 20•8 years ago
|
||
Let's try one that's actually passed local compilation: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0a8da4d60aad
Assignee | ||
Comment 21•8 years ago
|
||
Assignee | ||
Comment 22•8 years ago
|
||
Even better: https://treeherder.mozilla.org/#/jobs?repo=try&revision=9da2f23ef8f8
This log:
https://treeherder.mozilla.org/logviewer.html#?job_id=26204790&repo=try#L84757
indicates that our counter tracking the number of unused atoms is inconsistent with the number of atoms actually in the table:
07:03:11 INFO - [Child 1735] ###!!! ASSERTION: removed 10 unused 11
07:03:11 INFO - : 'false', file /builds/slave/try-m64-d-00000000000000000000/build/src/xpcom/ds/nsAtomTable.cpp, line 472
07:03:11 INFO - #01: XRE_InitChildProcess (in XUL) + 2328
07:03:11 INFO - #02: 0x0009c560 (in XUL) + 112
07:03:11 INFO - #03: 0x0012651c (in XUL) + 1564
07:03:11 INFO - #04: XRE_TermEmbedding (in XUL) + 81
07:03:11 INFO - #05: 0x006699d3 (in XUL) + 19
07:03:11 INFO - #06: 0x0000000100001f0c (in plugin-container) + 236
But, at the same time, we removed all the known dynamic atoms from the table, and we (I think) haven't had any races with other threads releasing atoms while we're iterating over the table for the final GC. So I'm unsure what the problem actually is...
Comment 23•8 years ago
|
||
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
Updated•8 years ago
|
Summary: Intermittent Main app process exited normally | application crashed [@ XUL + 0x96620] after Assertion failure: gUnusedAtomCount == 0, at /builds/slave/m-in-m64-d-0000000000000000000/build/src/xpcom/ds/nsAtomTable.cpp:546 → Intermittent Main app process exited normally | application crashed [@ NS_ShutdownAtomTable()] after Assertion failure: gUnusedAtomCount == 0, at /builds/slave/m-in-m64-d-0000000000000000000/build/src/xpcom/ds/nsAtomTable.cpp:546
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 42•8 years ago
|
||
This has increased in frequency recently- I would like to come back and revisit this bug. I see both Bobby Holley and :froydnj on the bug- those are the common names on nsAtomTable.cpp.
While this happens on a variety of operating systems, this is particular to crashtests, e10s, debug, and primarily on osx. Looking over history on autoland, this is about 6% of the time.
We went from ~15 instances/week to >25/week around December 4th.
I would imagine we could push to try and bisect this down with 50 retriggers of crashtest- maybe break crashtests into a few chunks. Unfortunately osx isn't on taskcluster, so to do this we would need to hack the manifests each time.
Comment 43•8 years ago
|
||
here is me bisecting the ~3100 crashtests into 10 chunks and running them 50 times each:
https://treeherder.mozilla.org/#/jobs?repo=try&author=jmaher@mozilla.com&fromchange=7183f38b4fc65b92faa953572f3a9f74f029cc92&tochange=46c1edf84a5a77b19ec0a65e9dc80653c211f64d
ideally we will have data to point us at a chunk of ~310 tests, worse case either a couple chunks or no chunks.
Comment 44•8 years ago
|
||
I found the 3rd chunk to be the problem (10%+ of the time) :)
now to bisect the 312 tests down in 10 more chunks:
https://treeherder.mozilla.org/#/jobs?repo=try&author=jmaher@mozilla.com&fromchange=46b96c8fd1430d02227c5fd8836e211186ed2b18&tochange=b04ea377414ae0b6a31026c7a9fa72b90854223e
just like last time, this should give us a smaller chunk and we can bisect the ~31 tests down to a smaller range.
Comment 45•8 years ago
|
||
hmm, all smaller chunks of 31 passed for me, and I have verified I ran the same set of tests in the smaller chunks as per the original chunk of 311.
I am not sure if this is timing related or not running long enough, the directories that we run tests from are:
dom/svg/crashtests
dom/xbl/crashtests
dom/xml/crashtests
dom/xslt/crashtests
dom/xul/crashtests
dom/xul/templates/crashtests
editor/composer/crashtests
editor/libeditor/crashtests
editor/txmgr/tests/crashtests
gfx/tests/crashtests
Comment 46•8 years ago
|
||
trying a few more try pushes to split the 311 tests into 2 sets of 155 and see if this helps.
Comment 47•8 years ago
|
||
ok, I give up, splitting the 311 tests in 2 or 3 chunks do not reproduce, but trying the same 311 tests again yield failures- so as it stands, I think there is a set of tests in the directories from comment 45 which depend upon each other (or some timing issue).
I am not sure how to make this actionable other than running the tests in 12 chunks :) I think I will see if :froydnj has thoughts on this given the fact that we have narrowed the test case down 90%.
Flags: needinfo?(nfroyd)
Updated•8 years ago
|
Summary: Intermittent Main app process exited normally | application crashed [@ NS_ShutdownAtomTable()] after Assertion failure: gUnusedAtomCount == 0, at /builds/slave/m-in-m64-d-0000000000000000000/build/src/xpcom/ds/nsAtomTable.cpp:546 → Intermittent 1113005.html, Main app process exited normally | application crashed [@ NS_ShutdownAtomTable()] after Assertion failure: gUnusedAtomCount == 0, at /builds/slave/m-in-m64-d-0000000000000000000/build/src/xpcom/ds/nsAtomTable.cpp:546
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 49•8 years ago
|
||
(In reply to Joel Maher ( :jmaher) from comment #47)
> I am not sure how to make this actionable other than running the tests in 12
> chunks :) I think I will see if :froydnj has thoughts on this given the
> fact that we have narrowed the test case down 90%.
On the supposition that it's timing related, can you simply eliminate some tests and/or directories from the try runs and narrow it down that way?
The peculiar thing is that comment 22 says that we're racing between threads, I think, and comment 45 suggests that virtually all of the tests are dealing with on-the-main-thread things.
Flags: needinfo?(nfroyd)
Assignee | ||
Comment 50•8 years ago
|
||
OK, so I've been thinking about this intermittent and doing some try runs, and I have a plausible theory as to how we end up with this assertion. Why it happens much more frequently on Mac, I don't know.
Try runs indicate that we're failing like this:
12:29:24 INFO - [Child 2266] ###!!! ASSERTION: removed 15 unused 16
12:29:24 INFO - : 'false', file /builds/slave/try-m64-d-00000000000000000000/build/src/xpcom/ds/nsAtomTable.cpp, line 487
12:29:24 INFO - #01: NS_ShutdownAtomTable() [xpcom/glue/Mutex.h:169]
12:29:24 INFO - #02: mozilla::ShutdownXPCOM(nsIServiceManager*) [xpcom/build/XPCOMInit.cpp:1083]
12:29:24 INFO - #03: XRE_TermEmbedding [toolkit/xre/nsEmbedFunctions.cpp:217]
12:29:24 INFO - #04: mozilla::ipc::ScopedXREEmbed::Stop() [ipc/glue/ScopedXREEmbed.cpp:118]
12:29:24 INFO - #05: XRE_InitChildProcess [toolkit/xre/nsEmbedFunctions.cpp:761]
12:29:24 INFO - #06: content_process_main(int, char**) [ipc/contentproc/plugin-container.cpp:115]
That is, we've removed 15 atoms from the atom table at shutdown, but we think we have 16 unused atoms in the table, and so we fail. The numbers are not consistent between runs, but the difference is always more unused atoms than we have removed. A plausible scenario follows below, at some arbitrary time during the execution of the tests, we have:
T1: Starts to GC the atom table, for whatever reason:
http://searchfox.org/mozilla-central/source/xpcom/ds/nsAtomTable.cpp#403
T2: Release()'s a Dynamic atom:
http://searchfox.org/mozilla-central/source/xpcom/ds/nsAtomTable.cpp#453
After the reference count of the atom on T2 has been decremented (and so it is now zero and can be GC'd), but before it can execute any of the instructions involved in `++gUnusedAtomCount >= kAtomGCThreshold`, it is interrupted.
T1: Completes the GC of the atom table, GC'ing the atom from T2, and decrements gUnusedAtomCount.
T2: increments gUnusedAtomCount, finds that it's less than kAtomGCThreshold, and continues.
We are now in a situation where gUnusedAtomCount is *not* an accurate count of the unused atoms still in the atom table. We're also doing very dodgy things, since we're executing the destructor of an object that has just been delete'd...it's a very good thing we don't touch any more object state after this check!
The only possibilities I see are deleting the assertion or making atom addref/release slower (i.e. Mutex acquisition to ensure we don't race with the GC). But I haven't thought about it very hard. bholley, WDYT?
Flags: needinfo?(bobbyholley)
Comment 51•8 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #50)
> T1: Starts to GC the atom table, for whatever reason:
>
> http://searchfox.org/mozilla-central/source/xpcom/ds/nsAtomTable.cpp#403
>
> T2: Release()'s a Dynamic atom:
>
> http://searchfox.org/mozilla-central/source/xpcom/ds/nsAtomTable.cpp#453
>
> After the reference count of the atom on T2 has been decremented (and so it
> is now zero and can be GC'd), but before it can execute any of the
> instructions involved in `++gUnusedAtomCount >= kAtomGCThreshold`, it is
> interrupted.
>
> T1: Completes the GC of the atom table, GC'ing the atom from T2, and
> decrements gUnusedAtomCount.
> T2: increments gUnusedAtomCount, finds that it's less than kAtomGCThreshold,
> and continues.
But T1 decrements gUnusedAtomCount by the number of atoms that were actually removed, right? So in the race you're describing, T1 is sort of "spending on credit" when it decrements gUnusedAtomCount, but T2 will deposit the balance shortly and equalize things out. This could lead to integer underflow, but as long as things wrap appropriately (do they), it should all balance out, right? Or am I missing something?
Flags: needinfo?(bobbyholley) → needinfo?(nfroyd)
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 53•8 years ago
|
||
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #51)
> But T1 decrements gUnusedAtomCount by the number of atoms that were actually
> removed, right? So in the race you're describing, T1 is sort of "spending on
> credit" when it decrements gUnusedAtomCount, but T2 will deposit the balance
> shortly and equalize things out. This could lead to integer underflow, but
> as long as things wrap appropriately (do they), it should all balance out,
> right? Or am I missing something?
OK, that makes sense. But we're still ending up with an extra increment that's not reflected in the number of unused atoms in the atom table! Back to thinking hard, I guess...
Flags: needinfo?(nfroyd)
Comment 54•8 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #50)
> That is, we've removed 15 atoms from the atom table at shutdown, but we
> think we have 16 unused atoms in the table, and so we fail. The numbers are
> not consistent between runs, but the difference is always more unused atoms
> than we have removed.
Hm. How many atoms are actually in the table? Does it match the unused count when we GC? It seems like the final GC should assert that the unused count is equal to the entries in the table.
But I guess that runs up against the issue described in [1]. I think it might make sense to track down and fix those leaks, since it would allow us to have stronger invariants here, and make it easier to reason about this bug.
http://searchfox.org/mozilla-central/rev/8144fcc62054e278fe5db9666815cc13188c96fa/xpcom/ds/nsAtomTable.cpp#549
Comment 55•8 years ago
|
||
--> Adjusting bug summary a bit to clarify that this doesn't *actually* happen during crashtest 1113005.html, though Linux64 Debug test-runners report it as if it did. See bug 1329781 for more details (there's a logging bug here).
Summary: Intermittent 1113005.html, Main app process exited normally | application crashed [@ NS_ShutdownAtomTable()] after Assertion failure: gUnusedAtomCount == 0, at /builds/slave/m-in-m64-d-0000000000000000000/build/src/xpcom/ds/nsAtomTable.cpp:546 → Intermittent Main app process exited normally | application crashed [@ NS_ShutdownAtomTable()] after Assertion failure: gUnusedAtomCount == 0, at xpcom/ds/nsAtomTable.cpp:546 (sometimes reported during crashtest 1113005.html)
Assignee | ||
Comment 56•8 years ago
|
||
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #54)
> (In reply to Nathan Froyd [:froydnj] from comment #50)
> > That is, we've removed 15 atoms from the atom table at shutdown, but we
> > think we have 16 unused atoms in the table, and so we fail. The numbers are
> > not consistent between runs, but the difference is always more unused atoms
> > than we have removed.
>
> Hm. How many atoms are actually in the table? Does it match the unused count
> when we GC? It seems like the final GC should assert that the unused count
> is equal to the entries in the table.
That is what the current assertion (gUnusedAtomCount == 0) is essentially trying to do. I have a more complete set of diagnostic patches:
https://hg.mozilla.org/try/rev/61f17a9fa0314f3a6516c8193cf8a4bd469c08da
https://hg.mozilla.org/try/rev/3ba431dc5a0b54b1ed1895eb5ac47d7b5cd61321
that separates the GC into normal-usage GC and shutdown-GC. And we get the same flavor of assertion: at the final shutdown GC, the number of removed atoms from the table is not equal to the count of unused atoms that we had. And no, we don't appear to have missed any, or added things to the atom table during GC, or anything like that.
Comment 57•8 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #56)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #54)
> > (In reply to Nathan Froyd [:froydnj] from comment #50)
> > > That is, we've removed 15 atoms from the atom table at shutdown, but we
> > > think we have 16 unused atoms in the table, and so we fail. The numbers are
> > > not consistent between runs, but the difference is always more unused atoms
> > > than we have removed.
> >
> > Hm. How many atoms are actually in the table? Does it match the unused count
> > when we GC? It seems like the final GC should assert that the unused count
> > is equal to the entries in the table.
>
> That is what the current assertion (gUnusedAtomCount == 0) is essentially
> trying to do.
Not quite. IIUC the current assertion is trying to check whether we GCed all the unused atoms. That's different than asserting that all the atoms are unused (i.e. no atoms are leaked), which per the link in comment 54 isn't yet assertable. I'm suggesting that we see whether we can make that assertable without too much work, because it would give us another data point and help us figure out whether gUnusedAtomCount is wrong or one of the mRefCnt members is wrong.
Assignee | ||
Comment 58•8 years ago
|
||
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #57)
> (In reply to Nathan Froyd [:froydnj] from comment #56)
> > (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #54)
> > > (In reply to Nathan Froyd [:froydnj] from comment #50)
> > > > That is, we've removed 15 atoms from the atom table at shutdown, but we
> > > > think we have 16 unused atoms in the table, and so we fail. The numbers are
> > > > not consistent between runs, but the difference is always more unused atoms
> > > > than we have removed.
> > >
> > > Hm. How many atoms are actually in the table? Does it match the unused count
> > > when we GC? It seems like the final GC should assert that the unused count
> > > is equal to the entries in the table.
> >
> > That is what the current assertion (gUnusedAtomCount == 0) is essentially
> > trying to do.
>
> Not quite. IIUC the current assertion is trying to check whether we GCed all
> the unused atoms. That's different than asserting that all the atoms are
> unused (i.e. no atoms are leaked), which per the link in comment 54 isn't
> yet assertable. I'm suggesting that we see whether we can make that
> assertable without too much work, because it would give us another data
> point and help us figure out whether gUnusedAtomCount is wrong or one of the
> mRefCnt members is wrong.
The diagnostic patches explicitly check that the number of removed atoms is equal to gUnusedAtomCount, which is not the case. Those patches also check that there are no dynamic atoms in the atom table after we've finished the shutdown GC. So we're not leaking any atoms.
Ah, idea: dynamic atoms can get transmuted to static atoms. Maybe what's happening is that we're creating a dynamic atom *very* early on and then releasing it so that it's effectively unused. That would increment gUnusedAtomCount, but probably not trigger a GC. Then we register all the static atoms, and effectively change that (unused) dynamic atom into a static atom...and that atom will never get released. That would account for the extra increment to gUnusedAtomCount.
Comment 59•8 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #58)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #57)
> > (In reply to Nathan Froyd [:froydnj] from comment #56)
> > > (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #54)
> > > > (In reply to Nathan Froyd [:froydnj] from comment #50)
> > > > > That is, we've removed 15 atoms from the atom table at shutdown, but we
> > > > > think we have 16 unused atoms in the table, and so we fail. The numbers are
> > > > > not consistent between runs, but the difference is always more unused atoms
> > > > > than we have removed.
> > > >
> > > > Hm. How many atoms are actually in the table? Does it match the unused count
> > > > when we GC? It seems like the final GC should assert that the unused count
> > > > is equal to the entries in the table.
> > >
> > > That is what the current assertion (gUnusedAtomCount == 0) is essentially
> > > trying to do.
> >
> > Not quite. IIUC the current assertion is trying to check whether we GCed all
> > the unused atoms. That's different than asserting that all the atoms are
> > unused (i.e. no atoms are leaked), which per the link in comment 54 isn't
> > yet assertable. I'm suggesting that we see whether we can make that
> > assertable without too much work, because it would give us another data
> > point and help us figure out whether gUnusedAtomCount is wrong or one of the
> > mRefCnt members is wrong.
>
> The diagnostic patches explicitly check that the number of removed atoms is
> equal to gUnusedAtomCount, which is not the case. Those patches also check
> that there are no dynamic atoms in the atom table after we've finished the
> shutdown GC. So we're not leaking any atoms.
Oh interesting! If that's actually the case, then we can strengthen the assertion referred to at [1] right?
> Ah, idea: dynamic atoms can get transmuted to static atoms. Maybe what's
> happening is that we're creating a dynamic atom *very* early on and then
> releasing it so that it's effectively unused. That would increment
> gUnusedAtomCount, but probably not trigger a GC. Then we register all the
> static atoms, and effectively change that (unused) dynamic atom into a
> static atom...and that atom will never get released. That would account for
> the extra increment to gUnusedAtomCount.
Interesting! But why on earth do we allow this sketchy transmutation? Is it a holdover from the binary component days? It seems like we could just reorder things so that the static atom table is initialized, prepopulated, and sealed very early in startup.
[1] http://searchfox.org/mozilla-central/rev/8144fcc62054e278fe5db9666815cc13188c96fa/xpcom/ds/nsAtomTable.cpp#549
Assignee | ||
Comment 60•8 years ago
|
||
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #59)
> (In reply to Nathan Froyd [:froydnj] from comment #58)
> > (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #57)
> > > (In reply to Nathan Froyd [:froydnj] from comment #56)
> > > > (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #54)
> > > > > (In reply to Nathan Froyd [:froydnj] from comment #50)
> > > > > > That is, we've removed 15 atoms from the atom table at shutdown, but we
> > > > > > think we have 16 unused atoms in the table, and so we fail. The numbers are
> > > > > > not consistent between runs, but the difference is always more unused atoms
> > > > > > than we have removed.
> > > > >
> > > > > Hm. How many atoms are actually in the table? Does it match the unused count
> > > > > when we GC? It seems like the final GC should assert that the unused count
> > > > > is equal to the entries in the table.
> > > >
> > > > That is what the current assertion (gUnusedAtomCount == 0) is essentially
> > > > trying to do.
> > >
> > > Not quite. IIUC the current assertion is trying to check whether we GCed all
> > > the unused atoms. That's different than asserting that all the atoms are
> > > unused (i.e. no atoms are leaked), which per the link in comment 54 isn't
> > > yet assertable. I'm suggesting that we see whether we can make that
> > > assertable without too much work, because it would give us another data
> > > point and help us figure out whether gUnusedAtomCount is wrong or one of the
> > > mRefCnt members is wrong.
> >
> > The diagnostic patches explicitly check that the number of removed atoms is
> > equal to gUnusedAtomCount, which is not the case. Those patches also check
> > that there are no dynamic atoms in the atom table after we've finished the
> > shutdown GC. So we're not leaking any atoms.
>
> Oh interesting! If that's actually the case, then we can strengthen the
> assertion referred to at [1] right?
Well, once the previous, weaker assertion stops intermittently failing, yes. :)
> > Ah, idea: dynamic atoms can get transmuted to static atoms. Maybe what's
> > happening is that we're creating a dynamic atom *very* early on and then
> > releasing it so that it's effectively unused. That would increment
> > gUnusedAtomCount, but probably not trigger a GC. Then we register all the
> > static atoms, and effectively change that (unused) dynamic atom into a
> > static atom...and that atom will never get released. That would account for
> > the extra increment to gUnusedAtomCount.
>
> Interesting! But why on earth do we allow this sketchy transmutation? Is it
> a holdover from the binary component days? It seems like we could just
> reorder things so that the static atom table is initialized, prepopulated,
> and sealed very early in startup.
I don't know why we allow this; the code doing this dates from hg@1:
https://hg.mozilla.org/integration/mozilla-inbound/annotate/5903b0f685fc/xpcom/ds/nsAtomTable.cpp
with just minimal refactoring changes over the years. I guess we could stick a MOZ_ASSERT there and then try to figure out who's doing that. I would dearly love to overhaul how we do atoms...
The good news is that handling this transmutation case apparently makes the intermittent go away:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=e27bf84176b405879fceab7526cc58a2d35b26bb&selectedJob=67698606
Comment 61•8 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #60)
> with just minimal refactoring changes over the years. I guess we could
> stick a MOZ_ASSERT there and then try to figure out who's doing that. I
> would dearly love to overhaul how we do atoms...
I think it would actually be even easier than that.
There are 9 calls to NS_RegisterStaticAtoms [1]. The 4 main ones live at the top of nsLayoutStatics::Initialize [2]. But because of the 5 other miscellaneous ones, we delay sealing the atom table until the _bottom_ of nsLayoutStatics::Initialize [3], after all sorts of (presumably dynamic-atom-consuming) code has been run.
So my bet is we could do the following simple steps:
(1) Invoke nsTextServicesDocument::RegisterAtoms, nsHtml5Atoms::AddRefAtoms, nsHTMLTags::AddRefTable, NS_RegisterStaticAtoms(rdf_atoms), and NS_RegisterStaticAtoms(directory_atoms) at the top of nsLayoutStatics::Initialize.
(2) Move the NS_SealStaticAtomTable() call to immediately after that block.
(3) Initialize the dynamic atom table at the bottom of NS_SealStaticAtomTable(), and remove the lazy initialization.
(4) Remove the transmutation gunk.
(5) Assert that we're not leaking any atoms at shutdown.
WDYT?
[1] http://searchfox.org/mozilla-central/search?q=symbol:_Z22NS_RegisterStaticAtomsRAT__K12nsStaticAtom&redirect=false
[2] http://searchfox.org/mozilla-central/rev/225ab0637ed51b8b3f9f4ee2f9c339a37a65b626/layout/build/nsLayoutStatics.cpp#155
[3] http://searchfox.org/mozilla-central/rev/225ab0637ed51b8b3f9f4ee2f9c339a37a65b626/layout/build/nsLayoutStatics.cpp#270
Flags: needinfo?(nfroyd)
Comment 62•8 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #60)
> The good news is that handling this transmutation case apparently makes the
> intermittent go away:
>
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=e27bf84176b405879fceab7526cc58a2d35b26bb&selectedJob=6
> 7698606
Also: \o/ \o/ \o/
Assignee | ||
Comment 63•8 years ago
|
||
That is a much easier idea than my "figure out what needs to be moved through repeated try runs". :) I'll look into doing that.
Flags: needinfo?(nfroyd)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 66•8 years ago
|
||
Moving the html5 atoms out into their own initialization phase makes the
initialization of atoms more explicit and avoids problems with trying to
move modules around so their atoms get initialized in the correct place.
As an aesthetic bonus, this change produces pleasing symmetry in
nsHtml5Module::{Initialize,Release}Statics.
Reviewed-by: Nathan Froyd <froydnj@gmail.com>
Attachment #8830339 -
Flags: review?(hsivonen)
Assignee | ||
Comment 67•8 years ago
|
||
Moving this call closer to the other atom initializations will enable us
to seal the static atom table sooner.
Attachment #8830340 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 68•8 years ago
|
||
We do this for much the same reason that we moved the nsHtml5Atoms
initialization. Otherwise, the nsHTMLTags atoms are lazily initialized
long after we've sealed the static atom table in nsLayoutStatics.
Attachment #8830341 -
Flags: review?(hsivonen)
Assignee | ||
Comment 69•8 years ago
|
||
This is not the cleanest code ever, but we need to move all static atom
initialization prior to NS_SealStaticAtomTable to avoid the possibility
of dynamic atoms being transmuted into static atoms. We therefore need
to open up an avenue for somebody else to initialize the atoms that RDF needs.
Attachment #8830343 -
Flags: review?(l10n)
Assignee | ||
Comment 70•8 years ago
|
||
There are several XPCOM tests that purport to call
NS_RegisterStaticAtoms. The tests located in the xpcom/tests/ directory
are unused, so we might as well just remove them. The gtests do get
run, but there's going to be no way to test NS_RegisterStaticAtoms once
sealing the atom table actually means forbidding new additions. So we
might as well remove the gtest too.
Attachment #8830344 -
Flags: review?(erahm)
Assignee | ||
Comment 71•8 years ago
|
||
We do this to ensure that everybody has registered all the static atoms
we'll care about, and to pave the way for asserting that nobody is
trying to create any static atoms past this point in the next patch.
Attachment #8830345 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 72•8 years ago
|
||
This change seems like an obvious thing we should have been doing, but
we weren't.
Attachment #8830346 -
Flags: review?(erahm)
Assignee | ||
Comment 73•8 years ago
|
||
This comment makes things slightly more greppable.
Attachment #8830347 -
Flags: review?(erahm)
Assignee | ||
Comment 74•8 years ago
|
||
We can do this now that we've shuffled static atom initialization around
appropriately.
Attachment #8830348 -
Flags: review?(erahm)
Assignee | ||
Comment 75•8 years ago
|
||
Good riddance to some sketchy code.
Attachment #8830349 -
Flags: review?(erahm)
Assignee | ||
Comment 76•8 years ago
|
||
This could have been done more simply, but the small amount of
refactoring that takes place in this comment enables better error
messages in the case where something does go wrong.
Attachment #8830350 -
Flags: review?(erahm)
Comment 77•8 years ago
|
||
Comment on attachment 8830343 [details] [diff] [review]
part 4 - initialize RDF atoms in nsLayoutStatics
Review of attachment 8830343 [details] [diff] [review]:
-----------------------------------------------------------------
Flinging this over from Axel to Benjamin, this is out of my comfort zone.
Attachment #8830343 -
Flags: review?(l10n) → review?(benjamin)
Comment 78•8 years ago
|
||
Comment on attachment 8830346 [details] [diff] [review]
part 7 - don't register static atoms after the table has been sealed
Review of attachment 8830346 [details] [diff] [review]:
-----------------------------------------------------------------
::: xpcom/ds/nsAtomTable.cpp
@@ +601,1 @@
> if (!gStaticAtomTable && !gStaticAtomTableSealed) {
There doesn't seem to be a whole lot of value in the gStaticAtomTableSealed here now.
Comment 79•8 years ago
|
||
Comment on attachment 8830348 [details] [diff] [review]
part 9 - forbid transmutation of dynamic atoms
Review of attachment 8830348 [details] [diff] [review]:
-----------------------------------------------------------------
::: xpcom/ds/nsAtomTable.cpp
@@ +619,5 @@
> if (atom) {
> + // Disallow creating a dynamic atom, and then later, while the
> + // dynamic atom is still alive, registering that name atom as a
> + // static atom. It causes subtle bugs, and we're programming in
> + // C++ here, not Smalltalk.
Not sure there's any point in jabbing at smalltalk here.
Comment 80•8 years ago
|
||
Comment on attachment 8830343 [details] [diff] [review]
part 4 - initialize RDF atoms in nsLayoutStatics
This makes me sad, both because RDF will be going away soon and because the layout module is a weird side-effecty way of doing this. But I can't think of a cleaner way, so r+ and I'll hopefully make this better hacking the startup sequence.
Attachment #8830343 -
Flags: review?(benjamin) → review+
Comment 81•8 years ago
|
||
Comment on attachment 8830340 [details] [diff] [review]
part 2 - move nsTextServicesDocument::RegisterAtoms call
r=me
Attachment #8830340 -
Flags: review?(bzbarsky) → review+
Comment 82•8 years ago
|
||
Comment on attachment 8830345 [details] [diff] [review]
part 6 - seal the static atom table sooner
r=me
Attachment #8830345 -
Flags: review?(bzbarsky) → review+
Comment 83•8 years ago
|
||
Comment on attachment 8830344 [details] [diff] [review]
part 5 - don't try to test NS_RegisterStaticAtoms
Review of attachment 8830344 [details] [diff] [review]:
-----------------------------------------------------------------
It would be nice if we had a way to test the atom table directly rather than being limited to the global one, but I guess this is what we've got. It might make sense to file a follow up to make the atom table unit testable.
::: xpcom/tests/gtest/TestAtoms.cpp
@@ -160,5 @@
>
> EXPECT_TRUE(thirdDynamic);
> EXPECT_EQ(NS_GetNumberOfAtoms(), count + 1);
> -
> - NS_RegisterStaticAtoms(sAtoms_info);
Can you add a note here that we can't test adding atoms because the table has already been sealed by the time xpcom initialization has completed?
Attachment #8830344 -
Flags: review?(erahm) → review+
Comment 84•8 years ago
|
||
Comment on attachment 8830346 [details] [diff] [review]
part 7 - don't register static atoms after the table has been sealed
Review of attachment 8830346 [details] [diff] [review]:
-----------------------------------------------------------------
::: xpcom/ds/nsAtomTable.cpp
@@ +595,5 @@
> RegisterStaticAtoms(const nsStaticAtom* aAtoms, uint32_t aAtomCount)
> {
> MutexAutoLock lock(*gAtomTableLock);
> +
> + MOZ_RELEASE_ASSERT(!gStaticAtomTableSealed, "too late!");
Perhaps: "Atom table has already been sealed!"
Attachment #8830346 -
Flags: review?(erahm) → review+
Updated•8 years ago
|
Attachment #8830347 -
Flags: review?(erahm) → review+
Comment 85•8 years ago
|
||
Comment on attachment 8830348 [details] [diff] [review]
part 9 - forbid transmutation of dynamic atoms
Review of attachment 8830348 [details] [diff] [review]:
-----------------------------------------------------------------
::: xpcom/ds/nsAtomTable.cpp
@@ +617,5 @@
>
> nsIAtom* atom = he->mAtom;
> if (atom) {
> + // Disallow creating a dynamic atom, and then later, while the
> + // dynamic atom is still alive, registering that name atom as a
nit: "registering that same atom"
@@ +624,2 @@
> if (!atom->IsStaticAtom()) {
> + MOZ_CRASH("static atom registration should be pushed back!");
Is there some way we can log the atom string here? Crashing without that isn't super helpful. Maybe w/ a crash annotation.
Attachment #8830348 -
Flags: review?(erahm) → feedback+
Comment 86•8 years ago
|
||
Comment on attachment 8830349 [details] [diff] [review]
part 10 - remove dynamic->static atom transmutation code
Review of attachment 8830349 [details] [diff] [review]:
-----------------------------------------------------------------
\o/
Attachment #8830349 -
Flags: review?(erahm) → review+
Updated•8 years ago
|
Attachment #8830350 -
Flags: review?(erahm) → review+
Comment 87•8 years ago
|
||
Lovely, esp. part 10.
Comment on attachment 8830339 [details] [diff] [review]
part 1 - make nsHtml5Atoms initialization explicit in nsLayoutStatics
Review of attachment 8830339 [details] [diff] [review]:
-----------------------------------------------------------------
r+, but I guess I should really put the HTML5 atoms into the normal atom bucket. I was advised to make them separate way back in 2009, IIRC, but the separation seems to just complicate things.
Attachment #8830339 -
Flags: review?(hsivonen) → review+
Attachment #8830341 -
Flags: review?(hsivonen) → review+
Assignee | ||
Comment 89•8 years ago
|
||
(In reply to Henri Sivonen (:hsivonen) from comment #88)
> r+, but I guess I should really put the HTML5 atoms into the normal atom
> bucket. I was advised to make them separate way back in 2009, IIRC, but the
> separation seems to just complicate things.
It complicates things and it would save a reasonable amount of space (tens of kilobytes) if the HTML parser would just use nsGkAtoms:: atoms instead. Is that feasible to do? I looked into it, but got a bit lost in having to modify the Java translator to make things go.
Flags: needinfo?(hsivonen)
Assignee | ||
Comment 90•8 years ago
|
||
Now with more crash annotation.
Attachment #8830805 -
Flags: review?(erahm)
Assignee | ||
Updated•8 years ago
|
Attachment #8830348 -
Attachment is obsolete: true
Comment 91•8 years ago
|
||
Comment on attachment 8830805 [details] [diff] [review]
part 9 - forbid transmutation of dynamic atoms
Review of attachment 8830805 [details] [diff] [review]:
-----------------------------------------------------------------
::: xpcom/ds/nsAtomTable.cpp
@@ +628,5 @@
> + nsPrintfCString msg("static atom registration for %s should be pushed back",
> + name.get());
> + static char sCrashReason[1024];
> + memcpy(sCrashReason, msg.get(), XPCOM_MIN(sizeof(sCrashReason),
> + size_t(msg.Length() + 1)));
We could probably just |snprintf| the buffer directly, but this way works fine too.
Attachment #8830805 -
Flags: review?(erahm) → review+
Comment 92•8 years ago
|
||
Comment on attachment 8830805 [details] [diff] [review]
part 9 - forbid transmutation of dynamic atoms
Review of attachment 8830805 [details] [diff] [review]:
-----------------------------------------------------------------
::: xpcom/ds/nsAtomTable.cpp
@@ +630,5 @@
> + static char sCrashReason[1024];
> + memcpy(sCrashReason, msg.get(), XPCOM_MIN(sizeof(sCrashReason),
> + size_t(msg.Length() + 1)));
> + MOZ_CRASH_ANNOTATE(sCrashReason);
> + MOZ_REALLY_CRASH();
Sorry one more note, you can just MOZ_CRASH(sCrashReason) I believe.
Comment 93•8 years ago
|
||
Pushed by nfroyd@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f0488821fbbe
part 1 - make nsHtml5Atoms initialization explicit in nsLayoutStatics; r=hsivonen
https://hg.mozilla.org/integration/mozilla-inbound/rev/d8652bf04c7b
part 2 - move nsTextServicesDocument::RegisterAtoms call; r=bz
https://hg.mozilla.org/integration/mozilla-inbound/rev/b7c1ee47ca9a
part 3 - split out nsHTMLTags atom initialization; r=hsivonen
https://hg.mozilla.org/integration/mozilla-inbound/rev/e6272736e0e6
part 4 - initialize RDF atoms in nsLayoutStatics; r=Pike
https://hg.mozilla.org/integration/mozilla-inbound/rev/ead7c6aa2342
part 5 - don't try to test NS_RegisterStaticAtoms; r=erahm
https://hg.mozilla.org/integration/mozilla-inbound/rev/c7456484433f
part 6 - seal the static atom table sooner; r=bz
https://hg.mozilla.org/integration/mozilla-inbound/rev/5a6fc9e02f9d
part 7 - don't register static atoms after the table has been sealed; r=erahm
https://hg.mozilla.org/integration/mozilla-inbound/rev/0fe884bfae3a
part 8 - add a comment in NS_InitXPCOM2 to make static atom initialization less cryptic; r=erahm
https://hg.mozilla.org/integration/mozilla-inbound/rev/2534a6de4a7a
part 9 - forbid transmutation of dynamic atoms; r=erahm
https://hg.mozilla.org/integration/mozilla-inbound/rev/5d5695501775
part 10 - remove dynamic->static atom transmutation code; r=erahm
https://hg.mozilla.org/integration/mozilla-inbound/rev/4ad4f181474c
part 11 - strengthen assertions for atom table shutdown GC; r=erahm
Reporter | ||
Comment 94•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f0488821fbbe
https://hg.mozilla.org/mozilla-central/rev/d8652bf04c7b
https://hg.mozilla.org/mozilla-central/rev/b7c1ee47ca9a
https://hg.mozilla.org/mozilla-central/rev/e6272736e0e6
https://hg.mozilla.org/mozilla-central/rev/ead7c6aa2342
https://hg.mozilla.org/mozilla-central/rev/c7456484433f
https://hg.mozilla.org/mozilla-central/rev/5a6fc9e02f9d
https://hg.mozilla.org/mozilla-central/rev/0fe884bfae3a
https://hg.mozilla.org/mozilla-central/rev/2534a6de4a7a
https://hg.mozilla.org/mozilla-central/rev/5d5695501775
https://hg.mozilla.org/mozilla-central/rev/4ad4f181474c
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox54:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Comment 95•8 years ago
|
||
My hero <3
Comment hidden (Intermittent Failures Robot) |
(In reply to Nathan Froyd [:froydnj] from comment #89)
> (In reply to Henri Sivonen (:hsivonen) from comment #88)
> > r+, but I guess I should really put the HTML5 atoms into the normal atom
> > bucket. I was advised to make them separate way back in 2009, IIRC, but the
> > separation seems to just complicate things.
>
> It complicates things and it would save a reasonable amount of space (tens
> of kilobytes) if the HTML parser would just use nsGkAtoms:: atoms instead.
> Is that feasible to do
Totally.
> I looked into it, but got a bit lost in having to
> modify the Java translator to make things go.
Let's see if I can get it done this week...
Flags: needinfo?(hsivonen)
Assignee | ||
Comment 98•8 years ago
|
||
(In reply to Henri Sivonen (:hsivonen) from comment #97)
> (In reply to Nathan Froyd [:froydnj] from comment #89)
> > I looked into it, but got a bit lost in having to
> > modify the Java translator to make things go.
>
> Let's see if I can get it done this week...
That would be great, thank you!
Assignee | ||
Comment 99•8 years ago
|
||
RyanVM would dearly love to uplift these patches. I don't think there's much of a problem with uplifting them to 53, but given the TB fallout from these patches, I think uplifting them to 52 would be a bridge too far. For 52, I'd like to do something more like:
https://hg.mozilla.org/try/rev/e27bf84176b405879fceab7526cc58a2d35b26bb
and handle the problematic case directly, rather than architecturally redoing bits of how we do atoms.
Comment 100•8 years ago
|
||
I do think that if we uplift without the new assertion to 52 Thunderbird would be OK by just moving to dynamic atoms and backporting that to 52.
Whether we can have other callers of NS_RegisterStaticAtoms on 52 is less clear to me. On 53 we can't of course, since we killed XPCOM exports.
Comment 101•8 years ago
|
||
Shall we go ahead and proceed with the Aurora approvals on this?
Flags: needinfo?(nfroyd)
Assignee | ||
Comment 103•8 years ago
|
||
Comment on attachment 8830339 [details] [diff] [review]
part 1 - make nsHtml5Atoms initialization explicit in nsLayoutStatics
This uplift request is for all the patches in this bug.
Approval Request Comment
[Feature/Bug causing the regression]: bug 1275755
[User impact if declined]: None. Sheriffs will love not having to star intermittent failures for this on Aurora, though.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: Insofar as there haven't been any intermittent failures from this bug since the patches landed, yes.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: We're making the startup sequence of the browser more predictable. If there were problems, we'd have seen them via crashes at startup.
[String changes made/needed]: None.
Attachment #8830339 -
Flags: approval-mozilla-aurora?
Comment 104•8 years ago
|
||
Comment on attachment 8830339 [details] [diff] [review]
part 1 - make nsHtml5Atoms initialization explicit in nsLayoutStatics
Fixes an intermittent, let's uplift to Aurora53
Attachment #8830339 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 105•8 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/77ab0b397936
https://hg.mozilla.org/releases/mozilla-aurora/rev/9feaa1d4b820
https://hg.mozilla.org/releases/mozilla-aurora/rev/c922a29ad146
https://hg.mozilla.org/releases/mozilla-aurora/rev/a5abb9c78e53
https://hg.mozilla.org/releases/mozilla-aurora/rev/c1b5e94191ec
https://hg.mozilla.org/releases/mozilla-aurora/rev/d5ccf239c142
https://hg.mozilla.org/releases/mozilla-aurora/rev/829c02ed316a
https://hg.mozilla.org/releases/mozilla-aurora/rev/712c0d0da7dc
https://hg.mozilla.org/releases/mozilla-aurora/rev/2159ed149364
https://hg.mozilla.org/releases/mozilla-aurora/rev/8ee370bffd15
https://hg.mozilla.org/releases/mozilla-aurora/rev/b4ed5ea97d25
Flags: in-testsuite+
Comment hidden (Intermittent Failures Robot) |
Comment 107•8 years ago
|
||
Comment on attachment 8830339 [details] [diff] [review]
part 1 - make nsHtml5Atoms initialization explicit in nsLayoutStatics
See comment 103.
Attachment #8830339 -
Flags: approval-mozilla-beta?
Comment 108•8 years ago
|
||
Is this ok for TB in 52 after all?
Comment 109•8 years ago
|
||
Thanks for the consideration. We're still suffering from the fallout of this. We landed bug 1334558, bug 1334771, bug 1334874 to fix our bustage, but bug 1337865 is still open. This affects debug builds only. I guess we would survive is this was uplifted to M-C, perhaps next week since this week we're planning to build our TB 52 beta2 without this being uplifted.
Comment 110•8 years ago
|
||
Comment on attachment 8830339 [details] [diff] [review]
part 1 - make nsHtml5Atoms initialization explicit in nsLayoutStatics
let's get these into beta52 now.
Attachment #8830339 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Reporter | ||
Comment 111•8 years ago
|
||
seems this need rebasing for beta since hitting as example
grafting 387536:c922a29ad146 "Bug 1276669 - Part 3: Split out nsHTMLTags atom initialization. r=hsivonen, a=ritu"
merging layout/build/nsLayoutStatics.cpp
merging parser/htmlparser/nsHTMLTags.cpp
merging parser/htmlparser/nsHTMLTags.h
warning: conflicts while merging parser/htmlparser/nsHTMLTags.cpp! (edit, then use 'hg resolve --mark')
abort: unresolved conflicts, can't continue
(use 'hg resolve' and 'hg graft --continue')
Flags: needinfo?(nfroyd)
Assignee | ||
Comment 112•8 years ago
|
||
(In reply to Carsten Book [:Tomcat] from comment #111)
> seems this need rebasing for beta since hitting as example
>
> grafting 387536:c922a29ad146 "Bug 1276669 - Part 3: Split out nsHTMLTags
> atom initialization. r=hsivonen, a=ritu"
> merging layout/build/nsLayoutStatics.cpp
> merging parser/htmlparser/nsHTMLTags.cpp
> merging parser/htmlparser/nsHTMLTags.h
> warning: conflicts while merging parser/htmlparser/nsHTMLTags.cpp! (edit,
> then use 'hg resolve --mark')
> abort: unresolved conflicts, can't continue
> (use 'hg resolve' and 'hg graft --continue')
I think RyanVM was flexing his C++ knowledge and has rebased the patches onto Beta?
Flags: needinfo?(nfroyd) → needinfo?(ryanvm)
Comment 114•8 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/c2a2e606cf8d
https://hg.mozilla.org/releases/mozilla-beta/rev/1962102a1aaf
https://hg.mozilla.org/releases/mozilla-beta/rev/03ec073931a4
https://hg.mozilla.org/releases/mozilla-beta/rev/27e0795c40f5
https://hg.mozilla.org/releases/mozilla-beta/rev/72914fb1b733
https://hg.mozilla.org/releases/mozilla-beta/rev/275d01760b8c
https://hg.mozilla.org/releases/mozilla-beta/rev/12178b9ec166
https://hg.mozilla.org/releases/mozilla-beta/rev/61056266ab9e
https://hg.mozilla.org/releases/mozilla-beta/rev/7b7ca08168a0
https://hg.mozilla.org/releases/mozilla-beta/rev/345d7454ed89
https://hg.mozilla.org/releases/mozilla-beta/rev/49f969a728e9
Comment 115•8 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-esr52/rev/c2a2e606cf8d
https://hg.mozilla.org/releases/mozilla-esr52/rev/1962102a1aaf
https://hg.mozilla.org/releases/mozilla-esr52/rev/03ec073931a4
https://hg.mozilla.org/releases/mozilla-esr52/rev/27e0795c40f5
https://hg.mozilla.org/releases/mozilla-esr52/rev/72914fb1b733
https://hg.mozilla.org/releases/mozilla-esr52/rev/275d01760b8c
https://hg.mozilla.org/releases/mozilla-esr52/rev/12178b9ec166
https://hg.mozilla.org/releases/mozilla-esr52/rev/61056266ab9e
https://hg.mozilla.org/releases/mozilla-esr52/rev/7b7ca08168a0
https://hg.mozilla.org/releases/mozilla-esr52/rev/345d7454ed89
https://hg.mozilla.org/releases/mozilla-esr52/rev/49f969a728e9
status-firefox-esr52:
--- → fixed
Updated•8 years ago
|
Whiteboard: [stockwell fixed]
You need to log in
before you can comment on or make changes to this bug.
Description
•