Closed
Bug 912255
Opened 11 years ago
Closed 11 years ago
Ghost window on http://n-e-r-v-o-u-s.com/cellCycle/
Categories
(Core :: Graphics: CanvasWebGL, defect)
Tracking
()
RESOLVED
FIXED
mozilla26
Tracking | Status | |
---|---|---|
firefox24 | --- | unaffected |
firefox25 | + | verified |
firefox26 | + | verified |
People
(Reporter: guijoselito, Assigned: bjacob)
References
Details
(Keywords: memory-leak, regression, Whiteboard: [MemShrink:P2] webgl-test-needed)
Attachments
(1 file)
(deleted),
patch
|
smaug
:
review+
akeybl
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0 (Beta/Release)
Build ID: 20130903030201
Steps to reproduce:
Open http://n-e-r-v-o-u-s.com/cellCycle/ and then close it.
Actual results:
Opened about:memory and there was a
top(none)/ghost/window(http://n-e-r-v-o-u-s.com/cellCycle/)
Then opened about:compartments and there it was under "Ghost windows"
Expected results:
No Ghost Windows.
Updated•11 years ago
|
Whiteboard: [MemShrink]
Comment 1•11 years ago
|
||
I can reproduce this.
Updated•11 years ago
|
Whiteboard: [MemShrink] → [MemShrink:P2]
This leaks through shutdown.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: mlk
Interesting, if I minimize memory a couple times on ESR 17 it goes away eventually. It seems to stick around forever on nightly.
Alice, do you have time to bisect this?
Comment 4•11 years ago
|
||
Regression window(m-c)
Good:
http://hg.mozilla.org/mozilla-central/rev/8e3a124c9c1a
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20130627 Firefox/25.0 ID:20130627193322
Bad:
http://hg.mozilla.org/mozilla-central/rev/ec8ceaf4b03a
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20130628 Firefox/25.0 ID:20130628063224
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=8e3a124c9c1a&tochange=ec8ceaf4b03a
Regression window(m-c)
Good:
http://hg.mozilla.org/integration/mozilla-inbound/rev/c9bbec788caf
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20130627 Firefox/25.0 ID:20130627132145
Bad:
http://hg.mozilla.org/integration/mozilla-inbound/rev/88b65229c78c
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20130627 Firefox/25.0 ID:20130627140921
Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=c9bbec788caf&tochange=88b65229c78c
Regressed by: bug 738869
Comment 5•11 years ago
|
||
Indeed, that bug changed cycle collection of WebGLContext.
mDefaultVertexArray for instance isn't traversed/unlinked.
Updated•11 years ago
|
Component: General → Canvas: WebGL
Thanks Alice, you rock.
Keywords: regression
Comment 7•11 years ago
|
||
I should have a patch... once my build is ready.
Assignee: nobody → bugs
Assignee | ||
Comment 8•11 years ago
|
||
Indeed, this patch added mBoundVertexArray and mDefaultVertexArray but only traversed/unlinked the former.
Attachment #799470 -
Flags: review?(bugs)
Comment 9•11 years ago
|
||
Comment on attachment 799470 [details] [diff] [review]
traverse/unlink mDefaultVertexArray
Assuming you actually tested this helps :)
Attachment #799470 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 10•11 years ago
|
||
Sorry, didn't realize that you already had a patch.
Tested locally, the patch does fix the leak for me.
Assignee | ||
Comment 11•11 years ago
|
||
Assignee: bugs → bjacob
Target Milestone: --- → mozilla26
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
We should add a mochitest for this.
Flags: needinfo?(bjacob)
Flags: in-testsuite?
Assignee | ||
Comment 14•11 years ago
|
||
Yes, but in fact, I don't understand why this wasn't caught already. This kind of leak is automatically detected when it happens on a mochitest, right?
Looking at the code, the cycle between mDefaultVertexArray and its WebGLContext is always, unconditionally created right on the creation of any WebGL context. Furthermore, even if for whatever reason the leak only occured when a non-default vertex array was bound, that too is tested by an existing, already-running mochitest, namely this WebGL conformance test page (run by the WebGL mochitest): conformance/extensions/oes-vertex-array-object.html
So I'm a bit clueless as to why the leak wasn't automatically detected; understanding that blocks doing anything about that; but I don't have time to do anything about it short-term myself (b2g 1.2 rush...).
Flags: needinfo?(bjacob)
(In reply to Benoit Jacob [:bjacob] from comment #14)
> Yes, but in fact, I don't understand why this wasn't caught already. This
> kind of leak is automatically detected when it happens on a mochitest, right?
Yes. Since we leak through shutdown here, triggering it is enough to turn the tree orange.
> Looking at the code, the cycle between mDefaultVertexArray and its
> WebGLContext is always, unconditionally created right on the creation of any
> WebGL context.
Why do you believe that? Where is the edge from the vertex array to the context? I don't see it anywhere. WebGLBoundObject holds a weak reference to the context afaict.
Flags: needinfo?(bjacob)
Assignee | ||
Comment 16•11 years ago
|
||
Oh, you are completely right. Been a long time since I actually did much WebGL coding :-) we do intentionally not have a strong reference from the WebGLContextBoundObject back to the WebGLContext, to avoid creating unnecessary cycles. So in fact, a mochitest would be as simple as creating a vertex array object, and in javascript manually creating a reference from it to the webgl context.
Flags: needinfo?(bjacob)
Assignee | ||
Comment 17•11 years ago
|
||
This would actually be perfectly reasonable material to add to the existing, upstream WebGL conformance test,
https://www.khronos.org/registry/webgl/sdk/tests/conformance/extensions/oes-vertex-array-object.html
so we should just add it there (and then, we'd pick it up the next time that we update our copy of the conformance tests into our mochitest, which should happen more frequently than transits of Halley's comet).
Adding webgl-test-needed, which is how we remember to make needed contribution to the conformance tests...
Whiteboard: [MemShrink:P2] → [MemShrink:P2] webgl-test-needed
Comment 18•11 years ago
|
||
> Indeed, this patch added mBoundVertexArray and mDefaultVertexArray but only
> traversed/unlinked the former.
Is there any way to automate the detection of these kinds of problems?
Assignee | ||
Comment 19•11 years ago
|
||
(In reply to Benoit Jacob [:bjacob] from comment #16)
> So in fact, a mochitest would be as simple as creating a
> vertex array object, and in javascript manually creating a reference from it
> to the webgl context.
On second thought this is probably more subtle than that? The cycle that we forgot to break was between the WebGL context and its default vertex array, which AFAICT is not exposed to JS... so I don't understand how the present testcase managed to create that cycle. Need to minimize it?
Assignee | ||
Comment 20•11 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #18)
> > Indeed, this patch added mBoundVertexArray and mDefaultVertexArray but only
> > traversed/unlinked the former.
>
> Is there any way to automate the detection of these kinds of problems?
The 'refgraph' tool that I was working on in my spare time ( https://github.com/bjacob/mozilla-central/wiki/Refgraph ) does exactly that, but with all the b2g work lately, I haven't had time.
I've been thinking about how to move refgraph to "bjacob's side project on github" to something that could land in mozilla-central. I think there's a reasonable path to that that involves abandoning refgraph's custom malloc-replace library and instead porting it to use DMD's existing one, which is already in mozilla-central (maybe extending it a little bit along the way).
refgraph would still remain a non-default build option, like DMD is, but it could be useful to a lot more people in this way.
I'd like to convince people who have the ability to spend mozilla time on memory issues that this is a reasonable project :-) Justin and Andrew had expressed interest a while ago but Justin left, most unfortunately.
Comment 21•11 years ago
|
||
BTW, this was a great bug all round. Guilherme provided a nice, reproducible test case, which took advantage of some good built-in tools. I confirmed it reproduced, Kyle identified it was a regression, Alice bisected and found the regression, and Olli and Benoit fixed it. All in just over 24 hours! :)
Comment 22•11 years ago
|
||
Well, the refgraph tool is great for figuring out why a cycle is occurring, but it doesn't help with the issue here, which is that we are lacking a test case to create a cycle. Practically speaking, for these sorts of issues in DOM code we mostly depend on Jesse fuzzing up test cases that leak on shutdown, and because they are often weird looking leaks, they don't always get fixed quickly.
I spent some time on a static analysis to detect things missing from Traverse and Unlink, but it got bogged down in the ugly reality of Gecko code, and didn't really end up as a usable thing. Plus the underlying static analysis framework is now defunct. But the basic idea is that you take a list of fields in the object, filter out things that the CC should be told about, and then compare it to the list of things you do tell the CC about.
Assignee | ||
Comment 23•11 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #22)
> Well, the refgraph tool is great for figuring out why a cycle is occurring,
> but it doesn't help with the issue here, which is that we are lacking a test
> case to create a cycle.
If fact, Refgraph could automatically detect the underlying problem --- that a non-cycle-collected C++ strong reference exists between objects that are exposed to JS --- without even the need for actually creating the cycle.
Refgraph knows the full graph of strong references, and know what part of it is known to the CC. Refgraph can also know which vertices (i.e., objects) are exposed to JS. So it could automatically find out when there is a path of strong references from one JS-exposed object to another, that is not known to the CC. That was my next task when working on refgraph, but I had to stop working on it for lack of time. But if anyone is interested, it could be implemented very quickly at this point.
Comment 24•11 years ago
|
||
Oh, I see. That's pretty cool. Yeah, that would certainly catch the most common problems with undeclared stuff.
Assignee | ||
Comment 25•11 years ago
|
||
(Of course you have to create the one-way strong reference between C++ objects. But we already do that in the above-mentioned mochitest).
Updated•11 years ago
|
status-firefox24:
--- → unaffected
status-firefox25:
--- → affected
status-firefox26:
--- → affected
Assignee | ||
Comment 26•11 years ago
|
||
Comment on attachment 799470 [details] [diff] [review]
traverse/unlink mDefaultVertexArray
[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 738869
User impact if declined: large memory leak on certain WebGL-using pages
Testing completed (on m-c, etc.): m-c for a few days now
Risk to taking this patch (and alternatives if risky): no risk at all
String or IDL/UUID changes made by this patch: none
Attachment #799470 -
Flags: approval-mozilla-aurora?
Updated•11 years ago
|
Attachment #799470 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee | ||
Comment 27•11 years ago
|
||
Comment 28•11 years ago
|
||
Verified as fixed with Firefox 25 beta 7 (build ID: 20131010180222) and latest Aurora (build ID: 20131011004001).
Updated•11 years ago
|
QA Contact: manuela.muntean
You need to log in
before you can comment on or make changes to this bug.
Description
•