Store ArrayBuffer metadata in its own arena
Categories
(Core :: JavaScript Engine, enhancement, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox71 | --- | fixed |
People
(Reporter: mhowell, Assigned: tjr)
References
(Blocks 2 open bugs)
Details
Attachments
(1 file, 3 obsolete files)
(deleted),
text/x-phabricator-request
|
Details |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 3•6 years ago
|
||
Comment 4•6 years ago
|
||
Assignee | ||
Comment 5•6 years ago
|
||
Reporter | ||
Comment 6•6 years ago
|
||
Updated•6 years ago
|
Updated•6 years ago
|
Comment 7•6 years ago
|
||
Updated•6 years ago
|
Assignee | ||
Comment 8•5 years ago
|
||
I worked on this a bit. I updated the patch (only one needed now): patch
I ran Talos and the results weren't great: Try Run / Talos
Some 2-3% regressions. And yet, you'll find this was actually the best try run I had...
Then I tried to only apply the separation to non-System Principal: patch
Talos for that was significantly worse: Talos
Then I noticed that previously the data was able to be background-finalized but we had lost that capability, so I added it back: patch
That (plus the system principal patch) also did poorly on Talos
And then I ran it with background finalization but without the system principal exception: Try Talos
I was able to get a really nice win on tabswitch on windows (20%) but a bunch of regressions elsewhere - again more than the very first try run.
Next I'm going to try with fewer AllocKinds as suggested.
Assignee | ||
Comment 9•5 years ago
|
||
2 AllocKinds Talos got a lot of ~3% regressions across the board; but also a lot of 3% improvements on a11yr specifically.
There's also the now-typical large medium-confident regressions on perf_reftest but those have such huge std dev I'm not sure what to make of them...
Assignee | ||
Comment 10•5 years ago
|
||
Hey Steve, Jon - do you have any suggestions for things to try based on these Talos runs?
Comment 11•5 years ago
|
||
(In reply to Tom Ritter [:tjr] from comment #10)
Hey Steve, Jon - do you have any suggestions for things to try based on these Talos runs?
First of all, I think you should get some better baseline numbers. I've come to strongly distrust comparisons against recent mozilla-central (especially when the patch under test is based off an old mozilla-central; I don't think it takes your base into account, it just looks at the last 2 days.) So before making any decisions, can you make a dummy change to e596664275d5e3e2fdcb7fa8d1447289f99269c3 and push that to try with --rebuild 7
(or --rebuild-talos 7
, depending on which command you use). It looks like all of your pushes are based on that rev. Then you can change the comparisons to be based off of that. (And you'll have to create a dummy change. I usually append something to README.txt. It's annoying, but the decision job will just refuse to run anything otherwise.)
Partly why I'm skeptical is that I looked at the awsy memory results on that first talos link, and they didn't show much change, but did have /improvement/ of 2.3%, which I can't explain. It's also based on only 2 runs with your patch though, so in addition to adding in a proper baseline, you should probably do some more retriggers.
That said, I doubt we have enough ArrayBuffers to make up any significant percentage of total browser memory usage, so I don't expect awsy to show all that much. (And yet, if that's the case, then maybe there's no problem taking this...)
Anyway, it seems from the results you have so far, there's a perf blocker. I suspect with better measurement that particular issue will just disappear, and we can focus on the more relevant memory usage stuff.
Assignee | ||
Comment 12•5 years ago
|
||
(In reply to Steve Fink [:sfink] [:s:] from comment #11)
First of all, I think you should get some better baseline numbers.
Fair enough. Here's a try run of my base commit.
Base compared to initial patch - Very favorable! Only seemed to affect Windows. On x64: 6% improvement on tabpaint, 9% on tsvg_static. 6% regression on tsvgr_opacity, 10% on about_preferences_basic. 4-6.5% on time_to_session_store_window_restored_ms for Windows x86. 2.4% regression on Linux64 AWSY.
Base Compared to excluding system principal - Huge 32% regression on glvideo Mean tick time across 100 ticks on OSX. 8% regression on perf_reftest for Win10 x64, 2% and 7% regressions on Windows x86 tests, 6% improvement on OSX ts_paint_webext. No notable AWSY regressions.
Base compared to Background Finalization - No important Talos regressions or improvements. 2% Linux x64 AWSY regression.
Base compared to Background Finalization + System Principal Exclusion 6% improvement on tsvg_static for Win x64, 1 small regression for Linux and OSX each, 2 small for x86 Windows. No notable AWSY regressions.
Base compared to Two AllocKinds - 12% regression on time_to_session_store_window_restored_ms for Windows x86. AWSY: A 5% Winx86 regression and a 10% x86 improvement. A 2% OSX improvement.
This... doesn't really seem very bad.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 13•5 years ago
|
||
The reason for doing this is to get ArrayBufferObjects allocated into their own arenas.
The specific enum values were chosen to avoid breaking assumptions about where
certain values fall in the list, such as OBJECT_FIRST == FUNCTION.
Assignee | ||
Comment 14•5 years ago
|
||
This improves performance.
Depends on D40227
Comment 15•5 years ago
|
||
(In reply to Tom Ritter [:tjr] from comment #14)
It looks like the first patch has the array buffer alloc kinds marked as background finalizable, so what does the second patch actually do? Array buffers are background finalized so I don't think we need foreground finalized versions of these arena kinds. Was there some bug working out the arena kind that meant we picked the wrong kind?
Assignee | ||
Comment 16•5 years ago
|
||
(In reply to Jon Coppeard (:jonco) from comment #15)
(In reply to Tom Ritter [:tjr] from comment #14)
It looks like the first patch has the array buffer alloc kinds marked as background finalizable, so what does the second patch actually do?
I hate to disagree with you on your own code; but that isn't how I read it. The boolean 'BGFinal' seems to say "Can this be Background Finalizable?' but I think it means "Is this already finalized?" (So a 'true' means you can't background finalize, and a 'false' means you can.)
If we follow ArrayBufferObject::create*
into NewObjectWithClassProto
and then into NewObjectWithClassProtoCommon it checks if (CanBeFinalizedInBackground(allocKind, clasp))
. CanBeFinalizedInBackground in turn says return !IsBackgroundFinalized(kind)
which just returns the bool from the initial table. So 'false' means "I can be background finalized". (I think.)
In the first patch I mark then as true and in the second patch I add the new _BG types and mark the original as false.
Array buffers are background finalized so I don't think we need foreground finalized versions of these arena kinds.
The reason I also have the foreground types is because of how the Background types are determined. In NewObjectWithClassProtoCommon
it calls GetBackgroundAllocKind which just adds 1 to the allockind. So to keep that codepath the same; I create foreground types that won't be used and background types that would.
However typing this all out; it seems like I could just return the BG types in my GetArrayBufferGCObjectKind
function added in the first patch and that ought to work too; eliminating the foreground types.
I'll do that if this all looks correct to you.
Comment 17•5 years ago
|
||
(In reply to Tom Ritter [:tjr] from comment #16)
The boolean 'BGFinal' seems to say "Can this be Background Finalizable?' but I think it means "Is this already finalized?" (So a 'true' means you can't background finalize, and a 'false' means you can.)
No, true means this kind is finalized in the background, false means it's finalized in the foreground (on the main thread).
If we follow
ArrayBufferObject::create*
intoNewObjectWithClassProto
and then into NewObjectWithClassProtoCommon it checksif (CanBeFinalizedInBackground(allocKind, clasp))
. CanBeFinalizedInBackground in turn saysreturn !IsBackgroundFinalized(kind)
which just returns the bool from the initial table. So 'false' means "I can be background finalized".
Oh yeah so reading it again this code is is really confusing.
What happens is that the caller passes in the foreground alloc kind and if possible we change it to the background alloc kind by adding one to it. This happens by | if (CanBeFinalizedInBackground()) { allocKind = GetBackgroundAllocKind() } | or similar logic. That's why CanBeFinalizedInBackground only returns true if the alloc kind it was passed is not already background alloc kind.
However typing this all out; it seems like I could just return the BG types in my
GetArrayBufferGCObjectKind
function added in the first patch and that ought to work too; eliminating the foreground types.
Yes I think that would be better if it works.
I'll file for cleaning up the background finalizable confusion.
Assignee | ||
Comment 18•5 years ago
|
||
(In reply to Jon Coppeard (:jonco) from comment #17)
However typing this all out; it seems like I could just return the BG types in my
GetArrayBufferGCObjectKind
function added in the first patch and that ought to work too; eliminating the foreground types.Yes I think that would be better if it works.
Ah okay; you're right then; my initial patch is all that's needed. It also means that my summary of Talos performance was slightly off in its descriptions.
(This is still accurate) Base compared to initial patch - Very favorable! Only seemed to affect Windows. On x64: 6% improvement on tabpaint, 9% on tsvg_static. 6% regression on tsvgr_opacity, 10% on about_preferences_basic. 4-6.5% on time_to_session_store_window_restored_ms for Windows x86. 2.4% regression on Linux64 AWSY.
(This is still accurate) Base Compared to excluding system principal - Huge 32% regression on glvideo Mean tick time across 100 ticks on OSX. 8% regression on perf_reftest for Win10 x64, 2% and 7% regressions on Windows x86 tests, 6% improvement on OSX ts_paint_webext. No notable AWSY regressions.
(This code is behaving the same way as 'Base compared to initial patch') Base compared to Background Finalization - No important Talos regressions or improvements. 2% Linux x64 AWSY regression.
(This is behaving the same way as 'Base Compared to excluding system principal') Base compared to Background Finalization + System Principal Exclusion 6% improvement on tsvg_static for Win x64, 1 small regression for Linux and OSX each, 2 small for x86 Windows. No notable AWSY regressions.
(This is still accurate) Base compared to Two AllocKinds - 12% regression on time_to_session_store_window_restored_ms for Windows x86. AWSY: A 5% Winx86 regression and a 10% x86 improvement. A 2% OSX improvement.
Updated•5 years ago
|
Comment 19•5 years ago
|
||
Sorry, I've been ignoring this.
Code-wise, the change looks good. And I don't anticipate any significant performance differences, and for the most part it seems like your tests confirm that.
So it really boils down to the question of memory usage. A 2% regression in exchange for security... it doesn't seem out of the question, but it doesn't seem like something we should obviously take, either.
I'm realizing I'm just not the right person to be deciding. I'd like to punt that decision over to people closer to memshrink & fission. The wheel of fate points at erahm.
That said, I am most interested in that final experiment, of the two AllocKind solution. Those numbers sound off, which makes me suspect there's some excess variance in there. You did a decent number of pushes, so I'm not sure what to say. I guess I can pawn that off on erahm as well, though you could also retrigger another dozen times or so to be sure.
Comment 20•5 years ago
|
||
These regressions are possibly in the noise, a minor regression on Linux is probably okay as we have some other larger wins in the pipeline (bug 1470591 for example). If we see larger regressions or regressions on Windows in particular we might want to consider backing out, but overall I'm fine with giving it a shot. Lets shoot for trying this out early in 71 so that we can get some more stable perf data and some time to bake on Nightly.
Updated•5 years ago
|
Comment 21•5 years ago
|
||
Comment 22•5 years ago
|
||
bugherder |
Assignee | ||
Comment 23•5 years ago
|
||
We were kind of expecting some sort of regression (memory, performance potentially) to show up from this. Talos was indicative but inconclusive - did we see anything?
Comment 24•5 years ago
|
||
(In reply to Tom Ritter [:tjr] from comment #23)
We were kind of expecting some sort of regression (memory, performance potentially) to show up from this. Talos was indicative but inconclusive - did we see anything?
Marian, have you seen any AWSY changes related to this bug?
Comment 25•5 years ago
|
||
Ionut, i checked the alerts from awsy but i see nothing related to this bug.
Description
•