Closed
Bug 920149
Opened 11 years ago
Closed 11 years ago
certain strings appear to be duplicated many times in tbpl
Categories
(Tree Management Graveyard :: TBPL, defect)
Tree Management Graveyard
TBPL
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: froydnj, Assigned: Swatinem)
References
Details
Attachments
(3 files, 2 obsolete files)
(deleted),
patch
|
emorley
:
review+
|
Details | Diff | Splinter Review |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
emorley
:
review+
|
Details | Diff | Splinter Review |
about:memory current says for TBPL that these strings are "notable" (i.e. taking up an outsized amount of memory):
│ │ └───9.20 MB (00.81%) -- js-zone(0x7fd4481ab800)
│ │ ├──5.65 MB (00.50%) ── unused-gc-things
│ │ ├──3.20 MB (00.28%) -- strings
│ │ │ ├──2.41 MB (00.21%) ++ normal
│ │ │ ├──0.48 MB (00.04%) -- notable
│ │ │ │ ├──0.04 MB (00.00%) ── string(length=49, copies=1186, "https://tbpl.mozilla.org/php/getParsedLog.php?id=")/gc-heap
│ │ │ │ ├──0.04 MB (00.00%) ── string(length=50, copies=1186, "https://tbpl.mozilla.org/php/getLogExcerpt.php?id=")/gc-heap
│ │ │ │ ├──0.03 MB (00.00%) ── string(length=7, copies=1128, "success")/gc-heap
│ │ │ │ ├──0.02 MB (00.00%) ++ string(length=82, copies=123, "buildbot-master67.srv.releng.use1.mozilla.com:/builds/buildbot/tests1-linux/master")
I've included the "buildbot-..." string so you can see how many more copies of the URL and "success" there are. And this is for a close-to-freshly-loaded page; leaving tbpl open for several hours or even overnight, it wouldn't be unusual to have the URLs and "success" taking up half a megabyte or more.
This state of affairs seems unusual.
Reporter | ||
Comment 1•11 years ago
|
||
Nick, are there any tools that can help determine who's holding on to the copies of all those strings?
Flags: needinfo?(n.nethercote)
Comment 2•11 years ago
|
||
Get a JS heap dump (the easiest way to do that is to get a CC dump, and it will be produced at the same time: https://wiki.mozilla.org/Performance:Leak_Tools#Cycle_collector_heap_dump ). Then you can use my GC log analysis scripts at https://github.com/amccreight/heapgraph/tree/master/g stringy.py should give you a reasonable overview of what strings there are, then you can use --string-mode with find_roots.py to see what is holding it alive. It is a bit rough, but I've used it to successfully track down the cause of some string bloat in B2G.
Assignee | ||
Comment 3•11 years ago
|
||
Thats very interesting. Basically most of tbpl is just string concat + innerHTML.
Ideally nothing from script side should hold onto them, they should be embedded into the DOM as href, title and other kind of attributes or textnodes.
I’ve read a blog some time ago about string handling in spidermonkey, maybe someone can enlighten me further.
As these are just string fragments of stuff we concatenate, I guess they must be parts of rope strings. I just don’t know why they are duplicated. If they are indeed ropes, they should be shared somehow?
So what kind of best practices do you have to mitigate this problem? I mean its really a common problem, a *lot* of templating engines use string concat (or Array.join) internally.
Comment 4•11 years ago
|
||
I don't have anything notable to add, other than the amount taken up doesn't seem that high to me.
Reporter | ||
Comment 5•11 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #2)
> Get a JS heap dump (the easiest way to do that is to get a CC dump, and it
> will be produced at the same time:
> https://wiki.mozilla.org/Performance:Leak_Tools#Cycle_collector_heap_dump ).
> Then you can use my GC log analysis scripts at
> https://github.com/amccreight/heapgraph/tree/master/g stringy.py should
> give you a reasonable overview of what strings there are, then you can use
> --string-mode with find_roots.py to see what is holding it alive. It is a
> bit rough, but I've used it to successfully track down the cause of some
> string bloat in B2G.
What's the second argument to find_roots.py? The address of the string from the gc log?
Flags: needinfo?(n.nethercote) → needinfo?(continuation)
Comment 6•11 years ago
|
||
Looks like with --string-mode it is the string you want to look for.
Flags: needinfo?(continuation)
Reporter | ||
Comment 7•11 years ago
|
||
(In reply to Arpad Borsos (Swatinem) from comment #3)
> Thats very interesting. Basically most of tbpl is just string concat +
> innerHTML.
>
> Ideally nothing from script side should hold onto them, they should be
> embedded into the DOM as href, title and other kind of attributes or
> textnodes.
My guess is that the strings are actually generated hereabouts:
https://hg.mozilla.org/webtools/tbpl/file/ecbd85696df3/js/BuildbotDBUser.js#l61
If the JS engine is smart enough to concatenate constant strings in there, it ought to be smart enough to share the strings, too. The GC log never actually displays those strings, though, it just displays '<rope: length 49>' or '<rope: length 50>'. And the ropes are definitely shared. I'm wondering if about:memory is not somehow miscounting those strings. Nick?
Flags: needinfo?(n.nethercote)
Assignee | ||
Comment 8•11 years ago
|
||
Thanks for the info. So ye, we do save a lot of useless strings on objects.
Would it make sense to replace those by getters that output the string dynamically instead of predefining it?
Reporter | ||
Comment 9•11 years ago
|
||
(In reply to Arpad Borsos (Swatinem) from comment #8)
> Thanks for the info. So ye, we do save a lot of useless strings on objects.
> Would it make sense to replace those by getters that output the string
> dynamically instead of predefining it?
I think that's worth a shot, yes. AFAICS, you wouldn't normally need most of the strings anyway on a given tbpl page (you only want to look at a couple of jobs at a time), so I think you'd do less work by generating the strings on demand.
Not sure that I even have the right place; looks plausible, though.
Assignee | ||
Comment 11•11 years ago
|
||
The patch moves the predefined properties into dynamic getters (they are completely transparent)
Assignee | ||
Comment 12•11 years ago
|
||
So here is a verbose about:memory measurement.
STR:
* load the inbound tree
* click the down (load more) arrow
* minimize memory
* take measurement
most notable changes:
43,330,864 B (07.69%) -- top(https://tbpl.mozilla.org/?tree=Mozilla-Inbound, id=97)
vs
31,009,736 B (05.50%) -- top(file:///home/swatinem/Coding/tinderboxpushlog/index.html?tree=Mozilla-Inbound, id=95)
11,507,504 B (02.04%) -- strings
757,184 B (00.13%) -- notable
vs
4,079,856 B (00.72%) -- strings
222,560 B (00.04%) -- notable
Maybe someone with a lot heavier tbpl usage than me can prove these improvements?
Assignee | ||
Comment 13•11 years ago
|
||
Another question to the js engine pros:
The first measurement shows the whole UserInterface.js code as a notable string. Why? And why not the even bigger jquery.js?
Assignee | ||
Comment 14•11 years ago
|
||
MachineResult.runID was casted to string needlessly, also .tree is always the same.
Also defined .push and .order to keep with the best practice to "define all properties in the constructor and dont change shapes of objects on runtime".
I did the opposite to Machine.jobPartNumber: some memoization, since I saw that function a lot in profiles. There are not a lot of these structures around, so it should not matter for mem usage.
The "success" and "testfailed" string is actually saved literally in the DB and comes in via JSON.
Attachment #816330 -
Flags: review?(emorley)
Comment 15•11 years ago
|
||
With those two patches, viewing mozilla-central with 20 pushes (not exactly a heavy load, I typically have 6+ TBPL tabs open with 10-30 pushes shown in each)...
Before:
│ ├───49.77 MB (10.29%) -- top(https://tbpl.mozilla.org/, id=364)
│ │ ├──28.70 MB (05.93%) -- active/window(https://tbpl.mozilla.org/)
│ │ │ ├──11.68 MB (02.41%) -- dom
│ │ │ │ ├───9.74 MB (02.01%) ── element-nodes
│ │ │ │ └───1.94 MB (00.40%) ++ (4 tiny)
│ │ │ ├───9.47 MB (01.96%) -- layout
│ │ │ │ ├──5.23 MB (01.08%) ++ frames
│ │ │ │ └──4.24 MB (00.88%) ++ (6 tiny)
│ │ │ ├───7.48 MB (01.55%) -- js-compartment(https://tbpl.mozilla.org/)
│ │ │ │ ├──6.51 MB (01.35%) -- objects
│ │ │ │ │ ├──4.84 MB (01.00%) ++ gc-heap
│ │ │ │ │ └──1.67 MB (00.34%) ++ (2 tiny)
│ │ │ │ └──0.97 MB (00.20%) ++ (5 tiny)
│ │ │ └───0.07 MB (00.02%) ++ (2 tiny)
│ │ ├──20.52 MB (04.24%) -- js-zone(0x3045f000)
│ │ │ ├──14.04 MB (02.90%) -- strings
│ │ │ │ ├──12.93 MB (02.67%) -- normal
│ │ │ │ │ ├───9.87 MB (02.04%) ── malloc-heap
│ │ │ │ │ └───3.06 MB (00.63%) ── gc-heap
│ │ │ │ └───1.11 MB (00.23%) ++ (2 tiny)
│ │ │ ├───6.25 MB (01.29%) ── unused-gc-things
│ │ │ └───0.23 MB (00.05%) ++ (5 tiny)
│ │ └───0.55 MB (00.11%) ++ cached/window(about:home)
With patch:
│ ├───37.21 MB (07.69%) -- top(file:///C:/src/tbpl/index.html, id=354)
│ │ ├──26.95 MB (05.57%) -- active/window(file:///C:/src/tbpl/index.html)
│ │ │ ├──12.30 MB (02.54%) -- dom
│ │ │ │ ├──10.36 MB (02.14%) ── element-nodes
│ │ │ │ └───1.94 MB (00.40%) ++ (4 tiny)
│ │ │ ├───7.86 MB (01.63%) -- layout
│ │ │ │ ├──5.23 MB (01.08%) ++ frames
│ │ │ │ └──2.63 MB (00.54%) ++ (6 tiny)
│ │ │ ├───6.71 MB (01.39%) -- js-compartment(file:///C:/src/tbpl/index.html)
│ │ │ │ ├──5.79 MB (01.20%) ++ objects
│ │ │ │ └──0.92 MB (00.19%) ++ (5 tiny)
│ │ │ └───0.07 MB (00.02%) ++ (2 tiny)
│ │ ├───9.74 MB (02.01%) -- js-zone(0xbe29400)
│ │ │ ├──5.66 MB (01.17%) -- strings
│ │ │ │ ├──4.88 MB (01.01%) ++ normal
│ │ │ │ └──0.77 MB (00.16%) ++ (2 tiny)
│ │ │ └──4.09 MB (00.84%) ++ (6 tiny)
│ │ └───0.53 MB (00.11%) ++ cached/window(about:home)
-> 24% reduction... not to be sniffed at!
Awesome :-D
Updated•11 years ago
|
Attachment #816218 -
Flags: review?(emorley) → review+
Comment 16•11 years ago
|
||
Comment on attachment 816330 [details] [diff] [review]
slimfast.patch
Review of attachment 816330 [details] [diff] [review]:
-----------------------------------------------------------------
This is great - thank you! :-)
Attachment #816330 -
Flags: review?(emorley) → review+
Assignee | ||
Comment 17•11 years ago
|
||
(In reply to Arpad Borsos (Swatinem) from comment #14)
> MachineResult.runID was casted to string needlessly, also .tree is always
> the same.
Hm, id has type int in the database, but comes out as string in the json. So I cast it back to a number.
Also I got rid of those notable "success" strings by a clever hack. Maybe it is getting a bit ridiculous by now :-D
Attachment #816330 -
Attachment is obsolete: true
Attachment #816335 -
Flags: review?(emorley)
Assignee | ||
Comment 18•11 years ago
|
||
Just as a side note: The comparison is a little bit in favor of the in-production version, since it does not have the split to b2g desktop yet, with all the additional dom nodes and associated overhead. But anyway, there is still a clear win :-)
Comment 19•11 years ago
|
||
Comment on attachment 816335 [details] [diff] [review]
Other cleanups
Review of attachment 816335 [details] [diff] [review]:
-----------------------------------------------------------------
r- due to TypeError breaking job selection
::: js/BuildbotDBUser.js
@@ +45,5 @@
> + // a constant string. Saves us ~200k for ~5k results. \o/
> + _staticStrings: {
> + "success": "success",
> + "testfailed": "testfailed",
> + },
Please can you also add the other result states from:
https://hg.mozilla.org/webtools/tbpl/file/4b9bf20dad6a/dataimport/import-buildbot-data.py#l83
@@ +61,3 @@
> "machine": machine,
> "slave": run.slave,
> + "runID": parseInt(run._id, 10),
Unfortunately this causes a TypeError, since the runID can helpfully not actually be a integer - in the case of pending and running jobs, which are prefixed with "pending-" and "running-" appropriately, so we trip up against:
https://hg.mozilla.org/webtools/tbpl/file/4b9bf20dad6a/js/UserInterface.js#l1435
...when selecting a job in the main UI.
Attachment #816335 -
Flags: review?(emorley) → review-
Assignee | ||
Comment 20•11 years ago
|
||
(In reply to Ed Morley [:edmorley UTC+1] from comment #19)
> Please can you also add the other result states from:
> https://hg.mozilla.org/webtools/tbpl/file/4b9bf20dad6a/dataimport/import-
> buildbot-data.py#l83
Done
> > "machine": machine,
> > "slave": run.slave,
> > + "runID": parseInt(run._id, 10),
>
> Unfortunately this causes a TypeError, since the runID can helpfully not
> actually be a integer - in the case of pending and running jobs, which are
> prefixed with "pending-" and "running-" appropriately, so we trip up against:
> https://hg.mozilla.org/webtools/tbpl/file/4b9bf20dad6a/js/UserInterface.
> js#l1435
> ...when selecting a job in the main UI.
Oh, so the string cast was actually useful. I added it back as a .toString() so we don’t break when php suddenly decides to json_encode it as a real int.
Attachment #816335 -
Attachment is obsolete: true
Attachment #816345 -
Flags: review?(emorley)
Comment 21•11 years ago
|
||
Comment on attachment 816345 [details] [diff] [review]
Other cleanups
Review of attachment 816345 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good, thank you :-)
Attachment #816345 -
Flags: review?(emorley) → review+
Comment 22•11 years ago
|
||
In production :-)
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Product: Webtools → Tree Management
Updated•10 years ago
|
Product: Tree Management → Tree Management Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•