Closed Bug 1521701 Opened 6 years ago Closed 6 years ago

6.02% Heap Unclassified (windows7-32) regression on push 965622da5962b6cfd9e9e8b5332896e1634070e1 (Sun Jan 20 2019)

Categories

(Core :: Graphics, defect, P3)

defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: Bebe, Unassigned)

References

Details

(Keywords: perf, regression)

We have detected an awsy regression from push:

https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=965622da5962b6cfd9e9e8b5332896e1634070e1

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

6% Heap Unclassified windows7-32 opt stylo 32,977,661.34 -> 34,961,459.47

Improvements:

19% Images windows7-32 opt stylo 5,115,635.56 -> 4,147,833.83
19% Images osx-10-10 opt stylo 5,526,468.29 -> 4,490,395.57
18% Images windows7-32 pgo stylo 5,046,256.87 -> 4,134,271.14
18% Images windows10-64 pgo stylo 6,154,602.30 -> 5,077,514.01
17% Images linux64-stylo-sequential opt stylo-sequential 5,090,349.43 -> 4,207,002.83
17% Images linux64 opt stylo 5,037,764.76 -> 4,166,564.87

You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=18830

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/AWSY/Tests

Product: Testing → Core
Version: Version 3 → unspecified

:bas.schouten this regresion is caused by one of the two bugs:
Bug 1521027 or Bug 1521008 can you take a look and remove the one unrelated

Flags: needinfo?(bas)

Hrm, both of these could have a small effect, they are very closely related. It seems most of these are improvements, I can theorize a little about why those happened. I could also see how memory could shift compared to before. I'm not entirely certain how Heap Unclassified is measured. What seems for all intents and purposes impossible is for these patches to actually -increase- total memory usage.

What they would cause us to do though, is keep some surfaces alive slightly longer while they are shipped to the paint thread, where they are changed into 'proper' surfaces a lot of the time. I suspect the surfaces we're keeping alive longer are counted as 'heap unclassified' (that's probably something that should be fixed but it's unrelated to my patches), and the ones we're creating are counted as 'Images'. If we now do a measurement in the timeslot between the image decoding and the paint thread, what will happen is the numbers will have shifted (basically taken from 'Images' and dropped into 'unclassified', even though no different amount of total memory is actually used. A temporary increase in memory would be theoretically possible, if the 'original' storage would be somehow less efficient than the 'new' storage, but this would be a very short-term increase (on the order of milliseconds).

Flags: needinfo?(bas)

(In reply to Florin Strugariu [:Bebe] from comment #0)

Regressions:

6% Heap Unclassified windows7-32 opt stylo 32,977,661.34 -> 34,961,459.47

Improvements:

19% Images windows7-32 opt stylo 5,115,635.56 -> 4,147,833.83

Are we trading off ~1 MB of improvement for ~2 MB of a regression, though? (I don't know if I'm interpreting that correctly)

This isn't a Core:General thing so I'm going to roll the dice and move it to Graphics (it almost came up ImageLib).

Component: General → Graphics
Priority: -- → P3

A cursory look indicates this is an across the board overall regression. If we look at the subtests for heap-unclassified tabs open has regressed a large amount (20-25%). Explicit appears to have regressed as well (though lower confidence).

I retriggered the inbound push to get more completely numbers.

(In reply to Bas Schouten (:bas.schouten) from comment #2)

Hrm, both of these could have a small effect, they are very closely related. It seems most of these are improvements, I can theorize a little about why those happened. I could also see how memory could shift compared to before. I'm not entirely certain how Heap Unclassified is measured. What seems for all intents and purposes impossible is for these patches to actually -increase- total memory usage.

roughly heap-unclassified = heap-allocated - explicit

After the retriggers it looks like the resident measurement may have improved (it's pretty noisy and low confidence but seems to be down across the board) by ~20MB for tabs open. Explicit appears to have increased by ~6MB, heap-unclassifed by ~22MB. Images decreased by ~17MB.

Overall it feels like we moved a bunch of measured overhead (images) to unmeasured overhead (heap-unclassified) and possibly caused more heap allocation overall (explicit). Resident going down ~20MB is interesting and makes me less concerned overall. It would be helpful if you could dig into diffing some memory reports to get a finer grained idea of what happened.

Whiteboard: [MemShrink]

We should definitely make sure this memory is reported, but as far as the regression goes we think this okay.

Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WORKSFORME
Whiteboard: [MemShrink]
You need to log in before you can comment on or make changes to this bug.