Closed Bug 1683301 Opened 4 years ago Closed 4 years ago

2.14 - 2.15% JS (linux1804-64-shippable-qr) regression on push 01bd308ee954d80ca25db2ca2cb8ad4e66f6f7a2 (Tue December 15 2020)

Categories

(Core :: JavaScript: GC, defect)

defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox-esr78 --- unaffected
firefox84 --- unaffected
firefox85 --- unaffected
firefox86 --- wontfix
firefox87 --- wontfix

People

(Reporter: aesanu, Unassigned)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Perfherder has detected a awsy performance regression from push 01bd308ee954d80ca25db2ca2cb8ad4e66f6f7a2. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Suite Test Platform Options Absolute values (old vs new)
2% JS linux1804-64-shippable-qr 78,575,094.34 -> 80,266,439.43
2% JS linux1804-64-shippable-qr 78,599,591.62 -> 80,278,358.41

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(bob)

jmaher: The most relevant part of the patch is in process_perf_data.py

--- a/testing/awsy/awsy/process_perf_data.py
+++ b/testing/awsy/awsy/process_perf_data.py
@@ -155,17 +155,17 @@ def create_suite(
             # for simplicity (it's nice to be able to compare RSS of non-e10s
             # with RSS + USS of e10s).
             totals_rss = parse_about_memory.calculate_memory_report_values(
                 memory_report_path, node, ["Main"]
             )
             totals_uss = parse_about_memory.calculate_memory_report_values(
                 memory_report_path, "resident-unique"
             )
-            value = totals_rss.values()[0] + sum(
+            value = list(totals_rss.values())[0] + sum(
                 [v for k, v in totals_uss.iteritems() if "Main" not in k]
             )
 
         subtest = {
             "name": checkpoint["name"],
             "value": value,
             "lowerIsBetter": True,
             "unit": "bytes",

Perhaps we are getting new keys which do not include "Main". I'll run some tests locally but if you have any thoughts on this I would appreciate your input.

Flags: needinfo?(jmaher)

Looking at the graph closer, it seems to me that the regression occured early this morning on Dec 18 while the patch landed on Dec 15. I don't think my awsy patch has anything to do with this. I checked the keys used before/after and see no differences.

Flags: needinfo?(bob)

after retriggering, this looks like a possible regression (retriggers show some lower values), but the push afterwards seems to be more of a regression:
https://hg.mozilla.org/integration/autoland/rev/03e46a1ed9c18dfee6002d37044d4a052cdeb956

if this is a regression, it is should be classified as 'infra'

Flags: needinfo?(jmaher)

That seems like it could be a plausible cause for regression in JS memory.

Flags: needinfo?(jcoppeard)

Set release status flags based on info from the regressing bug 1681164

Component: AWSY → JavaScript: GC
Product: Testing → Core
Regressed by: 1677765
No longer regressed by: 1681164
Has Regression Range: --- → yes

That change should not have any effect on memory usage, it's just moving some work off the main thread.

There are two regression here, one on the 15th (which this bug was filed for) and a larger one on the 18th (mentioned in comment 2).

Retriggers for the first regression do point at bug 1677765. I'll investigate these.

For the second regression, retriggers suggest a push containing bug 1682757. Since this was also found to be the root of a memory regression in bug 1683841 I think it's likely that it's the cause of this too.

Muddying the waters, bug 1681533 caused a small regression directly before that push but was later backed out due to bug 1683870.

(In reply to Jon Coppeard (:jonco) from comment #7)

For the second regression, retriggers suggest a push containing bug 1682757. Since this was also found to be the root of a memory regression in bug 1683841 I think it's likely that it's the cause of this too.

Yes, and that just reverted the Marionette code to no longer disable bfcache for all the navigations. The JS opt memory was also affected by that as you can see in the following graph:

https://treeherder.mozilla.org/perfherder/graphs?highlightAlerts=1&highlightChangelogData=1&series=autoland,2223844,1,4&timerange=5184000

On November 11th we landed the patch that enabled JSWindowActors by default in Marionette (bug 1669169), which introduced this regression with bfcache. Now with bug 1682757 landed it was just fixed. So it renders it indeed as wontfix.

I filed bug 1685128 to reduce memory used by the nursery when it's idle. I'm hopeful that landing this and bug 1681533 will cause the situation to return to normal.

Depends on: 1685128
Flags: needinfo?(jcoppeard)

Since both bug 1685128 and bug 1681533 seem to be fixed now, do we still this situation?

(In reply to Steven DeTar [:sdetar] from comment #10)

Since both bug 1685128 and bug 1681533 seem to be fixed now, do we still this situation?

Flags: needinfo?(jcoppeard)
Flags: needinfo?(aesanu)

Unfortunately those landed without improving the situation. At this point I'd like to just accept the 2% regression.

Flags: needinfo?(jcoppeard)

Marking this as WONTFIX.

Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(aesanu)
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.