Closed Bug 1713100 Opened 3 years ago Closed 3 years ago

8.01 - 6.69% Base Content Explicit / Base Content Explicit + 4 more (Linux, OSX, Windows) regression on Tue May 25 2021

Categories

(Core :: Memory Allocator, defect)

defect

Tracking

()

RESOLVED FIXED
91 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox88 --- unaffected
firefox89 --- unaffected
firefox90 + fixed
firefox91 --- fixed

People

(Reporter: aesanu, Assigned: toshi)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: perf, perf-alert, regression)

Attachments

(1 file)

Perfherder has detected a awsy performance regression from push 8bee937821e3725b922352a0493f53b5e431c3d0. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Suite Test Platform Options Absolute values (old vs new)
8% Base Content Explicit macosx1015-64-shippable 8,101,360.00 -> 8,750,576.00
8% Base Content Explicit linux1804-64-shippable 9,029,274.67 -> 9,746,928.00
8% Base Content Explicit linux1804-64-shippable 9,030,981.33 -> 9,745,050.67
7% Base Content Explicit linux1804-64-shippable-qr 9,087,472.00 -> 9,744,026.67
7% Base Content Resident Unique Memory windows10-64-shippable 9,442,645.33 -> 10,084,864.00
7% Base Content Explicit windows10-64-shippable 8,103,578.67 -> 8,645,616.00

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(tkikuchi)

Is this really caused by bug 1711610? I think bug 1711610 impacts only Windows.

Flags: needinfo?(tkikuchi)

Could this affect the behavior on other platforms when we trigger a manual minimize memory notification? The AWSY tests do that.

No, this reduces a chance to trigger memory-pressure events on Windows. There is no change on the behaviors after memory notifications are triggered.

Set release status flags based on info from the regressing bug 1711610

Andra, could you retriage the regression?

Flags: needinfo?(aesanu)

This is a pretty large regression so we should really figure out what is going on here.

(In reply to Marco Castelluccio [:marco] from comment #5)

Andra, could you retriage the regression?

I did some retriggers. The graph is pretty clear and the regression seems to be here.

Flags: needinfo?(aesanu)

[Tracking Requested - why for this release]: large regression in content process memory overhead, though it is possible that this is some kind of measurement artifact.

I downloaded the memory-report-TabsOpenForceGC-0.json.gz artifacts before and after the regression, then compared them using "load and diff" in about:memory. "web" is the relevant process.

On Linux, the regression looks like this:

4.84 MB (100.0%) -- explicit
├──4.88 MB (100.73%) -- heap-overhead
│ ├──4.91 MB (101.53%) ── page-cache [8]

OSX and Windows look similar.

Maybe there's some mechanism to purge the page-cache on memory minimization that got broken?

Ah, now I see the problem. An OS-independent part in my patch is I moved AvailableMemoryTracker::Init() from NS_InitXPCOM to XREMain::XRE_mainRun() to use prefs. This broke memory-pressure in the child processes entirely because XRE_mainRun() is only for the main process. Apparently this is my bad! I'll prepare a patch.

Bug 1711610 moved AvailableMemoryTracker::Init() from NS_InitXPCOM to XRE_mainRun,
but it caused memory degradation because AvailableMemoryTracker was no longer initialized
in the child processes.

I made that part for nsAvailableMemoryWatcher to cache the pref value in the earlier design,
but it's not needed at all in the current design because nsAvailableMemoryWatcher loads
a mirror value every time.

This patch reverts AvailableMemoryTracker::Init() back to NS_InitXPCOM.

Assignee: nobody → tkikuchi
Status: NEW → ASSIGNED

I started a Try job with a patch: https://treeherder.mozilla.org/jobs?repo=try&revision=0b2e8348281b59938e4ef471bf210918edb644c5. How can I kick performance tests and verify the patch before landing?

Flags: needinfo?(aesanu)

The base overhead tests show up as SY(ab). You can find them by doing "add new jobs (search)" and searching for "awsy base". Then you can use perf herder to compare the results to the existing ones.

Pushed by tkikuchi@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cee641108aa0
Move AvailableMemoryTracker::Init() back to NS_InitXPCOM.  r=gsvelto
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 91 Branch

The patch landed in nightly and beta is affected.
:toshi, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(tkikuchi)

== Change summary for alert #30310 (as of Mon, 07 Jun 2021 09:55:01 GMT) ==

Improvements:

Ratio Suite Test Platform Options Absolute values (old vs new)
7% Base Content Explicit linux1804-64-shippable 9,759,045.33 -> 9,049,584.00
7% Base Content Explicit windows10-64-shippable 8,662,000.00 -> 8,036,848.00
7% Base Content Explicit linux1804-64-shippable 9,760,922.67 -> 9,057,946.67
7% Base Content Explicit macosx1015-64-shippable 8,707,141.33 -> 8,087,536.00
7% Base Content Resident Unique Memory windows10-64-shippable 10,182,656.00 -> 9,488,213.33
7% Base Content Explicit linux1804-64-shippable-qr 9,758,192.00 -> 9,098,906.67

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=30310

Comment on attachment 9225004 [details]
Bug 1713100 - Move AvailableMemoryTracker::Init() back to NS_InitXPCOM. r=gsvelto

Beta/Release Uplift Approval Request

  • User impact if declined: Firefox consumes 7-8% more memory on all platforms because no memory-pressure events are issued in child processes.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This is a regression from bug 1711610 and the fix is to partially revert the regressing change into the original. No new logic was implemented by this patch.
  • String changes made/needed: None
Flags: needinfo?(tkikuchi)
Attachment #9225004 - Flags: approval-mozilla-beta?

Comment on attachment 9225004 [details]
Bug 1713100 - Move AvailableMemoryTracker::Init() back to NS_InitXPCOM. r=gsvelto

approved for 90.0b5

Attachment #9225004 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

== Change summary for alert #30363 (as of Mon, 14 Jun 2021 14:41:51 GMT) ==

Improvements:

Ratio Suite Test Platform Options Absolute values (old vs new)
8% Base Content Explicit linux1804-64-shippable 9,777,136.00 -> 8,982,853.33
7% Base Content Explicit windows10-64-shippable 8,595,269.33 -> 7,960,901.33
7% Base Content Explicit linux1804-64-shippable-qr 9,761,434.67 -> 9,041,904.00
6% Base Content Resident Unique Memory windows10-64-shippable 10,041,002.67 -> 9,402,368.00

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=30363

Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: