1439509 - [dt-leak] NetworkEvent actors leak on reload

Reporter

Description

•

7 years ago

Attached file memory-report-before-cc.json.gz (deleted) — Details

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0 Build ID: 20180215141455 Steps to reproduce: Open a page in a local server; Open the Developer Tool; Do a change in a JavaScript file; Reload the page; Actual results: The number of copies stored and the memory usage increase. Expected results: Do a garbage collection.

guirec.corbel

Reporter

Comment 1

•

7 years ago

I reproduced the bug with this HTML file : <script src='./bigfile.js' type="text/javascript" charset="utf-8"></script> And this js file : https://raw.githubusercontent.com/orangehill/jstree-bootstrap-theme/master/dist/jstree.js . Any big files should be the same. I didn't saw the bug with a small file. As you can see in the attachment, I can take 4go of RAM and 18go of "vsize". I had to closed Firefox and the CPU usage was very high. I had to kill the process to really stop Firefox. As you can see here "string(length=515416, copies=101, "/* jsTree default theme */" there is a lot of copies of the same files. I have to close the dev tool and run a Garbage Collection and Cycle Collection to release the memory. It took the CPU for a while when I run a Garbage Collection. The "Disable Cache" checkbox was checked in the network tab.

Alexandre Poirot [:ochameau]

Comment 2

•

7 years ago

Does it happen with every tools or were you using one in particular (debugger, console, inspector)?

Alexandre Poirot [:ochameau]

Comment 3

•

7 years ago

I see you are having LiveReload add-on via the memory profile you attached. Were you using any particular workflow/tool, or just a plain http server and F5? I imagine live-reload add-ons may make such memory leak even more present!

guirec.corbel

Reporter

Comment 4

•

7 years ago

I send an memory report with livereload disabled. I was on the console, I will test in other tabs in a few minutes. As you can see, there 7 copies of "/app/lib/assets/stylesheets/loader.scss" but I didn't touch this file. There is other files loaded multiple times too.

guirec.corbel

Reporter

Comment 5

•

7 years ago

Attached file memory-report.json.gz (deleted) — Details

guirec.corbel

Reporter

Comment 6

•

7 years ago

It is the same with other tabs. For information, I was modifying the file that starts with "Modernizer". I reloaded with F5.

u554753

Updated

•

7 years ago

Component: Untriaged → Developer Tools

Alexandre Poirot [:ochameau]

Comment 7

•

7 years ago

(In reply to guirec.corbel from comment #1) > I reproduced the bug with this HTML file : > > <script src='./bigfile.js' type="text/javascript" > charset="utf-8"></script> > > And this js file : > https://raw.githubusercontent.com/orangehill/jstree-bootstrap-theme/master/ > dist/jstree.js . I tried to reproduce that on today's nightly, without success. I used this html file: <script src='jquery.js' type="text/javascript" > charset="utf-8"></script> <script src='bigfile.js' type="text/javascript" > charset="utf-8"></script> With the following js files next to it: $ wget https://code.jquery.com/jquery-3.3.1.min.js -O jquery.js $ wget https://raw.githubusercontent.com/orangehill/jstree-bootstrap-theme/master/dist/jstree.js -O bigfile.js My STR was: * open DevTools on the console * press F5 to reload the page tons of times My result was that the memory is surely growing, that's expected. But nothing really bad, here is the memory after 30seconds of reloads: 224MB for parent process and 211MB for content process (Firefox started at: 172MB for parent and 53MB for content with the test document opened with the console) Also I do not see anything really special in about:memory. So, there must be something special in your usecase. Are you sure you opened only the console and directly the console? Did you reproduced with just a <script> tag in an empty html document?

guirec.corbel

Reporter

Comment 8

•

7 years ago

I wasn't able to reproduce the same bug with the same file I used when I created the issue. I was able to reproduce with a brand new ember app. I did just `ember new my-app && cd my-app && ember s` when ember cli is installed. You can install ember with `npm install -g ember-cli`. I attached a small video with a new profile. As you can see in the video, the memory is clean first. I open the console with Ctrl+Shift+k and refresh some times. After that, there is 4 copies of the same files. It's cleaned when I close the console. I don't need to click on "GC" and "CC" this time. I suppose this is because the memory is not loaded. It isn't a problem specific to Ember. I have it with other applications.

guirec.corbel

Reporter

Comment 9

•

7 years ago

Attached video Capture d’écran vidéo de 2018-02-28 13:03:47.webm (deleted) — Details

Alexandre Poirot [:ochameau]

Comment 10

•

7 years ago

Ah ok, I see. This is related to source map, this is why the simplest example doesn't work. 108.15 MB (25.29%) -- worker(resource://devtools/client/shared/source-map/worker.js, 0x7f8ca8b53800) │ │ ├──103.92 MB (24.30%) -- zone(0x7f8ca915f000) │ │ │ ├───53.03 MB (12.40%) -- strings │ │ │ │ ├──32.13 MB (07.51%) -- string(length=3368956, copies=5, "{"version":3,"sources":["vendor/ember So the source-map worker is loading the sources multiple times. If you force a GC, these strings go away, so it isn't a leak. It's just that the source map is allocating a lot of memory because of the JS content copies. There is multiple ways to improve that: * force GC on page reload * kill, respawn a source worker on reload, that may help releasing the worker objets sooner * optimize source map to allocate less memory? * try to detect when the resources don't change and do not re-request the source map worker to re-parse the files * ...

guirec.corbel

Reporter

Comment 11

•

7 years ago

On the reports I gave, the memory leaks dosen't seems to be related to the source map (or it's less obvious) as you can see here : 4,893.12 MB (100.0%) -- explicit ├──3,357.53 MB (68.62%) -- window-objects/top(http://app.local.com:4200/tests/#, id=6442450990) │ ├──2,301.10 MB (47.03%) ++ active │ └──1,056.43 MB (21.59%) ++ js-zone(0x7fa71f048000) ├────972.40 MB (19.87%) -- js-non-window │ ├──888.73 MB (18.16%) -- zones │ │ ├──871.52 MB (17.81%) -- zone(0x7fa74637d000) │ │ │ ├──831.51 MB (16.99%) -- strings │ │ │ │ ├──502.44 MB (10.27%) ++ (276 tiny) │ │ │ │ ├───99.42 MB (02.03%) -- string(length=515416, copies=101, "/* jsTree" (truncated)) │ │ │ │ │ ├──99.42 MB (02.03%) ── malloc-heap/two-byte │ │ │ │ │ └───0.00 MB (00.00%) ── gc-heap/two-byte │ │ │ │ ├───92.18 MB (01.88%) -- string(length=893446, copies=54, ""use strict" (truncated)) │ │ │ │ │ ├──92.18 MB (01.88%) ── malloc-heap/two-byte │ │ │ │ │ └───0.00 MB (00.00%) ── gc-heap/two-byte │ │ │ │ ├───81.35 MB (01.66%) -- string(length=1092231, copies=39, "/n/n(function() " (truncated)) │ │ │ │ │ ├──81.35 MB (01.66%) ── malloc-heap/two-byte │ │ │ │ │ └───0.00 MB (00.00%) ── gc-heap/two-byte │ │ │ │ └───56.11 MB (01.15%) -- string(length=1049559, copies=28, "window.EmberENV"(truncated)) │ │ │ │ ├──56.11 MB (01.15%) ── malloc-heap/two-byte It seems to be under the tab where the page is loaded. In the screencast, it seems to be the same than you.

guirec.corbel

Reporter

Comment 12

•

7 years ago

I just realized I forgot an important part yesterday (and you too). We have to do a small change in a JS file before to reload as I specified in the first comment.

guirec.corbel

Reporter

Comment 13

•

7 years ago

I still have the issue. It keeps multiple copies of JS files when we change a file and reload with the dev tool open.

Julien Wajsberg [:julienw]

Updated

•

7 years ago

Status: UNCONFIRMED → NEW

Ever confirmed: true

Patrick Brosset <:pbro>

Updated

•

7 years ago

Blocks: devtools-performance

Priority: -- → P2

BMO Automation

Updated

•

6 years ago

Product: Firefox → DevTools

guirec.corbel

Reporter

Comment 14

•

6 years ago

I tried again to reproduce the bug and it seems be a little bit different. Now, to reproduce the bug, I have to open the file in the debugger tab. It seems to keep copies of old JS files. I think it needs less memory than before but I suppose some garbage collection have to be performed. The file is still in memory if close the file in the debugger.

:Harald Kirschner :digitarald

Updated

•

6 years ago

Blocks: 1553178
No longer blocks: devtools-performance

:Harald Kirschner :digitarald

Updated

•

5 years ago

Blocks: dt-leak
No longer blocks: 1553178

Sorin Davidoi

Assignee

Comment 15

•

5 years ago

Attached file fix(devtools): Clear stored events on tab navigation (deleted) — Details

Don't keep old network events around after a tab navigation occurs.

Phabricator Automation

Updated

•

5 years ago

Assignee: nobody → sorin.davidoi

Alexandre Poirot [:ochameau]

Comment 16

•

5 years ago

Harald, Could you think about any existing telemetry which may help assert the impact of this patch?
I'm pretty confident it will address serious memory leaks from DevTools, and it would be great to take an overview of existing metrics we have to see if one highlights this fix. If one do pick this up, we may take it as good candidate for the performance initiative.

I was wondering for example if we have a metrics about OOM crashes? If yet, could we categorize the one related to DevTools?
Or telemetry about memory usage, also categorized by DevTools usages?
Or the number of restart of Firefox per day/week? Or Firefox uptime for DevTools users?

Flags: needinfo?(hkirschner)

:Harald Kirschner :digitarald

Comment 17

•

5 years ago

Not sure about OOM crashes as we only get very little information. Bug 1297525 was filed for DevTools because of OOM, but we don't have the means to track those reliably over time (afaik).

The best we can do without additional instrumentation is relying on existing probes. For leaks especially we can track our memory telemetry; but also GC related metrics (leak = long GC). Leaks slow specific parts of GC but also cause hangs with mostly GC.

Preliminary chat with sfink about this yields:

For DevTools users:

MEMORY_JS_GC_HEAP should improve.
GC_MS should improve.
GC_MARK_MS should improve.
GC_SWEEP_MS should not change much as leaks prevent sweeping (maybe)

As you said (and Sorin brought it up as well) open/close count for DevTools should improve if this has big impact, as we trained developers that DevTools leaks are fixed by closing the tools (also comes up in feedback).

Most of these we should be able to plot in stmo (with some help from data science).

Flags: needinfo?(hkirschner)

Alexandre Poirot [:ochameau]

Comment 18

•

5 years ago

It is a bit surprising, but DAMP doesn't highlight any improvement:
https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=try&newProject=try&newRevision=0f7e1a3c96ad5512e356006e3a19af287d75527c&originalSignature=1759151&newSignature=1759151&framework=12&originalRevision=f77921e426e801272780129a8cdff3121c3d6620
I would expect us to free some more memory between page reload while running the netmonitor test, leading to less GC pauses in the following tests. But nothing visible.
Note that netmonitor isn't having a stress test document. We only test it against bild.de, which may not trigger enough requests?

Sorin Davidoi

Assignee

Comment 19

•

5 years ago

Last time I checked the DAMP version of bild.de it seemed pretty lightweight by today's standards. That's why I chose Mastodon to test (it's pretty heavyweight with React, Redux, Immutable and tons of other libraries).

:Harald Kirschner :digitarald

Comment 20

•

5 years ago

Agreed. An SPA would probably best represent a modern heavy JS payload. Mastodon seems good – Riot, Firefox Profiler would also be open source alternatives. If we want to go towards content, we could have Discord and Reddit.

We can also continue that discussion in bug 1539195.

Patrick Brosset <:pbro>

Updated

•

5 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1612051

Alexandre Poirot [:ochameau]

Comment 21

•

5 years ago

Note that bug 1615283 might help improve test helpers. At least, when we navigate to a new URL, there is a shared helper that we can tweak.
But here, we may want to also tweak initNetmonitor all all netmonitor tests, using various wait methods for no clear reason.

Honza, could you please have a look at the comments on phabricator?
https://phabricator.services.mozilla.com/D54800#1815934
https://phabricator.services.mozilla.com/D54800#1906061
https://phabricator.services.mozilla.com/D54800#inline-377350

I believe that, in order to fix this issue and have tests passing, we would need to review all existing helpers around netmonitor.
We have way too many different wait mechanisms:

waitForAllNetworkUpdateEvents
waitForNetworkEvents
waitForAllRequestsFinished
waitForRequestsFinished
and probably some others...
This patch highlights that some are weak and waitForAllRequestsFinished looks slightly better.
Should we use this one everywhere? Would it work? Or is it weak against another edge case?
Why are we having so many different ways to wait for updates?

Then, once we can pick the best way to wait for updates, we can go over netmonitor test and use the right one.
This may involve tweaking initNetmonitor and/or navigateTo.
It looks like we are explicitely avoiding to check the behavior during toolbox opening and always reload the page.
This creates a lot of duplicated lines in each test, if we have to keep this, we should introduce an helper to do all of that in one function call.

Flags: needinfo?(odvarko)

Sorin Davidoi

Assignee

Comment 22

•

5 years ago

Could we maybe create another issue to track this refactoring of the tests? It seems to be a more systemic problem than what the revision tries to address.

Flags: needinfo?(poirot.alex)

Alexandre Poirot [:ochameau]

Updated

•

5 years ago

Depends on: 1622043

Alexandre Poirot [:ochameau]

Comment 23

•

5 years ago

(In reply to Sorin Davidoi from comment #22)

Could we maybe create another issue to track this refactoring of the tests? It seems to be a more systemic problem than what the revision tries to address.

Yes. Your patch raise a systemic issue which has to be addressed before being able to fix this leak.

I opened bug 1622043 and will start submitting patches to it.

Alexandre Poirot [:ochameau]

Updated

•

5 years ago

Flags: needinfo?(poirot.alex)

Sorin Davidoi

Assignee

Comment 24

•

5 years ago

@ochameau Since the test issues are fixed, can you take another look at the patch?

Flags: needinfo?(poirot.alex)

Jan Honza Odvarko [:Honza] (always need-info? me)

Comment 25

•

5 years ago

I was looking at this yesterday and pushed to try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=75a94dd42d446e96e21309e64deb8e7553ed17f2

I am seeing some test failures:

devtools/client/netmonitor/test/browser_net_block-context.js
devtools/client/netmonitor/test/browser_net_block-csp.js
devtools/client/netmonitor/test/browser_net_block.js
devtools/client/netmonitor/test/browser_net_cause_source_map.js
devtools/client/netmonitor/test/browser_net_charts-**.js
devtools/client/netmonitor/src/har/test/browser_net_har_copy_all_as_har.js
devtools/client/framework/test/browser_ignore_toolbox_network_requests.js

I was doing some analysis of the Network monitor backend and I am hoping to get also more time to work on this.

Honza

Flags: needinfo?(odvarko)

Sorin Davidoi

Assignee

Comment 26

•

4 years ago

I've rebased the patch. The test failure are still there, and I'm not sure how to fix them. I suspect that they are related to the test setup (probably similar issues to the ones @ochameau has fixed).

Julian Descottes [:jdescottes]

Comment 27

•

4 years ago

Cleaning up dt-leak dependencies

Summary: High RAM usage when the developer tool is open → [dt-leak] NetworkEvent actors leak on reload

Julian Descottes [:jdescottes]

Comment 30

•

4 years ago

I'm curious to know if the test failures in this bug highlighted a real issue with the fix or if they are only test artifacts.
I wonder if it would be acceptable to put this new behavior behind a pref so that we can toggle it off for the tests?

Julian Descottes [:jdescottes]

Updated

•

4 years ago

Blocks: dt-leak-reload

Alexandre Poirot [:ochameau]

Comment 31

•

3 years ago

Attached file Bug 1439509 - [devtools] Clear network events when their related WindowGlobal is destroyed. (deleted) — Details

The main challenge here is to still support persist-logs feature of the netmonitor.
When this feature is enabled we have to keep around all the Network Events as before this patch.

Fission refactoring and the migration to server side NETWORK_EVENT watchers made
this fix significantly easier as we can now more easily reach out the parent process
code which is keeping things allocated.

Note that with the current patch, we still leak one request. The navigation request,
which is slightly special as it is flaged with previous document innerWindowId
and is meant to be inspectable from the next document load.

Alexandre Poirot [:ochameau]

Comment 32

•

3 years ago

DAMP reports various small improvements (3% to 6%) on many tests:
https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=mozilla-central&newProject=try&newRevision=be53546902553b0e585b42f253d6ceb0f67876c8&originalSignature=3148563&newSignature=3130913&framework=12&selectedTimeRange=172800&page=1&showOnlyImportant=1
The most important improvements impact tests others than netmonitor one, which probably means that the leaked objects were slowing down the following tests.

Flags: needinfo?(poirot.alex)

Alexandre Poirot [:ochameau]

Updated

•

3 years ago

Whiteboard: dt-perf-stability-triage

Alexandre Poirot [:ochameau]

Updated

•

3 years ago

Depends on: 1728737

Julian Descottes [:jdescottes]

Updated

•

3 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1660854

Alexandre Poirot [:ochameau]

Updated

•

3 years ago

Blocks: 1729473

Pulsebot

Comment 33

•

3 years ago

Pushed by apoirot@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c1ba43b65415 [devtools] Clear network events when their related WindowGlobal is destroyed. r=bomsy,devtools-backward-compat-reviewers,jdescottes

Alexandre Poirot [:ochameau]

Comment 34

•

3 years ago

Surprisingly, I no longer see notable improvement reported by DAMP:
https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=try&newProject=try&newRevision=f08cb89717e5fb10f940ffc0ef6e84d596646fd6&originalSignature=3130913&newSignature=3130913&framework=12&originalRevision=37234b235e647e5814d4eb24fe19d873ad284f9f&page=1

Let's see if it impacts the metrics on mozilla-central.

But it does still fix browser_allocation_reload.js, which sees its number of leaks go down to 0 objects leaked with stack (with all the target fronts and markupview fixes [still in flight]).

Cosmin Sabou [:CosminS]

Comment 35

•

3 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/c1ba43b65415

Status: NEW → RESOLVED

Closed: 3 years ago

status-firefox94: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 94 Branch

Simona Badau, Desktop QA

Updated

•

3 years ago

Regressions: 1740116

memory-report-before-cc.json.gz 7 years ago guirec.corbel (deleted), application/gzip		Details
memory-report.json.gz 7 years ago guirec.corbel (deleted), application/gzip		Details
Capture d’écran vidéo de 2018-02-28 13:03:47.webm 7 years ago guirec.corbel (deleted), video/webm		Details
fix(devtools): Clear stored events on tab navigation 5 years ago Sorin Davidoi (deleted), text/x-phabricator-request		Details
Bug 1439509 - [devtools] Clear network events when their related WindowGlobal is destroyed. 3 years ago Alexandre Poirot [:ochameau] (deleted), text/x-phabricator-request		Details