Closed
Bug 1159003
Opened 10 years ago
Closed 7 years ago
The buffer for user timing api should not have an upper limitation
Categories
(Core :: DOM: Core & HTML, defect)
Core
DOM: Core & HTML
Tracking
()
RESOLVED
FIXED
mozilla56
Tracking | Status | |
---|---|---|
firefox56 | --- | fixed |
People
(Reporter: hiro, Assigned: baku)
References
(Blocks 1 open bug)
Details
Attachments
(2 files, 2 obsolete files)
(deleted),
patch
|
bzbarsky
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bzbarsky
:
review+
|
Details | Diff | Splinter Review |
Current implementation uses an upper bound value for resource timing api. But there is no description about the limitation in spec.
Comment 1•10 years ago
|
||
Maybe there should be? Otherwise it seems like an easy way to leak memory....
Reporter | ||
Comment 2•10 years ago
|
||
I thought user should care about the buffer using clearMarks and clearMeasures in case of user timing api.
Comment 3•9 years ago
|
||
User Timing is using the same buffer as resource timing [1][2]. Raptor relies on receiving marks from the homescreen in order to relaunch applications, and because of this buffer will not mark application launches after the 144th time (144 launches + 5 homescreen-specific marks makes 149). See bug 1175230.
[1] https://dxr.mozilla.org/mozilla-central/source/dom/base/nsPerformance.cpp#877
[2] https://dxr.mozilla.org/mozilla-central/source/dom/base/nsPerformance.h#358
Comment 4•7 years ago
|
||
(In reply to Boris Zbarsky [:bz] (work week until 5/26) (if a patch has no decent message, automatic r-) from comment #1)
> Maybe there should be? Otherwise it seems like an easy way to leak
> memory....
Authors have all kind of ways to "leak memory" if they want... I don't think it is a convincing reason to set such limitation.
Comment 5•7 years ago
|
||
> Authors have all kind of ways to "leak memory" if they want...
The difference is between "leak memory if you want, but you kinda have to try" and "leak memory if you use the API the way it's meant to be used".
Comment 6•7 years ago
|
||
(In reply to Boris Zbarsky [:bz] (work week until 5/26) (if a patch has no decent message, automatic r-) from comment #5)
> The difference is between "leak memory if you want, but you kinda have to
> try" and "leak memory if you use the API the way it's meant to be used".
What is the way it's meant to be used? Recording forever? In that case, it is not implemented in a way for that in Gecko, because we only record the first several records and stop as far as the buffer is full, which is not quite helpful.
Anyway, the spec doesn't agree with you [1], and we have different behavior than other browsers which makes authors have to add workaround for Firefox [2] because of this. And it seems authors think the behavior from the spec (and other browsers) is better.
[1] https://github.com/w3c/resource-timing/issues/89#issuecomment-283463782
[2] e.g. https://github.com/tootsuite/mastodon/commit/33d73387793674d955d8ec244ca099ba5a9ef97e
Comment 8•7 years ago
|
||
> Recording forever?
Measuring things one after another, probably on every user action or off timers, etc.
If the intent is that consumers of the API are supposed to do clearMarks and clearMeasures consistently, then my question is: do they actually do this?
I understand that our current limit doesn't really make this work very well; this is mostly because the limit is quite low. I don't have a problem with raising it quite a bit, in case that wasn't clear.
I also think that the fact that setResourceTimingBufferSize affects this limit is clearly buggy; it shouldn't do that.
Comment 9•7 years ago
|
||
And again, if people have really thought this through and the "no limit" behavior is reasonable, I'm OK with that too. I just want people to have really thought this through, not assumed that just because a spec (especially a webperf spec) says something that automatically means the spec is correct.
Comment 10•7 years ago
|
||
Is the concern about memory leaks in production or in development? I'd be ok with a solution that increases the default limit (since 150 is really low) assuming we also allowed DevTools to set the limit to Infinity once DevTools was opened. In general some kind of solution that by default works better and then for developers it works as expected with other browsers and yet the limit protects non-developers from poor production code.
Comment 11•7 years ago
|
||
The concern would be production. Conditioning this on devtools usage would be a bit weird, since these APIs are meant to be used for telemetry in the wild, when devtools aren't open at all.
Again, just to be clear, I have no problem with us removing the limit, as long as we're doing that in a considered way, not just because everyone else is doing it.
Comment 12•7 years ago
|
||
Related, Bug 1373086 is making the API faster, making raising the limit easier.
Comment 13•7 years ago
|
||
I think we need to remove this limit.
We should be concerned about the production mem leaks that can occur without a limit, however when weighing that against webcompat issue where sites won't be getting reliable perf telemetry from Firefox makes me think we need to change. Many sites are already using more than our limit allows, see: https://github.com/w3c/resource-timing/issues/89#issuecomment-278214137
To combat the mem issues that come along with this DevTools can try to ensure people use clearMarks and clearMeasures consistently. I've filed bug 1374775 to handle this. We can also do some better Docs and DevRel outreach but I think the tools are likely the best place to warn people.
We may want to add some telemetry to understand the number of marks / measures (if we don't already) users are seeing so we can better quantify the perf / memory targets we need to test for.
Also setResourceTimingBufferSize shouldn't be working with User Timings I've filed bug 1374777 for that issue.
Assignee | ||
Comment 15•7 years ago
|
||
GetAsISupports() is just because nsISupports is an ambiguous class for PerformanceMainThread. I don't do an extra QI.
Assignee: hikezoe → amarchesini
Attachment #8887626 -
Flags: review?(bzbarsky)
Comment 16•7 years ago
|
||
Comment on attachment 8887626 [details] [diff] [review]
perfomance.patch
I guess it's not clear to me why we cast to nsIObserver* to get to nsISupports* instead of casting to Performance*... If there is a reason, please document, else probably better to cast to Performance* both places.
>@@ -2001,17 +2001,20 @@ nsGlobalWindow::CleanUp()
What ensures that GetPerformance() will not be called after this, or at least not create a performance object?
r=me with that ensured.
Attachment #8887626 -
Flags: review?(bzbarsky) → review+
Comment 17•7 years ago
|
||
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b7d81fea0b33
setResourceTimingBufferSize shouldn't affect user timing, but we should clean user markers if we have memory pressure, r=bz
Comment 18•7 years ago
|
||
OK, so the patch that landed actually makes QI and GetAsISupports() return _different_ nsISupports pointers. Why? That seems broken to me and really needs to be fixed.
Also, why was the boolean not placed next to the other boolean members to pack better?
Flags: needinfo?(amarchesini)
Assignee | ||
Comment 19•7 years ago
|
||
Flags: needinfo?(amarchesini)
Attachment #8888345 -
Flags: review?(bzbarsky)
Comment 20•7 years ago
|
||
Comment on attachment 8888345 [details] [diff] [review]
performance2.patch
r=me
Attachment #8888345 -
Flags: review?(bzbarsky) → review+
Comment 21•7 years ago
|
||
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f4378a5e6c7f
Remove Performance::GetAsISupports(), r=bz
Comment 22•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/b7d81fea0b33
https://hg.mozilla.org/mozilla-central/rev/f4378a5e6c7f
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox56:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Comment 23•7 years ago
|
||
I see a perf win from this bug:
== Change summary for alert #8178 (as of July 20 2017 10:57 UTC) ==
Improvements:
9% quantum_pageload_facebook summary windows10-64 pgo e10s 1,133.46 -> 1,032.58
8% quantum_pageload_facebook summary windows10-64 opt e10s 1,373.98 -> 1,260.21
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=8178
Comment 24•7 years ago
|
||
Bug 1383553 is a bad memory leak and we are near the branch merge day. Can we back this out until the problem can be fixed?
Flags: needinfo?(amarchesini)
Comment 25•7 years ago
|
||
Backed out.
https://hg.mozilla.org/integration/mozilla-inbound/rev/d55bb144a48d743d48825eed765c274a2dd5c0f9
Status: RESOLVED → REOPENED
status-firefox56:
fixed → ---
Flags: needinfo?(amarchesini)
Resolution: FIXED → ---
Target Milestone: mozilla56 → ---
Assignee | ||
Comment 26•7 years ago
|
||
(In reply to Ben Kelly [:bkelly] from comment #24)
> Bug 1383553 is a bad memory leak and we are near the branch merge day. Can
> we back this out until the problem can be fixed?
The fix patch has already been reviewed by bz. I was planning to land it today.
Comment 27•7 years ago
|
||
backout bugherder |
also backedout from central
https://hg.mozilla.org/mozilla-central/rev/d55bb144a48d
Comment 28•7 years ago
|
||
Backout cancelled regressions:
== Change summary for alert #8290 (as of July 26 2017 01:50 UTC) ==
Improvements:
8% Images summary windows7-32 opt 6,383,995.28 -> 5,842,480.07
8% Resident Memory summary windows10-64 opt 537,625,077.38 -> 494,286,302.08
8% Resident Memory summary windows7-32 pgo 383,352,045.51 -> 353,332,399.89
8% Resident Memory summary windows7-32 opt 389,532,786.94 -> 359,121,542.35
8% JS summary windows7-32 pgo 107,934,079.55 -> 99,802,179.05
7% JS summary windows10-64 pgo 145,607,084.09 -> 134,697,663.96
7% Resident Memory summary windows10-64 pgo 526,052,940.35 -> 486,648,881.76
7% Explicit Memory summary windows7-32 pgo 284,566,835.14 -> 263,363,442.22
7% Explicit Memory summary windows10-64 opt 361,002,635.23 -> 334,246,145.39
7% JS summary windows10-64 opt 145,598,700.07 -> 135,039,887.35
7% Explicit Memory summary windows10-64 pgo 361,506,278.08 -> 335,797,819.14
7% JS summary windows7-32 opt 108,417,526.32 -> 101,084,010.10
7% Explicit Memory summary windows7-32 opt 282,131,227.70 -> 263,571,893.08
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=8290
Comment 29•7 years ago
|
||
(In reply to Andrea Marchesini [:baku] from comment #26)
> The fix patch has already been reviewed by bz. I was planning to land it
> today.
Sorry I didn't see the patch in the other bug. It does look like it still has an open review though. In theory it should be straightforward to reland this with that new patch when its ready?
Sorry again.
Assignee | ||
Comment 30•7 years ago
|
||
Attachment #8887626 -
Attachment is obsolete: true
Attachment #8890513 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 32•7 years ago
|
||
Forgot the patch flag.
Attachment #8890513 -
Attachment is obsolete: true
Attachment #8890513 -
Flags: review?(bzbarsky)
Attachment #8890515 -
Flags: review?(bzbarsky)
Comment 33•7 years ago
|
||
Comment on attachment 8890515 [details] [diff] [review]
perfomance.patch
r=me
Attachment #8890515 -
Flags: review?(bzbarsky) → review+
Comment 34•7 years ago
|
||
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/51e745426b9a
setResourceTimingBufferSize shouldn't affect user timing, but we should clean user markers if we have memory pressure, r=bz
https://hg.mozilla.org/integration/mozilla-inbound/rev/ea02734a4e8b
Remove Performance::GetAsISupports(), r=bz
Comment 35•7 years ago
|
||
The backout from comment 25 left us with one regression:
== Change summary for alert #8289 (as of July 26 2017 01:50 UTC) ==
Regressions:
7% quantum_pageload_facebook summary windows10-64 opt e10s stylo 1,291.65 -> 1,382.83
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=8289
Unfortunately, I cannot say if this was expected by the backout or not, as I no longer have data points older than July 22nd.
Comment 36•7 years ago
|
||
Hopefully, push from comment 34 will cancel this quantum test regression.
Comment 37•7 years ago
|
||
That's extremely unlikely, if the backout didn't affect it. Chances are, it's due to something else. Bobby, do we have a known facebook pageload regression on stylo?
Flags: needinfo?(bobbyholley)
Comment 38•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/51e745426b9a
https://hg.mozilla.org/mozilla-central/rev/ea02734a4e8b
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
status-firefox56:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Updated•7 years ago
|
Comment 39•7 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #37)
> That's extremely unlikely, if the backout didn't affect it. Chances are,
> it's due to something else. Bobby, do we have a known facebook pageload
> regression on stylo?
I don't think so. I'm not sure of the full context of this bug, but it seems like the stylo and non-stylo builds got slower by about the same amount on Jul 25th?
https://treeherder.mozilla.org/perf.html#/graphs?timerange=5184000&series=%5Bmozilla-inbound,7082fd4e57363151313c6215e6bb77b68fe4b775,1,1%5D&series=%5Bmozilla-inbound,56f971bde973a3edee958d7c4d975062f646d3a9,1,1%5D&series=%5Bmozilla-inbound,6ec57e077dec594917859dbe5526923fffbb1ad6,1,1%5D&series=%5Bmozilla-inbound,b8d73ce351c36fb3f136064b85a051b2c393decd,1,1%5D&zoom=1501001507103.5183,1501071544043.9783,594.0296685517724,1847.7610118353546
Flags: needinfo?(bobbyholley)
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•