1159003 - The buffer for user timing api should not have an upper limitation

User Timing is using the same buffer as resource timing [1][2]. Raptor relies on receiving marks from the homescreen in order to relaunch applications, and because of this buffer will not mark application launches after the 144th time (144 launches + 5 homescreen-specific marks makes 149). See bug 1175230. [1] https://dxr.mozilla.org/mozilla-central/source/dom/base/nsPerformance.cpp#877 [2] https://dxr.mozilla.org/mozilla-central/source/dom/base/nsPerformance.h#358

Hiroyuki Ikezoe (:hiro)

Reporter

Updated

•

9 years ago

Blocks: 1235445

Xidorn Quan [:xidorn] UTC+11

Comment 4

•

7 years ago

(In reply to Boris Zbarsky [:bz] (work week until 5/26) (if a patch has no decent message, automatic r-) from comment #1) > Maybe there should be? Otherwise it seems like an easy way to leak > memory.... Authors have all kind of ways to "leak memory" if they want... I don't think it is a convincing reason to set such limitation.

Boris Zbarsky [:bzbarsky]

Comment 5

•

7 years ago

> Authors have all kind of ways to "leak memory" if they want... The difference is between "leak memory if you want, but you kinda have to try" and "leak memory if you use the API the way it's meant to be used".

Xidorn Quan [:xidorn] UTC+11

Comment 6

•

7 years ago

(In reply to Boris Zbarsky [:bz] (work week until 5/26) (if a patch has no decent message, automatic r-) from comment #5) > The difference is between "leak memory if you want, but you kinda have to > try" and "leak memory if you use the API the way it's meant to be used". What is the way it's meant to be used? Recording forever? In that case, it is not implemented in a way for that in Gecko, because we only record the first several records and stop as far as the buffer is full, which is not quite helpful. Anyway, the spec doesn't agree with you [1], and we have different behavior than other browsers which makes authors have to add workaround for Firefox [2] because of this. And it seems authors think the behavior from the spec (and other browsers) is better. [1] https://github.com/w3c/resource-timing/issues/89#issuecomment-283463782 [2] e.g. https://github.com/tootsuite/mastodon/commit/33d73387793674d955d8ec244ca099ba5a9ef97e

Boris Zbarsky [:bzbarsky]

Comment 8

•

7 years ago

> Recording forever? Measuring things one after another, probably on every user action or off timers, etc. If the intent is that consumers of the API are supposed to do clearMarks and clearMeasures consistently, then my question is: do they actually do this? I understand that our current limit doesn't really make this work very well; this is mostly because the limit is quite low. I don't have a problem with raising it quite a bit, in case that wasn't clear. I also think that the fact that setResourceTimingBufferSize affects this limit is clearly buggy; it shouldn't do that.

Boris Zbarsky [:bzbarsky]

Comment 9

•

7 years ago

And again, if people have really thought this through and the "no limit" behavior is reasonable, I'm OK with that too. I just want people to have really thought this through, not assumed that just because a spec (especially a webperf spec) says something that automatically means the spec is correct.

Bryan Clark (DevTools PM) [:clarkbw]

Comment 10

•

7 years ago

Is the concern about memory leaks in production or in development? I'd be ok with a solution that increases the default limit (since 150 is really low) assuming we also allowed DevTools to set the limit to Infinity once DevTools was opened. In general some kind of solution that by default works better and then for developers it works as expected with other browsers and yet the limit protects non-developers from poor production code.

Boris Zbarsky [:bzbarsky]

Comment 11

•

7 years ago

The concern would be production. Conditioning this on devtools usage would be a bit weird, since these APIs are meant to be used for telemetry in the wild, when devtools aren't open at all. Again, just to be clear, I have no problem with us removing the limit, as long as we're doing that in a considered way, not just because everyone else is doing it.

:Harald Kirschner :digitarald

Comment 12

•

7 years ago

Related, Bug 1373086 is making the API faster, making raising the limit easier.

Bryan Clark (DevTools PM) [:clarkbw]

Comment 13

•

7 years ago

I think we need to remove this limit. We should be concerned about the production mem leaks that can occur without a limit, however when weighing that against webcompat issue where sites won't be getting reliable perf telemetry from Firefox makes me think we need to change. Many sites are already using more than our limit allows, see: https://github.com/w3c/resource-timing/issues/89#issuecomment-278214137 To combat the mem issues that come along with this DevTools can try to ensure people use clearMarks and clearMeasures consistently. I've filed bug 1374775 to handle this. We can also do some better Docs and DevRel outreach but I think the tools are likely the best place to warn people. We may want to add some telemetry to understand the number of marks / measures (if we don't already) users are seeing so we can better quantify the perf / memory targets we need to test for. Also setResourceTimingBufferSize shouldn't be working with User Timings I've filed bug 1374777 for that issue.

Andrea Marchesini [:baku]

Assignee

Comment 15

•

7 years ago

Attached patch perfomance.patch (obsolete) (deleted) — Details — Splinter Review

GetAsISupports() is just because nsISupports is an ambiguous class for PerformanceMainThread. I don't do an extra QI.

Assignee: hikezoe → amarchesini

Attachment #8887626 - Flags: review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 16

•

7 years ago

Comment on attachment 8887626 [details] [diff] [review] perfomance.patch I guess it's not clear to me why we cast to nsIObserver* to get to nsISupports* instead of casting to Performance*... If there is a reason, please document, else probably better to cast to Performance* both places. >@@ -2001,17 +2001,20 @@ nsGlobalWindow::CleanUp() What ensures that GetPerformance() will not be called after this, or at least not create a performance object? r=me with that ensured.

Attachment #8887626 - Flags: review?(bzbarsky) → review+

Pulsebot

Comment 17

•

7 years ago

Pushed by amarchesini@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/b7d81fea0b33 setResourceTimingBufferSize shouldn't affect user timing, but we should clean user markers if we have memory pressure, r=bz

Boris Zbarsky [:bzbarsky]

Comment 18

•

7 years ago

OK, so the patch that landed actually makes QI and GetAsISupports() return _different_ nsISupports pointers. Why? That seems broken to me and really needs to be fixed. Also, why was the boolean not placed next to the other boolean members to pack better?

Flags: needinfo?(amarchesini)

Andrea Marchesini [:baku]

Assignee

Comment 19

•

7 years ago

Attached patch performance2.patch (deleted) — Details — Splinter Review

Flags: needinfo?(amarchesini)

Attachment #8888345 - Flags: review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 20

•

7 years ago

Comment on attachment 8888345 [details] [diff] [review] performance2.patch r=me

Attachment #8888345 - Flags: review?(bzbarsky) → review+

Pulsebot

Comment 21

•

7 years ago

Pushed by amarchesini@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/f4378a5e6c7f Remove Performance::GetAsISupports(), r=bz

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

7 years ago

Depends on: 1382768

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 22

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/b7d81fea0b33 https://hg.mozilla.org/mozilla-central/rev/f4378a5e6c7f

Status: NEW → RESOLVED

Closed: 7 years ago

status-firefox56: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla56

Ionuț Goldan [:igoldan]

Updated

•

7 years ago

Depends on: 1383040

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

7 years ago

Depends on: 1383261

Ed Lee :Mardak

Updated

•

7 years ago

Depends on: 1383070

Joel Maher ( :jmaher ) (UTC -8)

Comment 23

•

7 years ago

I see a perf win from this bug: == Change summary for alert #8178 (as of July 20 2017 10:57 UTC) == Improvements: 9% quantum_pageload_facebook summary windows10-64 pgo e10s 1,133.46 -> 1,032.58 8% quantum_pageload_facebook summary windows10-64 opt e10s 1,373.98 -> 1,260.21 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=8178

Andrew McCreight [:mccr8]

Updated

•

7 years ago

Depends on: 1383553

Ben Kelly [:bkelly, not reviewing]

Comment 24

•

7 years ago

Bug 1383553 is a bad memory leak and we are near the branch merge day. Can we back this out until the problem can be fixed?

Flags: needinfo?(amarchesini)

Ryan VanderMeulen [:RyanVM]

Comment 25

•

7 years ago

Backed out. https://hg.mozilla.org/integration/mozilla-inbound/rev/d55bb144a48d743d48825eed765c274a2dd5c0f9

Status: RESOLVED → REOPENED

status-firefox56: fixed → ---

Flags: needinfo?(amarchesini)

Resolution: FIXED → ---

Target Milestone: mozilla56 → ---

Andrea Marchesini [:baku]

Assignee

Comment 26

•

7 years ago

(In reply to Ben Kelly [:bkelly] from comment #24) > Bug 1383553 is a bad memory leak and we are near the branch merge day. Can > we back this out until the problem can be fixed? The fix patch has already been reviewed by bz. I was planning to land it today.

Carsten Book [:Tomcat]

Comment 27

•

7 years ago

backout bugherder

also backedout from central https://hg.mozilla.org/mozilla-central/rev/d55bb144a48d

Ionuț Goldan [:igoldan]

Comment 28

•

7 years ago

Backout cancelled regressions: == Change summary for alert #8290 (as of July 26 2017 01:50 UTC) == Improvements: 8% Images summary windows7-32 opt 6,383,995.28 -> 5,842,480.07 8% Resident Memory summary windows10-64 opt 537,625,077.38 -> 494,286,302.08 8% Resident Memory summary windows7-32 pgo 383,352,045.51 -> 353,332,399.89 8% Resident Memory summary windows7-32 opt 389,532,786.94 -> 359,121,542.35 8% JS summary windows7-32 pgo 107,934,079.55 -> 99,802,179.05 7% JS summary windows10-64 pgo 145,607,084.09 -> 134,697,663.96 7% Resident Memory summary windows10-64 pgo 526,052,940.35 -> 486,648,881.76 7% Explicit Memory summary windows7-32 pgo 284,566,835.14 -> 263,363,442.22 7% Explicit Memory summary windows10-64 opt 361,002,635.23 -> 334,246,145.39 7% JS summary windows10-64 opt 145,598,700.07 -> 135,039,887.35 7% Explicit Memory summary windows10-64 pgo 361,506,278.08 -> 335,797,819.14 7% JS summary windows7-32 opt 108,417,526.32 -> 101,084,010.10 7% Explicit Memory summary windows7-32 opt 282,131,227.70 -> 263,571,893.08 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=8290

Ben Kelly [:bkelly, not reviewing]

Comment 29

•

7 years ago

(In reply to Andrea Marchesini [:baku] from comment #26) > The fix patch has already been reviewed by bz. I was planning to land it > today. Sorry I didn't see the patch in the other bug. It does look like it still has an open review though. In theory it should be straightforward to reland this with that new patch when its ready? Sorry again.

Andrea Marchesini [:baku]

Assignee

Comment 30

•

7 years ago

Attached file performance.patch (obsolete) (deleted) — Details

Attachment #8887626 - Attachment is obsolete: true

Attachment #8890513 - Flags: review?(bzbarsky)

Andrea Marchesini [:baku]

Assignee

Comment 32

•

7 years ago

Attached patch perfomance.patch (deleted) — Details — Splinter Review

Forgot the patch flag.

Attachment #8890513 - Attachment is obsolete: true

Attachment #8890513 - Flags: review?(bzbarsky)

Attachment #8890515 - Flags: review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Comment 33

•

7 years ago

Comment on attachment 8890515 [details] [diff] [review] perfomance.patch r=me

Attachment #8890515 - Flags: review?(bzbarsky) → review+

Pulsebot

Comment 34

•

7 years ago

Pushed by amarchesini@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/51e745426b9a setResourceTimingBufferSize shouldn't affect user timing, but we should clean user markers if we have memory pressure, r=bz https://hg.mozilla.org/integration/mozilla-inbound/rev/ea02734a4e8b Remove Performance::GetAsISupports(), r=bz

Ionuț Goldan [:igoldan]

Comment 35

•

7 years ago

The backout from comment 25 left us with one regression: == Change summary for alert #8289 (as of July 26 2017 01:50 UTC) == Regressions: 7% quantum_pageload_facebook summary windows10-64 opt e10s stylo 1,291.65 -> 1,382.83 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=8289 Unfortunately, I cannot say if this was expected by the backout or not, as I no longer have data points older than July 22nd.

Ionuț Goldan [:igoldan]

Comment 36

•

7 years ago

Hopefully, push from comment 34 will cancel this quantum test regression.

Boris Zbarsky [:bzbarsky]

Comment 37

•

7 years ago

That's extremely unlikely, if the backout didn't affect it. Chances are, it's due to something else. Bobby, do we have a known facebook pageload regression on stylo?

Flags: needinfo?(bobbyholley)

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 38

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/51e745426b9a https://hg.mozilla.org/mozilla-central/rev/ea02734a4e8b

Status: REOPENED → RESOLVED

Closed: 7 years ago → 7 years ago

status-firefox56: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla56

Bryan Clark (DevTools PM) [:clarkbw]

Updated

•

7 years ago

See Also: → https://bugzilla.mozilla.org/show_bug.cgi?id=1373086

Bobby Holley (:bholley)

Comment 39

•

7 years ago

(In reply to Boris Zbarsky [:bz] from comment #37) > That's extremely unlikely, if the backout didn't affect it. Chances are, > it's due to something else. Bobby, do we have a known facebook pageload > regression on stylo? I don't think so. I'm not sure of the full context of this bug, but it seems like the stylo and non-stylo builds got slower by about the same amount on Jul 25th? https://treeherder.mozilla.org/perf.html#/graphs?timerange=5184000&series=%5Bmozilla-inbound,7082fd4e57363151313c6215e6bb77b68fe4b775,1,1%5D&series=%5Bmozilla-inbound,56f971bde973a3edee958d7c4d975062f646d3a9,1,1%5D&series=%5Bmozilla-inbound,6ec57e077dec594917859dbe5526923fffbb1ad6,1,1%5D&series=%5Bmozilla-inbound,b8d73ce351c36fb3f136064b85a051b2c393decd,1,1%5D&zoom=1501001507103.5183,1501071544043.9783,594.0296685517724,1847.7610118353546

Flags: needinfo?(bobbyholley)

Nobody; OK to take it and work on it

Updated

•

6 years ago

Component: DOM → DOM: Core & HTML

perfomance.patch 7 years ago Andrea Marchesini [:baku] (deleted), patch	bzbarsky : review+	Details \| Diff \| Splinter Review
performance2.patch 7 years ago Andrea Marchesini [:baku] (deleted), patch	bzbarsky : review+	Details \| Diff \| Splinter Review
performance.patch 7 years ago Andrea Marchesini [:baku] (deleted), text/plain		Details
perfomance.patch 7 years ago Andrea Marchesini [:baku] (deleted), patch	bzbarsky : review+	Details \| Diff \| Splinter Review