Closed Bug 1457538 Opened 6 years ago Closed 6 years ago

14.37% tp5o responsiveness (linux64) regression on push bb6b514d69886a9b7f639859f623f1b5191245de (Fri Apr 27 2018)

Categories

(WebExtensions :: General, defect, P1)

61 Branch
defect

Tracking

(firefox-esr52 unaffected, firefox-esr60 unaffected, firefox60 unaffected, firefox61+ wontfix, firefox62+ wontfix)

VERIFIED WONTFIX
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- unaffected
firefox60 --- unaffected
firefox61 + wontfix
firefox62 + wontfix

People

(Reporter: rwood, Unassigned)

References

Details

(Keywords: perf, regression, talos-regression)

Talos has detected a Firefox performance regression from push: https://hg.mozilla.org/integration/autoland/pushloghtml?changeset=bb6b514d69886a9b7f639859f623f1b5191245de As author of one of the patches included in that push, we need your help to address this regression. Regressions: 14% tp5o responsiveness linux64 opt e10s stylo 0.70 -> 0.80 You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=12960 On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format. To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running *** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! *** Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Component: General → Go Faster
Product: Firefox → Web Compatibility Tools
Version: 53 Branch → unspecified
Here's a Perfherder comparison between this patch landing and the previous commit, after some retriggers, showing the regression: https://treeherder.mozilla.org/perf.html#/compare?originalProject=autoland&originalRevision=704dd8298b15bdfc46b438e02fc625c9fca921cb&newProject=autoland&newRevision=bb6b514d69886a9b7f639859f623f1b5191245de&framework=1&showOnlyImportant=1 :denschub can you have a look at this regression please? Thanks! :)
Flags: needinfo?(dschubert)
> *** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! *** Just FYI, Dennis is on PTO on Monday and there will be a holiday in Germany on Tuesday -- so don't get super trigger happy, he'll respond when he's back on Wednesday! :)
Priority: -- → P1
(In reply to Mike Taylor [:miketaylr] from comment #2) > > *** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! *** > > Just FYI, Dennis is on PTO on Monday and there will be a holiday in Germany > on Tuesday -- so don't get super trigger happy, he'll respond when he's back > on Wednesday! :) Thanks Mike :)
Thanks Mike! And thanks Robert for waiting a bit, much appreciated. This is regression is actually a really hard one. First, it's worth noting that it only affects Linux. On macOS and Windows, we already run webextensions out of process, but we don't on Linux so far, see bug 1190679. Eventually, when we enable OOP WebExts on Linux, this issue will go away by itself. However, since there is no timeline to that, I feel a bit bad for leaving that as the state of this issue, and I'd not like us to get backed out, since it would heavily impact our ability to roll out some site-fixes we already have in our queue. I had a few looks around, and it looks like that I regress tp5o responsiveness as soon as I enable the background script. However, all the WebExtension that is responsible for this perf regression is actually doing is only registering some content scripts, see [0]. So the issue is probably deep within the WebExtension implementation, which is a bit hard for me to profile right now. :aswan, kindly asking you for some feedback here, as you are one of the people most familiar with WebExtensions, as far as I can tell. :) Given the background script linked, and the performance regression in this issue, do you have any idea? Is there something I can improve in the WebExtension so we don't regress this test? Do you have any hints for me on how to best profile this? Or do you think we should just wait for OOP WebExt on Linux and have this regression as "necessary" for our WebCompat work on Linux? [0]: https://searchfox.org/mozilla-central/rev/795575287259a370d00518098472eaa5b362bfa7/browser/extensions/webcompat/webextension/background.js
Flags: needinfo?(dschubert) → needinfo?(aswan)
I'm not all that familiar with Talos, bear with me. Does tp5o measure anything during browser startup? From the description on the Talos wiki page I gather that it is supposed to measure the time to navigate to and load specific pages, after startup is done. Can you elaborate on "it looks like that I regress tp5o responsiveness as soon as I enable the background script"? That script should be idle other than at startup so either this test is measuring (part or all of) the cost of starting the background script during browser startup, or the regression comes from content script matching. The latter doesn't sound likely since that code is pretty well optimized and I believe it works the same on all platforms regardless of whether extension pages run out-of-process or not.
Flags: needinfo?(aswan) → needinfo?(dschubert)
tp5o (and the responsiveness metric) do not measure startup. The browser starts, we wait almost 10 seconds, then start loading tests and measuring metrics. Maybe there is some delayed initialization/lateStartTasks (like telemetry) which could be affecting things related to the web extensions.
No worries, I maybe didn't do a good job in explaining in the first place! :) That's not related to browser startup at all, yes. tp5o responsiveness measures the event loop lag in the parent process while loading a predefined set of pages. As soon as I enable the content scripts, we seem to delay the content script a bit in this case. It's not much, but given the high performance in general, it turns out to be almost 15% slower now. > Can you elaborate on "it looks like that I regress tp5o responsiveness as soon as I enable the background script"? As soon as I have a content script registered, regardless of whether it's registered for the current site or not, I seem to hit regressions. My assumption is that the methods checking whether or not the actual content scripts per domain/URI is to blame here, as I can see the regression as soon as I register a content script, even if it's empty. However, I'm not sure that this assumption is true, and I don't really know how to actually check. :/ By any chance, do you know someone who did WebExtension performance testing/optimization before? We probably have some logic somewhere that runs in the webextension process if available, but somehow ends up blocking the event loop otherwise. But I don't really know what that is.
Flags: needinfo?(dschubert) → needinfo?(aswan)
Well as I understand it, content script matching runs in all content processes including the extension process. And on Linux the extension process is still the main process, so I guess this is measuring that logic running in the main process? Adding Kris who is better acquainted with the details here. Is the content script being registered from the background page meant to be kept around permanently? It looks like unnecessary overhead for every content load. If this is just a placeholder to represent other "real" scripts to be added in the future then you can disregard that question...
Flags: needinfo?(aswan)
Just for the record: > If this is just a placeholder to represent other "real" scripts to be added in the future then you can disregard that question... Yes, it will be replaced by real scripts very soon, assuming we don't get backed out for perf issues. :)
Hm, OOP extensions on Linux are blocked on bug 1406533 which doesn't look particularly close to landing...
Hey Panos, given the size of this regression (~14%), it seems a bit hard to justify that we live with it, even if webcompat is important. I see a few possible options: 1. Disable content script injections ("site patching") for everyone until bug 1406533 is landed. We have a pref to control this currently. We can still use the UA spoofing capabilities independent of this. 2. Disable site patching for Linux only. Awkward, maybe, but we can still target Windows and OSX, where most of our users are. 3. Eat the regression. Thoughts?
Flags: needinfo?(past)
Do we have profiles from the changeset that introduced this regression and its parent? This is just a matter of pushing to try with the syntax |mach try fuzzy --talos-profile|. This would give us a better understanding of where the problem lies and how it could be mitigated. From a somewhat recent experience in another team, do any of the ideas in bug 1415467 comment 12 apply here? I am not comfortable shipping such a regression and this is not a practice we want to encourage. Disabling it for Linux only, means that when we eventually ship a compatibility remediation, some of our users will not benefit from it, which is I think more confusing than helpful. Disabling content script injections seems like the best path forward, in the sense that it makes shipping our v2 a step forward from v1 (UA spoofing plus CSS injections), but I don't know that it helps us in the short term. Do we have plans for CSS injections currently? The last option is to back it all out, but I don't see much value in that over option 1.
Flags: needinfo?(past)
ni? Dennis for the following question: > From a somewhat recent experience in another team, do any of the ideas in bug 1415467 comment 12 apply here?
Flags: needinfo?(dschubert)
> From a somewhat recent experience in another team, do any of the ideas in bug 1415467 comment 12 apply here? Well... point 1 is still true. This issue wouldn't be an issue with out-of-process web extensions. On Linux, we still run in the main process, and thus can regress the event loop lag. But as pointed out by Andrew, OOP-WebExt on Linux aren't really close to landing, unfortunately. Point 2 is a lot harder. I, for myself, don't set up tab listener. However, I don't know what WebExtensions are doing internally, and it may be that the content script registry is using them to check when they need to inject scripts. I wanted to figure that out before replying, but the WebExtension source is surprisingly complex... I'll do some more digging and look at the profile. > Disabling content script injections seems like the best path forward, in the sense that it makes shipping our v2 a step forward from v1 (UA spoofing plus CSS injections) Just for the record: CSS injections are using content scripts as well, as it is the easiest way to inject CSS files on a per-site basis, so it causes the same perf regression on Linux. Disabling JS injections also means disabling CSS injections. We have some site patches in the queue which we could handle with UA overrides, like bug 1452707, but in these cases, using JS injections would actually be a much safer way. If we spoof as Chrome, we'd get these sites to run all kinds of Chrome-specific code paths, while we could simply get the Firefox-detection to work again with a very simple JS injection, which makes me relatively certain we don't cause any bugs as a side effect.
Flags: needinfo?(dschubert)
Point 2 from the comment on bug 1415467 was about the browser.tabs.onUpdated event which the webcompat webextension does not use. I talked briefly to Kris about this last week on IRC and he was pretty adamant that content script matching should not be the cause of a regression like this. Needinfo to him to elaborate.
Flags: needinfo?(kmaglione+bmo)
Component: Go Faster → WebExtensions: General
Product: Web Compatibility Tools → Toolkit
Version: unspecified → 61 Branch
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #13) > before: > https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster. > net%2Fv1%2Ftask%2FaPcsqNcNTHKc6_XuwD- > PBA%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_tp5o.zip > > after: > https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster. > net%2Fv1%2Ftask%2FFBVQbkDXS7yrZ2rCD9uPKw%2Fruns%2F0%2Fartifacts%2Fpublic%2Fte > st_info%2Fprofile_tp5o.zip I don't see any extension overhead at all in either of those profiles. (In reply to Andrew Swan [:aswan] from comment #16) > Point 2 from the comment on bug 1415467 was about the browser.tabs.onUpdated > event which the webcompat webextension does not use. > I talked briefly to Kris about this last week on IRC and he was pretty > adamant that content script matching should not be the cause of a regression > like this. Needinfo to him to elaborate. I don't really have anything to add. Content script matching is *extremely* cheap. And it doesn't really matter whether extensions are in-process or out-of-process. We'd see the same overhead on all platforms if that were the issue. The only place it would conceivably make a difference is in non-e10s runs. All of this said, I think we're making an issue out of something that is not an issue here. 15% is not a large regression when we're talking about baseline numbers as low as they currently are. Unless we're seeing occasionally large janks (which does not appear to be the case), the difference here is so far from being user-perceivable that it is barely even worth considering. We have better things to spend time on.
Flags: needinfo?(kmaglione+bmo)
Kris makes a great point about the regression magnitude in absolute numbers, I can't believe I missed that. Joel, how important is this regression in your opinion? Since the only alternative is to essentially back the entire thing out (per comment 15), I'm inclined to just accept the regression. There will be further opportunities to investigate and improve this overhead as we ship real compatibility interventions like bug 1452707 and this feature is expected to help us a lot in tackling the long tail of web compatibility issues.
Flags: needinfo?(jmaher)
yes, this is a small value- but it is a large percentage. if we accepted all of the small changes our responsiveness would be much higher. I don't think a user can distinguish the difference here, so that logic is correct. If there is no reasonable fix, and this is the only regression, I think it is a good choice to accept this.
Flags: needinfo?(jmaher)
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #19) > yes, this is a small value- but it is a large percentage. if we accepted > all of the small changes our responsiveness would be much higher. I don't > think a user can distinguish the difference here, so that logic is correct. I don't think it makes sense to try to deal with every small change in these metrics as they come up as long as our metrics are well below user-perceivable. We obviously need to stay on top of responsiveness metrics, but I think we're better off focusing on bigger issues during QF releases than on small issues like this.
Given Comments 17, 18 and 19, let's go ahead and WONTFIX this. I think the gains in ability to respond to webcompat bustage across platforms outweigh the (small) hit in perf here. Thanks everyone. (If there's a better resolution or formal process, someone please feel free to correct!)
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
Product: Toolkit → WebExtensions
You need to log in before you can comment on or make changes to this bug.