Closed Bug 1174770 Opened 9 years ago Closed 6 years ago

tpaint regressions (3%-8%) in e10s mode compared to non-e10s mode

Categories

(Testing :: Talos, defect, P2)

defect

Tracking

(e10s+, firefox41 affected)

RESOLVED WONTFIX
Tracking Status
e10s + ---
firefox41 --- affected

People

(Reporter: jmaher, Assigned: mconley)

References

(Blocks 1 open bug, )

Details

here is some data about windows tpaint: xp: :( tpaint 151.1 +/- 2% (6#) [ +10.7%] 167.3 +/- 1% (6#) win7: tpaint 164.0 +/- 2% (6#) [ -1.6%] 161.4 +/- 2% (6#) win8: :( tpaint 161.4 +/- 1% (6#) [ +3.1%] 166.4 +/- 0% (1#) I assume the win7 improvement wouldn't show up much, but a 10% hit on XP and a 3% hit on win8. This is worth looking into to see if there is something obvious.
Blocks: 1144120
We are seeing similar regressions on Linux 32 / 64: Linux 32: :( tpaint 171.5 +/- 3% (6#) [ +31.9%] 226.3 +/- 2% (6#) Linux 64: :( tpaint 164.8 +/- 3% (6#) [ +32.5%] 218.3 +/- 3% (6#)
OS: Windows → Unspecified
Hardware: All → Unspecified
Summary: windows tpaint regression in e10s mode → Windows / Linux tpaint regressions in e10s mode
Summary: Windows / Linux tpaint regressions in e10s mode → Windows / Linux tpaint regressions (3%-33%) in e10s mode
tracking-e10s: --- → ?
from: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=aad95360a002&exclusion_profile=false&filter-searchStr=talos Linux: :( tpaint 173.0 +/- 2% (7#) [ +31.7%] 227.9 +/- 3% (7#) Linux64: :( tpaint 168.2 +/- 4% (7#) [ +32.9%] 223.6 +/- 2% (7#) WinXP: :( tpaint 157.9 +/- 3% (7#) [ +10.0%] 173.7 +/- 4% (7#) Win8: :( tpaint 163.0 +/- 1% (7#) [ +2.7%] 167.4 +/- 1% (7#) no issues on osx.
Summary: Windows / Linux tpaint regressions (3%-33%) in e10s mode → Windows / Linux tpaint regressions (3%-33%) in e10s mode compared to non-e10s mode
Note that while the tpaint docs* say "Tests the amount of time it takes the open a new window", it's actually measuring first paint of the _content_ (from a data: URI) loaded to a new window. Bug 1179377 discusses this subject, but regardless, this regression is of the content rendering as expressed by the current tpaint code. * https://wiki.mozilla.org/Buildbot/Talos/Tests#tpaint
Assignee: nobody → mconley
Priority: -- → P2
Summary: Windows / Linux tpaint regressions (3%-33%) in e10s mode compared to non-e10s mode → tpaint regressions (3%-33%) in e10s mode compared to non-e10s mode
keep in mind this bug was filed 4 months ago, the link to the graph for osx is actually an infrastructure change which went live on Nov 19- we went from NOISE to sanity with a hardware and os rev. Now we run new rev7 mac minis and osx 10.10.5 :)
A few observations from here: 1) There's a chunk of ~30ms where the content process is blocked waiting for the parent to service the Browser:Init message (during which time, it looks like the parent is busy running delayed startup for the window). I think we should see if we can get rid of that Browser:Init message entirely. Sending a sync message as soon as we run browser-child.js is an anti-pattern we should avoid. 2) Querying the OS via GetScreenBounds seems to be taking more time in the parent with e10s. 3) We might be calling the mTabProgressListener's too many times (which might account for bug 1115683)
So I think I know what's happening here for Linux 64. When the window is created using gtk's nsWindow::Create, it calls gtk_widget_grab_focus[1]. This must be distinctly understood, or nothing wonderful can come of the story I am going to relate. Normally as soon as the window is opened, the single process interprets the tpaint data URI that it loads, renders the page, and then records the timestamp before closing the window. Because the content process frees up the parent process so much, the parent process ends up going back to g_main_context_iteration more, meaning that it's freed up to hear about system events, like "focus_in_event", which GTK sends us because of our request go grab focus from earlier. When the parent sees that, it causes a chain reaction - it causes the main window to fire the "activate" event, which causes various things to happen in the parent (ToolbarIconColor.handleEvent stuff, for example), and also results in an Activate message being sent down to the TabChild. That Activate message results in a sync message for IME information to go back up to the parent. This can occur because I believe the content process is still waiting for confirmation from the Compositor that the paint has occurred (which results in the test-run ending). [1]: https://dxr.mozilla.org/mozilla-central/rev/388bdc46ba51ee31da8b8abe977e0ca38d117434/widget/gtk/nsWindow.cpp#3733
Depends on: 1251032
Depends on: 1251063
So we're going to change things a bit here. Having talked with billm and jmaher, we feel like we're kinda measuring something that's not truly important. tpaint measures the time it takes to open a window and draw its content. That's great, except that we live in a world where opening tabs is farrrrrrrrr and away more common. The code-paths are somewhat similar to provide a new tab to content (assuming the tab was opening by a window.open in the content). We feel like _this_ is more important to optimize, rather than the time it takes to paint a pop-up window. I'm going to file a bug to start modifying the test.
Depends on: 1251732
Depends on: 1253388
Depends on: 1253382
In bug 1253382, I've created a new Talos test that measures how long it takes to paint the content when a new tab is opened (both from the parent, and from web content). I'm still waiting on Try results to come back, but my local results are quite interesting - it looks like with e10s, we're equivalent if not slightly faster than non-e10s for this case. If that's true for the rest of the platforms, then along with the results from bug 1251732, I think we can isolate the regression down to a single case - when opening new windows from content. I'll update the wiki page.
What I'm noticing is that with e10s, the tpaint test causes a lot of "waiting around" for the content process. The content process is waiting for information from the parent, and the parent is busy doing other things. There are two ways to approach this: 1) Stop asking the parent for so many things with sync messages. This is a good long-term solution, but won't yield positive results until _all_ sync messages are removed, because it's any sync message that will cause the content process to wait. 2) Free up the parent so that it's primed to answer messages from the content process quickly. I'm going to try to put a timeline together to see where things are waiting and why.
According to the e10s profile, the time that the content process is stuck waiting on things from the parent: SendCreateWindow: 264ms SendGetRenderFrameInfo: 48ms browser-child.js Browser:Init: 22ms IME: 102ms Total: 436ms Assuming for the second that we can't optimize SendCreateWindow itself (since opening up a window in the parent costs the same amount), we're wasting something like 172ms just waiting around until the content can paint.
I used epic hackery to get rid of the SendGetRenderFrameInfo, Browser:Init and IME sync messages, and that puts us within the 5% threshold. Hello :masayuki! Do you know if it's possible to avoid sync messages from IME until _after_ the first paint of the content has been composited?
Flags: needinfo?(masayuki)
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #16) > Hello :masayuki! Do you know if it's possible to avoid sync messages from > IME until _after_ the first paint of the content has been composited? I guess that it's impossible. When an editor gets focus, chrome widget needs to prepare new IME context for receiving first keyboard event (or other input event) since widget may need to send received input event to IME. I.e., important thing is the timing of focus move in any process, not painting. For example, you may start to type some text *immediately* after pressing Tab key to move focus to an editor. Even if paint isn't performed yet at receiving the first keyboard event, the first keyboard event needs to be handled by IME. If not, only first character is inputted directly and following keyboard events cause composition string.
Flags: needinfo?(masayuki)
(In reply to Masayuki Nakano [:masayuki] (Mozilla Japan) from comment #17) > (In reply to Mike Conley (:mconley) - Needinfo me! from comment #16) > > Hello :masayuki! Do you know if it's possible to avoid sync messages from > > IME until _after_ the first paint of the content has been composited? > > I guess that it's impossible. When an editor gets focus, chrome widget needs > to prepare new IME context for receiving first keyboard event (or other > input event) since widget may need to send received input event to IME. > I.e., important thing is the timing of focus move in any process, not > painting. For example, you may start to type some text *immediately* after > pressing Tab key to move focus to an editor. Even if paint isn't performed > yet at receiving the first keyboard event, the first keyboard event needs to > be handled by IME. If not, only first character is inputted directly and > following keyboard events cause composition string. Hm, okay, good to know. Thank you. Do you know why it's necessary to do this synchronous messaging in the context of the tpaint test? In the tpaint test, the window that opens is a page with just a single word on it ("TPAINT"), and no widget for inputting text. Is it necessary to shift focus and cause IME messaging in that case?
Flags: needinfo?(masayuki)
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #18) > Do you know why it's necessary to do this > synchronous messaging in the context of the tpaint test? In the tpaint test, > the window that opens is a page with just a single word on it ("TPAINT"), > and no widget for inputting text. Is it necessary to shift focus and cause > IME messaging in that case? If nsIWidget::GetInputContext() is called, even if focus isn't changed actually... http://mxr.mozilla.org/mozilla-central/source/dom/events/IMEStateManager.cpp?rev=03fc467e1002&mark=458-459,486-489#440 I've read IMEStateManager again. Then, I realized that nsIWidget::GetInputContext() is called only for checking IME enabled state. I guess that IMEStateManager can cache it and notified of enabled state change from parent process. When is the deadline of this bug? If it's not so urgent, I'll take a look next month or in Q2.
Flags: needinfo?(masayuki)
(In reply to Masayuki Nakano [:masayuki] (Mozilla Japan) from comment #19) > (In reply to Mike Conley (:mconley) - Needinfo me! from comment #18) > > Do you know why it's necessary to do this > > synchronous messaging in the context of the tpaint test? In the tpaint test, > > the window that opens is a page with just a single word on it ("TPAINT"), > > and no widget for inputting text. Is it necessary to shift focus and cause > > IME messaging in that case? > > If nsIWidget::GetInputContext() is called, even if focus isn't changed > actually... > http://mxr.mozilla.org/mozilla-central/source/dom/events/IMEStateManager. > cpp?rev=03fc467e1002&mark=458-459,486-489#440 > > I've read IMEStateManager again. Then, I realized that > nsIWidget::GetInputContext() is called only for checking IME enabled state. > I guess that IMEStateManager can cache it and notified of enabled state > change from parent process. > > When is the deadline of this bug? If it's not so urgent, I'll take a look > next month or in Q2. This tpaint regression is one of several performance regressions that we're trying to fix before we ship e10s, so it's my current top priority. Would you have time to coach me through altering the IMEStateManager in the way that you've described?
Flags: needinfo?(masayuki)
Of course but it's probably complicated. I now tested briefly. Then, PuppetWidget is created per tab. So, at least active puppet widget caches related chrome widget's IME enabled state. When PuppetWidget is created, the cache should be initialized as "UNKNOWN" which should be defined in widget/IMEData.h newly. I think the value should be enough large, e.g., 0xFF. When PuppetWidget::SetInputContext() is called, the cache should be modified. When a PuppetWidget becomes active, it needs to be set the chrome's current IME enabled state. I'm not sure where is the best place to do this. TabParent? IMEStateManager? nsFocusManager? After you create the cache, you should add a new method, nsIWidget::GetIMEEnabledState(). This should just return InputContext::mEnabled of each platform's widget and the cached value of PuppetWidget. And make IMEStateManager use the new method. Note that nsIWidget::GetInputContext() should keep current behavior because it returns IME open state which is impossible to cache.
Flags: needinfo?(masayuki)
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #16) > Hello :masayuki! Do you know if it's possible to avoid sync messages from > IME until _after_ the first paint of the content has been composited? Would this imply that the first window (chrome) paint would happen sooner, but first content paint would still happen the same duration after the trigger as it is now? If yes, while I don't have a strict objection (other than that we should know what the implications are), it does bring up a really old issue I have with all the "First ABC" tests. These tests could be "tricked" in ways that the first ABC happens sooner, but the user doesn't really benefit anything (since the "real thing" the user cares about didn't improve). This issue is IMO that these tests measure a bad thing. First paint means nothing really if the browser stays unresponsive or otherwise unusable until a later point in time. IMO we should have been measuring how long it takes for the browser to become usable instead (e.g. both in tpaint and ts_paint, possibly session restore too, and possibly other similar measurements). Unfortunately, we're not there. So IMO if the "solutions" to such [bad-]test regressions are "tricks", don't put effort in making the results better (because it won't mean much for the user anyway), and instead let's just document the change and move on. However, assuming I did understand correctly the "solution", specifically on this case it's not completely without merit IMO. Some could argue that an earlier chrome paint still has value regardless if the user still has to wait further to actually interact with it.
(In reply to Avi Halachmi (:avih) from comment #22) > (In reply to Mike Conley (:mconley) - Needinfo me! from comment #16) > > Hello :masayuki! Do you know if it's possible to avoid sync messages from > > IME until _after_ the first paint of the content has been composited? > > Would this imply that the first window (chrome) paint would happen sooner, > but first content paint would still happen the same duration after the > trigger as it is now? > > If yes, while I don't have a strict objection (other than that we should > know what the implications are), it does bring up a really old issue I have > with all the "First ABC" tests. > > These tests could be "tricked" in ways that the first ABC happens sooner, > but the user doesn't really benefit anything (since the "real thing" the > user cares about didn't improve). > > This issue is IMO that these tests measure a bad thing. First paint means > nothing really if the browser stays unresponsive or otherwise unusable until > a later point in time. > > IMO we should have been measuring how long it takes for the browser to > become usable instead (e.g. both in tpaint and ts_paint, possibly session > restore too, and possibly other similar measurements). Unfortunately, we're > not there. > > So IMO if the "solutions" to such [bad-]test regressions are "tricks", don't > put effort in making the results better (because it won't mean much for the > user anyway), and instead let's just document the change and move on. > > However, assuming I did understand correctly the "solution", specifically on > this case it's not completely without merit IMO. Some could argue that an > earlier chrome paint still has value regardless if the user still has to > wait further to actually interact with it. Both ts_paint and tpaint measure how long it takes to paint things. Whether or not they're fully functional is not tested. I think, however, there is value in optimizing the painting of a window for perceived performance. I think you'd agree that showing something to the user ASAP is better than waiting for everything to be ready before showing something to the user. And I think that's what these tests measure - how long it takes before we've shown something.
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #23) > I think, however, there is value in optimizing the painting of a window for > perceived performance. I think you'd agree that showing something to the > user ASAP is better than waiting for everything to be ready before showing > something to the user. > > And I think that's what these tests measure - how long it takes before we've > shown something. Agreed on both, but, due to lack of other tests which measure "how long does the user have to wait to actually be able to use the thing", the first paint tests are a sort of a proxy to that measurement too (which IMO _is_ important to us, possibly even more than the actual first paint). Or at least some people may treat them as such. So let's assume we had such test, let's call it "first_responsiveness" (aka "firstR"). Reading through the comments of this bug, it would seem to me that in e10s this hypothetical firstR test would have worse results in e10s compared to non-e10s, due to some messaging between the processes which makes it take longer. Currently this messaging seems to also affect tpaint, and your suggested fix approach would indeed improve that part - but will do nothing to address or acknowledge the fact that in e10s the user has a worse experience in this regard. But, we don't have a firstR test, and it's nowhere in the e10s acceptance criteria. So the least we could do IMO, in terms of what the user cares about, is also discuss how much (if at all) this hypothetical firstR test would be worse in e10s, and acknowledge that this suggested fix doesn't address it. Again, I consider it more an issue of missing test rather than bad fix, but nevertheless we should take it into account too, and not try to optimize with better talos numbers as the sole goal.
Depends on: 1254669
Depends on: 1254865
Depends on: 1257623
Depends on: 1257937
Having spoken with mstange, BenWa, and looking at some profiles, I've come to a realization: tpaint may not have been designed to measure what it's currently measuring. Currently, tpaint measures the time it takes for the content to fire MozAfterPaint, which will only occur after the data URI that tpaint opens loads, is parsed, executes, paints, and composites. Is that really what tpaint is interested in? The definition of tpaint in https://wiki.mozilla.org/Buildbot/Talos/Tests#tpaint is: "Tests the amount of time it takes the open a new window." If this is really what tpaint cares about, we should be waiting for the MozAfterPaint event to fire in the parent (meaning "the browser UI has been shown"), and not the child. So my question to avih, or jmaher (or both!) is what is the exact meaning of tpaint? Should it be measuring the time it takes to open a window and paint the browser UI? Or the time it takes to open a new window and paint the content that window is loading?
Flags: needinfo?(jmaher)
Flags: needinfo?(avihpit)
Originally this only listened for open, then we added the MozAfterPaint listener a number of years ago. The conclusion was that knowing when the user sees a usable browser was the metric we really cared about.
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #25) > So my question to avih, or jmaher (or both!) is what is the exact meaning of > tpaint? Should it be measuring the time it takes to open a window and paint > the browser UI? Or the time it takes to open a new window and paint the > content that window is loading? I don't know what it's "designed" to measure since it was written well before my time, and I've never touched it. Probably the same reply I gave earlier. In e10s it becomes more complicated... So we should probably pick an answer which we think represents the user's intention best, probably one of the following, in descending usefulness: 1. How long until everything settles? 2. How long until the chrome+content get rendered? 3. How long until something meaningful appears on screen (at least chrome)? If I had to guess, I'd think it was designed to test 1, but measured 2 as a proxy for 1. So unless there are objections, let's settle on 2?
Flags: needinfo?(avihpit)
I would have written the same thing as :jimm wrote. Talos tests for the most part measure specific actions that a user could take. For tpaint that would mean when we open a new window when is it *useful*. Some tests measure specific fine grain actions (like tart, cart, damp), but the rest mostly wait until we launch the browser, or load a webpage and it is useful to some degree for the user. That falls in line with what we are doing and what :avih mentioned for #2.
Flags: needinfo?(jmaher)
If that's the case, we should explicitly mention in the wiki that tpaint is measuring the time it takes to paint the browser UI and minimal web content, since that's what it's measuring. I will point out, however, that for the majority of popups being opened, there's likely going to be some network activity to download the content. Some of that network activity will likely download styling information. I believe there's some code in layout that will suppress painting in the content until a heuristically determined amount of styling information has come down so that there's less chance of a "flash of unstyled content". There's also a timeout that causes web content to be painted if those heuristics are not satisfied reasonably quickly. I've only learned these things recently, which is why I'm only bringing this up now. This makes me think that striving for 2 might be an error. It makes me think that we'd be attempting to optimize for a really uncommon case; the case where the content that's being opened in the new window is very quick to download and style, so that it's ready to paint right away. Given the above, doesn't it make more sense to optimize for showing something to the user - as in, to optimize for perceived performance by optimizing for browser UI paint? ni'ing avih and jmaher again, since I really value their input here, and I want to make sure we're all on the same page about what's important to measure.
Flags: needinfo?(jmaher)
Flags: needinfo?(avihpit)
there should be no network activity from a talos test to the real internet- we do have a localhost webserver and any files that we are loading on the page would come from there. Ideally the page is a real simple page for tpaint so we don't worry about rendering large content, styles, scripts. Just from my simple understanding, it seems as though we don't have to worry about it. tpaint could be a simple test to measure the time to create and paint a new window (chrome, not content). If we are measuring painting content of a webpage in addition to a new window, is that duplicating ts_paint, session_restore, tart? if we were to make the test measure browser UI stuff and not content paint, what changes would we need to make?
Flags: needinfo?(jmaher)
As we discussed on IRC, we don't have a strict definition of what it's supposed to be measuring, so we should come up with a reasonable definition ourselves. This test is about user experience when opening a new window. The interesting perspectives are probably those mentioned at comment 27. For a lack of some research which suggests otherwise, let's give equal weight to all three. Even if the test ends up measuring only #2, we probably shouldn't optimize for #2 if we know it would hurt #1 - and state the reasons someplace (such as here at this bug).
Flags: needinfo?(avihpit)
(In reply to Joel Maher (:jmaher) from comment #30) > tpaint could be a simple test to measure the time to create and paint a new > window (chrome, not content). If we are measuring painting content of a > webpage in addition to a new window, is that duplicating ts_paint, > session_restore, tart? I don't think it's duplicating them, no. But the webpage that is being painted (a mostly empty website with the sole word "TPAINT" in it) is, in my experience, not a common case. So I'm not entirely convinced that measuring how long it takes to paint that mostly empty website (which, once measuring, encourages people to optimize that case) is useful. > > if we were to make the test measure browser UI stuff and not content paint, > what changes would we need to make? In that case, we'd probably want to modify this test to operate more like tabpaint, where a window is opened once from the parent, and then again from content, waiting for the paint event in the parent for the newly opened window - assuming that we want tpaint to measure (and therefore encourage optimization of) the time it takes to get to the paint of the new window, and not necessarily the state where the window is useful[1]. To boil all of this down, I guess I just want us to be really clear about what tpaint is supposed to measure, and what we're supposed to be optimizing. Are we supposed to be measuring (and therefore optimizing): 1) The time to paint just the outer browser UI 2) The time until both the outer browser UI and a minimal webpage with the word "TPAINT" are painted 3) The time until the outer browser UI is painted and the window is usable 4) Something else entirely. Which is basically a re-phrasing of what avih wrote in comment 27. [1]: _delayedStartup, which is fired after painting in the new window, does a bunch of activation stuff that makes the window more functional. Up until that point, the window is painted, but somewhat broken.
Depends on: 1258440
No longer depends on: 1251063
Depends on: 1258465
Summary: tpaint regressions (3%-33%) in e10s mode compared to non-e10s mode → tpaint regressions (18%-30%) in e10s mode compared to non-e10s mode
Summary: tpaint regressions (18%-30%) in e10s mode compared to non-e10s mode → tpaint regressions (11%-30%) in e10s mode compared to non-e10s mode
Depends on: 1261842
Okay, I think I've found all of the bugs that'll resolve this. Some have already landed. Bug 1251032 is waiting on review, and bug 1258465 and bug 1261842 need proper implementation. I wrote experimental patches for the last two, and here's the comparison: https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=2c8a3acfb48c&newProject=try&newRevision=b8d488219a1c&framework=1&showOnlyImportant=0 I'm 90% confident that if we close these three out, the regression for tpaint will be gone (and that e10s might even beat non-e10s in some cases).
Depends on: 1262570
Depends on: 1262946
Efforts to drive this regression down have been pretty successful. Updating the gap.
Summary: tpaint regressions (11%-30%) in e10s mode compared to non-e10s mode → tpaint regressions (3%-8%) in e10s mode compared to non-e10s mode
We're floating right around the 5% threshold. There's more work that could be done here, but I wonder if I should focus more on tps now. cpeterson - thoughts?
Flags: needinfo?(cpeterson)
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #35) > We're floating right around the 5% threshold. There's more work that could > be done here, but I wonder if I should focus more on tps now. I agree. tps is a much higher priority, considering it has regressed 25–85%. tpaint is probably good enough. Do you want to close this tpaint bug as FIXED or keep it open on the back burner?
Flags: needinfo?(cpeterson) → needinfo?(mconley)
Since the regression still exists, let's keep this bug open, but put it on the backburner. Thanks cpeterson!
Flags: needinfo?(mconley)
this is an old bug, I would lean towards wontfix for this, but :mconley would be the decision maker here :)
been a long time, lets close this out
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.