Closed Bug 1283302 Opened 8 years ago Closed 8 years ago

Determine correct nglayout.initialpaint.delay for modern platforms

Categories

(Core :: Layout, defect)

All
Unspecified
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox48 --- wontfix
firefox49 --- wontfix
firefox50 --- wontfix
firefox51 --- wontfix
firefox52 --- wontfix
firefox53 --- fixed

People

(Reporter: bugs, Assigned: neerja, NeedInfo)

References

(Depends on 2 open bugs, Blocks 1 open bug)

Details

Attachments

(2 files, 1 obsolete file)

nglayout.initialpaint.delay is currently set to 250ms for all platforms. That preference is the amount of time we'll allow a page to finish its initial layout before painting to the screen. By setting it to a lower number, the user can potentially see content sooner, though that content may change again (causing FOUT or other visible layout shifts.) I don't know how we got to 250ms as a default, or if that number even makes sense anymore. I also don't know if this should remain a constant or dynamically computed based on system capabilities.
Blocks: 1283300
Since 5ms is substantially smaller than a refresh tick, we might want to follow up by getting rid of the entire delay mechanism.
(In reply to David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) from comment #2) > Since 5ms is substantially smaller than a refresh tick, we might want to > follow up by getting rid of the entire delay mechanism. Yeah I agree. Once we've had the pref change on release for a bit without fallout, then we should rip the remaining code out.
Attachment #8767332 - Flags: review?(matt.woodrow) → review+
Try results coming up: remote: View your changes here: remote: https://hg.mozilla.org/try/rev/9c41428cc88b remote: https://hg.mozilla.org/try/rev/ff5c0c16a40e remote: remote: Follow the progress of your build on Treeherder: remote: https://treeherder.mozilla.org/#/jobs?repo=try&revision=ff5c0c16a40e remote: remote: It looks like this try push has talos jobs. Compare performance against a baseline revision: remote: https://treeherder.mozilla.org/perf.html#/comparechooser?newProject=try&newRevision=ff5c0c16a40e
Pushed by jvillegas@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/24ec7aba60e7 Sets the default value for nglayout.initialpaint.delay to 5ms (was 250 ms) per user research conclusions. r=mattwoodrow
Assignee: nobody → bugs
Some hg blame history: 2001-04-25 - bug 77002 originally defined PAINTLOCK_EVENT_DELAY as 1200 ms. 2001-05-01 - bug 76495 changed it to 1000 ms. 2001-05-31 - bug 78695 changed it back to 1200 ms. 2002-12-26 - bug 180241 changed it to 250 ms "to match Phoenix and Chimera". 2016-07-06 - bug 1283302 (this bug) changed it to 5 ms.
Flags: needinfo?(bugs)
Backout by cbook@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/1052561d1faf Backed out changeset 24ec7aba60e7 for testfailures in 518172-1b.html
this caused a 4% tp5o regression on linux64 and it reverted when this was backed out: https://treeherder.mozilla.org/perf.html#/alerts?id=1723 please keep talos in mind when looking to reland this!
(In reply to Pulsebot from comment #8) > Backout by cbook@mozilla.com: > https://hg.mozilla.org/integration/mozilla-inbound/rev/1052561d1faf > Backed out changeset 24ec7aba60e7 for testfailures in 518172-1b.html Thanks, filed bug 1285317 for that.
Flags: needinfo?(bugs)
Depends on: 1285317
in addition to tp5o regression there is a 13% glterrain regression on win7 with this patch.
Depends on: 1285839
(In reply to Joel Maher ( :jmaher - PTO - back on Friday July 15 ) from comment #9) > this caused a 4% tp5o regression on linux64 and it reverted when this was > backed out: > https://treeherder.mozilla.org/perf.html#/alerts?id=1723 I'm somewhat suspicious of the glterrain regression (Windows) and the tp5o_scroll and tscrollx regressions (Linux64). The rest (except probably the RSS one) seem expected.
the backout shows a 10%+ win for glterrain for win7: https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=cc866385dd01&newProject=mozilla-inbound&newRevision=1052561d1faf&framework=1 this follows the pattern on the graph of the regression aligning with the commit and the improvement coming with the backout. Possibly we can push to try with a --spsProfile flag to learn more of what is going on?
(In reply to David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) from comment #12) > I'm somewhat suspicious of the glterrain regression (Windows) and the > tp5o_scroll and tscrollx regressions (Linux64). The rest (except probably > the RSS one) seem expected. Pinging this bug. I am wondering how we can move forward with this bug. What will make you guys feel comfortable fixing this and set the value to 5ms? Reading dbaron's comment it seems that the regressions are mostly expected.
see my comment above. What do we need to move forward here?
Flags: needinfo?(bugs)
(In reply to Dominik Strohmeier from comment #15) > see my comment above. What do we need to move forward here? This change will come with an inevitable increase in measured page load time but may come with a decrease in *perceived* time to initial rendering. This was true the last time we changed this value, see: https://bugzilla.mozilla.org/show_bug.cgi?id=180241#c17 I recommend we try it early in a cycle, see how it looks/feels, and gather feedback. We'll still want to figure out what's up with the GLTerrain test in comment 13 before we do that.
Flags: needinfo?(bugs)
> I recommend we try it early in a cycle, see how it looks/feels, and gather feedback. Just to clarify the next steps: Given the SHIELD study confirmed improved perceived performance, what other feedback are we looking for that is blocking this?
Flags: needinfo?(bugs)
(In reply to Harald Kirschner :digitarald from comment #17) > > I recommend we try it early in a cycle, see how it looks/feels, and gather feedback. > > Just to clarify the next steps: Given the SHIELD study confirmed improved > perceived performance, what other feedback are we looking for that is > blocking this? We're looking for confirmation that the SHIELD study sample set actually represents the larger population with more diverse configurations. We also need Platform Operations team to approve the measured Load time regression.
Flags: needinfo?(bugs)
ni? jgaunt. Could we check Jet's concern on the study's sample being representative?
Flags: needinfo?(jgaunt)
(In reply to Joel Maher ( :jmaher) from comment #13) > the backout shows a 10%+ win for glterrain for win7: > https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla- > inbound&originalRevision=cc866385dd01&newProject=mozilla- > inbound&newRevision=1052561d1faf&framework=1 I believe you :) ...but I can't reproduce the regression on Win7 with our without this change. > this follows the pattern on the graph of the regression aligning with the > commit and the improvement coming with the backout. Possibly we can push to > try with a --spsProfile flag to learn more of what is going on? +cc: jgilbert Jeff: this is the bug I mentioned last time we met. You're probably in a better spot to run this Try build and comment. Thx!
Flags: needinfo?(jgilbert)
Jason, can we somehow use Project Presto to see if this regresses Speed Index?
Flags: needinfo?(jduell.mcbugs)
Sadly Presto has been using https://www.webpagetest.org/ for our tests, and it uses standard builds from mozilla.org, so there's no easy way to get it to use a custom build. I do think that the webpagetest SpeedIndex is a good metric for page load responsivity. I'm asking Valentin (Presto coder) and Nick Hurley (who had a hand-rolled webpagetest setup at one point: maybe he has a recipe for standing one up) if they have ideas here.
Flags: needinfo?(jduell.mcbugs) → needinfo?(valentin.gosu)
Flags: needinfo?(hurley)
Valentin--how often does webpagetest.org update its nightly build? Could we throw the patch from this bug onto nightly for a few days till it shows up there, and then use your scripts to compare with nightlies from a few days before (w/o patch)?
Note: Valentin is on PTO so it may be a week or so before he responds.
(In reply to Jason Duell [:jduell] (needinfo me) from comment #22) > I'm asking Valentin (Presto coder) and Nick Hurley (who had a hand-rolled > webpagetest setup at one point: maybe he has a recipe for standing one up) > if they have ideas here. I didn't hand-roll a wpt setup, I hand-rolled a not-quite-wpt-like-thing-that-doesn't-measure-speedindex. Bob Clary was the one that set up wpt for us, redirecting to him.
Flags: needinfo?(hurley)
Flags: needinfo?(bob)
> it uses standard builds from mozilla.org, so there's no easy way to get it to use a custom build. We can also reach out to speedcurve to set up one of their instances with a custom build and run benchmarks. For reference, the BYO-Server solution is described in more detail: https://www.instartlogic.com/blog/configuring-private-instance-webpagetest-browser-extensions
wpt is using the directory: http://archive.mozilla.org/pub/firefox/nightly/latest-mozilla-central/ to get builds. which does appear to be updated regularly. I had thought taskcluster killed updating the latest dirs, but perhaps they changed their minds. https://github.com/WPO-Foundation/webpagetest/blob/master/www/installers/browsers/updateFirefoxNightly.php#L8 pulls the 32bit installer and places it on https://www.webpagetest.org/installers/browsers/ https://www.webpagetest.org/installers/browsers/nightly.dat http://www.webpagetest.org/installers/browsers/nightly-51-1471627440.exe is current. Doing an experiment on nightly should work.
Flags: needinfo?(bob)
> Doing an experiment on nightly should work. Testing this on Nightly as a second way of validating the patch in addition to the results from user research sounds like a great path forward. I suggest that we'll use the Alexa TOP100-curated list for testing to cover a wide range of popular websites [1]. I think that it will also be important to webpagetest also for slower connections. I figured out that webpagetest also allows for using credentials to login to pages like FB or Pinterest through scripting for more valid results [2][3] beyond landing pages. [1] https://github.com/WPO-Foundation/webpagetest/blob/master/bulktest/alexa-curated.txt [2] https://sites.google.com/a/webpagetest.org/docs/using-webpagetest/scripting [3] https://www.webpagetest.org/forums/archive/index.php?thread-11504.html
(answering for jgaunt) Hello! I designed and implemented the Shield study. To answer representativeness: 0. All the concerns are valid. 1. It was a self selected study of EN-* users from Release. These users self-selected to participate, not knowing what performance enhancements they would receive. It is entirely possible that these users had existing bad performance and that lured them into the study. 2. At the time we had no good access to merge these packets with Telemetry. 3. I would be delighted to have some code to compare a bunch of profiles to decide what is representative enough. 4. A decision here is whether this sample is more representative than a NIGHTLY sample. I personally doubt that users in Nightly are more representative of the full range of user configs. I especially doubt that their modal / median / centroid is the same. But I have no good evidence to back that up. The evidence we do have: 1. Surveys in the population reported better perceived performance than the other settings. 2. That effect followed a dose-response curve, and got worse when the setting was worse 3. Setting the timeout to worse than current (1000ms) was the worst group. 4. Participating user uninstalled the addon in that same pattern. I want to put a hard question forward: Suppose Nightly shows a trend in the other direction? Or no-effect? Is nightly the product we are actually building? Or release? If we want to move to a standard of evidence that is conclusive and acceptable, I would gladly hear thoughts on what that is. Until then, we have the evidence we have.
Flags: needinfo?(jgaunt)
Hey Gregg, thanks for these details. I briefly want to chime in here to clarify a misunderstanding that I read in your comment. The discussion to do tests in Nightly are targeting more the engineering side of things. We do see some regressions with nglayout.initialpaint.delay set to 5ms and there's been the question about validity of this regression in the wild. So the idea would be to run webpagetest.org against a Nightly version that has the pref set to 5ms and compare to results for the old value of 250ms. That has nothing to do with the results from the SHIELD research. Second, I agree with what you described regarding the validity and reliability of the SHIELD results. Thanks for clarifying. Makes sense?
Flags: needinfo?(glind)
(In reply to David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) from comment #2) > Since 5ms is substantially smaller than a refresh tick, we might want to > follow up by getting rid of the entire delay mechanism. Please do not. There are negative side effects to too low a setting. I personally keep the delay at 56ms after a bunch of testing. This is usually enough to prevent incorrect rendering of pages. Imagest usually take around 50ms to start loading, and having the whole page move is annoying. Similarly, I would definitely suggest testing among people who are not looking to prioritize speed. They may also like the smoother operation better.
> We're looking for confirmation that the SHIELD study sample set actually represents the larger population with more diverse configurations. We got that from Gregg on a higher level. The assumption is that SHIELD does random sampling on release and therefor is diverse enough. > We also need Platform Operations team to approve the measured Load time regression. Looks like we had some general agreement in the Friday meeting that the regression isn't a major concern.
Flags: needinfo?(bugs)
A delay of 50ms works for me, 5ms makes the browser nearly unusable.
(In reply to Spencer Selander [greenknight] from comment #33) > A delay of 50ms works for me, 5ms makes the browser nearly unusable. I have had the pref at 5ms for nearly 6 weeks now and haven't had any issues. I am curious what "nearly unusable" means for you.
Flags: needinfo?(spencerselander)
(In reply to Dominik Strohmeier from comment #34) > (In reply to Spencer Selander [greenknight] from comment #33) > > A delay of 50ms works for me, 5ms makes the browser nearly unusable. > > I have had the pref at 5ms for nearly 6 weeks now and haven't had any > issues. I am curious what "nearly unusable" means for you. It had been a while since I tried it, so I just gave it another shot - no problem now with 5ms. None of the reflows and glitches I got before, even image-heavy pages render very smoothly. I'm leaving it set that way for now.
Flags: needinfo?(spencerselander)
Modify the patch to apply only to desktop platforms. Android will remain at 250 ms.
Attachment #8767332 - Attachment is obsolete: true
Flags: needinfo?(bugs)
Attachment #8784602 - Flags: review?(tnikkel)
Attachment #8784602 - Flags: review?(tnikkel) → review+
After conferring with the Platform and Ops teams, we've decided to enable this on mozilla-central for all desktop platforms. For more detailed information on the User Research backing this change, please read this doc: https://docs.google.com/document/d/1BvCoZzk2_rNZx3u9ESPoFjSADRI0zIPeJRXFLwWXx_4/edit#heading=h.28ki6m8dg30z
Attachment #8784602 - Attachment description: Sets the default value for nglayout.initialpaint.delay to 5ms on desktop (stays 250 ms on Android) per user research conclusions. → Sets the default value for nglayout.initialpaint.delay to 5ms on desktop (stays 250 ms on Android) per user research conclusions. r=tn
Pushed by jvillegas@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/e68b376d1191 Sets the default value for nglayout.initialpaint.delay to 5ms on desktop (stays 250 ms on Android) per user research conclusions. r=tn
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/ccf3d0b90b0422b169c60d429029cdfdc0e5a5a0 for making a bunch of existing intermittent failures nearly permaorange. Going to be hard to point out without just a ton of copy-paste, but see in particular the Linux64 reftest-4 and reftest-7 chunks in https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&tochange=ccf3d0b90b0422b169c60d429029cdfdc0e5a5a0&fromchange=e59fbcb1da07bb3e330c8c35759e58404981ffa6&filter-searchStr=reftest&group_state=expanded
And some of the same reftests on Win7 (it'll be hours yet before Win8 or WinXP actually run).
Hi Jet, what'S the status here? Any chance that this will still ship in August? Thanks, Dominik
Flags: needinfo?(bugs)
(In reply to Dominik Strohmeier from comment #41) > Hi Jet, > what'S the status here? Any chance that this will still ship in August? > > Thanks, > Dominik mstange & I are looking through the failing tests. Note that we didn't plan to uplift this change (ie. it wasn't ever shipping in August.)
Flags: needinfo?(bugs)
(In reply to Bob Clary [:bc:] from comment #27) > Doing an experiment on nightly should work. I'll do a run in the next couple of days so we get a baseline, then we can compare with the build after we land on nightly.
Flags: needinfo?(valentin.gosu)
Jet, what's the status with this bug? Bug 1297996 landed and got uplifted to Beta and we're planning to run some tests on page2page transitions. In our pretests, we've seen that there are some more or less announced effects for the page2page transitions for different initial delay settings. It would be great if this would land and ride the trains. Thanks.
Flags: needinfo?(bugs)
(In reply to Dominik Strohmeier from comment #44) > Jet, what's the status with this bug? Markus is investigating the test failures this week. > Bug 1297996 landed and got uplifted to Beta and we're planning to run some > tests on page2page transitions. In our pretests, we've seen that there are > some more or less announced effects for the page2page transitions for > different initial delay settings. I had a look at the patches in bug 1297996 and the checks for suppressed painting will certainly be affected by landing this one. This bug affects all platforms, whereas your experiments in bug 1297996 are non-e10s-only. I wouldn't assume your results will hold true for e10s. > It would be great if this would land and ride the trains. Thanks. Agreed :) though we can't ship with the tests in this current state. I would note that the failing tests are perma-failures on my Linux VM with or without this fix, so that should help debugging at least.
Flags: needinfo?(bugs)
In bug 1311448, we will run a Telemetry Experiment on different page-to-page transitions. In this context, we will also have a test cohort that has nglayout.initialpaint.delay set to 0 and one that combines transition over blank with a initialpaint.delay of 0 as paint delays and the experience of different page to page transitions is very much linked. What's the state of landing this? Any progress on Markus's investigations? Does it make sense to wait for some more Telemetry data from the Telemetry Experiment and then target to get rid of the delay mechanisms maybe completely?
Flags: needinfo?(bugs)
(In reply to Dominik Strohmeier from comment #46) > What's the state of landing this? Any progress on Markus's investigations? Markus diagnosed the test failures. Most will add additional fuzzing to the test results to account for paint invalidation that occurs prior to the reftest screen captures. Neerja will land the test fixes. > Does it make sense to wait for some more Telemetry data from the Telemetry Experiment and then target to get rid of the delay mechanisms maybe completely? I think we'll see the same test failures at 0 delay. It would be good to see what the Telemetry Experiment results say about stability, since we do intend to remove the delay mechanism entirely (see comment 2.)
Flags: needinfo?(bugs) → needinfo?(npancholi)
(In reply to Jet Villegas (:jet) from comment #47) > I think we'll see the same test failures at 0 delay. It would be good to see > what the Telemetry Experiment results say about stability, since we do > intend to remove the delay mechanism entirely (see comment 2.) Comment 2 has one person suggest that. I have specifically asked that the preference not be removed, and provide my reasoning--that I personally prefer the smoother experience of no flashing on simple sites like xkcd.com, where the image is the primary content and thus there is no need to display anything until the image is ready. What reason is there to remove the mechanism entirely instead of leaving it there for those of us who might want to change it? It does not seem that it would be possible for an addon to overwrite this change (as there is no delayPaint property in JavaScript or CSS.) I would understand it if you were overhauling the entire painting system, and thus you'd need to reimplement the delay mechanism. Then, if few users are using it, it makes sense to remove it. But, otherwise, why not leave it?
Assignee: bugs → npancholi
Flags: needinfo?(npancholi)
No longer blocks: 1260629, 1293550, 1294278, 1295466, 1313772, 1313773
Added fuzzy annotations to intermittent failures corresponding to bugs 1293550, 1260629, 1313772, 1313773, 1294278, 1295466 and review request sent to mstange for each of them. (these bugs were also added to 'Depends on' list) Started new try run with patch for this bug + patches for all bugs listed above: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7d70c7ae79678fb56c1ea09e9d640e719f5c8c22
Failure for test invalidate-1.html on windows needed to be looked into more closely: https://treeherder.mozilla.org/#/jobs?repo=try&revision=20b632bf5f252b8e21789a5e74e4f2c51a07273c&selectedJob=28813830&exclusion_profile=false I triggered a Try run with "reftest-wait" and a small setTimeout added to this test: You can see that this does fix the issue. (Test doesn’t fail anymore in the try run below) That means we do *eventually* paint the correct thing, but the default reftest snapshot is just happening too quickly. https://treeherder.mozilla.org/#/jobs?repo=try&revision=1aa1e5359ecb84245c545bc9758a58474771fa3c So the explanation is probably that we've got a bug in one of the following two places: 1. Since delaying the snapshot taken by the test harness by 100ms does make the test pass, there might be a bug in the paint stabilization in the test harness where it doesn’t correctly wait for the second div in the testcase to be painted. 2. Or the paint stabilization is correct but the -moz-element invalidation is too delayed for it to catch for some reason Eg: -moz-element references one frame too late i.e. we paint the first div, then we discover that we need to invalidate the other div, and we trigger another paint which then paints the other div so maybe we must modify the code for -moz-element so that both the invalidation happens in the first paint itself? mattwoodrow, do you have any thoughts on what the explanation might be? (since you've worked on -moz-element as well as paint stabilization) (Thanks very much mstange and dholbert for your help with this!)
Flags: needinfo?(matt.woodrow)
(Slight tangent: IMO, we should feel free to paper over the invalidate-1.html failure using a small setTimeout (since that does fix it), in the short-term interests of getting this bug landed. And then we can investigate comment 50 more closely in bug 1178202 or a new bug. A small setTimeout would be basically equivalent to the paint delay that we're already giving this test in current trunk, so in practice it wouldn't be reducing the sensitivity of that test.)
I did some digging, and I strongly suspect that this is bug 1209751 coming back to bite us. My guess is that when we go to take the reftest snapshot of the page, the background image hasn't loaded yet. Reftests snapshots force synchronous image decoding (by not passing DRAWWINDOW_ASYNC_DECODE_IMAGES to drawWindow), so we block (during painting) when we try to draw the first div until the image has loaded and then draw it correctly. We then try painting the div with -moz-element (which didn't have the instrinsic image size available during layout, as the image was only ready during painting) and then that fails. Seth's comment suggests that this should only be a problem if the element's size depends on the image's intrinsic size, which shouldn't be the case here (we specify a size ourselves). There could easily be something else depending on the instrinsic size though which causes painting to fail. It looks like this is reproducible, so shouldn't be too hard to confirm. I'm not sure what the best fix is here. I think we need to detect that the image size has changed and schedule another paint so it can be draw with the correct layout. The reftest harness should detect this scheduled paint and wait for it.
Flags: needinfo?(matt.woodrow)
Depends on: 1178202
No longer depends on: 1260629, 1293550, 1294278, 1295466, 1313772, 1313773
No longer depends on: 1178202
(In reply to Matt Woodrow (:mattwoodrow) from comment #52) > I did some digging, and I strongly suspect that this is bug 1209751 coming > back to bite us. > > My guess is that when we go to take the reftest snapshot of the page, the > background image hasn't loaded yet. > > Reftests snapshots force synchronous image decoding (by not passing > DRAWWINDOW_ASYNC_DECODE_IMAGES to drawWindow), so we block (during painting) > when we try to draw the first div until the image has loaded and then draw > it correctly. > > We then try painting the div with -moz-element (which didn't have the > instrinsic image size available during layout, as the image was only ready > during painting) and then that fails. > > Seth's comment suggests that this should only be a problem if the element's > size depends on the image's intrinsic size, which shouldn't be the case here > (we specify a size ourselves). There could easily be something else > depending on the instrinsic size though which causes painting to fail. > > It looks like this is reproducible, so shouldn't be too hard to confirm. > > I'm not sure what the best fix is here. I think we need to detect that the > image size has changed and schedule another paint so it can be draw with the > correct layout. The reftest harness should detect this scheduled paint and > wait for it. Thanks for looking into this Matt!
List of all bugs that had to be modified in some way to prevent paint delay from causing a test-failure -> (Note that some of these bugs will remain open since we've simply papered over the failure with a "fuzzy" annotation -- that's why they're not marked as blocking in this bug) Bug 1293550, Bug 1260629, Bug 1313772, Bug 1313773, Bug 1294278, Bug 1295466, Bug 1178202, Bug 1315834, Bug 1295466, Bug 1313253, Bug 1312951, Bug 1315846 Clean try run -> https://treeherder.mozilla.org/#/jobs?repo=try&revision=0268d8e0c8def2152d1b9adb390c7687e8b4608c
Flags: needinfo?(bugs)
Just spoke with jet -- I'm going to land this on inbound tomorrow, after the test-tweaks (on autoland) have been merged around and have made it to inbound. --> transferring jet's needinfo to myself (I think bug 1316430 is the last test-tweak that needed landing, so once that's on inbound, we should be OK to land.)
Flags: needinfo?(bugs) → needinfo?(dholbert)
Pushed by dholbert@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/ff60dd49591d Sets the default value for nglayout.initialpaint.delay to 5ms on desktop (stays 250 ms on Android) per user research conclusions. r=tn
Flags: needinfo?(dholbert)
Depends on: 1316762
Thanks -- I've just landed a fix for that particular issue on bug 1316762. Did you see any others that seem to have been caused by this push? If so, please mention them here, so we can fix them before the next time we re-land. :) (Per comment 54, we thought we'd caught all the newly-fuzzy tests, but apparently there were a few more hiding.) I'll hopefully re-land this later tonight or tomorrow morning.
Flags: needinfo?(dholbert) → needinfo?(wkocher)
Flags: needinfo?(dholbert)
Flags: needinfo?(jgilbert)
Pushed by dholbert@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/b2b359340a84 Sets the default value for nglayout.initialpaint.delay to 5ms on desktop (stays 250 ms on Android) per user research conclusions. r=tn
(Hoping it sticks this time around. If there are any new intermittents that look suspiciously tied to this, please ping me before backing this out again -- we can hopefully just land any remaining targeted tweaks as followups, as-needed.)
Flags: needinfo?(wkocher)
Flags: needinfo?(dholbert)
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
@ Gregg Lind - Half a year passed (bug #180241 Comment 73) and still no results of Shield Study 1 & 2.
I think we need to back this out, until we better-understand the glterrain regression (and possibly modify that test and tsvgx so this change doesn't regress/noisify them) See bug 1317048 comment 8 and 9 for more. But for now, I'll be backing this out.
Attached patch backout patch (deleted) — Splinter Review
Not sure if I need aurora approval to land the backout on Aurora, but here goes just in case. Approval Request Comment [Feature/regressing bug #]: This bug. (I'd like to back it out for now.) [User impact if declined]: Talos regressions; see previous comment. [Describe test coverage new/current, TreeHerder]: N/A [Risks and why]: Extremely low risk. Just reverting a tweak to a paint-delay timeout value that *just* landed last week. This backout-patch restores us to our previous state. [String/UUID change made/needed]: None.
Attachment #8811055 - Flags: approval-mozilla-aurora?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Backout by dholbert@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/1c43a08d519f Backed out changeset b2b359340a84 for causing talos regressions.
Comment on attachment 8811055 [details] [diff] [review] backout patch Went ahead and pushed the backout to aurora.
Attachment #8811055 - Flags: approval-mozilla-aurora?
Depends on: 1318530
Depends on: 1067360
Depends on: 1220414
No longer depends on: 1220414
No longer depends on: 1067360
Depends on: 1319458
I'm planning on re-landing this later this afternoon. (I'm waiting for bug 1319458 to get a few cycles of talos runs on inbound, so that it's easier to distinguish glterrain differences [if any] caaused by bug 1319458's patch vs. by this bug's patch.)
Flags: needinfo?(dholbert)
Pushed by dholbert@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/a494dd0ae2b0 Sets the default value for nglayout.initialpaint.delay to 5ms on desktop (stays 250 ms on Android) per user research conclusions. r=tn
Flags: needinfo?(dholbert)
Note: This re-landing might initially be implicated as causing a tsvgx regression on inbound and/or central. Bug 1318530 (currently on autoland) should address that regression. After bug 1318530 is merged to a branch that has this paint-suppression reduction, the tsvgx regression should mostly go away, though there may be a small regression in the new "svg_static" suite (which is the tsvgx pieces that we spun off in bug 1318530). A small regression there is an expected & acceptable consequence of this bug, per comment 16.
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Hi Daniel, Here is perf alert for this landed patch: https://treeherder.mozilla.org/perf.html#/alerts?id=4345 Do you think all of these are expected?
Thanks, Alison! I think these are all OK. Specifically: * The "damp" regressions weren't reported the previous times this was landed, but they don't surprise me. (And they look like they might be sporadic / within the range of noise on the graph too). * The tart improvements make sense. * The "fileio" ones are *NOT* expected, and almost certainly unrelated to this change. (Looking at the graph for those, it looks like this "regression" is actually us returning to the same value we had a few days ago. Not sure if that's from a backout or from a bimodal test result or what, but I'm confident it's unrelated to this bug's one-liner non-fileio-related patch. * The tp5o tsvgx and regressions are expected.
I actually think the fileio regressions are related, they were improved on backing this out, and if you look at the 30 day history, you can see when we first landed this, backed it out, and now recently landed this: https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=%5Bmozilla-inbound,2f3af3833d55ff371ecf01c41aeee1939ef3a782,1,1%5D&series=%5Bautoland,2f3af3833d55ff371ecf01c41aeee1939ef3a782,1,1%5D these are all non-main startup fileio, I am not sure what this is- I would have to push to try with some hacks to talos to make it output data or an artifact which I could use to figure out which files are the problem. I did this about a month ago for another problem and it turned out we were loading different fonts. Let me know if you want me to do that and I could get some try pushes put together.
Depends on: 1332558
Depends on: 1385479
Regressions: 1720582
Regressions: 1698396
No longer regressions: 1698396
No longer regressions: 1720582
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: