Closed Bug 1150351 Opened 9 years ago Closed 9 years ago

test_deferred_start.html failing on Linux mochitest-e10s

Categories

(Core :: CSS Parsing and Computation, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
Tracking Status
e10s m8+ ---
firefox40 --- affected

People

(Reporter: dbaron, Assigned: dbaron)

References

(Blocks 1 open bug)

Details

We're disabling test_deferred_start.html on Linux mochitest-e10s, because it's failing, probably as a result of enabling vsync compositor (on Linux) and omt animations on the same day.

(More in a sec; need the bug number now for the comment.)
from #developers:

> [2015-04-01 21:27:58 -0700] <dbaron> really?  That test failure that **was actually fixed in the try run of the same patches** decided to come back?
> [2015-04-01 21:29:05 -0700] <birtles> dbaron: yeah, I just saw that...
> [2015-04-01 21:29:34 -0700] <philor> there's that perfectly good Mulet bug, though, not like it would be the first permaorange we've just merrily starred
> [2015-04-01 21:29:35 -0700] <birtles> I did two try runs with just that patch and that test didn't fail once
> [2015-04-01 21:30:54 -0700] <dbaron> birtles, let me see if anything else interesting landed between the base of my try run and my inbound push
> [2015-04-01 21:31:37 -0700] <philor> did you check whether the try pushes for no good reason had RELEASE_BUILD set?
> [2015-04-01 21:32:47 -0700] <dbaron> philor, I did a parallel try push that did show failures
> [2015-04-01 21:33:24 -0700] <dbaron> philor, also, it got better on the next push
> [2015-04-01 21:35:44 -0700] <pulsebot> Check-in: https://hg.mozilla.org/integration/mozilla-inbound/rev/0f1095c9567f - Mason Chang - Bug 1144317 - Enable vsync refresh driver on Windows. r=kats
> [2015-04-01 21:38:21 -0700] <dbaron> birtles, in the test that's failing, what makes us wait for stuff to get to the compositor thread?
> [2015-04-01 21:39:23 -0700] <birtles> dbaron: the wait on animation.ready
> [2015-04-01 21:40:31 -0700] <dbaron> birtles, that waits on sync'ing to the compositor thread?
> [2015-04-01 21:41:54 -0700] <birtles> dbaron: in effect--one refresh driver tick after calling EndTransaction on the layer transaction we resolve that promise
> [2015-04-01 21:42:05 -0700] <birtles> that test used to be flaky because there was a race where sometimes the animation transform wasn't applied on the shadow layer tree but bug 1113425 should have fixed that
> [2015-04-01 21:42:40 -0700] <dbaron> birtles, I'm somewhat inclined towards disabling that test on relevant platforms rather than backing out omta again
> [2015-04-01 21:43:33 -0700] <birtles> dbaron: fair enough--I spent a long time getting that test to pass reliably, but it's better to have OMTA on than that test :)
> [2015-04-01 21:43:48 -0700] <dbaron> birtles, also, it might be failing only on mochitest-e10s, which has the compositor in a different process
> [2015-04-01 21:44:05 -0700] <birtles> dbaron: but it passes reliably on B2G which has the compositor in a different process
> [2015-04-01 21:44:17 -0700] <dbaron> birtles, I'm not even sure we're getting correct mochitest results out of B2G
> [2015-04-01 21:44:29 -0700] <dbaron> birtles, can you explain why jwatt's changes or your changes that broke the mochitests elsewhere didn't break them on B2G?
> [2015-04-01 21:44:46 -0700] <birtles> dbaron: I was wondering that myself
> [2015-04-01 21:45:47 -0700] <birtles> dbaron: but some of them could be masked by additional layer transactions that might be triggered by additional UI events etc.
> [2015-04-01 21:46:00 -0700] <dbaron> philor, you don't happen to know how to tell from inside a mochitest that we're in mochitest-e10s?
> [2015-04-01 21:46:20 -0700] <philor> dbaron: sorry, nope
> [2015-04-01 21:46:29 -0700] <birtles> we've had that before with those OMTA tests, move the mouse and the test passes
> [2015-04-01 21:47:12 -0700] <birtles> dbaron: but yes, I think switching off that test again is reasonable
> [2015-04-01 21:48:50 -0700] <birtles> dbaron: assuming that it used to pass reliably (and we weren't just incredibly lucky before) it should be possible to bisect the regression and hopefully easier to fix this time around
> [2015-04-01 21:49:33 -0700] <dbaron> birtles, oh, one of the changes that landed since my try run is:
> [2015-04-01 21:49:38 -0700] <dbaron> changeset:   237046:806abeb59e8f
> [2015-04-01 21:49:38 -0700] <dbaron> user:        Mason Chang <mchang@mozilla.com>
> [2015-04-01 21:49:38 -0700] <dbaron> date:        Wed Apr 01 08:26:37 2015 -0700
> [2015-04-01 21:49:38 -0700] <dbaron> summary:     Bug 1149391 - Enable software vsync compositor on Linux. r=kats
> [2015-04-01 21:50:49 -0700] <birtles> dbaron: we've had trouble with that test + vsync + e10s + omta before
> [2015-04-01 21:51:08 -0700] <birtles> bug 1119981
> [2015-04-01 21:51:13 -0700] <birtles> but bug 1113425 should have fixed it
> [2015-04-01 21:52:38 -0700] <birtles> dbaron: I think we just disable that test... it's going to take a while to work out what's going wrong there
> [2015-04-01 21:52:47 -0700] <birtles> for e10s only
> [2015-04-01 21:53:24 -0700] <dbaron> birtles, I was hoping to just disable half of it (i.e., the second half), but I don't know how to do that, so I'm likely to just disable it in the manifest if that's ok
> [2015-04-01 21:53:33 -0700] <birtles> dbaron: yes, I think that's fine


The set of changes since the good try run was:
hg log -r 'ancestors(7b75233a273c) - ancestors(e5b72a8edb82)'
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/bce1203ac470, although underlying problem still present.
Blocks: e10s-tests
tracking-e10s: --- → m8+
needinfo as a reminder to birtles
Flags: needinfo?(bbirtles)
And, actually, a try run to check that it's still an issue:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b1cd12ae251a
It looks like maybe I can just back out the disabling... tomorrow, though.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
https://hg.mozilla.org/mozilla-central/rev/10b09c193214
Assignee: nobody → dbaron
Flags: in-testsuite+
This appears to still be failing. The failures are being logged in bug 1148949 and appear to pick up right after this landed on m-i. I'll see if I can reproduce on Linux.
Flags: needinfo?(bbirtles)
I'm able to reproduce this so I'll try to fix it over in bug 1148949.
Depends on: 1148949
Thanks for fixing that so quickly.

(Sorry, probably should have backed this out once it turned out that it wasn't actually fixed, but I guess it's all better now.)
You need to log in before you can comment on or make changes to this bug.