Open Bug 1343884 Opened 8 years ago Updated 2 years ago

Intermittent 1239889-1.html | program error managing timeouts

Categories

(Core :: DOM: Animation, defect, P5)

defect

Tracking

()

Tracking Status
firefox-esr52 --- unaffected
firefox54 --- unaffected
firefox55 --- affected
firefox56 --- affected

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Keywords: intermittent-failure, leave-open, Whiteboard: [stockwell unknown])

Attachments

(1 file)

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a57a0d11451da00731c31129c0082fe8c57e30d6 I did a try with a modification which causes a MozAfterPaint event. This is a workaround but seems to work fine on the try.
Summary: Intermittent REFTEST ERROR | file:///home/worker/workspace/build/tests/reftest/tests/dom/animation/test/crashtests/1239889-1.html | program error managing timeouts → Intermittent 1239889-1.html | program error managing timeouts
I'd like to see if the modification in comment 6 solves this timeout.
Comment on attachment 8862716 [details] Bug 1343884 - Add a tweak to cause a MozAfterPaint in 1239889-1.html. DONTBUILD https://reviewboard.mozilla.org/r/134582/#review137580
Attachment #8862716 - Flags: review?(boris.chiou) → review+
Thank you!
Keywords: leave-open
Pushed by hikezoe@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/cc0825d18ef8 Add a tweak to cause a MozAfterPaint in 1239889-1.html. r=boris DONTBUILD
Unfortunately still fails, but from a failure log; INFO - REFTEST TEST-LOAD | file:///home/worker/workspace/build/tests/reftest/tests/docshell/base/crashtests/1341657.html | 20 / 3194 (0%) INFO - REFTEST TEST-START | file:///home/worker/workspace/build/tests/reftest/tests/dom/animation/test/crashtests/1239889-1.html INFO - REFTEST TEST-LOAD | file:///home/worker/workspace/build/tests/reftest/tests/dom/animation/test/crashtests/1239889-1.html | 21 / 3194 (0%) ERROR - REFTEST ERROR | file:///home/worker/workspace/build/tests/reftest/tests/dom/animation/test/crashtests/1239889-1.html | program error managing timeouts ERROR -- INFO - REFTEST TEST-PASS | file:///home/worker/workspace/build/tests/reftest/tests/dom/animation/test/crashtests/1239889-1.html | (LOAD ONLY) INFO - REFTEST TEST-END | file:///home/worker/workspace/build/tests/reftest/tests/dom/animation/test/crashtests/1239889-1.html It seems to me that 1341657.html did not finish properly.
Depends on: 1362903
this failure has increased in frequency in the last week, mostlly linux-qr/debug, but almost many linux64/debug. These are all e10s failures. :hiro, I see you had taken a look at this a few weeks ago, do you think this is easy to fix with a little more effort?
Flags: needinfo?(hikezoe)
Whiteboard: [stockwell needswork]
Looking at the log here: > [task 2017-06-12T16:01:48.822642Z] 16:01:48 INFO - REFTEST TEST-START | file:///home/worker/workspace/build/tests/reftest/tests/docshell/base/crashtests/1341657.html > [task 2017-06-12T16:01:48.824451Z] 16:01:48 INFO - REFTEST TEST-LOAD | file:///home/worker/workspace/build/tests/reftest/tests/docshell/base/crashtests/1341657.html | 20 / 3207 (0%) > [task 2017-06-12T16:01:48.830887Z] 16:01:48 INFO - REFTEST None > [task 2017-06-12T16:01:48.832675Z] 16:01:48 INFO - REFTEST TEST-START | file:///home/worker/workspace/build/tests/reftest/tests/dom/animation/test/crashtests/1239889-1.html > [task 2017-06-12T16:01:48.834481Z] 16:01:48 INFO - REFTEST TEST-LOAD | file:///home/worker/workspace/build/tests/reftest/tests/dom/animation/test/crashtests/1239889-1.html | 21 / 3207 (0%) > [task 2017-06-12T16:01:48.836381Z] 16:01:48 INFO - ++DOMWINDOW == 85 (0x7fb4731f5800) [pid = 1096] [serial = 85] [outer = 0x7fb49dc99800] > [task 2017-06-12T16:01:48.838195Z] 16:01:48 INFO - ++DOMWINDOW == 86 (0x7fb473c0a000) [pid = 1096] [serial = 86] [outer = 0x7fb49dc99800] > [task 2017-06-12T16:01:48.839988Z] 16:01:48 ERROR - REFTEST ERROR | file:///home/worker/workspace/build/tests/reftest/tests/dom/animation/test/crashtests/1239889-1.html | program error managing timeouts > [task 2017-06-12T16:01:48.843445Z] 16:01:48 ERROR - > [task 2017-06-12T16:01:48.846893Z] 16:01:48 INFO - REFTEST TEST-PASS | file:///home/worker/workspace/build/tests/reftest/tests/dom/animation/test/crashtests/1239889-1.html | (LOAD ONLY) > [task 2017-06-12T16:01:48.847577Z] 16:01:48 INFO - REFTEST TEST-END | file:///home/worker/workspace/build/tests/reftest/tests/dom/animation/test/crashtests/1239889-1.html And the specific error we're hitting: > if (gFailureTimeout != null) { > SendException("program error managing timeouts\n"); > } I wonder if the problem is actually the previous test (1341657.html)? According to the log above it is still running. (Or even the one before that, 1331295.html, which seems to finish twice.) It seems like the test harness if getting confused. Perhaps the previous test is still running, hence gFailureTimeout is not null.
Oh, comment 14 says the same thing :)
And it looks like bug 1362903 tracks fixing this. We should probably dupe this bug to that but I'm not sure if that is going to interfere with orange tracking.
I will post an updated fix (actually it's not a fix, just a workaround) in bug 1362903 soon. An unfortunate thing is that reftest tools (or test runner?) get confused that the timeout happens in this test.
Flags: needinfo?(hikezoe)
:hiro, can you take a look at this test case- we are 2 weeks of high failure rates
Flags: needinfo?(hikezoe)
This is really unfortunate. Bug 1362903 did not help the timeout. docshell/base/crashtests/1341657.html is not still finished properly. Hsin-Yi, could you please find someone to fix the timeout issue of docshell/base/crashtests/1341657.html or disable it. Thanks!
Flags: needinfo?(hikezoe) → needinfo?(htsai)
thanks :hiro for commenting, it would be great to see a fix here.
sorry for the late reply, Hiro and Joel! Hey Samael, could you please help investigate why docshell/base/crashtests/1341657.html doesn't finished properly? Unfortunately the workaround in bug 1362903 doesn't help much.
Flags: needinfo?(htsai) → needinfo?(sawang)
Pushed by gbrown@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/63637b683c21 Skip crashtest 1239889-1.html for intermittent failures; r=me,test-only
Hopefully Samael can get this test re-enabled soon.
Whiteboard: [stockwell needswork] → [stockwell disabled]
I should have read the discussion here before attempting to disable. When I disabled 1239889-1.html, the next crashtest failed the same way: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=482bb010dbf4a076bfc9e2c397e730b646e8fa61. I backed out the test skip in https://hg.mozilla.org/integration/mozilla-inbound/rev/e3e1e82bcb3dd70b4de49707f7ed0ea4d0b2d07b Can we skip docshell/base/crashtests/1341657.html instead?
Whiteboard: [stockwell disabled] → [stockwell needswork]
(In reply to Geoff Brown [:gbrown] from comment #35) > I should have read the discussion here before attempting to disable. When I > disabled 1239889-1.html, the next crashtest failed the same way: > https://treeherder.mozilla.org/#/jobs?repo=mozilla- > inbound&revision=482bb010dbf4a076bfc9e2c397e730b646e8fa61. > > I backed out the test skip in > https://hg.mozilla.org/integration/mozilla-inbound/rev/ > e3e1e82bcb3dd70b4de49707f7ed0ea4d0b2d07b > > Can we skip docshell/base/crashtests/1341657.html instead? I just re-opened bug 1362903 and start working on it. If you need a quick solution then yes 1341657.html is the one should be disabled.
Flags: needinfo?(sawang)
(In reply to Geoff Brown [:gbrown] from comment #35) > Can we skip docshell/base/crashtests/1341657.html instead? I found it's more likely to be caused by that 1331295.html passed twice. Investigating...
Here's my theory: gCurrentURL in reftest-content.js is not cleared on test finishes, so if a subsequent load event of the same test document occurs before parent sending another test URL, it can get confused. In 1331295.html there is window.location.reload, so this could happen occasionally.
(In reply to Samael Wang [:freesamael] from comment #38) > Here's my theory: > > gCurrentURL in reftest-content.js is not cleared on test finishes, so if a > subsequent load event of the same test document occurs before parent sending > another test URL, it can get confused. > > In 1331295.html there is window.location.reload, so this could happen > occasionally. And since reftest-content.js would send "reftest:TestDone" twice, reftest.jsm thought the 2nd TestDone was for 1341657.html.
No new failures on trunk since July 15.
Whiteboard: [stockwell needswork] → [stockwell unknown]
Priority: -- → P5
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: