Closed
Bug 1453863
Opened 7 years ago
Closed 6 years ago
Intermittent TEST-UNEXPECTED-TIMEOUT | dom/events/test/test_wheel_default_action.html | application timed out after 370 seconds with no output
Categories
(Core :: Panning and Zooming, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla61
Tracking | Status | |
---|---|---|
firefox59 | --- | unaffected |
firefox60 | --- | unaffected |
firefox61 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: kats)
References
Details
(Keywords: intermittent-failure)
Crash Data
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
zjz
:
review+
|
Details | Diff | Splinter Review |
Filed by: rgurzau [at] mozilla.com
https://treeherder.mozilla.org/logviewer.html#?job_id=173410896&repo=autoland
https://queue.taskcluster.net/v1/task/M83CdU08RqmjEdGCeELEfg/runs/0/artifacts/public/logs/live_backing.log
[task 2018-04-13T02:07:43.265Z] 02:07:43 WARNING - runtests.py | Failed to get app exit code - running/crashed?
[task 2018-04-13T02:07:43.267Z] 02:07:43 INFO - TEST-INFO | Main app process: exit 0
[task 2018-04-13T02:07:43.268Z] 02:07:43 INFO - runtests.py | Application ran for: 0:11:21.305372
[task 2018-04-13T02:07:43.272Z] 02:07:43 INFO - zombiecheck | Reading PID log: /tmp/tmppZdDf0pidlog
[task 2018-04-13T02:07:43.272Z] 02:07:43 INFO - ==> process 1665 launched child process 1685
[task 2018-04-13T02:07:43.272Z] 02:07:43 INFO - ==> process 1665 launched child process 1710
[task 2018-04-13T02:07:43.272Z] 02:07:43 INFO - ==> process 1665 launched child process 1738
[task 2018-04-13T02:07:43.272Z] 02:07:43 INFO - zombiecheck | Checking for orphan process with PID: 1738
[task 2018-04-13T02:07:43.272Z] 02:07:43 INFO - zombiecheck | Checking for orphan process with PID: 1685
[task 2018-04-13T02:07:43.273Z] 02:07:43 INFO - zombiecheck | Checking for orphan process with PID: 1710
[task 2018-04-13T02:07:43.273Z] 02:07:43 INFO - mozcrash Copy/paste: /usr/local/bin/linux64-minidump_stackwalk /tmp/tmp7h19jw.mozrunner/minidumps/61911ce0-0e4d-8c4e-c5be-417b87575a34.dmp /builds/worker/workspace/build/symbols
[task 2018-04-13T02:07:43.288Z] 02:07:43 INFO - mozcrash Saved minidump as /builds/worker/workspace/build/blobber_upload_dir/61911ce0-0e4d-8c4e-c5be-417b87575a34.dmp
[task 2018-04-13T02:07:43.290Z] 02:07:43 INFO - PROCESS-CRASH | Main app process exited normally | application crashed [None]
[task 2018-04-13T02:07:43.291Z] 02:07:43 INFO - Crash dump filename: /tmp/tmp7h19jw.mozrunner/minidumps/61911ce0-0e4d-8c4e-c5be-417b87575a34.dmp
[task 2018-04-13T02:07:43.293Z] 02:07:43 INFO - stderr from minidump_stackwalk:
[task 2018-04-13T02:07:43.294Z] 02:07:43 INFO - 2018-04-13 02:07:43: minidump.cc:4359: INFO: Minidump opened minidump /tmp/tmp7h19jw.mozrunner/minidumps/61911ce0-0e4d-8c4e-c5be-417b87575a34.dmp
[task 2018-04-13T02:07:43.295Z] 02:07:43 INFO - 2018-04-13 02:07:43: minidump.cc:4479: INFO: Minidump not byte-swapping minidump
[task 2018-04-13T02:07:43.297Z] 02:07:43 INFO - 2018-04-13 02:07:43: minidump.cc:4956: INFO: GetStream: type 15 not present
[task 2018-04-13T02:07:43.298Z] 02:07:43 INFO - 2018-04-13 02:07:43: minidump.cc:4956: INFO: GetStream: type 7 not present
[task 2018-04-13T02:07:43.299Z] 02:07:43 INFO - 2018-04-13 02:07:43: minidump.cc:4956: INFO: GetStream: type 7 not present
[task 2018-04-13T02:07:43.300Z] 02:07:43 INFO - 2018-04-13 02:07:43: minidump.cc:4956: INFO: GetStream: type 1197932545 not present
[task 2018-04-13T02:07:43.302Z] 02:07:43 INFO - 2018-04-13 02:07:43: minidump.cc:4956: INFO: GetStream: type 6 not present
[task 2018-04-13T02:07:43.302Z] 02:07:43 INFO - 2018-04-13 02:07:43: minidump.cc:4956: INFO: GetStream: type 1197932546 not present
[task 2018-04-13T02:07:43.303Z] 02:07:43 INFO - 2018-04-13 02:07:43: minidump.cc:4956: INFO: GetStream: type 4 not present
[task 2018-04-13T02:07:43.304Z] 02:07:43 INFO - 2018-04-13 02:07:43: minidump.cc:4956: INFO: GetStream: type 5 not present
[task 2018-04-13T02:07:43.305Z] 02:07:43 INFO - 2018-04-13 02:07:43: minidump.cc:4956: INFO: GetStream: type 3 not present
[task 2018-04-13T02:07:43.306Z] 02:07:43 INFO - 2018-04-13 02:07:43: minidump_processor.cc:152: ERROR: Minidump /tmp/tmp7h19jw.mozrunner/minidumps/61911ce0-0e4d-8c4e-c5be-417b87575a34.dmp has no thread list
[task 2018-04-13T02:07:43.307Z] 02:07:43 INFO - 2018-04-13 02:07:43: stackwalk.cc:139: ERROR: MinidumpProcessor::Process failed
[task 2018-04-13T02:07:43.308Z] 02:07:43 INFO - 2018-04-13 02:07:43: minidump.cc:4331: INFO: Minidump closing minidump
[task 2018-04-13T02:07:43.310Z] 02:07:43 INFO - mozcrash Copy/paste: /usr/local/bin/linux64-minidump_stackwalk /tmp/tmp7h19jw.mozrunner/minidumps/44d12e3e-6ac6-543a-6e32-af01d09c6432.dmp /builds/worker/workspace/build/symbols
[task 2018-04-13T02:07:49.073Z] 02:07:49 INFO - mozcrash Saved minidump as /builds/worker/workspace/build/blobber_upload_dir/44d12e3e-6ac6-543a-6e32-af01d09c6432.dmp
[task 2018-04-13T02:07:49.074Z] 02:07:49 INFO - mozcrash Saved app info as /builds/worker/workspace/build/blobber_upload_dir/44d12e3e-6ac6-543a-6e32-af01d09c6432.extra
[task 2018-04-13T02:07:49.223Z] 02:07:49 INFO - PROCESS-CRASH | Main app process exited normally | application crashed [@ libpthread-2.23.so + 0x1026d]
Comment 1•7 years ago
|
||
[task 2018-04-13T02:07:42.553Z] 02:07:42 INFO - TEST-PASS | dom/events/test/test_wheel_default_action.html | doTestAutoDirScroll(aSettings=deltaX is reverted, {honoursRoot: false}), root = writingMode: horizontal-tb; direction: rtl; target = writingMode: vertical-rl; direction: ltr; adjusted auto-dir scroll to right(originally top) by pixel scroll even if lineOrPageDelta is 0, no vertical scrollbar: scrolled vertical
[task 2018-04-13T02:07:42.554Z] 02:07:42 INFO - TEST-PASS | dom/events/test/test_wheel_default_action.html | doTestAutoDirScroll(aSettings=deltaX is reverted, {honoursRoot: false}), root = writingMode: horizontal-tb; direction: rtl; target = writingMode: vertical-rl; direction: ltr; adjusted auto-dir scroll to right(originally top) by pixel scroll even if lineOrPageDelta is 0, no vertical scrollbar: not scrolled to right, got -992
[task 2018-04-13T02:07:42.555Z] 02:07:42 INFO - Buffered messages finished
[task 2018-04-13T02:07:42.558Z] 02:07:42 ERROR - TEST-UNEXPECTED-TIMEOUT | dom/events/test/test_wheel_default_action.html | application timed out after 370 seconds with no output
Severity: critical → normal
Keywords: crash
Summary: Intermittent PROCESS-CRASH | Main app process exited normally | application crashed [@ libpthread-2.23.so + 0x1026d] → Intermittent TEST-UNEXPECTED-TIMEOUT | dom/events/test/test_wheel_default_action.html | application timed out after 370 seconds with no output
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 3•7 years ago
|
||
Based on comment 1 this looks like a regression from bug 1358017.
Blocks: autodir
Comment 4•7 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #3)
> Based on comment 1 this looks like a regression from bug 1358017.
The test time has been increased for autodir. So 370 seconds is not enough. It may need nearly 10 minutes. So it's not a regression. The allowed test time for the file has also been increased, see https://searchfox.org/mozilla-central/source/dom/events/test/window_wheel_default_action.html#71
Comment 5•7 years ago
|
||
Since there is SimpleTest.requestLongerTimeout(6); in the file.
So I am wondering why 370 seconds tiggered the timed-out failure. Is that some external code which turns off the feature of requestLongerTimeout?
Assignee | ||
Comment 6•7 years ago
|
||
I think the problem might be that there was *no output* for 370 seconds. If you look at the log in comment 0, the last test output is preceded by a line that says "Buffered messages logged at 02:01:28" and the timeout happens at 02:07:42 which is 374 seconds later. So either the test is hanging, or it's running through a patch of 6+ minutes of testing where there's no test output at all. Either way it happens after the test that outputs "doTestAutoDirScroll(aSettings=deltaX is reverted, {honoursRoot: false}), root = writingMode: horizontal-tb; direction: rtl; target = writingMode: vertical-rl; direction: ltr; adjusted auto-dir scroll to right(originally top) by pixel scroll even if lineOrPageDelta is 0, no vertical scrollbar: not scrolled to right, got -992"
Assignee | ||
Comment 7•7 years ago
|
||
And actually if you look further up in the log you see lots of this:
[task 2018-04-13T02:00:55.612Z] 02:00:55 INFO - GECKO(1665) | ###!!! ERROR: Potential deadlock detected:
[task 2018-04-13T02:00:55.613Z] 02:00:55 INFO - GECKO(1665) | === Cyclical dependency starts at
[task 2018-04-13T02:00:55.613Z] 02:00:55 INFO - GECKO(1665) | --- RecursiveMutex : APZCTreeLock calling context
[task 2018-04-13T02:00:55.614Z] 02:00:55 INFO - GECKO(1665) | [stack trace unavailable]
[task 2018-04-13T02:00:55.615Z] 02:00:55 INFO - GECKO(1665) | --- Next dependency:
[task 2018-04-13T02:00:55.616Z] 02:00:55 INFO - GECKO(1665) | --- RecursiveMutex : AsyncPanZoomController (currently acquired)
[task 2018-04-13T02:00:55.617Z] 02:00:55 INFO - GECKO(1665) | calling context
[task 2018-04-13T02:00:55.618Z] 02:00:55 INFO - GECKO(1665) | [stack trace unavailable]
[task 2018-04-13T02:00:55.619Z] 02:00:55 INFO - GECKO(1665) | === Cycle completed at
[task 2018-04-13T02:00:55.620Z] 02:00:55 INFO - GECKO(1665) | --- RecursiveMutex : APZCTreeLock calling context
[task 2018-04-13T02:00:55.620Z] 02:00:55 INFO - GECKO(1665) | [stack trace unavailable]
[task 2018-04-13T02:00:55.621Z] 02:00:55 INFO - GECKO(1665) | Deadlock may happen for some other execution
[task 2018-04-13T02:00:55.622Z] 02:00:55 INFO - GECKO(1665) | [Parent 1665, Main Thread] WARNING: Potential deadlock detected:
[task 2018-04-13T02:00:55.622Z] 02:00:55 INFO - GECKO(1665) | Cyclical dependency starts at
[task 2018-04-13T02:00:55.623Z] 02:00:55 INFO - GECKO(1665) | RecursiveMutex : APZCTreeLock
[task 2018-04-13T02:00:55.624Z] 02:00:55 INFO - GECKO(1665) | Next dependency:
[task 2018-04-13T02:00:55.625Z] 02:00:55 INFO - GECKO(1665) | RecursiveMutex : AsyncPanZoomController (currently acquired)
[task 2018-04-13T02:00:55.626Z] 02:00:55 INFO - GECKO(1665) | Cycle completed at
[task 2018-04-13T02:00:55.626Z] 02:00:55 INFO - GECKO(1665) | RecursiveMutex : APZCTreeLock
[task 2018-04-13T02:00:55.627Z] 02:00:55 INFO - GECKO(1665) | Deadlock may happen for some other execution
[task 2018-04-13T02:00:55.628Z] 02:00:55 INFO - GECKO(1665) | : file /builds/worker/workspace/build/src/xpcom/threads/BlockingResourceBase.cpp, line 309
so it's quite likely that the test is hitting a deadlock because it's acquiring locks in the wrong order.
Assignee | ||
Comment 8•7 years ago
|
||
https://hg.mozilla.org/integration/autoland/rev/509187cd4c1a#l4.220 for example is a call that acquires the APZ tree lock while holding the AsyncPanZoomController's mRecursiveMutex, and so violates the lock ordering requirements.
Assignee | ||
Comment 9•7 years ago
|
||
Here's a try push that might fix it: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d270e34520b9a2642af0af4e88e74e97b82de745
status-firefox59:
--- → unaffected
status-firefox60:
--- → unaffected
status-firefox61:
--- → affected
Component: DOM: Events → Panning and Zooming
OS: Unspecified → All
Priority: P5 → P2
Hardware: Unspecified → All
Version: unspecified → 61 Branch
Comment 10•7 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #8)
> https://hg.mozilla.org/integration/autoland/rev/509187cd4c1a#l4.220 for
> example is a call that acquires the APZ tree lock while holding the
> AsyncPanZoomController's mRecursiveMutex, and so violates the lock ordering
> requirements.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #9)
> Here's a try push that might fix it:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=d270e34520b9a2642af0af4e88e74e97b82de745
Wow. Thank you very much for the spot of the potential deadlock.
Assignee | ||
Comment 11•7 years ago
|
||
Try push seems to be good, I don't see any deadlock warnings in the log.
Assignee: nobody → bugmail
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Attachment #8968902 -
Flags: review?(zjz)
Comment 13•7 years ago
|
||
Thank you for your trust.
But I am sorry, Kartikaya. I am in L1, so I don't have access to review.
You may need to find someone else to review it.
Assignee | ||
Comment 14•7 years ago
|
||
Ah, MozReview might not give you permission but technically you're still allowed to review patches if the patch was written by the relevant module owner or peer, which I am. I think you're an appropriate reviewer since you added the code in question and understand it.
I'm attaching the patch as an attachment instead and obsoleting the MozReview version, so you should be able to review it by changing the review flag to r+. I can then land it manually on inbound.
Attachment #8968902 -
Attachment is obsolete: true
Attachment #8968902 -
Flags: review?(zjz)
Attachment #8968910 -
Flags: review?(zjz)
Comment 15•7 years ago
|
||
Comment on attachment 8968910 [details] [diff] [review]
Patch - remove deadlock scenario
Okay, done
Attachment #8968910 -
Flags: review?(zjz) → review+
Assignee | ||
Comment 16•7 years ago
|
||
Thanks! I'll land it on inbound once the trees are open again.
Comment 17•7 years ago
|
||
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a12624f2ce5b
Avoid potential deadlock scenario caused by improper lock acquisiting order. r=zjz
Comment 18•7 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Comment hidden (Intermittent Failures Robot) |
Comment 20•6 years ago
|
||
Updated•6 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 21•6 years ago
|
||
Please don't reopen bugs where patches have landed. File a new bug if needed.
Status: REOPENED → RESOLVED
Closed: 7 years ago → 6 years ago
Resolution: --- → FIXED
Comment 22•6 years ago
|
||
Recent failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=197969318&repo=mozilla-central&lineNumber=15068
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Updated•6 years ago
|
Comment 23•6 years ago
|
||
I filed a new bug (bug 1489395) for this, because this bug patch has landed.
So, closing this bug.
Status: REOPENED → RESOLVED
Closed: 6 years ago → 6 years ago
Resolution: --- → FIXED
Comment hidden (Intermittent Failures Robot) |
You need to log in
before you can comment on or make changes to this bug.
Description
•