Closed Bug 1622445 Opened 5 years ago Closed 5 years ago

Intermittent GECKO(2625) | SUMMARY: ThreadSanitizer: data race /builds/worker/checkouts/gecko/gfx/skia/skia/include/private/SkPathRef.h:152:13 in isFinite

Categories

(Core :: Graphics, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla76
Tracking Status
firefox76 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: kats)

References

(Regression)

Details

(Keywords: intermittent-failure, regression, Whiteboard: [retriggered][stockwell unknown])

Attachments

(2 files)

Filed by: rmaries [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=293106172&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/ICMdrm6qRLyj-GwYtyGf7w/runs/0/artifacts/public/logs/live_backing.log


[task 2020-03-13T21:19:01.827Z] 21:19:01 INFO - TEST-START | gfx/layers/apz/test/mochitest/test_group_hittest.html
[task 2020-03-13T21:19:22.471Z] 21:19:22 INFO - GECKO(2625) | #35 content_process_main /builds/worker/checkouts/gecko/browser/app/../../ipc/contentproc/plugin-container.cpp:56:28 (firefox+0xc86c7)
[task 2020-03-13T21:19:22.471Z] 21:19:22 INFO - GECKO(2625) | #36 main /builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp:303:18 (firefox+0xc86c7)
[task 2020-03-13T21:19:22.472Z] 21:19:22 INFO - GECKO(2625) | SUMMARY: ThreadSanitizer: data race /builds/worker/checkouts/gecko/gfx/skia/skia/include/private/SkPathRef.h:152:13 in isFinite
[task 2020-03-13T21:19:22.472Z] 21:19:22 INFO - GECKO(2625) | ==================
[task 2020-03-13T21:19:22.472Z] 21:19:22 INFO - GECKO(2625) | ###!!! [Parent][MessageChannel] Error: (msgtype=0x37012E,name=PContent::Msg_DetachBrowsingContext) Channel error: cannot send/recv
[task 2020-03-13T21:19:22.904Z] 21:19:22 ERROR - GECKO(2625) | A content process crashed and MOZ_CRASHREPORTER_SHUTDOWN is set, shutting down
[task 2020-03-13T21:19:34.409Z] 21:19:34 INFO - GECKO(2625) | 1584134374404 Marionette TRACE Received observer notification xpcom-will-shutdown
[task 2020-03-13T21:19:34.410Z] 21:19:34 INFO - GECKO(2625) | 1584134374405 Marionette INFO Stopped listening on port 2828
[task 2020-03-13T21:19:34.410Z] 21:19:34 INFO - GECKO(2625) | 1584134374406 Marionette DEBUG Marionette stopped listening

Component: Panning and Zooming → Graphics
Flags: needinfo?(kats)
Regressed by: 1617427
Has Regression Range: --- → yes
Keywords: regression
Whiteboard: [stockwell needswork:owner] → [stockwell needswork:owner][retriggered]

I'll investigate, thanks.

Assignee: nobody → kats
Flags: needinfo?(kats)
Attached file Stacks from the log (deleted) —

For posterity here's the stacks. They seem pretty clear:

  • OMTP is enabled
  • A SkPathRef is created during ClientLayerManager::EndTransaction as part of SVGGeometryFrame::Render (inside GetOrBuildPath, here)
  • This SkPathRef seems to be cached on the element, and a hit-testing operation ends up calling ContainsPoint on it here. Unfortunately, this internally triggers a lazy call to computeBounds here which is a mutating operation. So there's a write from the main thread.
  • This SkPathRef is also passed to the paint thread in a captured drawtarget and read from in the process of rasterization.

I don't remember all the OMTP details here but I'm guessing this is a legitimate bug in the code. The SkPathRef is held on to via a RefPtr so in that respect it's threadsafe. But the fact that the main thread holds on to it and mutates it after handing off a copy to the paint thread seems bad.

Also to be clear, I think this is a pre-existing bug that somehow got exposed by the new test I added. It's not clear to me why the new test triggered the bug, because the failure occurs a few tests later. It might be that tsan takes its time to symbolicate the stacks, and so the log is not properly representing exactly when the fault is detected. Or it might be that the SVGGeometryElement is being cached across multiple tests somehow. The hit-test operation that is "writing" to the SkPathRef seems like it could be coming from the new test that I added, and I guess it might be getting painted via OMTP during some later test, but it seems odd that it wouldn't get painted (and trigger the tsan failure) sooner.

https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&selectedJob=293296971&revision=3d87921dddbf32432677bcecde846dc0322ab826 disables the caching if the drawtarget is a capture DT. This will disable it during the OMTP recording which should solve the problem. Try push seems green so far.

Pushed by kgupta@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/09c8d799a2b4 Don't cache the SVGGeometryElement's Path with OMTP rendering. r=longsonr
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla76
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: