Closed
Bug 1217251
Opened 9 years ago
Closed 9 years ago
Crash in APZ TaskThrottler in Fennec
Categories
(Core :: Panning and Zooming, defect)
Tracking
()
RESOLVED
FIXED
mozilla44
Tracking | Status | |
---|---|---|
firefox44 | --- | fixed |
People
(Reporter: rbarker, Assigned: botond)
References
Details
Attachments
(2 files)
Getting crash in fennec with C++APZ enabled. The TaskThrottler is trying to post a task to a nullptr returned by MessageLoop::current();
Stack trace:
Program received signal SIGSEGV, Segmentation fault.
0x57fd050a in mozilla::layers::TaskThrottler::PostTask (this=this@entry=0x605ab480, aLocation=..., aTask=<error reading variable: Cannot access memory at address 0x0>, aTimeStamp=...)
at /Volumes/firefox/zoom/gfx/layers/apz/src/TaskThrottler.cpp:53
53 if (!MessageLoop::current()) { MOZ_CRASH(); }
(gdb) bt
#0 0x57fd050a in mozilla::layers::TaskThrottler::PostTask (this=this@entry=0x605ab480, aLocation=..., aTask=<error reading variable: Cannot access memory at address 0x0>, aTimeStamp=...)
at /Volumes/firefox/zoom/gfx/layers/apz/src/TaskThrottler.cpp:53
#1 0x57fd079e in mozilla::layers::AsyncPanZoomController::RequestContentRepaint (this=0x5caef600, this@entry=<error reading variable: Cannot access memory at address 0xbeaa6d74>,
aFrameMetrics=..., aFrameMetrics@entry=<error reading variable: Cannot access memory at address 0xbeaa6d74>, aThrottled=aThrottled@entry=true)
at /Volumes/firefox/zoom/gfx/layers/apz/src/AsyncPanZoomController.cpp:2732
#2 0x57fd0bcc in mozilla::layers::AsyncPanZoomController::RequestContentRepaint (this=<error reading variable: Cannot access memory at address 0xbeaa6d74>)
at /Volumes/firefox/zoom/gfx/layers/apz/src/AsyncPanZoomController.cpp:2698
#3 0x57fd0f46 in mozilla::layers::AsyncPanZoomController::ScheduleCompositeAndMaybeRepaint (this=<error reading variable: Cannot access memory at address 0xbeaa6d74>)
at /Volumes/firefox/zoom/gfx/layers/apz/src/AsyncPanZoomController.cpp:2631
#4 0x57fd0f46 in mozilla::layers::AsyncPanZoomController::ScheduleCompositeAndMaybeRepaint (this=<error reading variable: Cannot access memory at address 0xbeaa6d74>)
at /Volumes/firefox/zoom/gfx/layers/apz/src/AsyncPanZoomController.cpp:2631
Reporter | ||
Comment 1•9 years ago
|
||
This is 100% reproducible in the Android 4.3 emulator.
Assignee | ||
Comment 2•9 years ago
|
||
Regression from bug 1213273. I'm assuming MessageLoop::current() being null happens during shutdown or some other edge case, and we can just omit posting the timeout task in that case.
Blocks: 1213273
Reporter | ||
Comment 3•9 years ago
|
||
(In reply to Botond Ballo [:botond] (at standards meeting Oct 19-24) from comment #2)
> Regression from bug 1213273. I'm assuming MessageLoop::current() being null
> happens during shutdown or some other edge case, and we can just omit
> posting the timeout task in that case.
Sorry, I should have given more info on reproducing. Unfortunately this doesn't just occur at shut down. This will happen in Fennec while scrolling a page up and down quickly. I came across this while examining a reftest (gfx/layers/apz/test/reftest/async-scrollbar-1-v.html) for an unrelated failure.
Assignee | ||
Comment 4•9 years ago
|
||
Oh, interesting. Then I guess on Fennec, some threads (that APZ code can run on) don't have a MessageLoop at all?
Assignee | ||
Comment 5•9 years ago
|
||
It would be helpful to know what thread this occurs on. Randall, if you're able to catch this in gdb again, could you get the name of the thread that it occurs on?
Flags: needinfo?(rbarker)
Reporter | ||
Comment 6•9 years ago
|
||
(In reply to Botond Ballo [:botond] (at standards meeting Oct 19-24) from comment #5)
> It would be helpful to know what thread this occurs on. Randall, if you're
> able to catch this in gdb again, could you get the name of the thread that
> it occurs on?
It is easy to reproduce. I did this to try and get the thread name:
if (!MessageLoop::current()) {
RLOG("NO current MessageLoop!");
PRThread* t = nsThreadManager::get()->GetCurrentThread()->GetPRThread();
RLOG("In thread: %s", (PR_GetThreadName(t) ? PR_GetThreadName(t) : "<NO THREAD NAME>"));
return;
}
Unfortunately it printed out: "<NO THREAD NAME>"
Is there something else I can do?
Flags: needinfo?(rbarker) → needinfo?(botond)
Comment 7•9 years ago
|
||
I havea stack in 1217529, feel free to dupe on way or the other ;)
Assignee | ||
Comment 8•9 years ago
|
||
From the stack in bug 1217529, it looks like the call happens on the Java UI thread (since we call into APZ from JNI). I guess it makes sense for the Java UI thread not to have a MessageLoop.
I see APZThreadUtils::RunOnControllerThread() special-cases Android APZ to use a different mechanism (than MessageLoop) to post a Task. We could probably use this function to post the task in TaskThrottler. (It might change the task from sometimes running on the compositor thread and sometimes on the controller thread, to alwas running on the controller thread, but that should be fine.)
Flags: needinfo?(botond)
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → botond
Assignee | ||
Comment 9•9 years ago
|
||
Bug 1217251 - Add APZThreadUtils::RunDelayedTaskOnControllerThread(). r=mstange
Assignee | ||
Comment 10•9 years ago
|
||
Bug 1217251 - In TaskThrottler, dispatch the timeout task correctly from the Java UI thread on Android. r=mstange
Assignee | ||
Comment 11•9 years ago
|
||
The attached patches implement the approach described in comment 8 and should solve the problem.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7bb0ae36566e
Comment 12•9 years ago
|
||
I confirm that this fixes the crashes for me.
Assignee | ||
Comment 14•9 years ago
|
||
Great, thanks! Posting for review.
Assignee | ||
Comment 15•9 years ago
|
||
Comment on attachment 8677686 [details]
MozReview Request: Bug 1217251 - Add APZThreadUtils::RunDelayedTaskOnCurrentThread(). r=kats
Bug 1217251 - Add APZThreadUtils::RunDelayedTaskOnControllerThread(). r=mstange
Attachment #8677686 -
Flags: review?(mstange)
Assignee | ||
Comment 16•9 years ago
|
||
Comment on attachment 8677687 [details]
MozReview Request: Bug 1217251 - In TaskThrottler, dispatch the timeout task correctly from the Java UI thread on Android. r=kats
Bug 1217251 - In TaskThrottler, dispatch the timeout task correctly from the Java UI thread on Android. r=mstange
Attachment #8677687 -
Flags: review?(mstange)
Comment 17•9 years ago
|
||
(In reply to Botond Ballo [:botond] (at standards meeting Oct 19-24) from comment #8)
> (It might
> change the task from sometimes running on the compositor thread and
> sometimes on the controller thread, to alwas running on the controller
> thread, but that should be fine.)
I would like to avoid this if possible. We can instead add an APZThreadUtils::RunDelayedTaskOnCurrentThread() which uses MessageLoop::current() if it is non-null. If it is null then it falls back to what you're doing now, which is posting to the controller thread. We can even add stricter assertions that if MessageLoop::current() is non-null then we must be on Fennec.
Assignee | ||
Comment 18•9 years ago
|
||
Sounds reasonable. I actually considered doing that I just wasn't sure if it mattered enough to be worth it.
Assignee | ||
Comment 19•9 years ago
|
||
Comment on attachment 8677686 [details]
MozReview Request: Bug 1217251 - Add APZThreadUtils::RunDelayedTaskOnCurrentThread(). r=kats
Clearing review flag until I make the requested change.
Attachment #8677686 -
Flags: review?(mstange)
Assignee | ||
Updated•9 years ago
|
Attachment #8677687 -
Flags: review?(mstange)
Assignee | ||
Comment 20•9 years ago
|
||
Comment on attachment 8677686 [details]
MozReview Request: Bug 1217251 - Add APZThreadUtils::RunDelayedTaskOnCurrentThread(). r=kats
Bug 1217251 - Add APZThreadUtils::RunDelayedTaskOnCurrentThread(). r=kats
Attachment #8677686 -
Attachment description: MozReview Request: Bug 1217251 - Add APZThreadUtils::RunDelayedTaskOnControllerThread(). r=mstange → MozReview Request: Bug 1217251 - Add APZThreadUtils::RunDelayedTaskOnCurrentThread(). r=kats
Attachment #8677686 -
Flags: review?(bugmail.mozilla)
Assignee | ||
Comment 21•9 years ago
|
||
Comment on attachment 8677687 [details]
MozReview Request: Bug 1217251 - In TaskThrottler, dispatch the timeout task correctly from the Java UI thread on Android. r=kats
Bug 1217251 - In TaskThrottler, dispatch the timeout task correctly from the Java UI thread on Android. r=kats
Attachment #8677687 -
Attachment description: MozReview Request: Bug 1217251 - In TaskThrottler, dispatch the timeout task correctly from the Java UI thread on Android. r=mstange → MozReview Request: Bug 1217251 - In TaskThrottler, dispatch the timeout task correctly from the Java UI thread on Android. r=kats
Attachment #8677687 -
Flags: review?(bugmail.mozilla)
Comment 22•9 years ago
|
||
Comment on attachment 8677686 [details]
MozReview Request: Bug 1217251 - Add APZThreadUtils::RunDelayedTaskOnCurrentThread(). r=kats
https://reviewboard.mozilla.org/r/22997/#review20677
::: gfx/layers/apz/util/APZThreadUtils.cpp:92
(Diff revision 2)
> + // Fennec does not have a MessageLoop::current().
Indent the body of the else block more.
Also, append "on the controller thread" to this comment.
Attachment #8677686 -
Flags: review?(bugmail.mozilla) → review+
Updated•9 years ago
|
Attachment #8677687 -
Flags: review?(bugmail.mozilla) → review+
Comment 23•9 years ago
|
||
Comment on attachment 8677687 [details]
MozReview Request: Bug 1217251 - In TaskThrottler, dispatch the timeout task correctly from the Java UI thread on Android. r=kats
https://reviewboard.mozilla.org/r/22999/#review20679
Comment 24•9 years ago
|
||
https://reviewboard.mozilla.org/r/22997/#review20681
::: gfx/layers/apz/util/APZThreadUtils.cpp:96
(Diff revision 2)
> + MOZ_ASSERT(false, "This non-Fennec platform should have a MessageLoop::current()");
Actually make this a release assert and remove the |delete| statement. If this ever happens I want to make sure we know about it. In particular I don't want the task throttler's task to get deleted and it to keep going because that seems like a UAF waiting to happen.
Assignee | ||
Comment 25•9 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #24)
> Actually make this a release assert and remove the |delete| statement. If
> this ever happens I want to make sure we know about it. In particular I
> don't want the task throttler's task to get deleted and it to keep going
> because that seems like a UAF waiting to happen.
Ah, good point - it's the running of the task that nulls TaskThrottler::mTimeoutTask, so if we delete the task without running it first then we indeed get a dangling pointer. Thanks for catching this!
Comment 26•9 years ago
|
||
Comment 27•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/027c42d1dcc8
https://hg.mozilla.org/mozilla-central/rev/c6404f3d787c
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
You need to log in
before you can comment on or make changes to this bug.
Description
•