Closed
Bug 1200063
Opened 9 years ago
Closed 9 years ago
Intermittent test_wheel_transactions.html | Test timed out
Categories
(Core :: Panning and Zooming, defect)
Tracking
()
RESOLVED
FIXED
mozilla43
Tracking | Status | |
---|---|---|
firefox43 | --- | fixed |
People
(Reporter: RyanVM, Assigned: botond)
References
Details
(Keywords: intermittent-failure)
Attachments
(7 files)
(deleted),
image/png
|
Details | |
(deleted),
text/x-review-board-request
|
kats
:
review+
|
Details |
MozReview Request: Bug 1200063 - Share a paint throttler between APZCs in the same layers id. r=kats
(deleted),
text/x-review-board-request
|
kats
:
review+
|
Details |
(deleted),
text/x-review-board-request
|
kats
:
review+
|
Details |
(deleted),
text/x-review-board-request
|
kats
:
review+
|
Details |
(deleted),
text/x-review-board-request
|
kats
:
review+
|
Details |
(deleted),
patch
|
Details | Diff | Splinter Review |
No description provided.
Flags: needinfo?(botond)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 4•9 years ago
|
||
Thanks, I'll investigate.
Assignee: nobody → botond
Flags: needinfo?(botond)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 7•9 years ago
|
||
Pushed to try with logging:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=550e934ebb5e
Assignee | ||
Comment 8•9 years ago
|
||
Works better when the logging actually compiles:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4d79a6e896d6
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 14•9 years ago
|
||
diagnosis |
From the logs, what's happening is:
- The test hangs because it never receives a 'scroll' event
after dispatching a native wheel event.
- This is because the repaint request that's expected to
generate this 'scroll' event is never sent by APZ.
- APZ *does* call RequestContentRepaint() after processing
the wheel event.
==> The repaint request is likely stuck in the TaskThrottler.
Reading the TaskThrottler code, it seems to me that this can easily happen if, at the time the RequestContentRepaint() call happens, there has been a previous repaint request within 500 ms (TaskThrottler::mMaxWaitTime), and that repaint request does not produce a layer transaction (which can happen if nothing changes), so TaskComplete() is never called.
Kats, what's supposed to be triggering the execution of the task in TaskThrottler in such a case?
Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(bugmail.mozilla)
Comment 15•9 years ago
|
||
Hm, good catch. It does appear that this would result in the repaint request to get stuck. The max-wait timeout should prevent it from getting permanently stuck in production code since the next repaint request will get processed but in test scenarios we might not get one of those and/or the test may assert things in the meantime.
We should probably rethink the task throttle architecture; it may not make sense to keep any more particularly since it's per-APZC and so not super effective at throttling anyway.
Flags: needinfo?(bugmail.mozilla)
Assignee | ||
Comment 16•9 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #15)
> The max-wait timeout should prevent it from getting
> permanently stuck in production code since the next repaint request will get
> processed
Do you mean that in production we should always get a layer transaction within a short window of time that would trigger a call to TaskComplete()? Is that the case even if nothing is changing on the page?
Comment 17•9 years ago
|
||
No, I just mean that in production the user will likely do something that triggers a repaint, and it won't get stuck permanently. If the user does nothing and nothing is happening on the page then yeah, we might end up in a user-visible bad state.
Assignee | ||
Comment 18•9 years ago
|
||
OK, thanks, that makes sense.
What do you think about revising TaskThrottler in the following ways:
- Keep just one instance per APZCTreeManager, and have all APZCs
in it use that.
In addition to making the throttling per-APZCTM rather than
per-APZC, this will make the tracking of average paint durations
per-APZCTM. Is that reasonable?
- Use an nsITimer to make sure the pending repaint request is sent
when mMaxWaitTime elapses, even if no subsequent repaint request
has arrived in that time.
Comment 19•9 years ago
|
||
I think we should have one per process (or one per layers id), rather than one per APZCTM. We shouldn't really be blocking paints in one process if paints are outstanding in some other process. And using a timer of some sort to ensure the repaint request is flushed would be good, yes.
Assignee | ||
Comment 20•9 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #19)
> I think we should have one per process (or one per layers id), rather than
> one per APZCTM. We shouldn't really be blocking paints in one process if
> paints are outstanding in some other process.
Yep, that's an excellent point.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 28•9 years ago
|
||
Bug 1200063 - Make APZCTreeManager the place where GetFrameTime() can be overridden. r=kats
Attachment #8662057 -
Flags: review?(bugmail.mozilla)
Assignee | ||
Comment 29•9 years ago
|
||
Bug 1200063 - Share a paint throttler between APZCs in the same layers id. r=kats
Attachment #8662058 -
Flags: review?(bugmail.mozilla)
Assignee | ||
Comment 30•9 years ago
|
||
Bug 1200063 - Add a generic implementation of nsITimerCallback that's usable with a lambda or other function object. r=kats
We can consider moving this to xpcom in the future.
Attachment #8662059 -
Flags: review?(bugmail.mozilla)
Assignee | ||
Comment 31•9 years ago
|
||
Bug 1200063 - Make sure TaskThrottler sends its pending repaint request eventually, even if it never receives a notification from the previous request. r=kats
This is important because the previous repaint can be a no-op, and those don't trigger notifications.
Attachment #8662060 -
Flags: review?(bugmail.mozilla)
Assignee | ||
Comment 32•9 years ago
|
||
Here's an attempt at fixing this. A few notes for Kats:
- When the APZCTreeManager is destroyed, so will the throttlers.
Do you think there's a risk of an APZC still being alive and
accessing a throttler (to which it holds a reference) past that
time? If so, I guess we'd need to do something like store a pointer
to the throttler in APZC, null it out in Destroy(), and null-check
all accesses to it, but I'd like to avoid that if possible.
- If you don't like GenericTimerCallback being in APZThreadUtils.h,
I'm happy to move it somewhere else (especially if you suggest
a somewhere else).
- I didn't make TaskThrottler reference-counted, and the timer
callback captures and uses a raw pointer to it, but the
destructor cancels the timer, so it shouldn't fire after the
throttler is destroyed. Does that seem reasonable?
Assignee | ||
Comment 33•9 years ago
|
||
Assignee | ||
Comment 34•9 years ago
|
||
(In reply to Botond Ballo [:botond] from comment #33)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=047fc56ceb24
And, of course, STLport doesn't like storing a UniquePtr in a map...
Updated•9 years ago
|
Attachment #8662057 -
Flags: review?(bugmail.mozilla) → review+
Comment 35•9 years ago
|
||
Comment on attachment 8662057 [details]
MozReview Request: Bug 1200063 - Make APZCTreeManager the place where GetFrameTime() can be overridden. r=kats
https://reviewboard.mozilla.org/r/19491/#review17453
::: gfx/layers/apz/src/AsyncPanZoomController.cpp:816
(Diff revision 1)
> + // mTreeManager must be initialized before GetFrameTime() is called
Technically there is a call to GetFrameTime in the paint throttler initialization too but since it goes away in a future patch ao it doesn't matter. :)
::: gfx/tests/gtest/TestAsyncPanZoomController.cpp:158
(Diff revision 1)
> + TimeStamp GetFrameTime() override {
nit: add blank lines before and after this function.
Comment 36•9 years ago
|
||
Comment on attachment 8662058 [details]
MozReview Request: Bug 1200063 - Share a paint throttler between APZCs in the same layers id. r=kats
https://reviewboard.mozilla.org/r/19493/#review17461
That turned out to be much cleaner than I thought it would be, nice work!
Attachment #8662058 -
Flags: review?(bugmail.mozilla) → review+
Comment 37•9 years ago
|
||
Comment on attachment 8662059 [details]
MozReview Request: Bug 1200063 - Add a generic implementation of nsITimerCallback that's usable with a lambda or other function object. r=kats
https://reviewboard.mozilla.org/r/19495/#review17455
Looks good, but please file a follow-up bug to move it to xpcom.
::: gfx/layers/apz/util/APZThreadUtils.h:90
(Diff revision 1)
> +GenericTimerCallback<Function>* MakeTimerCallback(const Function& aFunction)
This should probably be called NewTimerCallback to be consistent with NewRunnableMethod and friends. I didn't think of that when I wrote MakeAPZCInstance but that one should probably be renamed as well.
Attachment #8662059 -
Flags: review?(bugmail.mozilla) → review+
Comment 38•9 years ago
|
||
Comment on attachment 8662060 [details]
MozReview Request: Bug 1200063 - Make sure TaskThrottler sends its pending repaint request eventually, even if it never receives a notification from the previous request. r=kats
https://reviewboard.mozilla.org/r/19489/#review17465
::: gfx/layers/apz/src/TaskThrottler.cpp:51
(Diff revision 1)
> + MakeTimerCallback([this, timeoutTime]()
I'm not a huge fan of the large indent on the callback implementation but I can live with it because it's a small function. I'd like to find a good coding style for this and put it in the gecko coding style doc before we have too many of these things in the codebase.
Attachment #8662060 -
Flags: review?(bugmail.mozilla) → review+
Comment 39•9 years ago
|
||
(In reply to Botond Ballo [:botond] from comment #32)
> Here's an attempt at fixing this. A few notes for Kats:
>
> - When the APZCTreeManager is destroyed, so will the throttlers.
> Do you think there's a risk of an APZC still being alive and
> accessing a throttler (to which it holds a reference) past that
> time? If so, I guess we'd need to do something like store a pointer
> to the throttler in APZC, null it out in Destroy(), and null-check
> all accesses to it, but I'd like to avoid that if possible.
Hm, I feel like this is a legitimate concern, because we might have layers holding on to AsyncPanZoomController instances and which end up doing things like run another animation frame or something. That might then try to use the dead task throttler which would probably be a UAF vulnerability. We can at least detect it somewhat by adding MOZ_ASSERT(mTreeManager) at places where we use the task throttler (wrapping it in a helper if needed). Another option is to have a static dummy instance of a TaskThrottler, and in Destroy() we point the reference to the dummy instance.
> - If you don't like GenericTimerCallback being in APZThreadUtils.h,
> I'm happy to move it somewhere else (especially if you suggest
> a somewhere else).
This is fine, although it would be good to move it into xpcom.
> - I didn't make TaskThrottler reference-counted, and the timer
> callback captures and uses a raw pointer to it, but the
> destructor cancels the timer, so it shouldn't fire after the
> throttler is destroyed. Does that seem reasonable?
As long as the TaskThrottler destructor and the timer callback are running on the same thread, yes. Can we add asserts for that?
Assignee | ||
Comment 40•9 years ago
|
||
The issue in comment 34 made me have to make TaskThrottler reference-counted, which I believe will end up mooting both lifetime issues. Will post updated patches as soon as I confirm it's building on STLport platforms.
Assignee | ||
Comment 41•9 years ago
|
||
Comment on attachment 8662057 [details]
MozReview Request: Bug 1200063 - Make APZCTreeManager the place where GetFrameTime() can be overridden. r=kats
Bug 1200063 - Make APZCTreeManager the place where GetFrameTime() can be overridden. r=kats
Assignee | ||
Comment 42•9 years ago
|
||
Comment on attachment 8662058 [details]
MozReview Request: Bug 1200063 - Share a paint throttler between APZCs in the same layers id. r=kats
Bug 1200063 - Share a paint throttler between APZCs in the same layers id. r=kats
In this process, TaskThrottler is made reference-counted.
Assignee | ||
Comment 43•9 years ago
|
||
Comment on attachment 8662059 [details]
MozReview Request: Bug 1200063 - Add a generic implementation of nsITimerCallback that's usable with a lambda or other function object. r=kats
Bug 1200063 - Add a generic implementation of nsITimerCallback that's usable with a lambda or other function object. r=kats
We can consider moving this to xpcom in the future.
Assignee | ||
Comment 44•9 years ago
|
||
Comment on attachment 8662060 [details]
MozReview Request: Bug 1200063 - Make sure TaskThrottler sends its pending repaint request eventually, even if it never receives a notification from the previous request. r=kats
Bug 1200063 - Make sure TaskThrottler sends its pending repaint request eventually, even if it never receives a notification from the previous request. r=kats
This is important because the previous repaint can be a no-op, and those don't trigger notifications.
Assignee | ||
Comment 45•9 years ago
|
||
Bug 1200063 - Rename MakeAPZCInstance to NewAPZCInstance for consistency. r=kats
Attachment #8662103 -
Flags: review?(bugmail.mozilla)
Assignee | ||
Comment 46•9 years ago
|
||
Comment on attachment 8662058 [details]
MozReview Request: Bug 1200063 - Share a paint throttler between APZCs in the same layers id. r=kats
Re-requesting review. I believe the reference-counting of TaskThrottler solves the issue of APZC accessing a potentially destroyed throttler.
Attachment #8662058 -
Flags: review+ → review?(bugmail.mozilla)
Assignee | ||
Comment 47•9 years ago
|
||
Comment on attachment 8662060 [details]
MozReview Request: Bug 1200063 - Make sure TaskThrottler sends its pending repaint request eventually, even if it never receives a notification from the previous request. r=kats
Re-requesting review. The lambda now captures an nsRefPtr to the throttler, keeping it alive. (The destructor still calls Cancel() for good measure.)
Attachment #8662060 -
Flags: review+ → review?(bugmail.mozilla)
Assignee | ||
Comment 48•9 years ago
|
||
(In reply to Botond Ballo [:botond] from comment #47)
> Comment on attachment 8662060 [details]
> MozReview Request: Bug 1200063 - Make sure TaskThrottler sends its pending
> repaint request eventually, even if it never receives a notification from
> the previous request. r=kats
>
> Re-requesting review. The lambda now captures an nsRefPtr to the throttler,
> keeping it alive. (The destructor still calls Cancel() for good measure.)
Also, let me know if you like this indentation better :)
Assignee | ||
Comment 49•9 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #37)
> Looks good, but please file a follow-up bug to move it to xpcom.
Filed bug 1205480.
Assignee | ||
Comment 50•9 years ago
|
||
Updated Try push, looking much better:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=8aaed62ecfb5
Assignee | ||
Comment 51•9 years ago
|
||
Note that I had to change NS_DECL_ISUPPORTS in GenericTimerCallbackBase to NS_DECL_THREADSAFE_ISUPPORTS because the way we're using it, Release() on the timer callback can be called on different threads. Fixed locally.
Assignee | ||
Comment 52•9 years ago
|
||
(In reply to Botond Ballo [:botond] from comment #50)
> Updated Try push, looking much better:
>
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=8aaed62ecfb5
In this Try push I've retriggered test_wheel_transactions 20 times on both a release and a debug build, and it's passing in all cases, suggesting that the patches really do fix the underlying problem that caused the intermittent.
Updated•9 years ago
|
Attachment #8662058 -
Flags: review?(bugmail.mozilla) → review+
Comment 53•9 years ago
|
||
Comment on attachment 8662058 [details]
MozReview Request: Bug 1200063 - Share a paint throttler between APZCs in the same layers id. r=kats
https://reviewboard.mozilla.org/r/19493/#review17489
Comment 54•9 years ago
|
||
Comment on attachment 8662060 [details]
MozReview Request: Bug 1200063 - Make sure TaskThrottler sends its pending repaint request eventually, even if it never receives a notification from the previous request. r=kats
https://reviewboard.mozilla.org/r/19489/#review17493
Looks good, thanks!
Attachment #8662060 -
Flags: review?(bugmail.mozilla) → review+
Updated•9 years ago
|
Attachment #8662103 -
Flags: review?(bugmail.mozilla) → review+
Comment 55•9 years ago
|
||
Comment on attachment 8662103 [details]
MozReview Request: Bug 1200063 - Rename MakeAPZCInstance to NewAPZCInstance for consistency. r=kats
https://reviewboard.mozilla.org/r/19501/#review17497
Comment 56•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/0c3593c96d65
https://hg.mozilla.org/integration/mozilla-inbound/rev/955917086e25
https://hg.mozilla.org/integration/mozilla-inbound/rev/e40b4dbc4a86
https://hg.mozilla.org/integration/mozilla-inbound/rev/4a6a55be6676
https://hg.mozilla.org/integration/mozilla-inbound/rev/92bf6f7917f3
Comment 57•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/0c3593c96d65
https://hg.mozilla.org/mozilla-central/rev/955917086e25
https://hg.mozilla.org/mozilla-central/rev/e40b4dbc4a86
https://hg.mozilla.org/mozilla-central/rev/4a6a55be6676
https://hg.mozilla.org/mozilla-central/rev/92bf6f7917f3
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
Comment 58•9 years ago
|
||
Comment 59•9 years ago
|
||
Comment on attachment 8662415 [details] [diff] [review]
Rename MakeAPZCInstance to NewAPZCInstance for consistency
Argh, ignore this. Apologies, I typed `hg bz` :(.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Intermittent Failures Robot) |
You need to log in
before you can comment on or make changes to this bug.
Description
•