Closed Bug 978712 Opened 11 years ago Closed 11 years ago

RefreshDriver keeps getting woken up while idling at contact's details page

Categories

(Core :: CSS Parsing and Computation, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla30
blocking-b2g 1.3T+
Tracking Status
b2g-v1.3T --- fixed

People

(Reporter: ting, Assigned: dbaron)

References

Details

Attachments

(1 file)

STR: 1. Press contacts icon at homescren 2. Select any contact to enter the details page 3. RefreshDriver keeps getting woken up by nsIFrame::SchedulePaint(), and invokes nsViewManager::ProcessPendingUpdates() Can still see the behavior even after pressing message button to bring Messages app to foreground.
Enabled "Flash repainted area" but couldn't see any flashing while idling at the page. Though I keep seeing the log at nsViewManager::ProcessPendingUpdatesForView() before/after painting.
Try disabling hwc. I don't think paint flashing works for frames composed by hwc. Also, I have been able to detect this in the past by monitoring CPU usage with adb shell top -m 5.
This is the bug I previously reported, but we could never track it down.
I haven't had a complete picture yet, but I found there's a repeat cycle: ... -> nsLayoutUtils.PaintFrame() -> nsLayoutUtils.HasAnimationOrTransitionForCompositor(aProprerty == eCSSProperty_transform) -> ElementAnimations::CanPerformOnCompositorThread() -> ActiveLayerTracker::NotifyAnimated() ... LayerActivityTracker::NotifyExpired() -> nsIFrame::SchedulePaint() -> nsPresShell::ScheduleViewManagerFlush() -> nsRefreshDriver::ScheduleViewManagerFlush() ... nsRefreshDriver::Tick() -> nsViewManager::ProcessPendingUpdates() -> nsViewManager::ProcessPendingUpdatesForView()
(In reply to Ting-Yu Chou [:ting] from comment #4) > -> nsLayoutUtils.PaintFrame() > -> nsLayoutUtils.HasAnimationOrTransitionForCompositor(aProprerty == > eCSSProperty_transform) > -> ElementAnimations::CanPerformOnCompositorThread() > -> ActiveLayerTracker::NotifyAnimated() We could try to disable the animations under apps/communications/contacts/style/animations.css and see what gives. I wonder if this is caused by our CSS or if it's a problem in Gecko when faced with certain specific conditions.
> We could try to disable the animations under > apps/communications/contacts/style/animations.css and see what gives. I > wonder if this is caused by our CSS or if it's a problem in Gecko when faced > with certain specific conditions. Not sure how to make it. I tried what http://goo.gl/7EZZ4j said, and then the details page does not show up.
(In reply to Gabriele Svelto [:gsvelto] from comment #5) > We could try to disable the animations under > apps/communications/contacts/style/animations.css and see what gives. I > wonder if this is caused by our CSS or if it's a problem in Gecko when faced > with certain specific conditions. I essentially tried this in bug 901964 and it worked. Its just annoying to build a bunch of js code to treat the end of the animation as special and replace it with a different a style. Do we need to do that for all animations or just this one? Its hard to know since we don't understand whats triggering the problem yet.
nsDisplayListBuilder::AddAnimationsAndTransitionsToLayer checks pt->IsRunningAt(currentTime) / anim->IsRunningAt(currentTime) but ElementTransitions::CanPerformOnCompositorThread and ElementAnimations::CanPerformOnCompositorThread do not. I think that's likely the problem; the latter functions should be fixed to match when we can actually perform the animations on the compositor thread.
Status: NEW → ASSIGNED
Component: General → CSS Parsing and Computation
Product: Firefox OS → Core
Assignee: nobody → dbaron
This changes the behavior of the CanPerformOnCompositorThread methods of both ElementAnimations and ElementTransitions to check that the respective animations or transitions are actually running. This is ok because: - The main caller is nsLayoutUtils::HasAnimationsForCompositor, and all of its callers pretty clearly want the more restricted behavior (they're concerned with layer activity) - The only other callers of these functions are nsAnimationManager::FlushAnimations and nsTransitionManager::FlushTransitions (determining when to do throttling), nsAnimationManager::GetAnimationsForCompositor (whose only caller, nsDisplayListBuilder::AddAnimationsAndTransitionsToLayer, also checks IsRunningAt). I think these also all want or are fine with having the IsRunningAt check. As to the actual changes: - In the animation manager, I think it's a mistake that ElementAnimation::IsRunningAt didn't already check mIterationDuration, since we throw out animations with a bad iteration-duration in ElementAnimations::EnsureStyleRuleFor. So this makes that change as well. - In the transition manager, IsRunningAt already checks !IsRemovedSentinel(). I've confirmed in gdb on a device that this fixes the repeated nsIFrame::SchedulePaint calls that were the symptom of this bug.
Attachment #8386420 - Flags: review?(cam)
Also, thanks for figuring out most of the problem. Comment 4 was very helpful.
(In reply to David Baron [:dbaron] (needinfo? me) (UTC-8) from comment #10) > Also, thanks for figuring out most of the problem. Comment 4 was very > helpful. Thank you, David, glad to know it is helpful.
I have a few test cases forthcoming for bug 975261 that I'd like to double-check still pass with this patch. Will update later today.
Comment on attachment 8386420 [details] [diff] [review] Prevent non-running transitions and animations (animations or transitions during their delay period, and animations after they finish) from repeatedly poking layer activity because we think we can run them on the compositor Review of attachment 8386420 [details] [diff] [review]: ----------------------------------------------------------------- I think it would be helpful if the comment above CanPerformOnCompositorThread's declaration described the conditions when it will return true and false. It's not clear from the name alone that it would take the current time and whether the animation is running right now into account. Can you remove the IsRunningAt check from nsDisplayListBuilder::AddAnimationsAndTransitionsToLayer now that it is being done within CanPerformOnCompositorThread? r=me with the comment and the change to AddAnimationsAndTransitionsToLayer if it makes sense.
Attachment #8386420 - Flags: review?(cam) → review+
[2014-03-06 16:24:52 -0800] <heycam> dbaron, ok I am ready to ask now. do we have this ping-ponging because we erroneously try to start running the (not yet running or already finished) animation on the compositor, it finds that the animation shouldn't run, and this repeats? [2014-03-06 16:25:05 -0800] <dbaron> heycam, sort of [2014-03-06 16:25:10 -0800] <dbaron> heycam, let me go refresh my memory... [2014-03-06 16:26:55 -0800] <dbaron> heycam, hmmm, Ting-Yu's first stack doesn't quite make sense [2014-03-06 16:27:04 -0800] <dbaron> heycam, so I need to figure out which caller it actually was [2014-03-06 16:27:52 -0800] <dbaron> heycam, oh, right, it's HasAnimationsForCompositor's internals... [2014-03-06 16:28:13 -0800] <dbaron> (which maybe isn't great in itself) [2014-03-06 16:28:52 -0800] <dbaron> heycam, so every time we ask if we can perform the animation on the compositor [2014-03-06 16:29:02 -0800] <dbaron> heycam, we notify the ActiveLayerTracker that the layer should be active [2014-03-06 16:29:30 -0800] <dbaron> heycam, then, at some point, the layer activity expires, since we expire layer activity on a quick expiration tracker (IIRC, 75ms-100ms???) [2014-03-06 16:30:21 -0800] <dbaron> heycam, so when the layer becomes inactive, we need to repaint because we're merging the contents into the parent layer [2014-03-06 16:30:41 -0800] <dbaron> heycam, then when we paint, we ask if the animation can run on the compositor [2014-03-06 16:30:50 -0800] <dbaron> heycam, and it pokes the layer activity tracker and makes the layer active again [2014-03-06 16:30:53 -0800] <heycam> aha [2014-03-06 16:31:05 -0800] <dbaron> heycam, but since there's no actual animation to keep the layer active [2014-03-06 16:31:16 -0800] <dbaron> heycam, we ping-pong through this cycle and repaint every time layer activity expires [2014-03-06 16:31:22 -0800] <heycam> yep ok that makes sense [2014-03-06 16:31:34 -0800] <heycam> thanks [2014-03-06 16:31:41 -0800] <dbaron> heycam, useful to explain; I don't think I had it fully sorted in my head prior to having to explain it :-) [2014-03-06 16:31:45 -0800] <heycam> :) [2014-03-06 16:32:24 -0800] <dbaron> heycam, mind if I paste this in the bug? [2014-03-06 16:32:30 -0800] <heycam> dbaron, go ahead
Though, actually, my explanation above doesn't quite add up, since the LayerActivityTracker checks for animations using the same code in ActiveLayerTracker::IsStyleAnimated. I feel like I should dig into this a little more; I wonder if we're regularly expiring layers while they're running animations on the compositor.
(A key difference in this case relative to running animations is that nsDisplayListBuilder::AddAnimationsAndTransitionsToLayer doesn't actually add any animations to the layer, and I thought I remembered something depending on that for deciding when to expire the layer's activity, but I no longer remember what that was or if I checked that it was actually the case.)
(In reply to Brian Birtles (:birtles) from comment #12) > I have a few test cases forthcoming for bug 975261 that I'd like to > double-check still pass with this patch. Will update later today. Current patch looks fine with tests from bug 975261 and bug 964646.
So the thing that prevents us from expiring layers when animations are running on the compositor is this: when an animation is running on the compositor, and throttled (not updating style) on the main thread, the refresh driver is still calling the WillRefresh method of the nsAnimationManager/nsTransitionManager every refresh cycle, and that in turn calls CanPerformOnCompositorThread every cycle, which pokes the layer activity timer. So I think the invariant of the current system that we broke in this case is the following invariant: CanPerformOnCompositorThread should never return true for an animation whose mNeedsRefreshes is false, since that can lead to he above mechanism not happening while we have an animation running on the compositor. (Admittedly, this isn't the best situation to be in; it would be nice to not have to do this work on the main thread while the animation is running on the compositor. That said, it's far less work than updating style or repainting.)
(In reply to Cameron McCormack (:heycam) from comment #13) > Can you remove the IsRunningAt check from > nsDisplayListBuilder::AddAnimationsAndTransitionsToLayer now that it is > being done within CanPerformOnCompositorThread? Technically we could for the transitions part, but we can't for the animations part, and I think it's clearer to keep them symmetric. In particular, the reason we can't for the animations part is that an element can have multiple animations affecting the same property. nsAnimationManager::GetAnimationsForCompositor returns the element's ElementAnimations object conditional on having at least one animation of opacity or transform... and a bunch of other conditions. But that doesn't mean we want to add any other (completed or future) animations affecting those properties to the compositor. That said, I'm also not at all confident that AddAnimationsAndTransitionsToLayer handles that case correctly. In fact, I'm somewhat confident that it handles it wrong, now that I think about it.
(In reply to David Baron [:dbaron] (needinfo? me) (UTC-8) from comment #19) > That said, I'm also not at all confident that > AddAnimationsAndTransitionsToLayer handles that case correctly. In fact, > I'm somewhat confident that it handles it wrong, now that I think about it. I filed bug 980769 on this.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
How could I let the patch also land on b2g v1.3t?
I don't know what the rules are for landing on v1.3t. It's probably reasonable to land, although there's some risk that it would cause regressions if bug 828173 isn't landed as well. Is the v1.3t branch getting sufficient testing to detect regressions in animations?
Ting, if you think a patch should be uplifted set the 1.3t? blocking-b2g flag and it will get discussed at the next triage meeting. Once its marked 1.3t+ it can be uplifted. Based on the See Also you added, it appears this will help reduce our memory thrashing on tarako?
blocking-b2g: --- → 1.3T?
(In reply to David Baron [:dbaron] (needinfo? me) (UTC-8) from comment #24) > I don't know what the rules are for landing on v1.3t. > > It's probably reasonable to land, although there's some risk that it would > cause regressions if bug 828173 isn't landed as well. Is the v1.3t branch > getting sufficient testing to detect regressions in animations? I am not sure about this, need to ask around.
(In reply to Ben Kelly [:bkelly] from comment #25) > Ting, if you think a patch should be uplifted set the 1.3t? blocking-b2g > flag and it will get discussed at the next triage meeting. Once its marked > 1.3t+ it can be uplifted. > > Based on the See Also you added, it appears this will help reduce our memory > thrashing on tarako? Ben, thank you for teaching me. It helps for the case starting message activity from contact's detail page, guess it is not that important though.
(In reply to Ting-Yu Chou [:ting] from comment #27) > Ben, thank you for teaching me. It helps for the case starting message > activity from contact's detail page, guess it is not that important though. I think that sounds important enough for the release. Lets see what the triage meeting says. Thanks!
Before we consider taking this onto 1.3T, we should find out if this caused the fallout in bug 981142. bbajaj was suspecting that this patch caused that really ugly animation regression seen in that bug.
(In reply to Jason Smith [:jsmith] from comment #29) > Before we consider taking this onto 1.3T, we should find out if this caused > the fallout in bug 981142. bbajaj was suspecting that this patch caused that > really ugly animation regression seen in that bug. Seems like easier to uplift and see if this causes the regression on 1.3T. I was talking to dbaron about this and we think we should land this patch and 828173 rather sooner than later if we want to consider the tarako branch.
blocking-b2g: 1.3T? → 1.3T+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: