Closed Bug 973105 Opened 10 years ago Closed 10 years ago

[B2G][Dialer] Quickly scrolling Call Log moves touch points (phantom input)

Categories

(Core :: Panning and Zooming, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla35

People

(Reporter: pbylenga, Assigned: kats)

References

Details

(Whiteboard: [apz_testrun])

Attachments

(5 files, 2 obsolete files)

Attached video CallLog_Scrolling.mpeg (deleted) —
Description:
In Dialer app, quickly scrolling the call log appears to move the touch points causing phantom inputs or highlighting.  This issue does not reproduce with APZ disabled.

Repro Steps:
1) Update a Buri to BuildID: 20140214004001
2) Have a lengthy list of call logs in Dialer App
3) Launch Dialer and navigate to call log
4) Quickly scroll in the center of the screen and observe

Actual:
Contacts below will become highlighted, or Missed Calls will become highlighted if scrolled downward.

Expected:
No highlighting of areas not being touched.

v1.3 Environmental Variables:
Device: Buri v1.3 MOZ
BuildID: 20140214004001
Gaia: 22e065f75193f569a78a8ec827b08e1ed520e1b2
Gecko: e3766683210c
Version: 28.0
Firmware Version: v1.2-device.cfg


Notes:

Repro frequency: 5/5, 100%
See attached: video
Minor bug overall.
Blocks: gaia-apzc-2
I've seen this too (Vivien was complaining about this before) and after fixing a couple of issues (bug 949132 and bug 947337) it seemed like the remainder of the problems were coming from a combination of flaky hardware the touch event fluffing code. That being said there might still be something we can do to improve this, but first I'd like to verify the root cause of what you're seeing here.

If I provide a patch with more logging, would you be able to reproduce the problem with that patch applied and post the logcat? If you can't build the patch, can I push it to the tryserver and have you run the build from there?
Flags: needinfo?(pbylenga)
Yeah for sure, send it over and I'll get started :).
Flags: needinfo?(pbylenga) → needinfo?(bugmail.mozilla)
Attached patch Patch for logging (deleted) — Splinter Review
Sorry for the delay. Here's the patch I'd like you to run with. When getting the logcat, please make sure you include timestamps by adding the "-v time" arguments to adb logcat.
Flags: needinfo?(bugmail.mozilla)
Flags: needinfo?(pbylenga)
Attached file PhantomInput_log.txt (deleted) —
Here is a log with the patch on it.  Saw 2 phantom highlights of the contact button at the bottom during scrolling.  At one point in the log, my touch was not light enough and entered into contacts information from the call log.
Flags: needinfo?(pbylenga)
Thank you! That log helps a lot.

I think I understand what is happening. When we unapply the async transform and apply the transformToGecko, in order to convert the point from what the user visually touched to the right coordinate space for gecko, it might be the case that that point is under something else from gecko's point of view. Since we're not transmitting a scrollId with touch events, gecko recalculates what frame the user touched based solely on the coordinates of the point, and computes a different answer because in gecko-land something else is on top of the point. If somehow we could tell gecko "the touch event was at point x,y on the layer with scrollId z" and it could do a hit detection on that, then it would get the right answer (although it would still be under something else).

This seems to be a problem inherent in async panning; I'm not sure what the right solution for this is.
Component: Gaia::Dialer → Panning and Zooming
Product: Firefox OS → Core
Without this we don't hit APZ correctness target for 1.4.
blocking-b2g: --- → 1.4+
Assign to me as a placeholder until the plates of other engineers clear up.
Assignee: nobody → milan
PM Triage: 1.4+
(In reply to (PTO|Mar15-Mar19) Kartikaya Gupta (email:kats@mozilla.com) from comment #6)
> I think I understand what is happening. When we unapply the async transform
> and apply the transformToGecko, in order to convert the point from what the
> user visually touched to the right coordinate space for gecko, it might be
> the case that that point is under something else from gecko's point of view.
> Since we're not transmitting a scrollId with touch events, gecko
> recalculates what frame the user touched based solely on the coordinates of
> the point, and computes a different answer because in gecko-land something
> else is on top of the point. If somehow we could tell gecko "the touch event
> was at point x,y on the layer with scrollId z" and it could do a hit
> detection on that, then it would get the right answer (although it would
> still be under something else).
> 
> This seems to be a problem inherent in async panning; I'm not sure what the
> right solution for this is.

Kats and I have discussed this and we have a proposed solution: when sending a touch event to Gecko, make sure to send a repaint request with the latest asynchronous scrolling state just before the touch event, so that when Gecko processes the touch event, it knows about the same scroll positions that were visible when the user made the touch.

I'll look into this approach.
Assignee: milan → botond
Update on the 1.4+ status of this bug:

This bug concerns a correctness issue where a tap during or immediately after scrolling (particularly, fast scrolling), can be ignored or sent to the wrong target. It has been described as a minor bug (comment 1) but it is nevertheless a correctness bug.

I have no strong feelings about blocking or not blocking on this. I am inclined to give bug 982141 and bug 976605 priority over this; while those are performance bugs, they seem to have higher impact.
Going with comments 1 and 12, minor bug overall, removing 1.4+
blocking-b2g: 1.4+ → ---
Untaking for now as priorities have changed a bit and I'll be focusing on progressive and low-resolution rendering (bug 993473) in the immediate future.
Assignee: botond → nobody
I'm not sure if it's the same issue, but if so, I see this with alarming regularity in 1.3 with APZC enabled, and have experience it on master (but I don't use master from day-to-day at the moment, so I don't know how easy it is to reproduce).

However, I see it more commonly in apps like Twitter, pressing on a tweet after scrolling (not necessarily immediately after scrolling), or sometimes in games, after a drag that doesn't necessarily even scroll anything (at least visibly).
Based on the discussion we had at the gfx work week we will need to implement the fix for this (flushing repaints on new input blocks) before we can get proper hit-testing.
Assignee: nobody → bugmail.mozilla
Blocks: 918288
Attached patch Part 1 - Flush repaints on new input blocks (obsolete) (deleted) — Splinter Review
Attachment #8494807 - Flags: review?(botond)
Attached patch Part 2 - gtest (obsolete) (deleted) — Splinter Review
I haven't actually run this code on a B2G device yet. I should do that before landing.
Attachment #8494808 - Flags: review?(botond)
Comment on attachment 8494807 [details] [diff] [review]
Part 1 - Flush repaints on new input blocks

Review of attachment 8494807 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/apz/src/AsyncPanZoomController.cpp
@@ +2318,5 @@
>    aFrameMetrics.SetUseDisplayPortMargins();
>  
>    // If we're trying to paint what we already think is painted, discard this
>    // request since it's a pointless paint.
> +  const FrameMetrics* last = (aUnthrottled ? &mLastDispatchedPaintMetrics : &mLastPaintRequestMetrics);

I wonder if it would be conceptually simpler to do the following:

  if (aUnthrottled) {
    CancelPendingPaintRequest();
  }

where that is:

  void AsyncPanZoomContorller::CancelPendingPaintRequest() {
    mPaintThrottler.CancelPendingTask();
    mLastPaintRequestMetrics = mLastDispatchedPaintMetrics;
  }

If we keep this, I think we should add a comment explaining this conditional (and perhaps use a reference rather than a pointer).

@@ +2873,5 @@
>          endZoomToMetrics.GetZoom()));
>  
>      // Schedule a repaint now, so the new displayport will be painted before the
>      // animation finishes.
> +    RequestContentRepaint(endZoomToMetrics, false);

false /* not unthrottled */

(or use an enumeration)

::: gfx/layers/apz/src/AsyncPanZoomController.h
@@ +544,5 @@
>    void RequestContentRepaint();
>  
>    /**
>     * Tell the paint throttler to request a content repaint with the given
>     * metrics.  (Helper function used by RequestContentRepaint.)

Update comment to say what |aUnthrottled| does.
Attachment #8494807 - Flags: review?(botond) → review+
(In reply to Botond Ballo [:botond] from comment #19)
> I wonder if it would be conceptually simpler to do the following:
> 
>   if (aUnthrottled) {
>     CancelPendingPaintRequest();
>   }

In fact, we could even move the call to CancelPendingPaintRequest() into FlushRepaintForNewInputBlock(), and avoid adding the aUnthrottled parameter to RequestContentRepaint() altogether.
Comment on attachment 8494808 [details] [diff] [review]
Part 2 - gtest

Review of attachment 8494808 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/tests/gtest/TestAsyncPanZoomController.cpp
@@ +1927,5 @@
> +  AsyncPanZoomController::SetFrameTime(testStartTime + TimeDuration::FromMilliseconds(1000));
> +
> +  // Now do two pans. The first of these will dispatch a repaint request, as above.
> +  // The second will get stuck in the paint throttler because the first one doesn't
> +  // get marked as "completed", so this will result in a non-empty LD transform.

Does the first also get stuck because the repaint request from the earlier part of the test hasn't been marked as "completed", either?
Attachment #8494808 - Flags: review?(botond) → review+
I was able to move the call to CancelPendingTask into FlushRepaintForNewInputBlock (good idea!), however I think we still need the unthrottled argument to RCR. The reason for this is that the paint throttler might still end up queuing the new repaint request if there is an old one outstanding. I updated the method documentation to make the contract more explicit.
Attachment #8494807 - Attachment is obsolete: true
Attachment #8495310 - Flags: review?(botond)
Attached patch Part 2 - gtest (v2) (deleted) — Splinter Review
Updated comment in the gtest to answer your question in comment 21, carrying r+. What's happening there is that the incomplete paint request from the first half of the test is ignored because I advance the timestamp by 1 second which exceeds the 500ms max-wait time of the paint throttler (unfortunately hard-coded in APZC's constructor).
Attachment #8494808 - Attachment is obsolete: true
Attachment #8495311 - Flags: review+
Ran the code on a recent-master hamachi and it seems to be behaving well. I didn't see the buttons at the bottom of the call log getting activated/highlighted while panning around, which is a good sign.
Comment on attachment 8495310 [details] [diff] [review]
Part 1 - Flush repaints on new input blocks (v2)

Review of attachment 8495310 [details] [diff] [review]:
-----------------------------------------------------------------

I think we still could have removed the aUnthrottled parameter if we made CancelPendingTask() clear the mOutstanding flag (and then perhaps call it something slightly different). However, I don't feel strongly about this.

::: gfx/layers/apz/src/AsyncPanZoomController.h
@@ +552,2 @@
>     */
> +  void RequestContentRepaint(FrameMetrics& aFrameMetrics, bool aUnthrottled);

It might be a bit more straightforward to make this aThrottled rather than aUnthrottled. Then we don't end up with things like

   true /* not throttled */
Attachment #8495310 - Flags: review?(botond) → review+
(In reply to Botond Ballo [:botond] from comment #25)
> Comment on attachment 8495310 [details] [diff] [review]
> Part 1 - Flush repaints on new input blocks (v2)
> 
> Review of attachment 8495310 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I think we still could have removed the aUnthrottled parameter if we made
> CancelPendingTask() clear the mOutstanding flag (and then perhaps call it
> something slightly different). However, I don't feel strongly about this.

I considered doing this but then we would introduce (more?) cases where the paint throttler doesn't end up recording the correct duration of the task execution.

> ::: gfx/layers/apz/src/AsyncPanZoomController.h
> @@ +552,2 @@
> >     */
> > +  void RequestContentRepaint(FrameMetrics& aFrameMetrics, bool aUnthrottled);
> 
> It might be a bit more straightforward to make this aThrottled rather than
> aUnthrottled. Then we don't end up with things like
> 
>    true /* not throttled */

Good call, I'll do this.
I flipped the bool to be aThrottled but I also added a default value on it (of true) which keeps the call sites cleaner.

https://hg.mozilla.org/integration/mozilla-inbound/rev/f015affb71ac
https://hg.mozilla.org/integration/mozilla-inbound/rev/8ae17788c3b2
https://hg.mozilla.org/mozilla-central/rev/f015affb71ac
https://hg.mozilla.org/mozilla-central/rev/8ae17788c3b2
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: