Closed Bug 1255968 Opened 8 years ago Closed 8 years ago

Interruptible reflow is broken in e10s

Categories

(Core :: Layout, defect, P1)

x86
Windows 7
defect

Tracking

()

VERIFIED FIXED
mozilla48
Tracking Status
e10s + ---
firefox46 --- wontfix
firefox47 --- wontfix
firefox48 --- fixed

People

(Reporter: alice0775, Assigned: ting)

References

(Blocks 1 open bug)

Details

(Keywords: perf, regression)

Attachments

(2 files, 2 obsolete files)

Build Identifier:
https://hg.mozilla.org/mozilla-central/rev/102886e9ac63b3fa33a6f1b394aea054abce2dfd
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:48.0) Gecko/20100101 Firefox/48.0 ID:20160311030233

Reproducible: always


Steps to Reproduce:
1. Open large page https://html.spec.whatwg.org/#semantics
2. Wait to load.
3. Open Sidebar (Ctrl+B)
   ---- observe how long does it take re-layout
4. Close Sidebar (Ctrl+B)
   ---- observe how long does it take re-layout

Actual Results:
e10s is x5 slower than non-e10s
Blocks: e10s-perf
Priority: -- → P1
On my current trunk build, the scrollbar gets shown very late when open/close sidebar, I'll get a profile.
Profiles from pressing ctrl+b twice (open and close sidebar):

  e10s     https://cleopatra.io/#report=cb468c4ef32e128203a74db67a4ed337764f0d41
  non-e10s https://cleopatra.io/#report=8ff732cd54de1c34758c66b7491d330b1ff48170

A quick glance is e10s takes much longer on doing reflow with this symbol:

  PresShell::ResizeReflowIgnoreOverride(int, int) /home/ting/w/fx/mc/layout/base/nsPresShell.cpp:1853
I tried to do ctrl+b more times on e10s, and found it can be as good as non-e10s. Also I can't consistently reproduce the slowness I saw in comment 1 and 2, probably somehow the lines were dirty in the runs.

Do you see 5x slower on e10s consistently?

Here's the profile that e10s runs as good as non-e10s: https://cleopatra.io/#report=2a56681bf1b9390b0a363f136033c5146db29f9d
Flags: needinfo?(alice0775)
(In reply to Ting-Yu Chou [:ting] from comment #3)
> I tried to do ctrl+b more times on e10s, and found it can be as good as
> non-e10s. Also I can't consistently reproduce the slowness I saw in comment
> 1 and 2, probably somehow the lines were dirty in the runs.
> 
> Do you see 5x slower on e10s consistently?
> 
> Here's the profile that e10s runs as good as non-e10s:
> https://cleopatra.io/#report=2a56681bf1b9390b0a363f136033c5146db29f9d

I can consistently reproduce the slowness.
Flags: needinfo?(alice0775)
I tried to reproduce this on both ubuntu and windows, but no luck. Could you record your screen for both e10s and non-e10s? Thank you.
Flags: needinfo?(alice0775)
Here's the screencast on my windows which I showed/hid sidebar for a few times, I started with e10s and then switched to non-e10s. I couldn't reproduce the 5x slowness.

https://youtu.be/8saMrpq_Dmg
(In reply to Ting-Yu Chou [:ting] from comment #5)

Screen capture:
(It is difficult to screen capture due to screen capture overhead.)
e10s : https://www.youtube.com/watch?v=K3LIq1Fd5zM
disabled e10s : https://www.youtube.com/watch?v=VU8_UUXnDj8
Flags: needinfo?(alice0775)
When is the timing to press ctrl+b? Did you press it quickly and repeatedly? Will there be any differences if you show/hide the sidebar by clicking the menu?

I traced the code a little bit, the only difference between e10s and non-e10s I can tell now is nsFrameLoader::UpdatePositionAndSize() needs an IPC for e10s.
Flags: needinfo?(alice0775)
(In reply to Ting-Yu Chou [:ting] from comment #8)
> When is the timing to press ctrl+b? Did you press it quickly and repeatedly?

I press Ctrl+B after complete re-layout(ie. horizontal, vertical scrollbar and content re-layout)
 
> Will there be any differences if you show/hide the sidebar by clicking the
> menu?

Same, No difference.
Flags: needinfo?(alice0775)
Since I can't reproduce the issue consistently, the profiles I captured in comment 2 may be not helpful.

Would you please profile on your computer (both e10s and non-e10s)? Thanks.
Flags: needinfo?(alice0775)
Profiler, ctrl+b twice
e10s: https://cleopatra.io/#report=57874efcefd83cc844cf5ade07e6385726eb0643
disabled e10s: https://cleopatra.io/#report=a47fc4a5210923cbc37800fdb085d2afcc9276e8
Flags: needinfo?(alice0775)
OS: Unspecified → Windows 7
Hardware: Unspecified → x86
It seems on non-e10s, HavePendingInputEvent() returns true at some point and willReflowAgain is set to true here:

  https://dxr.mozilla.org/mozilla-central/source/layout/generic/nsBlockFrame.cpp#2418

which the following ReflowLine() calls are skipped. I am double checking.
With the STR, in content process nsPresContext::HavePendingInputEvent() goes to nsBaseWidget::HasPendingInputEvent() which always returns false, while in parent process it goes to nsWindow::HasPendingInputEvent().

I haven't sorted out how does widget (GetNearestWidget()) work, but it seems interruptible reflow does not work in content process.

bz, :kanru told me B2G had the same issue before and it was never fixed. Are there anything we can do for e10s?
Flags: needinfo?(bzbarsky)
Well, we need to make nsPresContext::HavePendingInputEvent() not lie, yes?  I don't know enough about how input events are handled on e10s to tell you how to do that, exactly...
Flags: needinfo?(bzbarsky)
Summary: Page layout is slower if enabled e10s → Interruptible reflow is broken in e10s
Input events coming into the child process generally come across the PBrowser IPDL interface (TabParent -> TabChild). What might work is to implement HavePendingInputEvent in PuppetWidget by using the newly-added PeekMessages [1] function to scan the incoming IPDL message queue and detect if there are input events on the way that haven't been processed yet. I'm not sure if that'll work but it's worth a shot.

[1] http://mxr.mozilla.org/mozilla-central/ident?i=PeekMessages
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #15)
> Input events coming into the child process generally come across the
> PBrowser IPDL interface (TabParent -> TabChild). What might work is to
> implement HavePendingInputEvent in PuppetWidget by using the newly-added
> PeekMessages [1] function to scan the incoming IPDL message queue and detect
> if there are input events on the way that haven't been processed yet. I'm
> not sure if that'll work but it's worth a shot.
> 
> [1] http://mxr.mozilla.org/mozilla-central/ident?i=PeekMessages

Great, I'll give it a try, thanks!
Could you try to disable interruptible reflow [1] for non-e10s, see if it runs as slow as e10s? I'd like to make sure it is the root cause. Thank you.

[1] https://developer.mozilla.org/en-US/docs/Disabling_interruptible_reflow
Flags: needinfo?(alice0775)
(In reply to Ting-Yu Chou [:ting] from comment #17)
> Could you try to disable interruptible reflow [1] for non-e10s, see if it
> runs as slow as e10s? I'd like to make sure it is the root cause. Thank you.
> 
> [1] https://developer.mozilla.org/en-US/docs/Disabling_interruptible_reflow

Yes, you are right. The layout is slow when layout.interruptible-reflow.enabled = false.
Flags: needinfo?(alice0775)
Attached patch wip (obsolete) (deleted) — Splinter Review
:bz, what do you think about this? BTW, I am not sure are these all the input events we care.
Attachment #8733789 - Flags: feedback?(bzbarsky)
Comment on attachment 8733789 [details] [diff] [review]
wip

Well, on Mac we basically do this for mouseup/down/move/drag, keyup/down, scrollwheel, and tabletpointer.

That seems to match your list fairly well, actually.  Looks pretty reasonable to me.

I assume there is no way to break out of the peek once we have the state we want?  :(
Attachment #8733789 - Flags: feedback?(bzbarsky) → feedback+
Not yet, but we could probably modify the signature to return true from the callback or something. Also Ting, make sure you get an IPC peer to sign off on this (see [1]).

[1] http://mxr.mozilla.org/mozilla-central/source/ipc/glue/MessageChannel.h?rev=5863b88e10cc#114
Comment on attachment 8733789 [details] [diff] [review]
wip

We'd like to use PeekMessages() to check whether there are pending input events for interruptible reflow in content process. Would it be OK? Thanks.
Attachment #8733789 - Flags: feedback?(jld)
Assignee: nobody → janus926
Attachment #8733789 - Attachment is patch: true
Comment on attachment 8733789 [details] [diff] [review]
wip

(In reply to Ting-Yu Chou [:ting] from comment #22)
> Comment on attachment 8733789 [details] [diff] [review]
> wip
> 
> We'd like to use PeekMessages() to check whether there are pending input
> events for interruptible reflow in content process. Would it be OK? Thanks.

This should go to billm or dvander. gabor can also look at this. billm is currently on extended pto.
Attachment #8733789 - Flags: feedback?(dvander)
Comment on attachment 8733789 [details] [diff] [review]
wip

Seems reasonable. I guess we'd have to add an outparam or return value to the callback to be able to break out of the loop.
Attachment #8733789 - Flags: feedback?(dvander) → feedback+
Comment on attachment 8733789 [details] [diff] [review]
wip

Review of attachment 8733789 [details] [diff] [review]:
-----------------------------------------------------------------

I don't know this area of IPC very well, so I'm deferring to :dvander's judgment on this being the right approach.

::: widget/PuppetWidget.cpp
@@ +1424,5 @@
>  
> +bool
> +PuppetWidget::HasPendingInputEvent()
> +{
> +  static const int inputEvents[] = {

Nit: maybe this should use msgid_t instead of int?
Attachment #8733789 - Flags: feedback?(jld) → feedback+
Attachment #8735349 - Flags: review?(dvander)
Added another message: Msg_UpdateDimensions__ID for bug 1257869, and changed int to msgid_t as :jld suggested.
Attachment #8733789 - Attachment is obsolete: true
Attachment #8735352 - Flags: review?(bzbarsky)
Attachment #8735349 - Flags: review?(dvander) → review+
Comment on attachment 8735352 [details] [diff] [review]
part2 v1 - implement PuppetWidget::HasPendingInputEvent()

r=me
Attachment #8735352 - Flags: review?(bzbarsky) → review+
Try looks good.
Keywords: checkin-needed
Depends on: 1260736
Alice, has this work improved sidebar performance?
Flags: needinfo?(alice0775)
(In reply to Jim Mathies [:jimm] from comment #34)
> Alice, has this work improved sidebar performance?

https://hg.mozilla.org/mozilla-central/rev/538d248fa252a4100082fd9bc3fdc08d322cda22
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:48.0) Gecko/20100101 Firefox/48.0 ID:20160401030216

Yep, improved.
Flags: needinfo?(alice0775)
Ting-Yu, should this fix be uplifted to Aurora 47 in preparation for our e10s experiments on Beta 47? Or is this change too risky?
Status: RESOLVED → VERIFIED
Flags: needinfo?(janus926)
There's a regression bug 1260736 from this, would that be OK?
Flags: needinfo?(janus926) → needinfo?(cpeterson)
IRC log.

<cpeterson> hi. looks like we probably shouldn't uplift 1255968 until we
            understand the regression. no problem. :-)
<ting> ok, i'll cancel the ni, thanks :)
Flags: needinfo?(cpeterson)
Thanks, Ting. That sounds good to me.
bug 1255968 is in inbound now, I'll ask for uplift once it's landed.
(In reply to Ting-Yu Chou [:ting] from comment #40)
> bug 1255968 is in inbound now, I'll ask for uplift once it's landed.

Correction: bug 1260736
Comment on attachment 8735349 [details] [diff] [review]
part1 v1 - break out from the loop of PeekMessages()

Approval Request Comment
[Feature/regressing bug #]: n/a
[User impact if declined]: long reflow can't be interrupted by user inputs
[Describe test coverage new/current, TreeHerder]: automated tests
[Risks and why]: low as it just makes interruptible reflow work in content process
[String/UUID change made/needed]: n/a
Attachment #8735349 - Flags: approval-mozilla-aurora?
Comment on attachment 8735774 [details] [diff] [review]
part2 v2 - implement PuppetWidget::HasPendingInputEvent()

Same as comment 42.
Attachment #8735774 - Flags: approval-mozilla-aurora?
This can't be uplifted to aurora unless you uplift bug 1242609 also, and I'm not sure we want to do that. The introduction of PeekMessages in the APZ code carries some risk.
I guess you can uplift just the PeekMessages function, if you really want.
I see, I should've checked that before asking for uplift. I thought it'd be good if we can have this on 47, but should be fine if we can't.
Attachment #8735349 - Flags: approval-mozilla-aurora?
Attachment #8735774 - Flags: approval-mozilla-aurora?
No longer blocks: 1257869
Blocks: 1257869
Depends on: 1292781
Depends on: 1303649
Depends on: 1321960
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: