Closed Bug 1408792 Opened 7 years ago Closed 7 years ago

corrupt rendering on bbc.co.uk with webrender enabled

Categories

(Core :: Graphics: WebRender, defect, P1)

Unspecified
All
defect

Tracking

()

VERIFIED FIXED
mozilla60
Tracking Status
firefox-esr52 --- unaffected
firefox56 --- unaffected
firefox57 --- unaffected
firefox58 --- unaffected
firefox59 --- disabled
firefox60 --- verified
firefox61 --- verified

People

(Reporter: dvdplm, Assigned: kats)

References

(Blocks 1 open bug, )

Details

(Whiteboard: [wr-reserve])

Attachments

(4 files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:58.0) Gecko/20100101 Firefox/58.0
Build ID: 20171015100127

Steps to reproduce:

Visit http://www.bbc.co.uk/news/resources/idt-sh/death_of_the_nile with webrender turned on using latest firefox ("58.0a1 (2017-10-15) (64-bit)" on macOS v10.13 (17A405).


Actual results:

The article text is missing completely.


Expected results:

The article text should appear.
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:58.0) Gecko/20100101 Firefox/58.0

I was able to reproduce the issue using latest Nightly build (20171018100140) and gfx.webrender.enable=true.

While scrolling down some of the text is missing. 

It works fine with Beta 57.0b9.

I think the correct component is Core=>Graphics:WebRender.
Status: UNCONFIRMED → NEW
Component: Untriaged → Graphics: WebRender
Ever confirmed: true
Product: Firefox → Core
Whiteboard: [wr-mvp] [triage]
Priority: -- → P2
Whiteboard: [wr-mvp] [triage] → [wr-mvp]
I can repro, I'll investigate.
Assignee: nobody → bugmail
Has STR: --- → yes
OS: Unspecified → All
Version: 58 Branch → Trunk
Likely same problem as bug 1411249.
Status: NEW → ASSIGNED
Priority: P2 → P1
The patches I'm going to attach on bug 1373802 and bug 1411249 help here but don't fix the bbc page completely.
The only remaining problem seems to be that when we scroll stuff that should be painted already isn't - it shows up as black until we do a repaint. I'm not yet sure what's causing this.
I reduced the test page somewhat and I think the problem is coming from this part:

    FixedPosition p=0x1260ec018 ... clip(0,0,75900,42720) asr() clipChain(<0,42360,75900,42720> [0x125e7a1d8], <0,0,75900,42720> [root asr]) ...
      nsDisplayTransform p=0x1260dfb98 ...
        LayerEventRegions p=0x1260df758 ...
        WrapList p=0x1260dfa98 ...
          LayerEventRegions p=0x1260df858 ...
          Image p=0x1260df958 ... clip(0,0,75900,42720) asr() clipChain(<0,0,75900,42720> [root asr]) ...

and in SLH we end up doing this:

SLH: processing item 0x1260ec018
WRDL(0x125980980): PopClip
WRDL(0x125980980): PopScrollLayer id=2
WRDL(0x125980980): DefineClip id=9 as=2 ac=(nil) r=(x=0.000000, y=706.000000, w=1265.000000, h=712.000000) m=0x0 b=none complex=0
WRDL(0x125980980): PushClip id=9
WRDL(0x125980980): PushClipAndScroll s=0 c=9
SLH: done setup for 0x1260ec018
...
SLH: processing item 0x1260df958
WRDL(0x125980980): DefineClip id=10 as=(nil) ac=9 r=(x=0.000000, y=0.000000, w=1265.000000, h=712.000000) m=0x0 b=none complex=0
WRDL(0x125980980): PushClip id=10
WRDL(0x125980980): PushClipAndScroll s=0 c=10
SLH: done setup for 0x1260df958

What this means is that we define the clip on the Image as a child of the clip on the FixedPosition item. However, the FixedPosition item's clip scrolls (with ASR 0x125e7a1d8) while the Image item's clip does not (root ASR). So what we're doing in SLH inadvertently makes the Image item's clip scroll as well, which would explain why the image is getting clipped on async scroll.

I should be able to fix this by comparing the ASRs on the child and parent clips and making sure they are the same.
... so I implemented that but it didn't what I expected. Upon further reflection I'm not sure how to interpret the gecko display list for this case. I don't understand the clipchain for the Image item at all; it should either have the same clipChain as the FixedPosition item or it should just be empty and inherit the clipChain from the FixedPosition item. Instead it has a clipChain that is equivalent to the tail part of the FixedPosition item's clipChain.
Attached file reduced test case (deleted) β€”
Here's a reduced test case. It might be possible to reduce it a little further but it's simple enough to work with now and I didn't want to accidentally over-reduce it.
Attached file log from reduced test case (deleted) β€”
Here's a display list dump and SLH/WRDL log from processing it. Markus, if you get a chance can you take a look at this and let me know if the clipChain for Background p=0x113555158 makes sense to you? I'm not sure how to interpret it; I would expect it to have the same clipChain as the ancestor FixedPosition item.
Flags: needinfo?(mstange)
Hmm this actually appears to be the same pattern I see in layout/reftests/async-scrolling/fixed-pos-scrolled-clip-2.html.
Yes, that clip makes sense to me.

The FixedPosition item captures its clip; its contents are unclipped internally. For example, the nsDisplayTransform and the WrapList inside it don't have any clip. But the background image has its own clip. And the background image's clip is not moving with any scroll frame, so the ASR of that clip chain item is the root ASR.
Flags: needinfo?(mstange)
(In reply to Markus Stange [:mstange] from comment #11)
> The FixedPosition item captures its clip

In the world of ScrollingLayersHelper I guess this means that the clip chains of things inside the FixedPosition item don't implicitly connect to the clip chain of the FixedPosition item. Is this "capturing its clip" behaviour specific to FixedPosition items, or does it happen with other items too? i.e. what condition would I need to check to detect this?
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #12)
> (In reply to Markus Stange [:mstange] from comment #11)
> > The FixedPosition item captures its clip
> 
> In the world of ScrollingLayersHelper I guess this means that the clip
> chains of things inside the FixedPosition item don't implicitly connect to
> the clip chain of the FixedPosition item.

Well, it's more of an implicit intersection than an implicit connection of the chains.

Imagine a testcase where you have a fixed element with a scrolled clip (scrolled by asr A), and inside the fixed element there's a nested scroll frame (asr B). The ASR tree would look like this:

 [root ASR]
  ^     ^
 /       \
A         B

In the display list, the nsDisplayFixedPosition will have clips w.r.t. both A and the root ASR, and the display items for B's contents will have clips w.r.t both B and the root ASR. (For example, B's viewport is a clip w.r.t. the root ASR.) Ultimately, B's contents are clipped by the intersection of all these clip chain items, you just need to imagine clipping happening in two steps: First, everything inside the nsDisplayFixedPosition applies their clips, and creates a "rendering" of the nsDisplayFixedPosition item. Then the nsDisplayFixedPosition item's clips are applied to that rendering.
This intersection across sibling ASRs is going to be hard to represent in a clip scroll tree.

> Is this "capturing its clip"
> behaviour specific to FixedPosition items, or does it happen with other
> items too?

See http://searchfox.org/mozilla-central/rev/21363323fd4aa21db074c808fb5358a46df6d698/layout/generic/nsFrame.cpp#2705-2707 - it's not specific to FixedPosition items. It would be good to write the code in such a way that this is supported for all container item.
This makes a lot of sense, thanks!

I tried a few things to try and implement this using the existing WR APIs in a standalone example app. I was able to come close by using the "local clip" on the item, which gets intersected with the clip chain that's active for the item. However the "local clip" isn't a full clip chain and so cannot be used to express the full set of things we need per the example you gave.

I filed https://github.com/servo/webrender/issues/1964 for this in upstream WR.
Deprioritized to P3 during triage - moving to Reserve Backlog.
Assignee: bugmail → nobody
Status: ASSIGNED → NEW
Priority: P1 → P3
Whiteboard: [wr-mvp] → [wr-reserve]
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #14)
> I filed https://github.com/servo/webrender/issues/1964 for this in upstream WR.

There was https://github.com/servo/webrender/pull/2300 (pushed to master 5 days ago)

mozregression --find-fix --bad 2017-10-30 --good 2018-01-10 --pref layers.acceleration.force-enabled:true gfx.webrender.enabled:true gfx.webrender.blob-images:true image.mem.shared:true layout.display-list.retain:false startup.homepage_welcome_url:"http://www.bbc.co.uk/news/resources/idt-sh/death_of_the_nile" startup.homepage_welcome_url.additional:"https://bug1408792.bmoattachments.org/attachment.cgi?id=8922947"
> 11:41.55 INFO: First good revision: 92ff8af42f6fd6db6e7255b287031d6ec298e031
> 11:41.55 INFO: Last bad revision: 27841347525c5e83a67e086d6fd15758c625a9e3
> 11:41.55 INFO: Pushlog:
> https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=27841347525c5e83a67e086d6fd15758c625a9e3&tochange=92ff8af42f6fd6db6e7255b287031d6ec298e031

> 92ff8af42f6f	Kartikaya Gupta β€” Bug 1422057 - Remove now-unnecessary ::Equals checks. r=mstange
> b8931ab9dac9	Kartikaya Gupta β€” Bug 1422057 - Deduplicate DisplayItemClipChain instances on creation. r=mstange
> e448dd0f4aba	Kartikaya Gupta β€” Bug 1422057 - Add hash function and boilerplate for deduplicating DisplayItemClipChain via std::set. r=mstange
> b65a65089ffb	Kartikaya Gupta β€” Bug 1422057 - Extend the clipchain of a display item to the ancestor's clipchain if it is a strict superset. r=mstange
> 78ca58c296fa	Kartikaya Gupta β€” Bug 1422057 - Extract a local variable. r=mstange
> c12c951b5ecc	Kartikaya Gupta β€” Bug 1422057 - Avoid caching clips across stacking contexts with non-identity transforms. r=mstange
> 4dcb02cef0b7	Kartikaya Gupta β€” Bug 1422057 - Adjust some logging-related things to be more useful. r=mstange

I would call this fixed. To me it looks like non-WR now. That website has some bugs itself in non-WR Nightly/Chrome. I tested both website and testcase and results were the same.
Depends on: 1422057
Good :)

Self-assigning this bug to turn the reduced testcase into a reftest and land it.
Assignee: nobody → bugmail
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #16)
> There was https://github.com/servo/webrender/pull/2300 (pushed to master 5 days ago)
= 2018-01-16

> --good 2018-01-10

I was confused about the fact that it looked good to me before the patch from the github issue landed.
Turned it into a reftest and kicked off test-verify jobs with it: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e4fa76a12a376660897ac9ecb4d246349a268f17. That looked good so I added more regular reftest jobs to get some more coverage.
Comment on attachment 8944727 [details]
Bug 1408792 - Add a reftest.

https://reviewboard.mozilla.org/r/214876/#review221472
Attachment #8944727 - Flags: review?(mstange) → review+
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f8f0d318980c
Add a reftest. r=mstange
https://hg.mozilla.org/mozilla-central/rev/f8f0d318980c
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Flags: qe-verify+
I managed to reproduce the bug using an older version of Nightly (2017-10-15) on macOS 10.13. 
I retested everything on latest Nightly 61.0a1 and beta 60.b4 using the same platform and the bug is not reproducing anymore.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: