Closed
Bug 992218
Opened 11 years ago
Closed 11 years ago
Regression: Content glitches/flashes during periods of heavy repainting
Categories
(Firefox for Android Graveyard :: Toolbar, defect)
Tracking
(firefox30 verified, firefox31 verified, fennec30+)
VERIFIED
FIXED
Firefox 31
People
(Reporter: capella, Assigned: kats)
References
()
Details
(Keywords: regression, reproducible)
Attachments
(2 files, 1 obsolete file)
(deleted),
patch
|
BenWa
:
feedback-
|
Details | Diff | Splinter Review |
(deleted),
patch
|
kats
:
review+
lsblakk
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
Not sure if this is the right component for this report ...
Using last a current m-c repo / build (No patches in queue), I notice this oddness while moving text selection handles around the page while in reader mode.
(Starts around :05sec)
https://www.dropbox.com/s/do5fxn065s2kuu1/readerModeFlicker.mp4
Comment 1•11 years ago
|
||
Without a doubt a regression from recent graphics refactoring work. CC'ing some folks.
tracking-fennec: --- → ?
status-firefox30:
--- → affected
status-firefox31:
--- → affected
Keywords: regression,
regressionwindow-wanted
Comment 2•11 years ago
|
||
Regression window wanted:
mozilla central:
good build: 08-03-2014
bad build: 09-03-2014
pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=d01bf8596d3b&tochange=21f293fc8d34
Comment 3•11 years ago
|
||
Teodora, can you use narrow this using mozilla-inbound please?
I suspect this is a regression bug 963073 (enable b2g tiling by default).
Flags: needinfo?(teodora.vermesan)
Comment 4•11 years ago
|
||
(In reply to Aaron Train [:aaronmt] from comment #3)
> I suspect this is a regression bug 963073 (enable b2g tiling by default).
Yes, the bug is Bug 963073.
inbound:
good build: 1394227648
bad build: 1394229086
pushlog: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=c8355055899c&tochange=40b2ac413772
Flags: needinfo?(teodora.vermesan)
Updated•11 years ago
|
Keywords: regressionwindow-wanted
Updated•11 years ago
|
Assignee: nobody → chrislord.net
tracking-fennec: ? → 30+
Updated•11 years ago
|
Blocks: b2g-tiling
Flags: needinfo?(bas)
Keywords: reproducible
Summary: Odd Flicker viewing page in readerMode and moving TextSelection handles → Regression: Scrollable layers yields odd content flicking in Reader Move and or moving Text Selection handles
Updated•11 years ago
|
Summary: Regression: Scrollable layers yields odd content flicking in Reader Move and or moving Text Selection handles → Regression: Scrollable layers yields odd content flickering in Reader Mode and or moving text selection handles around in content
Comment 5•11 years ago
|
||
This is interesting... I expect it's another case of things going wrong in the progressive update code (can you reproduce if you change layers.progressive-paint to false?)
I'll be on PTO for 2 weeks, so I'm not going to get to this soon. needinfoing myself in case this isn't picked up before then, and kats in case this is related to the progressive update code.
Assignee: chrislord.net → nobody
Flags: needinfo?(chrislord.net)
Flags: needinfo?(bugmail.mozilla)
Assignee | ||
Comment 7•11 years ago
|
||
Added to my list of "possibly progressive update" bugs.
Flags: needinfo?(bugmail.mozilla)
Assignee | ||
Updated•11 years ago
|
Comment 8•11 years ago
|
||
I think what I see is this bug but am not 100% certain: https://www.youtube.com/watch?v=WRpKmnobUNg (720p it)
Reporter | ||
Comment 9•11 years ago
|
||
Yah, that UI "twitching" ... I do see it in other places, I just documented the first one I could repro.
Comment 10•11 years ago
|
||
I know little to nothing about this code so for now I'm going to un-need-info myself.
Flags: needinfo?(bas)
Reporter | ||
Comment 11•11 years ago
|
||
Ah, just realized comment #5 might have been a question for me ... in any case after a little more tinkering, I find I can only repro this on my GS/3 4.3, and have been unable to so far on my N7 4.4.2.
On the GS3 I can repro regardless of the layers.progressive-paint setting.
Assignee | ||
Comment 12•11 years ago
|
||
I was also able to repro this on the Nexus 4. I'm looking into it.
Assignee: nobody → bugmail.mozilla
Comment 13•11 years ago
|
||
STR |
also reproducible just by viewing an animation e.g, a *.gif, http://i.imgur.com/wEGGj3w.gif
Reporter | ||
Comment 14•11 years ago
|
||
Ah, better STR! Does occur on the N7 ...
(Spotted that GIF recently btw! Cooler than Honda's Awesome-o http://www.telegraph.co.uk/technology/technology-video/8877806/Honda-show-off-new-faster-smarter-Asimo-robot.html)
Assignee | ||
Comment 15•11 years ago
|
||
That gif makes it much easier to reproduce, thanks! (Also pretty scary..)
I'm not yet sure of the cause of the glitching but if tiling is enabled it usually seems to affect one or two tiles at a time, and if I disable tiling the entire content area seems affected. So it's not specific to tiling per se. I also looked at the layer tree dumps on both the client and compositor sides and there is no difference between the glitchy composites and the normal ones. I suspect that this is just inadequate locking somewhere and some buffer that is in the middle of getting updated is copied, resulting in partial garbage. I'm going to try to audit the codepaths to see if I can find this by inspection.
Summary: Regression: Scrollable layers yields odd content flickering in Reader Mode and or moving text selection handles around in content → Regression: Content glitches/flashes during periods of heavy repainting
Assignee | ||
Comment 16•11 years ago
|
||
CC'ing nical as well, since nical, Bas and Cwiiis are the ones who worked on the b2g-tiling code that caused this regression.
Assignee | ||
Comment 17•11 years ago
|
||
On a debug build where things are slower this is actually quite funny. Tiles are just being misplaced, making it look like a jigsaw puzzle:
https://www.youtube.com/watch?v=Us5sH6AY_4c
Bas, nical, any thoughts on where the problem might be?
Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(bas)
Updated•11 years ago
|
Flags: needinfo?(nical.bugzilla)
Assignee | ||
Comment 18•11 years ago
|
||
After much debugging I seem to have narrowed it down to the texture pool (unsurprising in hindsight) and shrinking the size of the texture pool to 0 fixes the problem. I had tried some things that I thought were equivalent to this before (such as never returning textures to the pool) but they didn't work, so maybe there are some subtler differences between this and what I was trying. Anyway, stashing this patch here for now.
Flags: needinfo?(bas)
Assignee | ||
Comment 20•11 years ago
|
||
Comment on attachment 8410463 [details] [diff] [review]
Workaround
Review of attachment 8410463 [details] [diff] [review]:
-----------------------------------------------------------------
Would you rather distable the texture client pool some other way, if it comes to that? Tomorrow I'll take another look to see if I can figure out exactly what codepaths are triggering the improper texture reuse.
Attachment #8410463 -
Flags: feedback?(bgirard)
Comment 21•11 years ago
|
||
Comment on attachment 8410463 [details] [diff] [review]
Workaround
Review of attachment 8410463 [details] [diff] [review]:
-----------------------------------------------------------------
I would be very apprehensive to disable this cache on b2g because it means multiple synchronous round trip to the compositor to allocate every single tile. I'm afraid this will come out as major performance/checkerboarding regressions. If it comes to it we should instead double down on this issue and bring in nical/cwiiis who know the new tiling code the best.
Attachment #8410463 -
Flags: feedback?(bgirard) → feedback-
Assignee | ||
Comment 22•11 years ago
|
||
I narrowed down the problem to recycling the backbuffer while it is still in use. Based on the code at [1] I think this is the correct fix but as we said before this texture pool is kinda funky so I'm not sure if I'm missing something.
[1] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/client/TiledContentClient.cpp?rev=68c8916ca208#510
Attachment #8411058 -
Flags: review?(bgirard)
Comment 23•11 years ago
|
||
Comment on attachment 8411058 [details] [diff] [review]
Patch
Review of attachment 8411058 [details] [diff] [review]:
-----------------------------------------------------------------
I'm not familiar with this code. -> nical
Attachment #8411058 -
Flags: review?(bgirard) → review?(nical.bugzilla)
Comment 24•11 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #22)
> Created attachment 8411058 [details] [diff] [review]
> Patch
>
> I narrowed down the problem to recycling the backbuffer while it is still in
> use. Based on the code at [1] I think this is the correct fix but as we said
> before this texture pool is kinda funky so I'm not sure if I'm missing
> something.
>
> [1]
> http://mxr.mozilla.org/mozilla-central/source/gfx/layers/client/
> TiledContentClient.cpp?rev=68c8916ca208#510
In theory, as long as a gralloc buffer is bound to a gl texture, it should remain locked. Except error on our side, if a gralloc buffer is the front buffer of something, it should be bound to a texture.
What should happen here is that the client side blocks on the gralloc lock when over-producing, which is the desired result. obviously something is wrong in what I said but that was the idea.
if we take this patch, recycling will be less efficient, but we might not have a choice in the end. I just want to understand what is breaking my assumptions before giving the r+
Assignee | ||
Comment 25•11 years ago
|
||
(In reply to Nicolas Silva [:nical] from comment #24)
> In theory, as long as a gralloc buffer is bound to a gl texture, it should
> remain locked. Except error on our side, if a gralloc buffer is the front
> buffer of something, it should be bound to a texture.
> What should happen here is that the client side blocks on the gralloc lock
> when over-producing, which is the desired result. obviously something is
> wrong in what I said but that was the idea.
On Fennec we don't use gralloc so the behaviour you're describing is probably not valid there.
> if we take this patch, recycling will be less efficient, but we might not
> have a choice in the end. I just want to understand what is breaking my
> assumptions before giving the r+
Honestly I'll be happy to try out any other suggestions you have - I don't really understand this code that well so I'm just grasping at straws.
Comment 26•11 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #25)
> (In reply to Nicolas Silva [:nical] from comment #24)
> > In theory, as long as a gralloc buffer is bound to a gl texture, it should
> > remain locked. Except error on our side, if a gralloc buffer is the front
> > buffer of something, it should be bound to a texture.
> > What should happen here is that the client side blocks on the gralloc lock
> > when over-producing, which is the desired result. obviously something is
> > wrong in what I said but that was the idea.
>
> On Fennec we don't use gralloc so the behaviour you're describing is
> probably not valid there.
Oops, my bad. I am still actively thinking about this. I'll either have a counter-proposal or a r+ by the end of tomorrow.
Comment 27•11 years ago
|
||
Comment on attachment 8411058 [details] [diff] [review]
Patch
Review of attachment 8411058 [details] [diff] [review]:
-----------------------------------------------------------------
r=me with the modification below:
::: gfx/layers/client/TiledContentClient.cpp
@@ +484,5 @@
> TileClient::DiscardBackBuffer()
> {
> if (mBackBuffer) {
> MOZ_ASSERT(mBackLock);
> + if (mBackLock->GetReadCount() > 1) {
Please change this condition to
(!mBackBuffer->ImplementsLocking() && mBackLock->GetReadCount() > 1)
This will keep the current behavior where the texture's blocking lock will prevent over-production (on b2g and windows D3D11), and fallback to discarding tiles more aggressively if the the TextureClient doesn't have a blocking lock (like on fennec).
Attachment #8411058 -
Flags: review?(nical.bugzilla) → review+
Assignee | ||
Comment 28•11 years ago
|
||
Makes sense. I verified the updated patch still fixes the problem on Fennec and landed:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a0fb78183db0
Assignee | ||
Comment 29•11 years ago
|
||
This is the updated patch. Requesting uplift to 30 as well (currently aurora, soon to be beta)
[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 963073
User impact if declined: while content is repainting (e.g. animated gifs, dragging around selection handles, and many other scenarios) often the repainted area will be of the wrong spot, because some other tile's content is painted instead of the right thing. This results in ugly visual glitching, pretty severe in some cases.
Testing completed (on m-c, etc.): locally so far
Risk to taking this patch (and alternatives if risky): this code is used on B2G, metro, and fennec, so it has the potential to negatively impact B2G or metro. However the additional condition nical had me add to the patch should restrict it to Fennec.
String or IDL/UUID changes made by this patch: none
Attachment #8411058 -
Attachment is obsolete: true
Attachment #8412591 -
Flags: review+
Attachment #8412591 -
Flags: approval-mozilla-aurora?
Comment 30•11 years ago
|
||
> Testing completed (on m-c, etc.): locally so far
For driver, tested the patch locally last evening against the steps outlined in this bug and it's duplicates and it worked for me.
Assignee | ||
Updated•11 years ago
|
Comment 31•11 years ago
|
||
Prime example of naming actively confusing even us: this cset landed with this commit msg, " Don't flag texture clients for reuse [...]". A TextureClient is not "a client" but "a texture" (on the client side). When the idea that's floating around, to s/TextureClient/ClientTexture/.
Comment 32•11 years ago
|
||
*Whence
Comment 33•11 years ago
|
||
(In reply to Benoit Jacob [:bjacob] from comment #31)
> Prime example of naming actively confusing even us: this cset landed with
> this commit msg, " Don't flag texture clients for reuse [...]". A
> TextureClient is not "a client" but "a texture" (on the client side). When
> the idea that's floating around, to s/TextureClient/ClientTexture/.
+1. While we're at it, let's please fix the inconsistency where in "ClientLayer", "Layer" is at the end but in "LayerComposite", "Layer" is at the beginning.
Comment 34•11 years ago
|
||
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 31
Updated•11 years ago
|
Attachment #8412591 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee | ||
Comment 35•11 years ago
|
||
Uplifted with minor rebase:
https://hg.mozilla.org/releases/mozilla-aurora/rev/10a8ef2b96c7
Updated•11 years ago
|
Status: RESOLVED → VERIFIED
Comment 36•11 years ago
|
||
The patch for this looks sound to me, clearing needinfo.
Flags: needinfo?(chrislord.net)
Comment 37•10 years ago
|
||
Using the gif provided in the URL, there are no more glitches/flashes, so:
Verified fixed on:
Build: Firefox for Android 30 Beta 4
Device: Alcatel One Touch (Android 4.1.2)
Updated•4 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•