Closed
Bug 1023882
Opened 10 years ago
Closed 10 years ago
Refactor ClientTiledThebesLayer::Render()'s logic to be more easily readable
Categories
(Core :: Graphics: Layers, defect)
Tracking
()
RESOLVED
FIXED
mozilla33
People
(Reporter: kats, Assigned: kats)
References
Details
Attachments
(8 files, 3 obsolete files)
(deleted),
patch
|
cwiiis
:
review+
bajaj
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cwiiis
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cwiiis
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cwiiis
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cwiiis
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cwiiis
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cwiiis
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
cwiiis
:
review+
|
Details | Diff | Splinter Review |
The Render function in ClientTiledThebesLayer is weird because it gets called multiple times with different state bits set, and so you have to mentally keep track of a lot of things to figure out what actually gets run on any given execution of the function. I refactored it a bunch - not sure if I made it better but I think I did a little bit, and at least I understand it now. Would like to land these patches assuming I didn't break anything.
Assignee | ||
Comment 1•10 years ago
|
||
Attachment #8438437 -
Flags: review?(chrislord.net)
Assignee | ||
Comment 2•10 years ago
|
||
Attachment #8438438 -
Flags: review?(chrislord.net)
Assignee | ||
Comment 3•10 years ago
|
||
Attachment #8438439 -
Flags: review?(chrislord.net)
Assignee | ||
Comment 4•10 years ago
|
||
Attachment #8438440 -
Flags: review?(chrislord.net)
Assignee | ||
Comment 5•10 years ago
|
||
Attachment #8438442 -
Flags: review?(chrislord.net)
Assignee | ||
Updated•10 years ago
|
Attachment #8438442 -
Attachment description: Split the invalid region calculation from the low-precision invalid region calculation → Part 5 - Split the invalid region calculation from the low-precision invalid region calculation
Assignee | ||
Comment 6•10 years ago
|
||
Attachment #8438443 -
Flags: review?(chrislord.net)
Assignee | ||
Comment 7•10 years ago
|
||
Attachment #8438444 -
Flags: review?(chrislord.net)
Assignee | ||
Comment 8•10 years ago
|
||
Attachment #8438445 -
Flags: review?(chrislord.net)
Comment 9•10 years ago
|
||
Comment on attachment 8438437 [details] [diff] [review]
Part 1 - Small changes
Review of attachment 8438437 [details] [diff] [review]:
-----------------------------------------------------------------
Much clearer, nice.
::: gfx/layers/client/ClientTiledThebesLayer.cpp
@@ +187,5 @@
> +ClientTiledThebesLayer::UseFastPath()
> +{
> + const FrameMetrics& parentMetrics = GetParent()->GetFrameMetrics();
> + bool multipleTransactionsNeeded = gfxPrefs::UseProgressiveTilePainting()
> + || gfxPrefs::UseLowPrecisionBuffer()
I think currently if you disable progressive tile painting, low precision painting won't happen(?) - I'd be in favour of fixing that though, so this can slide.
::: gfx/layers/client/ClientTiledThebesLayer.h
@@ +80,5 @@
> */
> void BeginPaint();
>
> /**
> + * Determine if we can use a fast path to just do a single high-precision
nit, comma after 'high-precision'.
Attachment #8438437 -
Flags: review?(chrislord.net) → review+
Comment 10•10 years ago
|
||
Comment on attachment 8438438 [details] [diff] [review]
Part 2 - Create a unified first-transaction block of code
Review of attachment 8438438 [details] [diff] [review]:
-----------------------------------------------------------------
Excellent.
Attachment #8438438 -
Flags: review?(chrislord.net) → review+
Comment 11•10 years ago
|
||
Comment on attachment 8438439 [details] [diff] [review]
Part 3 - Extract a high-precision render function
Review of attachment 8438439 [details] [diff] [review]:
-----------------------------------------------------------------
Again, nice.
::: gfx/layers/client/ClientTiledThebesLayer.cpp
@@ +200,5 @@
> + if (aInvalidRegion.IsEmpty() || mPaintData.mLowPrecisionPaintCount != 0) {
> + return false;
> + }
> +
> + // Only draw progressively when the resolution is unchanged.
I wonder if we can factor out the resolution change into the UseFastPath function? The HasShadowTarget bit here is a bit suspect too, but I'd put money on a whole bunch of ref-tests failing if we remove it :(
Attachment #8438439 -
Flags: review?(chrislord.net) → review+
Comment 12•10 years ago
|
||
Comment on attachment 8438440 [details] [diff] [review]
Part 4 - Extract a low-precision render function
Review of attachment 8438440 [details] [diff] [review]:
-----------------------------------------------------------------
Good, with the first comment addressed.
::: gfx/layers/client/ClientTiledThebesLayer.cpp
@@ +417,5 @@
> TILING_PRLOG_OBJ(("TILING 0x%p: Low-precision valid region is %s\n", this, tmpstr.get()), mLowPrecisionValidRegion);
> TILING_PRLOG_OBJ(("TILING 0x%p: Low-precision invalid region is %s\n", this, tmpstr.get()), lowPrecisionInvalidRegion);
>
> // Render the low precision buffer, if there's area to invalidate and the
> // visible region is larger than the critical display port.
The latter part of this comment should be moved into the RenderLowPrecision method.
::: gfx/layers/client/ClientTiledThebesLayer.h
@@ +86,5 @@
> */
> bool UseFastPath();
>
> + // Helper functions to do the high- and low-precision paints, respectively.
> + // These functions return true if they updated the paint buffers.
Any reason to go from C-style to C++-style comments here? I get that it's commenting two functions, but it seems out of place - might as well comment both separately I think? I'm not fussy, so your choice.
Attachment #8438440 -
Flags: review?(chrislord.net) → review+
Assignee | ||
Comment 13•10 years ago
|
||
Comment on attachment 8438443 [details] [diff] [review]
Part 6 - Move the SetRepeatTransaction out of TiledContentClient
This part seems to have a bug based on testing on Fennec. Unflagging for now.
Attachment #8438443 -
Flags: review?(chrislord.net)
Assignee | ||
Comment 14•10 years ago
|
||
Added a missing hunk
Attachment #8438443 -
Attachment is obsolete: true
Attachment #8438486 -
Flags: review?(chrislord.net)
Assignee | ||
Comment 15•10 years ago
|
||
Rebased
Attachment #8438444 -
Attachment is obsolete: true
Attachment #8438444 -
Flags: review?(chrislord.net)
Attachment #8438487 -
Flags: review?(chrislord.net)
Assignee | ||
Comment 16•10 years ago
|
||
Rebased
Attachment #8438445 -
Attachment is obsolete: true
Attachment #8438445 -
Flags: review?(chrislord.net)
Attachment #8438489 -
Flags: review?(chrislord.net)
Comment 17•10 years ago
|
||
Comment on attachment 8438442 [details] [diff] [review]
Part 5 - Split the invalid region calculation from the low-precision invalid region calculation
Review of attachment 8438442 [details] [diff] [review]:
-----------------------------------------------------------------
Ok.
Attachment #8438442 -
Flags: review?(chrislord.net) → review+
Comment 18•10 years ago
|
||
Comment on attachment 8438486 [details] [diff] [review]
Part 6 - Move the SetRepeatTransaction out of TiledContentClient
Review of attachment 8438486 [details] [diff] [review]:
-----------------------------------------------------------------
r+ with comment addressed.
::: gfx/layers/client/ClientTiledThebesLayer.cpp
@@ +406,5 @@
> + if (!mPaintData.mPaintFinished) {
> + // There is still more high-res stuff to paint, so we're not
> + // done yet. A subsequent transaction will take care of this.
> + ClientManager()->SetRepeatTransaction();
> + return;
Previously EndPaint(false) would be called before returning here I think. Am I correct in thinking this, and if so, is it intended to skip it?
Attachment #8438486 -
Flags: review?(chrislord.net) → review+
Comment 19•10 years ago
|
||
Comment on attachment 8438487 [details] [diff] [review]
Part 7 - Move down the low-precision invalid region calculation to where it's needed
Review of attachment 8438487 [details] [diff] [review]:
-----------------------------------------------------------------
Seems sound.
Attachment #8438487 -
Flags: review?(chrislord.net) → review+
Comment 20•10 years ago
|
||
Comment on attachment 8438489 [details] [diff] [review]
Part 8 - Remove the argument to EndPaint
Review of attachment 8438489 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good, nice work :)
::: gfx/layers/client/ClientTiledThebesLayer.cpp
@@ +430,5 @@
> }
> }
>
> + // If we get here, we've done all the high- and low-precision
> + // paints we wanted to do, so we can finish the paint and chill.
https://www.youtube.com/watch?feature=player_detailpage&v=eQZc53dpPrQ#t=34
Attachment #8438489 -
Flags: review?(chrislord.net) → review+
Assignee | ||
Comment 21•10 years ago
|
||
(In reply to Chris Lord [:cwiiis] from comment #9)
> Part 1 - Small changes
>
> I think currently if you disable progressive tile painting, low precision
> painting won't happen(?) - I'd be in favour of fixing that though, so this
> can slide.
I don't see any code that would disable low-precision painting if the progressive tile painting is turned off. However it is kind of odd that low-res tiles are always drawn using the ProgressiveUpdate call, even if progressive tile painting is turned off (I think). There's definitely some entanglement between the two prefs that we should sort out.
>
> nit, comma after 'high-precision'.
Fixed.
(In reply to Chris Lord [:cwiiis] from comment #11)
> Part 3 - Extract a high-precision render function
>
> > + // Only draw progressively when the resolution is unchanged.
>
> I wonder if we can factor out the resolution change into the UseFastPath
> function? The HasShadowTarget bit here is a bit suspect too, but I'd put
> money on a whole bunch of ref-tests failing if we remove it :(
Maybe. I didn't understand the purpose of HasShadowTarget so I was unsure if we could lump that into the fast-path code or not.
(In reply to Chris Lord [:cwiiis] from comment #12)
> Part 4 - Extract a low-precision render function
> > // Render the low precision buffer, if there's area to invalidate and the
> > // visible region is larger than the critical display port.
>
> The latter part of this comment should be moved into the RenderLowPrecision
> method.
Done.
> > + // Helper functions to do the high- and low-precision paints, respectively.
> > + // These functions return true if they updated the paint buffers.
>
> Any reason to go from C-style to C++-style comments here? I get that it's
> commenting two functions, but it seems out of place - might as well comment
> both separately I think? I'm not fussy, so your choice.
Not really. I changed it back to /** comments and added one for each function.
(In reply to Chris Lord [:cwiiis] from comment #18)
> Part 6 - Move the SetRepeatTransaction out of TiledContentClient
> > + if (!mPaintData.mPaintFinished) {
> > + // There is still more high-res stuff to paint, so we're not
> > + // done yet. A subsequent transaction will take care of this.
> > + ClientManager()->SetRepeatTransaction();
> > + return;
>
> Previously EndPaint(false) would be called before returning here I think. Am
> I correct in thinking this, and if so, is it intended to skip it?
Previously EndPaint(false) would get called, but it would be a no-op when mPaintData.mPaintFinished is false, because of if the (!aFinish && !mPaintData.mPaintFinished) check at the top. Now it doesn't get called at all in this case, which is equivalent, and allows the cleanup in part 8.
(In reply to Chris Lord [:cwiiis] from comment #20)
> https://www.youtube.com/watch?feature=player_detailpage&v=eQZc53dpPrQ#t=34
:)
Assignee | ||
Comment 22•10 years ago
|
||
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/05e8c4ee536d
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/039cfd09fa32
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/a92435f75f89
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/42e44e1eb363
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/4ba2f2a98114
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/c38a0cb1e00c
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/811d536995cf
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/501167f9e525
I backed these out in https://hg.mozilla.org/integration/mozilla-inbound/rev/44d144664d36 for breaking b2g mochitest-3 (and also b2g mochitest-4 when the test got moved to a different chunk: https://tbpl.mozilla.org/php/getParsedLog.php?id=41541251&tree=Mozilla-Inbound
Flags: needinfo?(bugmail.mozilla)
Assignee | ||
Comment 24•10 years ago
|
||
I'll look into this failure. Does seem to correlate 100% with the time my patches were in the tree. I also got a talos regression notification for Android rck2 from these changes, so I must have introduced some sort of functional change that I didn't intend to.
Flags: needinfo?(bugmail.mozilla)
Assignee | ||
Comment 25•10 years ago
|
||
So I can reproduce a problem on that test when running the B2G emulator mochitest locally. Not sure if it's the exact same problem but it certainly seems plausible. The problem is that the client side gets stuck at [1], because it tries to access the mutex after the compositor side has deleted it (but before the client side receives the delete notification).
There's probably also a problem in these patches that causes this problem to be exposed in the first place, because in theory there should be no functional changes here. Not sure what that problem is but I have try pushes pending to at least narrow down which patches are to blame.
[1] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/CompositorChild.cpp?rev=9a72db713446#239
Comment 27•10 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #26)
> Randall, any thoughts on comment 25?
It isn't clear why the lock would hang. The CrossProcessMutex should continue to work even if the compositor side has been deleted. I did not see any place where the mutex could be locked and then deleted leaving the mutex in a locked state. I wonder if shared memory used by the mutex is getting garbage written to it some how? From what I remember when I looked at the pthread mutex implementation was the mutex handle was really just an int. If the handle was non-zero when trying to obtain the lock, it would suspend until the handle returned to zero. Worst case would be to send a synchronous message to the Child process to ensure the mutex gets deleted on the child side before the compositor deletes it? Maybe cast the handle to an int and print out its value after the delete and before the hanging lock to see if it is changing between delete and the attempted lock? Here is the pthread_mutex_t struct:
typedef struct
{
int volatile value;
} pthread_mutex_t;
To dig into the mutex source:
http://androidxref.com/4.3_r2.1/xref/bionic/libc/include/pthread.h
http://androidxref.com/4.3_r2.1/xref/bionic/libc/bionic/pthread.c
Flags: needinfo?(rbarker)
Assignee | ||
Comment 28•10 years ago
|
||
Ah, what's happening is that the mutex creation is quickly followed by the mutex deletion on the compositor side. The mutex is deleted before the client side receives the creation message, and so by the time it tries to create its side of the mutex it has already been torn down. For some reason when the client side uses the stale handle to create the client-side CrossProcessMutex it doesn't fail, but instead points to some other shared memory buffer or something, and so the mutex there is "locked".
Comment 29•10 years ago
|
||
Maybe add a check [1] that mCount > 1, if not, call pthread_mutex_init again? When pthread_mutex_destroy is called, I'm pretty sure the handle is set to a non-zero value and needs to be initialized again.
[1] http://mxr.mozilla.org/mozilla-central/source/ipc/glue/CrossProcessMutex_posix.cpp#88
Assignee | ||
Comment 30•10 years ago
|
||
I filed bug 1024728 to spin off this CrossProcessMutex thing. Based on my try pushes part 7 of this bug seems to have introduced the functional change.
Comment 31•10 years ago
|
||
Comment on attachment 8438487 [details] [diff] [review]
Part 7 - Move down the low-precision invalid region calculation to where it's needed
Review of attachment 8438487 [details] [diff] [review]:
-----------------------------------------------------------------
::: gfx/layers/client/ClientTiledThebesLayer.cpp
@@ -239,5 @@
> ClientTiledThebesLayer::RenderLowPrecision(nsIntRegion& aInvalidRegion,
> LayerManager::DrawThebesLayerCallback aCallback,
> void* aCallbackData)
> {
> - if (!aInvalidRegion.IsEmpty() &&
I don't see this !aInvalidRegion.IsEmpty() check coming back at any point - perhaps this is the cause of the functional difference?
Assignee | ||
Comment 32•10 years ago
|
||
(In reply to Chris Lord [:cwiiis] from comment #31)
> I don't see this !aInvalidRegion.IsEmpty() check coming back at any point -
> perhaps this is the cause of the functional difference?
There's a lowPrecisionInvalidRegion.IsEmpty() check before the call to RenderLowPrecision ever happens so I don't think that's the problem. I think the problem might be that before lowPrecisionInvalidRegion was getting before the RenderHighPrecision code and now it's getting computed after. mValidRegion is used in the computation of lowPrecisionInvalidRegion, and it's modified by RenderHighPrecision, so the new value computed will be different. I'll do a try push to see if I can confirm this.
Assignee | ||
Comment 33•10 years ago
|
||
Oh it also might be that I removed the critical displayport isEmpty check from computing the lowPrecisionInvalidRegion. Will test that too.
Assignee | ||
Comment 34•10 years ago
|
||
Comment 32 seems to be right, based on this: https://tbpl.mozilla.org/?tree=Try&rev=97a9bed13bde. I'm also doing a full run at https://tbpl.mozilla.org/?tree=Try&rev=908cc9577c76 just to be safe because I've been backed out way too many times lately.
Assignee | ||
Comment 35•10 years ago
|
||
Re-landed with the lowPrecisionInvalidRegion computation moved up.
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/7577d77815ba
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/e33d7e715a91
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/c90cdfbfe4c6
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/e829b12010d4
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/fe43edb23d89
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/c027aed061ce
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/8508e0cc42b8
remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/32c13ea4c9e4
Comment 36•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/7577d77815ba
https://hg.mozilla.org/mozilla-central/rev/e33d7e715a91
https://hg.mozilla.org/mozilla-central/rev/c90cdfbfe4c6
https://hg.mozilla.org/mozilla-central/rev/e829b12010d4
https://hg.mozilla.org/mozilla-central/rev/fe43edb23d89
https://hg.mozilla.org/mozilla-central/rev/c027aed061ce
https://hg.mozilla.org/mozilla-central/rev/8508e0cc42b8
https://hg.mozilla.org/mozilla-central/rev/32c13ea4c9e4
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
Comment 37•10 years ago
|
||
Could this have cause Android 4.04 robocop checkerboard regression?
http://graphs.mozilla.org/graph.html#tests=[[201,63,29]]&sel=1402547543477.6853,1403024858163,2.4264705882352935,13.235294117647058&displayrange=30&datatype=running
Assignee | ||
Comment 38•10 years ago
|
||
Yeah I got that email too. It's likely this change. I'd rather not back this out again because that test is pretty finicky and I'm going to be making other changes to this code as well which will probably move the numbers around a bunch.
Assignee | ||
Comment 39•10 years ago
|
||
Would like to uplift this to 2.0 / Gecko32 because a bunch of the other low-res fixes landed after it and need uplifting. Uplifting this will make those fixes easier to uplift, and will also make it easier to debug issues on 2.0 if we have further bugs filed against it.
Assignee | ||
Comment 40•10 years ago
|
||
Comment on attachment 8438437 [details] [diff] [review]
Part 1 - Small changes
[Approval Request Comment]
Bug caused by (feature/regressing bug #): none
User impact if declined: No user-visible impact here. The changes are purely code reorganization. However not uplifting this will make it much harder to fix user-visible bug fixes (see list of dep bugs). This is feature that is new to B2G 2.0 and so there will probably be a few more bugfixes needed as things stabilize; having these patches uplifted will speed up development quit a bit.
Testing completed (on m-c, etc.): locally, on m-c for a few days now. I've been working in this code almost exclusively since it landed and haven't found any further issues with it. It did cause a Fennec talos regression that is being tracked separately. I'm actively looking into it and should have a fix for it soon.
Risk to taking this patch (and alternatives if risky): low risk. Code is only used on Fennec and B2G.
String or IDL/UUID changes made by this patch: none
Attachment #8438437 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 41•10 years ago
|
||
^ approval request applies to all patches in this bug
Assignee | ||
Updated•10 years ago
|
status-b2g-v1.4:
--- → wontfix
status-b2g-v2.0:
--- → affected
status-b2g-v2.1:
--- → fixed
status-firefox31:
--- → wontfix
status-firefox32:
--- → affected
status-firefox33:
--- → fixed
Updated•10 years ago
|
Attachment #8438437 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 42•10 years ago
|
||
Would not block on this, but considered approval given the user impact and bake time in comment #40.
blocking-b2g: 2.0? → ---
Assignee | ||
Comment 43•10 years ago
|
||
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/f79198359c86
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/d02aae5cb847
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/f2a65256e3cb
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/8ed48f1320b8
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/fa1f2e6b820b
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/71229cd93bd6
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/5c082ed1469b
remote: https://hg.mozilla.org/releases/mozilla-aurora/rev/b5411d889c2d
You need to log in
before you can comment on or make changes to this bug.
Description
•