Closed Bug 963073 (b2g-tiling) Opened 11 years ago Closed 10 years ago

Turn on tiled scrollable layers on b2g

Categories

(Core :: Graphics: Layers, defect)

All
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla30
blocking-b2g 1.4+
Tracking Status
firefox28 --- wontfix
firefox29 --- wontfix
firefox30 --- fixed
b2g-v1.4 --- fixed

People

(Reporter: cwiiis, Assigned: cwiiis)

References

(Depends on 2 open bugs)

Details

Attachments

(2 files, 8 obsolete files)

This is a bug to track the issues that need to be fixed before we turn on tiles everywhere for b2g. Note, there is a similar bug, bug 894333 that only concerns the browser. This may end up obsoleting that bug.
Depends on: 963076
Depends on: 915234
Depends on: 950841
I'm adding some gralloc bugs as blockers, but this may depend on whether we can show that gralloc could be a significant performance increase for tiles (I would say that this isn't a given).

We may also want to consider some kind of EGLImage-backed tiles implementation to move the texture allocation away from the compositor, if that makes sense...
Depends on: 959089
I don't think we will need that. Content should send the compositor gralloc tiles that content allocated (from the image bridge).
Depends on: 939348
We will use this bug to cover the "scrolling" scenario.  If the measurements support us doing tiling everywhere, we'll create another bug.
Summary: [meta] Turn on tiled layers on b2g → [meta] Turn on tiled scrollable layers on b2g
blocking-b2g: --- → 1.4+
Depends on: 893303
Alias: b2g-tiling
Depends on: 971914
Clearing blocking flag - this is a meta bug, which we will not block on. Per a drivers' decision - please nominate individual actionable dependencies.
blocking-b2g: 1.4+ → ---
Assignee: nobody → chrislord.net
blocking-b2g: --- → 1.4+
Summary: [meta] Turn on tiled scrollable layers on b2g → Turn on tiled scrollable layers on b2g
Depends on: 978248
Depends on: 979973
Depends on: 979084
Depends on: 979712
Attached patch Tiling branch work (obsolete) (deleted) — — Splinter Review
This is the diff of the tiling branch against the large merge point on central. Attaching it to this bug as it seemed the most appropriate, but this fixes bug 893303, bug 915234, bug 963076 and probably a few other related things.
Comment on attachment 8387468 [details] [diff] [review]
Tiling branch work

Review of attachment 8387468 [details] [diff] [review]:
-----------------------------------------------------------------

Here's my first pass through, I'm going to start fixing anything that's straight-forward.

::: gfx/layers/Compositor.h
@@ +137,5 @@
> + */
> +class CompositorBackendSpecificData : public RefCounted<CompositorBackendSpecificData>
> +{
> +public:
> +  //MOZ_DECLARE_REFCOUNTED_TYPENAME(CompositorBackendSpecificData)

Need to uncomment this now I think.

::: gfx/layers/client/ClientThebesLayer.cpp
@@ +22,5 @@
>  #include "nsCOMPtr.h"                   // for already_AddRefed
>  #include "nsISupportsImpl.h"            // for Layer::AddRef, etc
>  #include "nsRect.h"                     // for nsIntRect
>  #include "gfx2DGlue.h"
> +#include "gfxPrefs.h"

Not sure why this has moved... Merging weirdness?

@@ +174,5 @@
>  #endif
> +      gfxPrefs::LayersTilesEnabled() &&
> +      (AsShadowForwarder()->GetCompositorBackendType() == LayersBackend::LAYERS_OPENGL ||
> +       AsShadowForwarder()->GetCompositorBackendType() == LayersBackend::LAYERS_D3D9 ||
> +       AsShadowForwarder()->GetCompositorBackendType() == LayersBackend::LAYERS_D3D11)) {

I wonder if we even need to check the compositor backend type now...?

::: gfx/layers/client/CompositableClient.h
@@ +142,5 @@
>     * A hook for the when the Compositable is detached from it's layer.
>     */
>    virtual void OnDetach() {}
>  
> +  virtual void ClearCachedResources() {}

Needs documentation.

::: gfx/layers/client/TextureClient.cpp
@@ +213,5 @@
>  }
>  
> +#ifdef MOZ_WIDGET_GONK
> +static bool
> +DisableGralloc(SurfaceFormat aFormat)

Let's be careful when doing the final merge that this file mirrors the removals in CompositableClient.cpp correctly.

::: gfx/layers/client/TextureClient.h
@@ +224,5 @@
>    virtual void Unlock() {}
>  
>    virtual bool IsLocked() const = 0;
>  
> +  virtual bool CopyToTextureClient(TextureClient* aTarget,

Needs documentation (especially with regard to the expected state of locks).

@@ +235,5 @@
>     * use immediate uploads (see TextureFlags in CompositorTypes.h)
>     */
>    virtual bool ImplementsLocking() const { return false; }
>  
> +  virtual bool HasInternalBuffer() const = 0;

Needs documentation.

::: gfx/layers/client/TextureClientPool.h
@@ +17,5 @@
> +namespace layers {
> +
> +class ISurfaceAllocator;
> +
> +class TextureClientPool : public RefCounted<TextureClientPool>

All the function declarations in this class need documentation.

@@ +48,5 @@
> +  static const uint32_t sMinCacheSize = 0;
> +
> +  // This is the number of texture clients we don't want to exceed, including
> +  // outstanding TextureClients (from GetTextureClient()), cached
> +  // TextureClients and deferred-return TextureClients.

This needs updating, as deferred-return TextureClients are now considered outstanding.

::: gfx/layers/client/TiledContentClient.cpp
@@ +34,5 @@
>  #ifdef GFX_TILEDLAYER_DEBUG_OVERLAY
>  #include "cairo.h"
>  #include <sstream>
>  using mozilla::layers::Layer;
> +static void DrawDebugOverlay(mozilla::gfx::DrawTarget* dt, int x, int y, int width, int height)

For the record, I don't think this function fully works anymore - I didn't see the text getting drawn last time I enabled it. We could do with the bitmap font debugging functions that Vlad talked about.

::: gfx/layers/client/TiledContentClient.h
@@ +46,5 @@
> +class ClientLayerManager;
> +
> +
> +// A class to help implement copy-on-write semantics for shared tiles.
> +class gfxSharedReadLock : public AtomicRefCounted<gfxSharedReadLock> {

We should consider splitting these into their own files, they aren't really tiling-specific. The methods also need documentation (especially that on creation, the readcount is 1).

@@ +134,5 @@
>   * each tile keeps a reference to a texture client. The texture client
>   * is backed by a gfxReusableSurfaceWrapper that implements a
>   * copy-on-write mechanism while locked. The tile should be
>   * locked before being sent to the compositor and unlocked
>   * as soon as it is uploaded to prevent a copy.

This comment needs updating.

@@ +334,5 @@
>  /**
>   * Provide an instance of TiledLayerBuffer backed by image surfaces.
>   * This buffer provides an implementation to ValidateTile using a
>   * thebes callback and can support painting using a single paint buffer
>   * which is much faster then painting directly into the tiles.

This comment also needs updating. Tests are showing that the single paint buffer is actually quite a bit slower now too... One thing at a time, I guess :)

::: gfx/layers/composite/CompositableHost.cpp
@@ +174,5 @@
>    default:
>      MOZ_CRASH("Unknown CompositableType");
>    }
> +  // We know that Tiled buffers don't use the compositable backend-specific
> +  // data, so don't bother creating it.

Actually, this isn't used anywhere with the current changes. We should either just get rid of compositable backend specific data, or we should add a patch that restores the old path in GrallocTextureHost. I'm in favour of the former, but really, we should probably do the latter. I have a patch for the latter.

::: gfx/layers/composite/TiledContentHost.cpp
@@ +110,5 @@
> +  if(!IsValid()) {
> +    return;
> +  }
> +  // The TextureClients were created with the TEXTURE_IMMEDIATE_UPLOAD flag,
> +  // so calling Update on all the texture hosts will perform the texture upload.

I wonder if there's a way we can assert this? If we don't set this flag on internally buffered surfaces, we could end up with some weirdness.

@@ +130,5 @@
>    printf_stderr("Upload tile %i, %i\n", aTileOrigin.x, aTileOrigin.y);
>    long start = PR_IntervalNow();
>  #endif
>  
> +  aTile.mTextureHost->Updated(nullptr);

So here we upload the entire texture, like we used to, but we could easily use aDirtyRect if we wanted. Probably worth noting in a comment.

@@ +168,5 @@
> +TiledContentHost::~TiledContentHost()
> +{
> +  MOZ_COUNT_DTOR(TiledContentHost);
> +
> +  if (mPendingUpload) {

Probably wouldn't hurt to comment on what's going on here.

@@ +455,5 @@
>    for (;it != stop; ++it) {
>      fprintf_stderr(aFile, "%s", aPrefix);
>      fprintf_stderr(aFile, aDumpHtml ? "<li> <a href=" : "Tile ");
> +    //TODO: Check if there's a DumpTextureHost, if not write one
> +    //DumpDeprecatedTextureHost(aFile, it->mDeprecatedTextureHost);

Someone should follow up on this.

::: gfx/layers/composite/TiledContentHost.h
@@ +87,5 @@
>  
> +  bool IsPlaceholderTile() const { return mTextureHost == nullptr; }
> +
> +  void ReadUnlock() {
> +    NS_WARN_IF_FALSE(mTextureHost == nullptr || mSharedLock != nullptr,

Should probably document this warning - it's basically saying that if you have a texture host, you must have a lock. Could probably express it in an easier to read way.

@@ +121,5 @@
> +  void ReadUnlock();
> +
> +  void ReleaseTextureHosts();
> +
> +  void Upload();

This method especially could do with documentation I think. It's unlikely to actually do anything for textures with no internal buffer.

@@ +147,4 @@
>  };
>  
>  /**
> + * XXX Sort out this comment.

Yup.

::: gfx/layers/ipc/CompositableTransactionParent.cpp
@@ +67,5 @@
>  
>  bool
> +CompositableParentManager::IsCrossProcess() const
> +{
> +  MOZ_ASSERT(false, "TODO[nical] - Implement me!");

File a bug for this, or implement before merge?

::: gfx/layers/ipc/ShadowLayers.h
@@ +254,5 @@
>    void SetMask(ShadowableLayer* aLayer,
>                 ShadowableLayer* aMaskLayer);
>  
>    /**
> +   * See CompositableForwarder::UseTiledLayerBuffer

Seeing as this is an override, maybe we should just remove this comment(?)

::: gfx/layers/opengl/CompositorOGL.cpp
@@ +49,5 @@
>  #endif
> +
> +#ifdef MOZ_WIDGET_GONK
> +#include "EGLImageHelpers.h"
> +#endif

This hunk can disappear, this was for when the code allocated a 1x1 EGLImage to help with unbinding.

@@ +1373,5 @@
> +#ifdef MOZ_WIDGET_GONK
> +  if (mCompositorBackendSpecificData) {
> +    static_cast<CompositorOGLGonkBackendSpecificData*>(mCompositorBackendSpecificData.get())->EndFrame();
> +  }
> +#endif

We still get genlock failures with some drivers, I wonder if this might be improved by moving this block to *before* the SwapBuffers above?

::: gfx/layers/opengl/GrallocTextureHost.h
@@ +57,5 @@
>    {
>      mGraphicBuffer = nullptr;
>    }
>  
>    TemporaryRef<gfx::DataSourceSurface> GetAsSurface();

Note, we need to audit this function - I'm not sure if with the current locking and the way this function gets used if it won't cause genlock failures.

This function is pretty poorly documented, it's not clear whether a lock should be taken or not when using it and when I last looked at mxr, whether this is the case is unclear. Personally, I think a lock should be taken.

::: gfx/layers/opengl/TextureClientOGL.h
@@ +93,5 @@
>    virtual bool ToSurfaceDescriptor(SurfaceDescriptor& aOutDescriptor) MOZ_OVERRIDE;
>  
>    virtual TextureClientData* DropTextureData() MOZ_OVERRIDE { return nullptr; }
>  
> +  virtual bool HasInternalBuffer() const MOZ_OVERRIDE { return false; }

This is unnecessary, the default already returns false and this doesn't inherit from anything else.

::: gfx/layers/opengl/TextureHostOGL.cpp
@@ +53,1 @@
>    return new CompositableDataGonkOGL();

Heh, this would explain why my patch that restores the old path seemed to have no effect... Whoops :)
Attached patch Tiling branch work v2 (obsolete) (deleted) — — Splinter Review
I've addressed everything in my comments, except:
- DrawDebugOverlay
- splitting out gfxSharedReadLock
- the compositable backend data stuff (I have a patch on another machine that I won't get to for another hour or so)*
- DumpTextureHost
- ReadUnlock warning*
- XXX Sort out this comment*
- CompositableParentManager::IsCrossProcess
- Audit GrallocTextureHost::GetAsSurface*

Stuff marked with a star is stuff I intend to fix now, everything else and any issues I missed, take your pick.
Attachment #8387468 - Attachment is obsolete: true
(In reply to Chris Lord [:cwiiis] from comment #7)
> - DumpTextureHost

Done

> - CompositableParentManager::IsCrossProcess

Done

> - Audit GrallocTextureHost::GetAsSurface*

Note that GetAsSurface is only used by debugging helpers
Attached patch Restore old gralloc path for non-tiles (obsolete) (deleted) — — Splinter Review
Here's a patch that restores the old path for non-tiles...
(In reply to Chris Lord [:cwiiis] from comment #9)
> Created attachment 8387565 [details] [diff] [review]
> Restore old gralloc path for non-tiles
> 
> Here's a patch that restores the old path for non-tiles...

So was looking at doing the opposite (removing the old path), and it's quite involved. We need to either remove or port GrallocDeprecatedTextureHostOGL and duplicate whatever it is that CompositableParentManager::ReturnTextureDataIfNecessary is doing (which might need some looking at anyway?).

My vote goes for applying this patch and dealing with this later.
(In reply to Chris Lord [:cwiiis] from comment #9)
> Created attachment 8387565 [details] [diff] [review]
> Restore old gralloc path for non-tiles
> 
> Here's a patch that restores the old path for non-tiles...

After even more inspection, we realised that the new path breaks the fencing stuff satoro was working on, so I don't think not restoring the old path is an option. I've pushed this to the branch.

Also fixed up the other comments I pointed out.

GetAsSurface is only used by debug code, so I think we're ok to sort that out later.
Attached patch Tiles branch work v3 (obsolete) (deleted) — — Splinter Review
I think this is possibly good enough. Obviously, I can't be the one to make that call though.
Attachment #8387521 - Attachment is obsolete: true
Attachment #8387565 - Attachment is obsolete: true
Attached patch Tiling branch work v3, rebased on m-c (obsolete) (deleted) — — Splinter Review
No changes, couple of small header conflicts resolved.
Attachment #8387577 - Attachment is obsolete: true
try push of central with bug 979084 (which will have no effect on tests, but will be handy if anyone wants to use the resultant b2g builds) and rolled up patches:

https://tbpl.mozilla.org/?tree=Try&rev=8c9c9fe90171
Whoops, forgot to add files...

https://tbpl.mozilla.org/?tree=Try&rev=3125296c850d
Attachment #8387582 - Attachment is obsolete: true
And a push with tiles enabled on b2g: https://tbpl.mozilla.org/?tree=Try&rev=a7cb42741777
Actually, here's a better b2g try push: https://tbpl.mozilla.org/?tree=Try&rev=6d98af34bb95
kats caught that I got the bug number wrong. Updating in case someone jumps the gun and pushes straight from this patch :)
Attachment #8387588 - Attachment is obsolete: true
Attached patch Enable tiles by default on b2g (deleted) — — Splinter Review
This is just the pref flip. Giving r? to milan, who can have the final decision on when we turn this on after we merge.
Attachment #8387597 - Flags: review?(milan)
Comment on attachment 8387588 [details] [diff] [review]
Tiling branch work v3, rebased on m-c, TextureClientPool added.

Review of attachment 8387588 [details] [diff] [review]:
-----------------------------------------------------------------

Here are some things I've noticed so far.

::: gfx/layers/client/TiledContentClient.h
@@ +82,5 @@
> +private:
> +  int32_t mReadCount;
> +};
> +
> +class gfxShmSharedReadLock : public gfxSharedReadLock {

It's probably worth noting that each of these locks uses a full 4k page. In the future we might want to consider putting more than one on the same page.

::: gfx/layers/composite/TiledContentHost.cpp
@@ +44,4 @@
>  
> +  // Combine any valid content that wasn't already uploaded
> +  nsIntRegion oldPaintedRegion(aOldPaintedRegion);
> +  oldPaintedRegion.And(oldPaintedRegion, mValidRegion);

Doing:
 nsIntRegion oldPaintedRegion;
 oldPaintedRegion.And(aOldPaintedRegion, mValidRegion);

Will avoid the copy into oldPaintedRegion at the begining.
Attachment #8387588 - Attachment is obsolete: false
ugh, sorry, juggling queues meant I had a stale patch that broke displayport positioning on b2g... This will likely result in errors, so I've cancelled the b2g push, left the first all push and pushed new tries for both:


All: https://tbpl.mozilla.org/?tree=Try&rev=fc55be65eca6
B2G+tiles: https://tbpl.mozilla.org/?tree=Try&rev=8d1ded8e586e
4k per lock is not going to work on Tarako. We need a pool of looks sharing pages. Please file a follow-up bug to be fixed right after landing.
Let's flip the pref after Martijn has had a chance to look at this, Monday at the latest.

Andreas - agreed - we're aiming for hamachi/nexus4/nexus5 until we get more devices to work with next week.
Flags: needinfo?(martijn.martijn)
I've just spotted that TextureHostOGL doesn't implement HasInternalBuffer (which defaults to false on TextureHost), so that's negatively impacting Android performance/memory use. Will push a fix after I've tested.
PM Triage: remains 1.4+
(In reply to Chris Lord [:cwiiis] from comment #24)
> I've just spotted that TextureHostOGL doesn't implement HasInternalBuffer
> (which defaults to false on TextureHost), so that's negatively impacting
> Android performance/memory use. Will push a fix after I've tested.

This wasn't correct, the problem was an uninitialised member variable in TiledLayerBufferComposite (but it had the same effect).

Here's a try push with that fixed: https://tbpl.mozilla.org/?tree=Try&rev=48735e1258b7

Here's an etherpad where we're tracking what we've pushed to try: https://etherpad.mozilla.org/tiling-talos

Playing with this on a Galaxy Nexus, I'm seeing the compositor occasionally block on the client (or at least, it looks that way). I don't think this should stop us from landing, but we need to investigate this.
Attached file Tiling branch work v4, rebased on m-c (obsolete) (deleted) —
Here's the latest with the unnecessary nsIntRegion copy and uninitialised mHasDoubleBufferedTiles fixed.
Attachment #8387588 - Attachment is obsolete: true
Attachment #8387596 - Attachment is obsolete: true
Attached patch Tiling branch work v5, rebased on m-c (deleted) — — Splinter Review
Removed the nsIntRegion optimisation - it didn't build, presumable because aOldPaintedRegion is const. I'm getting tired, I'll let someone else fix that.

New Android push: https://tbpl.mozilla.org/?tree=Try&rev=438f9f0e1035
Attachment #8387755 - Attachment is obsolete: true
Comment on attachment 8387763 [details] [diff] [review]
Tiling branch work v5, rebased on m-c

Review of attachment 8387763 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/client/TiledContentClient.cpp
@@ +28,5 @@
>  
> +// This is the minimum area that we deem reasonable to copy from the front buffer to the
> +// back buffer on tile updates. If the valid region is smaller than this, we just
> +// redraw it and save on the copy (and requisite surface-locking involved).
> +#define MINIMUM_TILE_COPY_AREA ((TILEDLAYERBUFFER_TILE_SIZE * TILEDLAYERBUFFER_TILE_SIZE)/16)

This could be a constant, not a define.

@@ +314,5 @@
> +  MOZ_COUNT_CTOR(gfxShmSharedReadLock);
> +
> +  if (mAllocator) {
> +#define MOZ_ALIGN_WORD(x) (((x) + 3) & ~3)
> +    if (mAllocator->AllocUnsafeShmem(MOZ_ALIGN_WORD(sizeof(ShmReadLockInfo)),

Since ShmReadLockInfo contains just a int32_t, better static_assert(sizeof==4) and then you dont need to do crazy computations and can remove MOZ_ALIGN_WORD  (which btw should be an inline function, not a macro).

::: gfx/layers/client/TiledContentClient.h
@@ +121,5 @@
> +
> +  ShmReadLockInfo* GetShmReadLockInfoPtr()
> +  {
> +    return reinterpret_cast<ShmReadLockInfo*>
> +      (mShmem.get<char>() + mShmem.Size<char>() - sizeof(ShmReadLockInfo));

What assumption are you making here about alignof(ShmReadLockInfo) ? Is that assumption safe? Can you cover it by an assertion?

::: gfx/layers/opengl/GrallocTextureHost.cpp
@@ +130,5 @@
> +  if (mCompositableBackendData) {
> +    // There are two paths for locking/unlocking - if mCompositableBackendData is
> +    // set, we use the texture on there, otherwise we use
> +    // CompositorBackendSpecificData from the compositor and bind the EGLImage
> +    // only in Lock().

Hang on, are we going back to using a single texture compositor-wide for all gralloc buffers? There should at least be a comment here explaining why this is wanted. Also, if you still have these bugs with client processes hanging because the compositor doesn't release a lock on a graphicbuffer, then this would be a place where to look suspisciously!

@@ +324,5 @@
>  
>  void
>  GrallocTextureHostOGL::Unlock()
>  {
>    // Unlock is done internally by binding the texture to another gralloc buffer

I realize that that was there before, but maybe this is still suspicious and could explain these client process hangs you were mentioning... i.e. what if no new compositing is taking place, so this locks remains, and prevents a client process from getting a write lock.
(In reply to Benoit Jacob [:bjacob] from comment #29)
> Comment on attachment 8387763 [details] [diff] [review]
> Tiling branch work v5, rebased on m-c
> 
> Review of attachment 8387763 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/layers/client/TiledContentClient.cpp
> @@ +28,5 @@
> >  
> > +// This is the minimum area that we deem reasonable to copy from the front buffer to the
> > +// back buffer on tile updates. If the valid region is smaller than this, we just
> > +// redraw it and save on the copy (and requisite surface-locking involved).
> > +#define MINIMUM_TILE_COPY_AREA ((TILEDLAYERBUFFER_TILE_SIZE * TILEDLAYERBUFFER_TILE_SIZE)/16)
> 
> This could be a constant, not a define.
> 
> @@ +314,5 @@
> > +  MOZ_COUNT_CTOR(gfxShmSharedReadLock);
> > +
> > +  if (mAllocator) {
> > +#define MOZ_ALIGN_WORD(x) (((x) + 3) & ~3)
> > +    if (mAllocator->AllocUnsafeShmem(MOZ_ALIGN_WORD(sizeof(ShmReadLockInfo)),
> 
> Since ShmReadLockInfo contains just a int32_t, better
> static_assert(sizeof==4) and then you dont need to do crazy computations and
> can remove MOZ_ALIGN_WORD  (which btw should be an inline function, not a
> macro).
> 
> ::: gfx/layers/client/TiledContentClient.h
> @@ +121,5 @@
> > +
> > +  ShmReadLockInfo* GetShmReadLockInfoPtr()
> > +  {
> > +    return reinterpret_cast<ShmReadLockInfo*>
> > +      (mShmem.get<char>() + mShmem.Size<char>() - sizeof(ShmReadLockInfo));
> 
> What assumption are you making here about alignof(ShmReadLockInfo) ? Is that
> assumption safe? Can you cover it by an assertion?

All of this was just blindly derived from gfxSharedImageSurface, so I'm afraid I don't really know off-hand (and am struggling to think about it right now)... If you think this can be simplified, great :)

> ::: gfx/layers/opengl/GrallocTextureHost.cpp
> @@ +130,5 @@
> > +  if (mCompositableBackendData) {
> > +    // There are two paths for locking/unlocking - if mCompositableBackendData is
> > +    // set, we use the texture on there, otherwise we use
> > +    // CompositorBackendSpecificData from the compositor and bind the EGLImage
> > +    // only in Lock().
> 
> Hang on, are we going back to using a single texture compositor-wide for all
> gralloc buffers? There should at least be a comment here explaining why this
> is wanted. Also, if you still have these bugs with client processes hanging
> because the compositor doesn't release a lock on a graphicbuffer, then this
> would be a place where to look suspisciously!

No, the two paths are one texture per compositable or one texture per host locked in a frame. The former pretty much equates to the latter in all cases except for tiles, where we have many textures in one compositable.

It's worth restoring the old path because satoro's fence work relies on it, but also the non-tiles gralloc paths are not well-tested with the new locking mechanism.

> @@ +324,5 @@
> >  
> >  void
> >  GrallocTextureHostOGL::Unlock()
> >  {
> >    // Unlock is done internally by binding the texture to another gralloc buffer
> 
> I realize that that was there before, but maybe this is still suspicious and
> could explain these client process hangs you were mentioning... i.e. what if
> no new compositing is taking place, so this locks remains, and prevents a
> client process from getting a write lock.

The hangs I'm seeing are on Android only, b2g seems fine - I would more suspect something in the TextureClient/Host (maybe a contending lock that shouldn't be?)

gralloc is unfortunately a black box (and more irritatingly, a different black box per driver revision per device) - I think we can debug these issues as follow-up and hopefully think up some more robust plan.
Comment on attachment 8387596 [details] [diff] [review]
Tiling branch work v3, rebased on m-c, TextureClientPool added, summary text corrected

Review of attachment 8387596 [details] [diff] [review]:
-----------------------------------------------------------------

part 1 of my review:

::: browser/devtools/profiler/cleopatra/js/parserWorker.js
@@ +1225,5 @@
>      bugNumber: "772916",
>      check: function(frames, symbols, meta) {
>  
>        return stepContains('PaintGradient', frames, symbols)
> +          && stepContains('ClientTiledLayerBuffer::PaintThebesSingleBufferDraw', frames, symbols)

I say don't bother patching this.

::: gfx/layers/TiledLayerBuffer.h
@@ +199,3 @@
>     * has completed uploading.
>     */
> +  virtual void UseTiledLayerBuffer(ISurfaceAllocator* aAllocator,

UseTiledLayerBuffer is a poor name for this. I like PaintedTiledLayerbuffer much better. Otherwise we should find a better name.

@@ +369,5 @@
> +    if (oldTileCount >= tilesMissing) {
> +      oldRetainedTiles[i] = AsDerived().GetPlaceholderTile();
> +      AsDerived().ReleaseTile(oldTile);
> +    } else {
> +      oldTileCount ++;

nit: no space here.

::: gfx/layers/client/ClientLayerManager.cpp
@@ +61,5 @@
>    mRoot = nullptr;
>  
>    MOZ_COUNT_DTOR(ClientLayerManager);
> +
> +  mTexturePools.clear();

Is this clear needed? The destructor for this object is about to run.

::: gfx/layers/client/TextureClientPool.h
@@ +17,5 @@
> +namespace layers {
> +
> +class ISurfaceAllocator;
> +
> +class TextureClientPool : public RefCounted<TextureClientPool>

This class needs a comment to describe its functionality.

@@ +30,5 @@
> +   * a cached client that was returned to the pool, or a newly allocated
> +   * client if one isn't available.
> +   *
> +   * All clients retrieved by this method should be returned using the return
> +   * functions, or reported lost so that the pool can manage its size correctly.

I'm not a fan of the feature that a texture client can become loss. This complicates the lifetime quite a bit. Is this *really* needed?

@@ +44,5 @@
> +  /**
> +   * Return a TextureClient that is not yet ready to be reused, but will be
> +   * imminently.
> +   */
> +  void ReturnTextureClientDeferred(TextureClient *aClient);

Then it shouldn't be returned to the poll. If you have a good reason to do this then it should be tracked outside of the pool. If you return something here but it's not ready to be re-used then you're putting the responsibility on the caller to ReturnDeferredClients to know when it is. Now what happens if there's two types of lifetimes for tiles returned here and one of the calls to ReturnDeferredClients happens while some tiles still aren't ready.

The user of the texture client are responsible for tracking this.

@@ +56,5 @@
> +  /**
> +   * Attempt to shrink the pool so that there are no more than sMinCacheSize
> +   * unused clients.
> +   */
> +  void ShrinkToMinimumSize();

Ideally these Shrin* would be private (use friend).

@@ +68,5 @@
> +  /**
> +   * Report that a client retrieved via GetTextureClient() has become
> +   * unusable, so that it will no longer be tracked.
> +   */
> +  void ReportClientLost() { mOutstandingClients--; }

This call is gross too. What you're saying here:
- This pool will allocate the texture client
- You have to return the texture client
- ... but if you don't then you need to tell me about it

I don't think this complexity is justified for a pool. The pool should accept objects that are returned and either make it guaranteed to return or optional. We shouldn't track out much outstanding clients are made.

@@ +87,5 @@
> +  static const uint32_t sMinCacheSize = 0;
> +
> +  // The maximum number of texture clients managed by this pool that we want
> +  // to remain active.
> +  static const uint32_t sMaxTextureClients = 50;

I'm not a fan of this max. This wont work for a 4k display for example. In fact I really don't like most of these values. We seems to have done well so far this arbitrary limits so I think we should push back or have a great reason for them and why these numbers are chosen.

::: gfx/layers/composite/TextureHost.h
@@ +430,5 @@
>  
>    /**
> +   * Indicates whether the TextureHost implementation is backed by an
> +   * in-memory buffer. The consequence of this is that locking the
> +   * TextureHost does not contend with locking the texture on the client side.

I don't understand this comment. This leaves me wondering:
If there's an in-memory buffer then both sides can still access it so locking is important.

I'm sure this works so please clean-up the comment/API to make it more obvious what is happening here.

::: gfx/layers/ipc/CompositableForwarder.h
@@ +222,5 @@
>  
>    /**
>     * Returns the type of backend that is used off the main thread.
>     * We only don't allow changing the backend type at runtime so this value can
>     * be queried once and will not change until Gecko is restarted.

This comment is wrong now, see the other GetCompositorBackendType.

::: gfx/layers/ipc/ISurfaceAllocator.h
@@ +85,5 @@
> +   * We only don't allow changing the backend type at runtime so this value can
> +   * be queried once and will not change until Gecko is restarted.
> +   *
> +   * XXX - With e10s this may not be true anymore. we can have accelerated widgets
> +   * and non-accelerated widgets (small popups, etc.)

Shouldn't we fix this by having a different surface allocator or something? It's not clear what assumptions are made on that which will now be broken.
Surprisingly, enabling tiles is green on b2g :) https://tbpl.mozilla.org/?tree=Try&rev=8d1ded8e586e
(In reply to Chris Lord [:cwiiis] from comment #32)
> Surprisingly, enabling tiles is green on b2g :)
> https://tbpl.mozilla.org/?tree=Try&rev=8d1ded8e586e

I'd be surprised if the b2g tests managed to catch a blatant abort() ;)
Comment on attachment 8387597 [details] [diff] [review]
Enable tiles by default on b2g

Review of attachment 8387597 [details] [diff] [review]:
-----------------------------------------------------------------

Milan's on PTO, it's clear we want to do this now, let's back out if it causes issues.
Attachment #8387597 - Flags: review?(milan) → review+
The landing stuck, turned it on by default here:

http://hg.mozilla.org/integration/mozilla-inbound/rev/101152af314f

There seemed to be consensus since it was all green that turning it on my default right now was a good idea. If I interpreted this wrongly I apologize and do backout, but as far as I can see right now the more testing, the earlier, the better.
No longer depends on: 981817
Depends on: 981794
Depends on: 982128
Depends on: 984531
Depends on: 983883
Depends on: 984618
Depends on: 984673
Depends on: 984577
Depends on: 985593
Depends on: 985162
Depends on: 977880
Depends on: 985170
Depends on: 984482
Flags: needinfo?(martijn.martijn)
No longer depends on: 983883
No longer depends on: 985162
No longer depends on: 984531
No longer depends on: 984577
No longer depends on: 985170
No longer depends on: 985779
No longer depends on: 985593
No longer depends on: 984482
Can we follow up on the review here as promised?
Depends on: 992985
Depends on: 992218
Depends on: 1014333
Depends on: 1009306
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: