Closed Bug 1083101 Opened 10 years ago Closed 7 years ago

Add Multithreaded DrawTargetTiled version to Moz2D

Categories

(Core :: Graphics, defect, P1)

defect

Tracking

()

RESOLVED WONTFIX
tracking-b2g +

People

(Reporter: bas.schouten, Assigned: nical)

References

(Depends on 1 open bug)

Details

(Keywords: perf, Whiteboard: [leave-open])

Attachments

(15 files, 23 obsolete files)

(deleted), patch
jrmuizel
: review-
Details | Diff | Splinter Review
(deleted), patch
jrmuizel
: review+
nical
: checkin+
Details | Diff | Splinter Review
(deleted), patch
jrmuizel
: review+
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
jrmuizel
: review+
nical
: checkin+
Details | Diff | Splinter Review
(deleted), patch
jrmuizel
: review+
nical
: checkin+
Details | Diff | Splinter Review
(deleted), patch
jrmuizel
: review+
nical
: checkin+
Details | Diff | Splinter Review
(deleted), patch
jrmuizel
: review+
nical
: checkin+
Details | Diff | Splinter Review
(deleted), patch
jrmuizel
: review+
nical
: checkin+
Details | Diff | Splinter Review
(deleted), patch
jrmuizel
: review+
nical
: checkin+
Details | Diff | Splinter Review
(deleted), patch
jrmuizel
: review+
nical
: checkin+
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
The intention is to add a Multithreaded DrawTargetTiled to Moz2D. That will allow sus to explore the potential performance gains that can be achieved by using multiple cores when painting content.
blocking-b2g: --- → 2.2?
(In reply to Bas Schouten (:bas.schouten) from comment #0)
> The intention is to add a Multithreaded DrawTargetTiled to Moz2D. That will
> allow sus to explore the potential performance gains that can be achieved by
> using multiple cores when painting content.

Could you please point me high level gecko function which is doing this task now in current build ? Then we can evaluate and check how much time we are spending in doing this task on single core.
Flags: needinfo?(bas)
(In reply to Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me) from comment #1)
> Could you please point me high level gecko function which is doing this task
> now in current build ? Then we can evaluate and check how much time we are
> spending in doing this task on single core.

This corresponds to the "rasterization" time reported by the profiler. Otherwise you can look at how much time we spend in mozilla::gfx::DrawTarget methods (DrawTargetCairo in the case of B2G I think), or in gfxContext at a slightly higher level. I doubt that single core devices will ever see a win from running multi-threaded DrawTarget, it'll probably just use the single-threaded code paths.
(In reply to Nicolas Silva [:nical] from comment #2)
> (In reply to Tapas[:tkundu on #b2g/gaia/memshrink/gfx] (always NI me) from
> comment #1)
> > Could you please point me high level gecko function which is doing this task
> > now in current build ? Then we can evaluate and check how much time we are
> > spending in doing this task on single core.
> 
> This corresponds to the "rasterization" time reported by the profiler.
> Otherwise you can look at how much time we spend in mozilla::gfx::DrawTarget
> methods (DrawTargetCairo in the case of B2G I think), or in gfxContext at a
> slightly higher level. I doubt that single core devices will ever see a win
> from running multi-threaded DrawTarget, it'll probably just use the
> single-threaded code paths.

As Nical said.
Flags: needinfo?(bas)
This is new functionality, won't be a blocker bug.  After our planning, it may become a 2.2 feature, but there is not a lot of chance of that with the shortened timeframe.  Will keep this bug up to date.
blocking-b2g: 2.2? → ---
feature-b2g: --- → 2.2?
blocking-b2g: --- → backlog
feature-b2g: 2.2? → ---
Attached patch Part 1: Allow multiple ReadMaps to a DataSourceSurface. (obsolete) (deleted) β€” β€” Splinter Review
Multiple drawing threads can be executing commands using the same DataSourceSurface as a source for reading. This is fine but currently we assert when Unmap is called multiple times, this fixes that for DataSourceSurface in general. D2D will require slightly more complicated code to support this but I doubt this will matter much at the moment.
Attachment #8517348 - Flags: review?(jmuizelaar)
Attached patch Part 2: Make Moz2D refcounting atomic. (obsolete) (deleted) β€” β€” Splinter Review
For obvious reasons.
Attachment #8517353 - Flags: review?(jmuizelaar)
Comment on attachment 8517353 [details] [diff] [review]
Part 2: Make Moz2D refcounting atomic.

Review of attachment 8517353 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/2d/2D.h
@@ +343,5 @@
>    }
>  
>  protected:
>    friend class DrawTargetCaptureImpl;
> +  friend class DrawTargetTiledMT;

Whoops, will move this hunk to another patch.
Attached patch Part 3: Extend and fix DrawCommand storage facilities (obsolete) (deleted) β€” β€” Splinter Review
Attachment #8517354 - Flags: review?(jmuizelaar)
Attached patch Part 4: Add multithreaded DrawTargetTiled to Moz2D. (obsolete) (deleted) β€” β€” Splinter Review
Attachment #8517357 - Flags: review?(jmuizelaar)
Attached patch Part 4: Add multithreaded DrawTargetTiled to Moz2D. v2 (obsolete) (deleted) β€” β€” Splinter Review
And this time with the new files.
Attachment #8517357 - Attachment is obsolete: true
Attachment #8517357 - Flags: review?(jmuizelaar)
Attachment #8517359 - Flags: review?(jmuizelaar)
Attached patch Part 5: Make pixman image refcounting atomic. (obsolete) (deleted) β€” β€” Splinter Review
This is required because cairo keeps a cache of single solid color pixman images around for complex filling purposes. These can race when cairo is used from multiple thread. This seems like a bug in cairo multithreaded.
Attachment #8517360 - Flags: review?(jmuizelaar)
Attached patch Part 6: Integrate DrawTargetTiledMT in Gecko. (obsolete) (deleted) β€” β€” Splinter Review
Attachment #8517363 - Flags: review?(jmuizelaar)
Comment on attachment 8517360 [details] [diff] [review]
Part 5: Make pixman image refcounting atomic.

Review of attachment 8517360 [details] [diff] [review]:
-----------------------------------------------------------------

I'd rather we do something like http://cgit.freedesktop.org/cairo/commit/?id=71e8a4c23019b01aa43b334fcb2784c70daae9b5 to be closer to upstream.
Attachment #8517360 - Flags: review?(jmuizelaar) → review-
Comment on attachment 8517359 [details] [diff] [review]
Part 4: Add multithreaded DrawTargetTiled to Moz2D. v2

Review of attachment 8517359 [details] [diff] [review]:
-----------------------------------------------------------------

It seems like there's a lot of duplicated code between TileCommandProcessor_posix.cpp and TileCommandProcess_win32.cpp. Can you abstract out the threading primitives instead or duplicating the rest of the logic?
(In reply to Jeff Muizelaar [:jrmuizel] from comment #14)
> Comment on attachment 8517359 [details] [diff] [review]
> Part 4: Add multithreaded DrawTargetTiled to Moz2D. v2
> 
> Review of attachment 8517359 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> It seems like there's a lot of duplicated code between
> TileCommandProcessor_posix.cpp and TileCommandProcess_win32.cpp. Can you
> abstract out the threading primitives instead or duplicating the rest of the
> logic?

No, they use different threading primitives and the actual scheduling mechanism is a little different. I could abstract out -some- more code and I looked into doing this, but the value in the end seemed very limited.
(In reply to Bas Schouten (:bas.schouten) from comment #15)
> (In reply to Jeff Muizelaar [:jrmuizel] from comment #14)
> > Comment on attachment 8517359 [details] [diff] [review]
> > Part 4: Add multithreaded DrawTargetTiled to Moz2D. v2
> > 
> > Review of attachment 8517359 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > It seems like there's a lot of duplicated code between
> > TileCommandProcessor_posix.cpp and TileCommandProcess_win32.cpp. Can you
> > abstract out the threading primitives instead or duplicating the rest of the
> > logic?
> 
> No, they use different threading primitives and the actual scheduling
> mechanism is a little different. I could abstract out -some- more code and I
> looked into doing this, but the value in the end seemed very limited.

What's the difference?
(In reply to Jeff Muizelaar [:jrmuizel] from comment #16)
> (In reply to Bas Schouten (:bas.schouten) from comment #15)
> > (In reply to Jeff Muizelaar [:jrmuizel] from comment #14)
> > > Comment on attachment 8517359 [details] [diff] [review]
> > > Part 4: Add multithreaded DrawTargetTiled to Moz2D. v2
> > > 
> > > Review of attachment 8517359 [details] [diff] [review]:
> > > -----------------------------------------------------------------
> > > 
> > > It seems like there's a lot of duplicated code between
> > > TileCommandProcessor_posix.cpp and TileCommandProcess_win32.cpp. Can you
> > > abstract out the threading primitives instead or duplicating the rest of the
> > > logic?
> > 
> > No, they use different threading primitives and the actual scheduling
> > mechanism is a little different. I could abstract out -some- more code and I
> > looked into doing this, but the value in the end seemed very limited.
> 
> What's the difference?

So for one if you look at the code, one uses a win32 signal for shutting down, which creates a slightly more elegant shutdown mechanism than when using a CondVar. It also uses a Win32 event for the scheduling of commands. Which in my last testing was a little more efficient than using a CondVar on windows.
(In reply to Bas Schouten (:bas.schouten) from comment #17)
> (In reply to Jeff Muizelaar [:jrmuizel] from comment #16)
> > (In reply to Bas Schouten (:bas.schouten) from comment #15)
> > > (In reply to Jeff Muizelaar [:jrmuizel] from comment #14)
> > > > Comment on attachment 8517359 [details] [diff] [review]
> > > > Part 4: Add multithreaded DrawTargetTiled to Moz2D. v2
> > > > 
> > > > Review of attachment 8517359 [details] [diff] [review]:
> > > > -----------------------------------------------------------------
> > > > 
> > > > It seems like there's a lot of duplicated code between
> > > > TileCommandProcessor_posix.cpp and TileCommandProcess_win32.cpp. Can you
> > > > abstract out the threading primitives instead or duplicating the rest of the
> > > > logic?
> > > 
> > > No, they use different threading primitives and the actual scheduling
> > > mechanism is a little different. I could abstract out -some- more code and I
> > > looked into doing this, but the value in the end seemed very limited.
> > 
> > What's the difference?
> 
> So for one if you look at the code, one uses a win32 signal for shutting
> down, which creates a slightly more elegant shutdown mechanism than when
> using a CondVar. It also uses a Win32 event for the scheduling of commands.
> Which in my last testing was a little more efficient than using a CondVar on
> windows.

Oh, the Win32 version is also able to use an Event for waiting for completion, which created a slightly more elegant completion reporting mechanism.
add tracking flag for backlog.
tracking-b2g: --- → +
blocking-b2g: backlog → ---
Comment on attachment 8517353 [details] [diff] [review]
Part 2: Make Moz2D refcounting atomic.

Review of attachment 8517353 [details] [diff] [review]:
-----------------------------------------------------------------

We need documentation on the thread safety guarantees provided by Moz2D. If SourceSurfaces can be used by multiple threads we'll need to fix the DrawTargetWillChange implementations to be threadsafe.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #20)
> Comment on attachment 8517353 [details] [diff] [review]
> Part 2: Make Moz2D refcounting atomic.
> 
> Review of attachment 8517353 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> We need documentation on the thread safety guarantees provided by Moz2D. If
> SourceSurfaces can be used by multiple threads we'll need to fix the
> DrawTargetWillChange implementations to be threadsafe.

Note for current usage this doesn't matter as DTWC won't be accessed on any of the involved DTs, but it's a fair point. Note Matt's Canvas patch already adds the atomic refcounts and uses them off multiple threads.
Attached patch Part 1: Allow multiple readmaps to DataSourceSurface (obsolete) (deleted) β€” β€” Splinter Review
Rebased.
Assignee: bas → nical.bugzilla
Attachment #8517348 - Attachment is obsolete: true
Attachment #8517348 - Flags: review?(jmuizelaar)
Attachment #8604657 - Flags: review?(jmuizelaar)
Attached patch Part 2: Make Moz2D ref counting atomic (deleted) β€” β€” Splinter Review
Rebased.
Attachment #8517353 - Attachment is obsolete: true
Attachment #8517353 - Flags: review?(jmuizelaar)
Attachment #8604659 - Flags: review?(jmuizelaar)
Rebased.
Attachment #8517354 - Attachment is obsolete: true
Attachment #8517354 - Flags: review?(jmuizelaar)
Attachment #8604660 - Flags: review?(jmuizelaar)
Attached patch Part 4: Add multithreaded DrawTargetTiled (obsolete) (deleted) β€” β€” Splinter Review
Rebased, fixed some threading issues and added (perhaps a tad too many) assertions.
Attachment #8517359 - Attachment is obsolete: true
Attachment #8517359 - Flags: review?(jmuizelaar)
Attachment #8604665 - Flags: review?(jmuizelaar)
Attachment #8517360 - Attachment is obsolete: true
Attachment #8604667 - Flags: review?(jmuizelaar)
Rebased.

With these 6 patches, enabling multi-threaded DrawTarget "survives" through reftests on mac (a lot of them fail but no crashes or dead-locks).
I tested using cairo as the drawing backend because I ran into crashes inside CG locally so I assume there are some still some issues with using CG DrawTargets from multiple threads.
Attachment #8517363 - Attachment is obsolete: true
Attachment #8517363 - Flags: review?(jmuizelaar)
Attachment #8604674 - Flags: review?(jmuizelaar)
Attachment #8604667 - Flags: review?(jmuizelaar) → review+
Comment on attachment 8604657 [details] [diff] [review]
Part 1: Allow multiple readmaps to DataSourceSurface

Review of attachment 8604657 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/2d/SourceSurfaceD2D.cpp
@@ +252,5 @@
>  {
>    // DataSourceSurfaces used with the new Map API should not be used with GetData!!
>    MOZ_ASSERT(!mMapped);
> +  // Add support for multiple read locks here!
> +  MOZ_ASSERT(mMapCount == 0);

This assertion seems to contradict the title of the patch.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #28)
> 
> This assertion seems to contradict the title of the patch.

True that. I think that the original intent was to get some form of initial support for cpu backends and just assert that we are not doing it with D2D.
Comment on attachment 8604665 [details] [diff] [review]
Part 4: Add multithreaded DrawTargetTiled

Review of attachment 8604665 [details] [diff] [review]:
-----------------------------------------------------------------

I am reworking the TiledCommandProcessor stuff to get something simpler and probably a tad slower, but that works, so you can either ignore the review of this patch for now or, even better, start looking at it but ignoring the complex logic in TiledCommandProcessor_* for now.
Attachment #8604665 - Flags: review?(jmuizelaar) → feedback?(jmuizelaar)
This patch fixes some bugs with the posix implementation and simplifies it a lot.
I first tried to abstract out synchronization primitives and have one queue for all platforms, but it turned out to be surprisingly hard and inefficient, due to the lack of condition variables on windows xp. Bas also assured me that his fine tuned win32 implementation is correct and 20% faster than a naΓ―ve single-mutex implementation so I ended up keeping it as-is (with the hope that I will never have to fix bugs in it...).

For the posix version (at least) I have a few optimizations in mind that don't involve complex synchronization primitives, but I'll play with them when this has landed.
Attachment #8604665 - Attachment is obsolete: true
Attachment #8604665 - Flags: feedback?(jmuizelaar)
Attachment #8620396 - Flags: review?(jmuizelaar)
Comment on attachment 8620396 [details] [diff] [review]
Part 4: Add multithreaded DrawTargetTiled

Review of attachment 8620396 [details] [diff] [review]:
-----------------------------------------------------------------

Can we split out the TaskQueue implementation from the TileCommandProcessor?
(In reply to Jeff Muizelaar [:jrmuizel] from comment #32)
> Comment on attachment 8620396 [details] [diff] [review]
> Part 4: Add multithreaded DrawTargetTiled
> 
> Review of attachment 8620396 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Can we split out the TaskQueue implementation from the TileCommandProcessor?

Sure, TileCommandProcessor doesn't do much at this point (nothing but forwarding commands to a global task queue). The nice thing with the current setup is that it lets us keep the declaration of the task queue private instead of having a TaskQueueMT.h header with different declarations in posix and win32 targets.
Comment on attachment 8620396 [details] [diff] [review]
Part 4: Add multithreaded DrawTargetTiled

Review of attachment 8620396 [details] [diff] [review]:
-----------------------------------------------------------------

This could use some high-level documentation of how it works. There seem to be two lists and it's not yet clear to me why. Overall, the amount of synchronization used seems pretty heavyweight but it's hard for me to judge because I don't understand it all yet. If you could give an explanation of why each bit of synchronization is needed that would be nice too.

::: gfx/2d/DrawTargetTiledMT.cpp
@@ +88,5 @@
> +DrawTargetTiledMT::DrawFilter(FilterNode *aNode,
> +                                  const Rect &aSourceRect,
> +                                  const Point &aDestPoint,
> +                                  const DrawOptions &aOptions)
> +{

This should have a comment as to why it's synchronous.

@@ +230,5 @@
> +void
> +DrawTargetTiledMT::CopySurface(SourceSurface *aSurface,
> +                             const IntRect &aSourceRect,
> +                             const IntPoint &aDestination)
> +{

This should have a comment as to why it's synchronous.

::: gfx/2d/DrawTargetTiledMT.h
@@ +35,5 @@
> +
> +  bool mClippedOut;
> +  bool mTransformDirty;
> +  // Only member that may be accessed off the main thread by the CommandProcessor.
> +  Task* mLastTask;

I don't understand what this member is for.
Agreed, this needs more documentation.

Tasks represent a drawing command on a given tile.
The two lists work as follows:
 * all tasks are in a std::list
 * tasks for a given tile are pointing to the next task for the same tile to keep track of which task can actually be executed. when a task can be executed, it means it is the first task for a given tile (as in, all of the tasks that come before it for this tile have been processed already, therefor this task is now "the first" for this tile).

When a worker thread tries to get a task to process, it walks the std::list of tasks until it finds one that is marked first (so there is no other task that should be executed before it).
Think of the second nested linked list (the one that is not std::list) as a simple for of dependency graph.

mLastTask is needed when adding a new task to put it at the end of the second linked list (to set the mNext pointer, of the previously last task to point to the new last task).
This ^^ would be useful in code, especially if it's in doxygen format so that it gets extracted.
Comment on attachment 8604659 [details] [diff] [review]
Part 2: Make Moz2D ref counting atomic

Review of attachment 8604659 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not in a rush to do this until we actually need it. The threadness guarantees provided by Moz2D seeem to still be insufficient for it to be used on multiple threads.
Attachment #8604659 - Flags: review?(jmuizelaar) → review-
(In reply to Jeff Muizelaar [:jrmuizel] from comment #37)
> Comment on attachment 8604659 [details] [diff] [review]
> Part 2: Make Moz2D ref counting atomic
> 
> Review of attachment 8604659 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I'm not in a rush to do this until we actually need it. The threadness
> guarantees provided by Moz2D seeem to still be insufficient for it to be
> used on multiple threads.

It also looks like https://msdn.microsoft.com/en-us/library/windows/desktop/jj569217%28v=vs.85%29.aspx suggests that we are already being bad about using our D3D device
Comment on attachment 8620396 [details] [diff] [review]
Part 4: Add multithreaded DrawTargetTiled

Review of attachment 8620396 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/2d/DrawTargetTiledMT.cpp
@@ +68,5 @@
> +  FlushCommand command;
> +
> +  TileCommandProcessor::PostTask(&command, mTransform, &mTiles.front(), mTiles.size(), false);
> +
> +  FinishDrawing();

Question:  Is this function, DrawTargetTiledMT::Flush, called from the main thread?
if so, main-thread will be still block here.
(In reply to C.J. Ku[:cjku] from comment #39)
> Comment on attachment 8620396 [details] [diff] [review]
> Part 4: Add multithreaded DrawTargetTiled
> 
> Review of attachment 8620396 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/2d/DrawTargetTiledMT.cpp
> @@ +68,5 @@
> > +  FlushCommand command;
> > +
> > +  TileCommandProcessor::PostTask(&command, mTransform, &mTiles.front(), mTiles.size(), false);
> > +
> > +  FinishDrawing();
> 
> Question:  Is this function, DrawTargetTiledMT::Flush, called from the main
> thread?
> if so, main-thread will be still block here.

Yes, I am in the process of re-doing this in a way that, in most cases, we'll only have to block the main thread at the end of the transaction (currently, flush is typically called once per layer after all of the drawing commands).
Attached patch Add a basic memory pool to Moz2D (obsolete) (deleted) β€” β€” Splinter Review
Some patches to come are relying on this memory pool, and I'll also modify Part4 to use it instead of doing its own thing for the comand buffer.

The memory pool is rather simple. It can be created either as a growable or a non-growable pool, and is used like this:

> auto handle = pool.Alloc<Type>(constructorParams);
> Type* tempPtr = pool.GetStorage(handle);

The pool does not attempt to run any destructor or copy/move-constructor when the storage is cleared or reallocated.
Attachment #8634224 - Flags: review?(jmuizelaar)
Comment on attachment 8634224 [details] [diff] [review]
Add a basic memory pool to Moz2D

Review of attachment 8634224 [details] [diff] [review]:
-----------------------------------------------------------------

Some quick comments:

It seems like this more of an arena/region than a pool because you can't unallocate individual objects.

FWIW, here's my version of an arena:
https://github.com/jrmuizel/full-scene-rasterizer/blob/master/arena.h

The paper cited might be worth looking at.
Attached patch Add a task scheduler to Moz2D (WIP) (obsolete) (deleted) β€” β€” Splinter Review
To give an idea of where I am going with my alternative multi-threaded tiling experiment. TaskBuffers are created by the main thread and enqued in a very simple multi-threaded queue. Worker threads dequeue task buffers and process tasks until the end of the task buffer or until they reach a SyncObject that is not in the wait state. Sync objects can express dependencies betwen task buffers in a way that is similar to fence objects in gpu APIs.

Among the advantages of this approach, it can express the graph of all drawing commands and their dependencies for the entire frame and let us only block at the end of the transaction (unless we try to read from the DrawTarget's snapshot and a few other obvious cases), while the other approach has a lot of points of synchronization. This one should also have much less contention on the queue's locks.
It is also simpler in general, because while there's more code and concepts, the tricky thread synchronization code of the other method are much harder to understand and debug.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #42)
> It seems like this more of an arena/region than a pool because you can't
> unallocate individual objects.

True that, I'll rename the class.

> 
> FWIW, here's my version of an arena:
> https://github.com/jrmuizel/full-scene-rasterizer/blob/master/arena.h
> 
> The paper cited might be worth looking at.

Interesting, I'll have a look at the paper. I also like that your version recycle arena buffers automatically (I planned to have worker threads and the multi-threaded DrawTarget do that for TaskBuffers manually).
Attached patch Add a basic memory arena to Moz2D. (obsolete) (deleted) β€” β€” Splinter Review
The MemPool patch renamed into Arena. Also I forgot to hg add the tests in the previous patch.
Attachment #8634224 - Attachment is obsolete: true
Attachment #8634224 - Flags: review?(jmuizelaar)
Attachment #8635269 - Flags: review?(jmuizelaar)
Attached patch Add a task scheduler to Moz2D (obsolete) (deleted) β€” β€” Splinter Review
Most of the task scheduler work is done now, with some neat gtest tests in TestTaskScheduler.cpp.

Most of what's left is DrawTarget/DrawingCommand integration, and make sure it builds on non-posix platforms (I haven't implemented the windows version yet).
Attachment #8634246 - Attachment is obsolete: true
Attachment #8635270 - Flags: feedback?(jmuizelaar)
Comment on attachment 8635269 [details] [diff] [review]
Add a basic memory arena to Moz2D.

Review of attachment 8635269 [details] [diff] [review]:
-----------------------------------------------------------------

It's worth noting up front that this stores each objects size so that it can iterate over all of the objects. It might be worth calling this an IterableArena.
Attached patch Add a basic memory arena to Moz2D. (deleted) β€” β€” Splinter Review
Renamed Arena into IterableArena.
Attachment #8635269 - Attachment is obsolete: true
Attachment #8635269 - Flags: review?(jmuizelaar)
Attachment #8637868 - Flags: review?(jmuizelaar)
Attached patch Add a task scheduler to Moz2D (obsolete) (deleted) β€” β€” Splinter Review
Attachment #8635270 - Attachment is obsolete: true
Attachment #8635270 - Flags: feedback?(jmuizelaar)
Attachment #8637871 - Flags: review?(jmuizelaar)
Comment on attachment 8637871 [details] [diff] [review]
Add a task scheduler to Moz2D

Review of attachment 8637871 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/2d/TaskScheduler.h
@@ +1,4 @@
> +/* -*- Mode: C++; tab-width: 20; indent-tabs-mode: nil; c-basic-offset: 2 -*-
> + * This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */

From what I understand of this so far it seems pretty sane.

It would be valuable to have a high level description of the architecture here. I'm a little concerned about Run taking a TaskBuffer* as a parameter because it feels like it leaks the abstraction a little but don't have an alternative to suggest at this point.
Comment on attachment 8637871 [details] [diff] [review]
Add a task scheduler to Moz2D

Review of attachment 8637871 [details] [diff] [review]:
-----------------------------------------------------------------

I feel less good about this now. I feel like it might be better to have an explicit graph of prerequisites and subsequents instead of leaving it up to the user to set this up.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #50)
> Comment on attachment 8637871 [details] [diff] [review]
> Add a task scheduler to Moz2D
> 
> Review of attachment 8637871 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/2d/TaskScheduler.h
> @@ +1,4 @@
> > +/* -*- Mode: C++; tab-width: 20; indent-tabs-mode: nil; c-basic-offset: 2 -*-
> > + * This Source Code Form is subject to the terms of the Mozilla Public
> > + * License, v. 2.0. If a copy of the MPL was not distributed with this
> > + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> 
> From what I understand of this so far it seems pretty sane.
> 
> It would be valuable to have a high level description of the architecture
> here. I'm a little concerned about Run taking a TaskBuffer* as a parameter
> because it feels like it leaks the abstraction a little but don't have an
> alternative to suggest at this point.

I am thinking about having the TaskBuffers "process themselves" rather than having the TaskScheduler::ProcessTaks method do it. This way we could have different kinds of task buffers with different TaskBufferData (right now tiling is sort of built into the TaksBufferData struct and thus into the TaskBuffer class as well).
Run taking TaskBuffer* makes it possible to implement the WaitTask which hands the TaskBuffer to a SyncObject.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #51)
> Comment on attachment 8637871 [details] [diff] [review]
> Add a task scheduler to Moz2D
> 
> Review of attachment 8637871 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I feel less good about this now. I feel like it might be better to have an
> explicit graph of prerequisites and subsequents instead of leaving it up to
> the user to set this up.

Could you give me more details about what you have in mind (like, what part of the explicit graph would not be up to the user to set up)?
Here's a simplified description of the Async Task Graph system used in the Unreal engine which is more a long the lines of what I was thinking:

TaskGraph {
  SubmitTask(Task task, TaskEventArray prerequisites); 
  WorkerThread threads[n];
};

WorkerThread {
  Queue(Task task, TaskEventArray prerequisites);
  // Each thread has it's own queue, but can steal work from other threads. If the queue
  // is empty the thread will stall waiting on an os level synchronization primitive.
  TaskQueue queue;
}

Task {
  Execute() {
    if (numOfRemainingPrereqs-- == 0) {
       return;
    } else {
       // Do work
       // Queue subsequent tasks
    }
  }
  Atomic<int> numOfRemainingPrereqs;
  RefPtr<TaskEvent> subsequents;
}

TaskEvent {
  List<Task> subsequents;
}

TaskGraph::SubmitTask() {
   // round-robin schedule the task on one of the worker threads
}

WorkerThread::Queue(Task task, TaskEventArray prerequisites) {
   // check for any prerequisites that are not complete and add this task
   // as a subsequent to those tasks
   // If all are complete add to this threads queue.
}

If you have a lock free linked list implementation this can be implemented with very little OS level synchronization.

The work stealing design seems pretty common these days and is used in Intel's Thread Build Blocks and Cilk frameworks as well as the Java ForkJoinPool and .Net ThreadPool

In our case I can imagine having Tasks that have chunks of Moz2D commands that are executed but without the Task system having to have the concept of a TaskBuffer. i.e. the chunking of work is completely separate from the task system.

Thoughts?
Comment on attachment 8637871 [details] [diff] [review]
Add a task scheduler to Moz2D

Review of attachment 8637871 [details] [diff] [review]:
-----------------------------------------------------------------

Nical and I talked about this over Skype and he agreed to a number of changes.
Attachment #8637871 - Flags: review?(jmuizelaar) → review-
Comment on attachment 8637868 [details] [diff] [review]
Add a basic memory arena to Moz2D.

Review of attachment 8637868 [details] [diff] [review]:
-----------------------------------------------------------------

We can land this now.
Attachment #8637868 - Flags: review?(jmuizelaar) → review+
Attached patch Add a task scheduler to Moz2D (obsolete) (deleted) β€” β€” Splinter Review
This version removes drawing commands from the task scheduler (drawing commands have moved to a separate patch) as discussed with Jeff.
Also, SyncObject now only holds their mutex when touching the list of waiting tasks (the mSignals counter is using atomics), and EventObject is used to perform synchronous waits (it was previously done with SyncObjects).
Attachment #8637871 - Attachment is obsolete: true
Attachment #8641076 - Flags: review?(jmuizelaar)
Comment on attachment 8641076 [details] [diff] [review]
Add a task scheduler to Moz2D

Review of attachment 8641076 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/2d/TaskScheduler.h
@@ +42,5 @@
> +  /// place the Task back in the queue of available task, giving a chance to
> +  /// another task to be scheduled in its place.
> +  ///
> +  /// Should not be used inside of Run implementations.
> +  TaskStatus Yield();

What is Yield() needed for? It looks like if we drop it we can drop mQueue member.
Comment on attachment 8641076 [details] [diff] [review]
Add a task scheduler to Moz2D

Review of attachment 8641076 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/2d/TaskScheduler.h
@@ +85,5 @@
> +  /// Create a synchronization object.
> +  ///
> +  /// aNumSignals determines the number of times the Signal() must be called
> +  /// before the object gets into the signaled state.
> +  SyncObject(int aNumSignals = 1);

Can you explain the purpose of this parameter? From my quick look at the test cases it looks like users are required to set it to the number of dependencies that a task has. Why not just have the task system do that? When would you want them to be different?

::: gfx/2d/TaskScheduler_posix.cpp
@@ +127,5 @@
> +void
> +WorkerThread::Run()
> +{
> +  for (;;) {
> +    Task* commands = nullptr;

I assume 'commands' is a remnant of a previous patch. 'task' would be a better name.
Attached patch Implement the drawing task and CommandBuffer (obsolete) (deleted) β€” β€” Splinter Review
Attachment #8642313 - Flags: review?(jmuizelaar)
Attached patch Mark DrawingCommand's methods const (deleted) β€” β€” Splinter Review
Attachment #8642319 - Flags: review?(jmuizelaar)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #59)
> Comment on attachment 8641076 [details] [diff] [review]
> Add a task scheduler to Moz2D
> 
> Review of attachment 8641076 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/2d/TaskScheduler.h
> @@ +85,5 @@
> > +  /// Create a synchronization object.
> > +  ///
> > +  /// aNumSignals determines the number of times the Signal() must be called
> > +  /// before the object gets into the signaled state.
> > +  SyncObject(int aNumSignals = 1);
> 
> Can you explain the purpose of this parameter? From my quick look at the
> test cases it looks like users are required to set it to the number of
> dependencies that a task has.

This is indeed the number of dependencies for things that wait on the sync object. I ll update the comment.

> Why not just have the task system do that?
> When would you want them to be different?

You mean have the sync object functionality directly in the Task class? this would make it less convenient to have several tasks depending on the same synchronization point: it would mean preceding tasks would have to keep a list of all of their subsequents, while now we can have several preceding tasks signal one SyncObject and have several susbsequent tasks wait for that sync object (and only the one SyncObject has to keep a list of the subsequents).
If the sync object functionality was to move in Task, we'd have to make Task reference counted, make it possible to mutate its list of subsequent while it is already submitted and pay for the synchronization cost of doing that for every preceding task rather than for one sync object. Also, debugging a race condition happening during the interaction between N tasks and 1 sync objects is easier than a race happening between N tasks and M tasks (IMO).

> 
> ::: gfx/2d/TaskScheduler_posix.cpp
> @@ +127,5 @@
> > +void
> > +WorkerThread::Run()
> > +{
> > +  for (;;) {
> > +    Task* commands = nullptr;
> 
> I assume 'commands' is a remnant of a previous patch. 'task' would be a
> better name.

yep
Attached patch Add a task scheduler to Moz2D (obsolete) (deleted) β€” β€” Splinter Review
fixed various typos and comments, but the patch remains mostly the same with the exception that I added a parameter for the number of queues in addition to the number of threads (for now each individual worker thread only pulls from one queue but we can have, say 2 queues with 2 threads each or a queue per thread, or one queue for all threads, etc.). I haven't yet looked at how this parameter affects performance, but it did affect the reproducibility of a bug that I was hunting so it's good to keep.
Attachment #8641076 - Attachment is obsolete: true
Attachment #8641076 - Flags: review?(jmuizelaar)
Attachment #8644335 - Flags: review?(nical.bugzilla)
Very few changes: AddTask was returning void while we need to it to return the command offset, and a few other minor tweaks.
Attachment #8642313 - Attachment is obsolete: true
Attachment #8642313 - Flags: review?(jmuizelaar)
Attachment #8644344 - Flags: review?(jmuizelaar)
Attachment #8644335 - Flags: review?(nical.bugzilla) → review?(jmuizelaar)
Attached patch implement the TaskScheduler based DrawTarget (obsolete) (deleted) β€” β€” Splinter Review
This patch needs to be tweaked and cleaned up a bit, but I can browse the web with tiles painted in parallel without crashing so it's encouraging!
Attachment #8644363 - Flags: feedback?(jmuizelaar)
(In reply to Nicolas Silva [:nical] from comment #62)
> You mean have the sync object functionality directly in the Task class? 

No.

I'm wondering why we expose the sync objects at all. With the current SyncObject design
you need to make sure that you actually hook up the same number of tasks to a SyncObject
as you say you're going to. I'm wondering if there's value in having these mismatch.

i.e. Why not have test cases that look like something like this?

  TaskArray prereqs;
  for (uint32_t i = 0; i < aNumCmdBuffers; ++i) {
    Task* t1 = new TestTask(1, i, &check, TaskScheduler::GetDrawingQueue(),
                            nullptr);
    prereqs.push(t1->completion());
    TaskScheduler::SubmitTask(t1);

    MaybeYieldThread();
  }
  for (uint32_t i = 0; i < aNumCmdBuffers; ++i) {
    Task* t2 = new TestTask(2, i, &check, TaskScheduler::GetDrawingQueue(),
                            prereqs);
    TaskScheduler::SubmitTask(t2);
 
    MaybeYieldThread();
  }
}
(In reply to Jeff Muizelaar [:jrmuizel] from comment #66)
> as you say you're going to. I'm wondering if there's value in having these
> mismatch.

I don't see value in having them mismatch.

> 
> i.e. Why not have test cases that look like something like this?
> 
> [...]

what does t1->Completion() return? The way this code looks seems to imply that subsequents check a list of things rather than one thing to know if they can run (Γ  la UE4).
I think that it is more efficient and also simpler to have prerequisites signal one object and have subsequents register themselves on it rather than have subsequents go check a list of things. having one object that represents a point of synchronization is nice when debugging since it centralizes the states that can answer the question "why didn't the subsequent tasks run?". We could even have it keep lists of all prerequisites and subsequents in debug builds that would not be used by the scheduling logic but would just help tracing things around in gdb when debugging. Anyway, I believe having explicit objects for synchronization to be a good thing, and something that is easy enough to reason about that I wouldn't want to loose its benefits for the sake of only exposing tasks at the API level. 

However, I do agree that initializing SyncObjects with a number is not a super elegant API (and allows us to make mistakes).
How about having Tasks increment the SyncObject's number of signals in their constructor automatically?

RefPtr<SyncObject> sync = new SyncObject();
Task* task1 = new FoobarTask(TaskScheduler::GetDrawingQueue(),
                             nullptr, // no prerequisites
                             sync); // Task's ctor increments SyncObject::mSignals
TaskScheduler::SubmitTask(task1);
Task* task2 = new FoobarTask(TaskScheduler::GetDrawingQueue(),
                             nullptr, // no prerequisites
                             sync); // Task's ctor increments SyncObject::mSignals
TaskScheduler::SubmitTask(task2);


Task* task3 = new FoobarTask(TaskScheduler::GetDrawingQueue(),
                             sync, // task3 depends on task1 and task2
                             ...);
TaskScheduler::SubmitTask(task3);


This would mean that you expose yourself to bugs if you schedule a task before creating all of its prerequisites (arguably this is not a very intuitive thing to do), but in exchange you don't risk messing up the number of signals for the sync object.
We could check at runtime that a prerequisite is never registered to a sync object after a subsequent is scheduled.

How does that sound to you?
(In reply to Nicolas Silva [:nical] from comment #67)
> How does that sound to you?

I like the new api better. I'm still thinking about the rest of your comment.
(In reply to Nicolas Silva [:nical] from comment #67)
> We could check at runtime that a prerequisite is never registered to a sync
> object after a subsequent is scheduled.

Yes, and it might be good to even explicitly 'close()/finish()' a sync object.
It could even return a different type. Which would let you largely check this
invariant at compile time.
Attachment #8642319 - Flags: review?(jmuizelaar) → review+
Comment on attachment 8644335 [details] [diff] [review]
Add a task scheduler to Moz2D

Review of attachment 8644335 [details] [diff] [review]:
-----------------------------------------------------------------

We talked about a couple of modification to this. I'll wait for the new one.
Attachment #8644335 - Flags: review?(jmuizelaar)
Attached patch task-scheduler (obsolete) (deleted) β€” β€” Splinter Review
Updated patch, no more explicit number of signals in the SyncObject api. I tried to have an explicit FinishPrerequisites() call on SyncObject but I wasn't found of how it reflected on the code so I removed it and handled the checks implicitly in SyncObject's implem.
Attachment #8644335 - Attachment is obsolete: true
Attachment #8648029 - Flags: review?(jmuizelaar)
Attached patch implement the TaskScheduler based DrawTarget (obsolete) (deleted) β€” β€” Splinter Review
The implementation is mostly there. There's a glitch in the chrome that I am chasing, and some optimizations to be made (some operations are still performed on the main thread even though they could be deferred), but I can  browse around without crashing. I haven't measured performance, but we could land something close to this preffed off and improve it in-tree.
Attachment #8644363 - Attachment is obsolete: true
Attachment #8644363 - Flags: feedback?(jmuizelaar)
Attachment #8648034 - Flags: review?(jmuizelaar)
OS: Windows 8.1 → All
Hardware: x86_64 → All
It would be good to have a little benchmarking utility to measure task scheduler overhead. i.e. how much time does it take to schedule and run a bunch of tiny tasks.
Attached patch Win32 implementation for TaskScheduler (deleted) β€” β€” Splinter Review
Attachment #8649300 - Flags: review?(jmuizelaar)
Attached patch Add a task scheduler to Moz2D (obsolete) (deleted) β€” β€” Splinter Review
Same patch with Windows build fixes.
Attachment #8649302 - Flags: review?(jmuizelaar)
Attachment #8648029 - Attachment is obsolete: true
Attachment #8648029 - Flags: review?(jmuizelaar)
Attachment #8620396 - Flags: review?(jmuizelaar)
Attachment #8604674 - Flags: review?(jmuizelaar)
Comment on attachment 8649300 [details] [diff] [review]
Win32 implementation for TaskScheduler

Review of attachment 8649300 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/2d/TaskScheduler_win32.h
@@ +18,5 @@
>  
>  class WorkerThread;
>  class Task;
>  
>  class Mutex {

Can we call this a CriticalSection and move it into a separate header. I'd like this to move into MFBT.

I looked into moving our current mozilla::Mutex implementation into mfbt and doing so is very painful because of needing to work with condition variables which don't exist on XP and so require a complex implementation.

Even if you don't want to rename it would be good to have this in a separate header.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #76)
> Can we call this a CriticalSection and move it into a separate header. I'd
> like this to move into MFBT.
> 
> I looked into moving our current mozilla::Mutex implementation into mfbt and
> doing so is very painful because of needing to work with condition variables
> which don't exist on XP and so require a complex implementation.
> 
> Even if you don't want to rename it would be good to have this in a separate
> header.

Okay, sure.
Comment on attachment 8649300 [details] [diff] [review]
Win32 implementation for TaskScheduler

Review of attachment 8649300 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/2d/SourceSurfaceD2D.cpp
@@ +283,1 @@
>    }

Unneeded change.
Attachment #8649300 - Flags: review?(jmuizelaar) → review+
Attachment #8649302 - Flags: review?(jmuizelaar) → review+
Attachment #8604660 - Flags: review?(jmuizelaar) → review+
Attachment #8644344 - Flags: review?(jmuizelaar) → review+
Started landing things (not all of of the r+ed patches yet). I'll put CriticalSection in its own file as a followup patch.
https://hg.mozilla.org/integration/mozilla-inbound/rev/a277ab555649
Attachment #8654047 - Flags: review?(jmuizelaar)
Attachment #8654047 - Flags: review?(jmuizelaar) → review+
Missing MPL headers. Add them.
Flags: needinfo?(nical.bugzilla)
https://hg.mozilla.org/integration/mozilla-inbound/rev/0a6cc12c33d7
https://hg.mozilla.org/integration/mozilla-inbound/rev/42d192dbf938
(In reply to Jeff Muizelaar [:jrmuizel] from comment #68)
> (In reply to Nicolas Silva [:nical] from comment #67)
> > How does that sound to you?
> 
> I like the new api better. I'm still thinking about the rest of your comment.

Hm, so atcually this API/implem has a race. Consider:

> loop {
>   TaskScheduler::SubmitTask(new TaskA(nullptr, sync)); // all tasks will signal 'sync'
> }
> TaskScheduler::SubmitTask(new TaskB(sync, nullptr)); // this should wait for all of the TaskA to finish.

Creating a task increments the number of time sync must be signaled, and the completion of a taskA signals sync. When sync's signal count reaches zero, the tasks that are in sync's waiting list are scheduled.
The problem is that we don't hold a lock around sync's signals counter + its list of pending tasks. we have an atomic counter and a list with its own lock. This means that we can have the following happening:

 1) [worker thread] a TaskA finishes and which brings sync's signal counter to zero
 2) [main thread] another taskA is created, incrementing sync's signal counter
 3) [main thread] taskB is created, the signal counter is not 0, so it goes in sync's pending task list
 4) [worker thread] because of 1), scheduler all of the tasks in the pending task list
 ====> This schedules taskB which should still be waiting for the taskA from 2) to have completed

The previous version of the API, while being a bit uglier, didn't have this problem because you would never get into a state where sync's signals count goes to zero before all of its prerequisites have completed.

A solution could be to add a mutex/cirtical section around the signals counter + the list. I am discarding this right away because this adds a LOT of accesses to that mutex.
Another solution could be to enfore that we don't schedule a Task before all of the tasks that will signal the same sync object have been created, this would look like:

> vector<Task*> tasks;
> loop {
>   tasks.push_back(new TaskA(nullptr, sync));
> }
> loop {
>   TaskScheduler::SubmitTask(tasks.pop());
> }

But that's super inconvenient for the most common use case :(

Otherwise we go back to an API where you specify number of prerequisites when creating the SyncObject. That's my favourite solution so far, but I just had another idea while writing this so I need spend some quality time with a white board before I can formulate it here.
Flags: needinfo?(nical.bugzilla)
Attached patch Add a task scheduler to Moz2D (obsolete) (deleted) β€” β€” Splinter Review
Same patch with a little twist around mHasSubmittedSubsequent: the latter is not debug-only anymore. We check the value of mHasSubmittedSubsequent before deciding to submit a sync object's pending tasks. This removes the race condition I described in my previous comment.
Attachment #8649302 - Attachment is obsolete: true
Attachment #8656533 - Flags: review?(jmuizelaar)
Attachment #8656533 - Flags: review?(jmuizelaar) → review+
Turns out the upstream cairo patch (part 5) that makes removes the thread-unsafe usage of pixman ref-counting is what's causing the OOMs on tbpl. I haven't figured out why yet, though.
Backout:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7a7fa1f33ac2
So there was another issue with the SyncObject's atomics. If we come back to specifying the number of prerequisites ahead of time (instead of letting the api compute it automatically while things are getting scheduled all of that goes away because we have a single state that we need to access atomically rather than several of them. All of the mistakes that we can possibly make by specifying a wrong number of prerequisites are simpler to reason about and fix than implementing synchronized access to several states without a lock so I have made my mind to come back to the original API, with an integer argument in SyncObject's constructor to specify the number of dependencies.
(In reply to Nicolas Silva [:nical] from comment #94)
> So there was another issue with the SyncObject's atomics.

What is the other issue?
(In reply to Jeff Muizelaar [:jrmuizel] from comment #95)
> (In reply to Nicolas Silva [:nical] from comment #94)
> > So there was another issue with the SyncObject's atomics.
> 
> What is the other issue?

I am still trying to figure it out, but the test is intermittently timing out which means sometimes tasks never get scheduled (they stay in the pending task list forever).
Attached patch Add a task scheduler to Moz2D (deleted) β€” β€” Splinter Review
Re-asking for review since I made the following notable changes:
 - specify dependencies when creating the SyncObject
 - added thread names in the posix implementation
 - tasks can be optionally pinned to worker threads rather than to task queues (better matches my mental model and will make things less quirky when I add work-stealing). When passing nullptr (default) the task is submitted to a random queue and any worker can pick it up.
Attachment #8656533 - Attachment is obsolete: true
Attachment #8659121 - Flags: review?(jmuizelaar)
Blocks: 1180705
This replaces the std::vector of waiting tasks and its critical section by an intrusive linked list of tasks, that is accessed concurrently with atomics rather than a mutex.
The list is easy to implement with atomics because the only two operations we need are to push elements and empty the list.
This removes a lot of lock contention + the allocation of the vector and makes my little benchmark that schedules empty tasks between 2 and 2.5 times faster.
Attachment #8661800 - Flags: review?(jmuizelaar)
Implemented as a gtest. I don't think we'd want to land it as is, but I am not sure how to integrate something like this in our infra and if it would actually be usefull (perhaps I should just land it as an unused function and let people who modify TaskScheduler run it manually).
Note that this measures the worst case where all threads do nothing but thrashing one another's cache lines by reading and writing into the Task+SyncObject's atomic members and accessing the queue's mutex.
Attachment #8661800 - Flags: review?(jmuizelaar) → review+
Attachment #8659121 - Flags: review?(jmuizelaar) → review+
Attachment #8648034 - Flags: review?(jmuizelaar) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/61db1a51a7c4
https://hg.mozilla.org/integration/mozilla-inbound/rev/cd54e93993b4
https://hg.mozilla.org/integration/mozilla-inbound/rev/e39dfd9e05cb
Backed everything out in https://hg.mozilla.org/integration/mozilla-inbound/rev/5ce02bfa5d37 for build bustage
Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(nical.bugzilla)
Whiteboard: [leave-open]
Depends on: 1209039
Attachment #8604660 - Flags: checkin+
Attachment #8637868 - Flags: checkin+
Attachment #8642319 - Flags: checkin+
Attachment #8649300 - Flags: checkin+
Attachment #8654047 - Flags: checkin+
Attachment #8659121 - Flags: checkin+
Attachment #8661800 - Flags: checkin+
Attachment #8644344 - Flags: checkin+
rebased patch
Attachment #8648034 - Attachment is obsolete: true
tracking-b2g: + → ---
Priority: -- → P1
Nicolas, could you help to make sure this in v2.5/Gecko44?
Flags: needinfo?(nical.bugzilla)
(In reply to Bobby Chien [:bchien] from comment #108)
> Nicolas, could you help to make sure this in v2.5/Gecko44?

This is a very experimental and risky feature, I don't think it is a reasonable target for v2.5. I'll get back to focusing on it more as soon as I have dealt with some of the issues we are having with texture sharing and recycling.
Flags: needinfo?(nical.bugzilla)
No longer blocks: 1180705
tracking-b2g: --- → +
Keywords: perf
Confirmed that this patch fixes bug 1217194.
Attachment #8604657 - Attachment is obsolete: true
Attachment #8604657 - Flags: review?(jmuizelaar)
Attachment #8677697 - Flags: review?(jmuizelaar)
Blocks: 1217194
Comment on attachment 8677697 [details] [diff] [review]
Part 1: Allow multiple readmaps to DataSourceSurface (rebased)

Review of attachment 8677697 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/2d/2D.h
@@ +468,5 @@
> +    if (!aMappedSurface->mData) {
> +      return false;
> +    }
> +    mMapCount++;
> +    mIsReadMap = aMapType == MapType::READ;

mIsReadMap should be of MapType instead.

::: gfx/2d/SourceSurfaceD2D.cpp
@@ +278,5 @@
>      return false;
>    }
>  
> +  if (!map.pData) {
> +    return false

We should be consistent with if mData/mStride is changed if we return false.

Currently on nightly they -will- be changed and you preserved this behavior in the base type. You should either change them all or none.
Comment on attachment 8677697 [details] [diff] [review]
Part 1: Allow multiple readmaps to DataSourceSurface (rebased)

Review of attachment 8677697 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/2d/2D.h
@@ +469,5 @@
> +      return false;
> +    }
> +    mMapCount++;
> +    mIsReadMap = aMapType == MapType::READ;
> +    return true;

This doesn't seem to fail if you ask for WRITE map on something that has outstanding readmaps.

::: gfx/2d/SourceSurfaceD2D1.cpp
@@ +183,5 @@
>  {
>    // DataSourceSurfaces used with the new Map API should not be used with GetData!!
>    MOZ_ASSERT(!mMapped);
> +  // XXX - Add support for multiple read maps here!
> +  MOZ_ASSERT(mMapCount == 0);

Can we just do the XXX now?
Attachment #8677697 - Flags: review?(jmuizelaar) → review-
(In reply to Jeff Muizelaar [:jrmuizel] from comment #112)
> > +  MOZ_ASSERT(mMapCount == 0);
> 
> Can we just do the XXX now?

I don't think that it's going to be useful since the point of supporting multiple readmaps is to read a SourceSurface from multiple threads, and we won't be able to use the D2D backends for that.
Attachment #8677697 - Attachment is obsolete: true
Attachment #8692961 - Flags: review?(jmuizelaar)
Comment on attachment 8692961 [details] [diff] [review]
Part 1: Allow multiple readmaps to DataSourceSurface

Review of attachment 8692961 [details] [diff] [review]:
-----------------------------------------------------------------

Shouldn't we be returning false in the cases where we are triggering assertions?
(In reply to Jeff Muizelaar [:jrmuizel] from comment #115)
> Comment on attachment 8692961 [details] [diff] [review]
> Part 1: Allow multiple readmaps to DataSourceSurface
> 
> Review of attachment 8692961 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Shouldn't we be returning false in the cases where we are triggering
> assertions?

mIsReadmap is a DebugOnly thing, we can make it exist in release builds as well if you prefer, I'll go through the assertions based on mMapCount and make sure ::Map returns false in these cases.
I like the sound of mIsReadmap in release builds...
Depends on: 1239292
Depends on: 1264023
We are moving in a different direction with webrender.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
Attachment #8692961 - Flags: review?(jmuizelaar)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: