Open Bug 1650352 Opened 4 years ago Updated 2 years ago

Background Taskqueue should be interchangeable with standard nsThread without tweaks

Categories

(Core :: XPCOM, enhancement)

enhancement

Tracking

()

People

(Reporter: jya, Unassigned)

References

Details

In bug 1650277 I wanted to replace a Google's base::Thread with a background taskqueue.

This caused some deadlocks even though no task dispatched is specifically blocking, at worse it dispatches a task to the main thread.
I had to use the NS_DISPATCH_EVENT_MAY_BLOCK dispatch flag to force using the IO thread pool.

If we want to encourage the adoption of using the background taskqueue, we should endeavour to make the replacement transparent and almost equivalent in usage.

Right now unless NS_DISPATCH_EVENT_MAY_BLOCK is set; it will use a default thread pool that is restricted to a single thread.
That makes all concurrent tasks currently being all serialised, as we increase the use of background taskqueue, the worse it will get.

I think the number of default threads used by the default thread pool should be increased.

Here is a pernosco recording showing how when having each thread pool limited to a single thread only would cause a hang
https://pernos.co/debug/Lj9XG-cJQ-e3m0S-3jAQSw/index.html#f{m[INNY,AA_,t[BQ,BtU_,f{e[ARTE,BEE_,s{af56DRAAA,bIQ,oWIxE,uWG4a___

I didn't investigate on the why, for now I will be using for bug 1650277 a taskqueue with a media threadpool instead that will create the number of threads as needed.

Having a thread pool limited to a single thread has proven problematic in the past; particularly during shutdown and why we had to create multiple threadpool for media.

In 1650277 , removing

  // For now just one thread. Can increase easily later if we want.
  rv = pool->SetThreadLimit(1);
  NS_ENSURE_SUCCESS(rv, rv);

prevent the deadlocks.

Ok, in this task, we are blocked in TextureClientRecycleAllocator* CompositableClient::GetTextureClientRecycler() {

It dispatches a task to the ImageBridgeChild thread ; which since bug 1647628 is also using a Background TaskQueue.

So we have to taskqueue, both using the background thread pool waiting for another ; this will deadlock.

Core issue is the nsTimerThread implementation; TimerThread::PostTimerEvent which will dispatch an event ; here we could prevent the deadlock by using the NS_DISPATCH_EVENT_MAY_BLOCK flag in the dispatch. However the nsTimer doesn't know we are dealing with a background task; and giving NS_DISPATCH_EVENT_MAY_BLOCK to a non background taskqueue will cause an error about giving an unknown task.

I'm not sure that having the background tasqueue using a different thread pool according to the NS_DISPATCH_EVENT_MAY_BLOCK flag can scale well if we want to use it more broadly.

Flags: needinfo?(nfroyd)
Blocks: 1650786

(In reply to Jean-Yves Avenard [:jya] from comment #3)

Core issue is the nsTimerThread implementation; TimerThread::PostTimerEvent which will dispatch an event ; here we could prevent the deadlock by using the NS_DISPATCH_EVENT_MAY_BLOCK flag in the dispatch. However the nsTimer doesn't know we are dealing with a background task; and giving NS_DISPATCH_EVENT_MAY_BLOCK to a non background taskqueue will cause an error about giving an unknown task.

I do not understand what "will cause an error about giving an unknown task" is supposed to mean. An unknown flag, perhaps? Maybe that just means we should make everything else understand NS_DISPATCH_EVENT_MAY_BLOCK.

I'm not sure that having the background tasqueue using a different thread pool according to the NS_DISPATCH_EVENT_MAY_BLOCK flag can scale well if we want to use it more broadly.

The core problem that it's meant to solve is that you don't want all the threads in your thread pool blocked on I/O when there are tasks that could be running on the CPU(s)...but neither do you want the number of threads in your thread pool to scale without bound, because there are only so many CPU-intensive tasks that you can really do simultaneously. So in the ideal state, you have two conceptual pools: one that's fixed at the number of cores you have available, for your CPU-intensive tasks, and another that has some policy (both for growing and shrinking the number of threads) for accommodating I/O tasks that can block.

Admittedly, we are not near the ideal state, because both of the underlying thread pools have a single thread at the moment, and the I/O pool doesn't even try to scale according to the load of available I/O-bound tasks. And we don't have anything that attempts to catch mistakes in dispatch flags, either. I think it's at least reasonable to change the number of threads in the background pools and/or the idle shutdown time for those pools given that we're starting to use the background pools more.

What else would you do differently?

Flags: needinfo?(nfroyd)
Type: defect → enhancement
You need to log in before you can comment on or make changes to this bug.