Open Bug 1685321 Opened 4 years ago Updated 4 years ago

Remove the usage of WMF decoder module on the content processes

Categories

(Core :: Audio/Video: Playback, task, P3)

Firefox 84
x86_64
Windows 10
task

Tracking

()

People

(Reporter: andro.marian.v94, Assigned: alwu)

References

Details

(Whiteboard: [media-youtube])

Attachments

(3 files, 5 obsolete files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36 Edg/87.0.664.66

Steps to reproduce:

Video: https://youtu.be/LvL3FY_iFw0
Url: https://www.youtube.com/results?search_query=sketchup+open+door

Actual results:

The page is not loading. The red progress bar is stuck. Firefox CPU usage of any resource is 0%

Expected results:

To load the page.

Works good on: 83.0b10, 85.0b5

OS: Unspecified → Windows 10
Hardware: Unspecified → x86_64

I can't manage to reproduce this issue, not matter what Firefox I'm using the page is correctly loaded.

Mozilla has a list of basic troubleshooting steps. These can solve almost every problem.
https://www.reddit.com/r/firefox/wiki/support/troubleshooting
Could you please follow the steps above it might help?
Thanks.

Flags: needinfo?(andro.marian.v94)

Already try it before. Remove all files from AppData, LocalData, Safe Mode, Private Window.
I can reproduce it all the time in 84, 84.0.1, 84.0.2.

Flags: needinfo?(andro.marian.v94)

Now on a windows new profile is the same.

I'll assign "Core: Audio/Video: Playback" component, hopefully we'll receive some follow up action.

Component: Untriaged → Audio/Video: Playback
Product: Firefox → Core

I made a profiling test: https://share.firefox.dev/35xXgm4

Hi,
From comment0 & comment2, did you mean the Firefox 83 didn't have this problem, but it happened on Firefox 84 and all following versions?
Would you mind to use mozregression to find the culprit?

In addition, because you didn't start video yet, it seems to me that this issue isn't too related with media, I would probably say that's more related with DOM or layout. Move it to DOM component first to see if they have any thought.

Component: Audio/Video: Playback → DOM: Core & HTML
Component: DOM: Core & HTML → Audio/Video: Playback
Flags: needinfo?(andro.marian.v94)

Sorry, in your profiled result, I noticed that there are huge amount of time we were waiting for WMFDecoderModule shutdown, so that's indeed media stuffs.

Severity: -- → S3
Priority: -- → P3
Summary: Firefox is slow and seams to not porocessing anything → Firefox is slow and seams to not processing anything
Whiteboard: [media-youtube]

I try to find a regression starting from 2020-4-1 but I don't find anything on Nightly build.

Flags: needinfo?(andro.marian.v94)
Attached file Builds.txt (deleted) —

This are the builds on mozilla-release. Says the test result at the end of build.

Just another update.

Tested on Virtual Machine with 1803 17134.1006 and the problem seams to not exists.
Tested on Virtual Machine with 20H2 19042.746 and the problem still exists.

What is broken on 84 with 20H2 but other versions works good?

Attached image wmf-shutdown.png (deleted) —

Thank you for help! From the profiled result, we were blocking by destroying the WMFDecoderModule whenever YT was calling MediaSource.isTypeSupported(). That happened several times in the result.

It seems to me that we should create a global PDM used to check the support type, instread of creating and destroying PDM everytime, eg. In mp4decoder.

Assignee: nobody → alwu
Status: UNCONFIRMED → NEW
Ever confirmed: true

So, the other versions is not destroying it?
Like the 83 profile: https://share.firefox.dev/3iqAnpW

They did, on your result in comment12, the shutdown only took around 2ms. But on the result in comment5, it took around 1400~2400ms to shutdown. More specifically, the shutdown were blocked here (MTA thread might be too busy at that time)

Depends on: 1688278
Priority: P3 → P2
Summary: Firefox is slow and seams to not processing anything → Youtube is slow because the main thread is blocked by the process of destroying WMFDecoderModule

We've a lot of places using PDMFactory, but some of them only need the ability of asking if the type is supported, they won't ask for a decoder.

Therefore, implement a interface IPDMDecoderSupport to provide only necessary methods to those callers.

Depends on D102793

Everytime create and destroy a PDMFactory is a cost, especially considering some method would require longer time and block the main thread. Eg. MFShutdown needs to be run on MTA thread and we would synchrously wait until it finishes.

In fact, PDMFactory should only be created once per process and then being reused all the time until that process shutdowns. Therefore, make it to singleton is a necessary thing.

In addition, in order to support encrypted content in a cleaner way, we create a separate factory EMEPDMFactory for them, which would create a EME decoder module and give it a singleton PDMFactory.

Depends on D102854

ClearOnShutdown() can only be used on main thread, so do a sync creation for sPDMFactory, which can also make sure EnsureInit() is running on main thread.

Depends on D102855

Attached file Bug 1685321 - part4 : remove unused variables. (obsolete) (deleted) —

Depends on D102856

In current new design, EnsureInit() doesn't need to be a static function anymore.

Depends on D102857

FWIW, with current nightly, you can't get there as the WMF should only ever be run in the RDD process and so can't block the content process.

So those changes feel like a workaround that shouldn't occur anymore. The goal should be to remove using WMF in the content process rather than adding workaround for WMF not working well there.

(In reply to Jean-Yves Avenard [:jya] from comment #19)

FWIW, with current nightly, you can't get there as the WMF should only ever be run in the RDD process and so can't block the content process.

So those changes feel like a workaround that shouldn't occur anymore. The goal should be to remove using WMF in the content process rather than adding workaround for WMF not working well there.

I agree with that the utimate goal is not to using WMF in content process, but I don't think my patches are a workaround. First, I don't see any necessary to frequantly create and destroy the PDMFactory again and again. We only need to initialize PDMs and add them to mCurrentPDMs once per process. When you enter a video page on Youtube, from loading the resource to starting showing the first video frame, we would create and destroy approximately 54~80 times of PDMFactory.

Second, before removing WMF usage from content process, we might want to know how many users have turned off both GPU and RDD process (which makes them only be able to use WMF on content process, it might be related with reasons, such as that their Graphic card is buggy some they don't want to use GPU for decoding, or turning on RDD would cause crashes or tab frozen)

So my patches here are aim to solve the first problem, which is to avoid redundant function calls. Then it would somehow relief the issue this bug mentioned, but that is not the main goal of those patches. In addition, I tested my patches on Ubuntu, where we can see the improvement on time consuming, it's not a big improvement though. But accumulating more and more small improvements would eventually become a worthy investment.

After an offline discussion with :jya, I will abandon current patches and change a new way, which is to avoid initializing WMF on the content process.

Attachment #9198863 - Attachment is obsolete: true
Attachment #9198864 - Attachment is obsolete: true
Attachment #9198865 - Attachment is obsolete: true
Attachment #9198866 - Attachment is obsolete: true
Attachment #9198907 - Attachment is obsolete: true

For content process, it would get the decoder support information from either GPU or RDD process, where the WMF decoder module is being executed.

Therefore, there is no need to run WMF decoder module on the content process. In the future, we should only run RemoteDecoderModule in content process.

Woorks good now on the 85 release from what I see.

After getting more understanding of the status of RDD, this issue has beeb fixed already on Fx85 where we've enabled RDD by default (bug1681228)

So what I'm going to do in this bug is to remove the possible WMF usage in the content process, which is a task, not a defect. In addition, per this comment, we need to implement a mechanism to propagate the error from GPU/RDD process to content process if users are not able to create WMF decoder module.

Therefore, based on above reason, I'm going to change the priority of this bug.

Severity: S3 → N/A
Type: defect → task
Priority: P2 → P3
Summary: Youtube is slow because the main thread is blocked by the process of destroying WMFDecoderModule → Remove the usage of WMF decoder module on the content processes

So my current plans are that,

  1. fix the decoder doctor in order to show the banner when encountering the WMF failure
  2. use a pref to disable using WMF on content process, which can solve bug 1521370
    => and probably add some telemetry probes to see if we still need WMF on content process or not. Eventually we would only use it on GPU/RDD.
  3. implement a mechanism to send the error flag from RDD/GPU process back to the content process, in order to report the error by decoder doctor
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: