10-second long startup hangs on developer machines after each update because of the sanity test "snapshot" checks, if webrender compositor is enabled
Categories
(Core :: Graphics, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox73 | --- | wontfix |
firefox74 | + | disabled |
firefox75 | --- | verified |
People
(Reporter: Gijs, Assigned: sotaro, NeedInfo)
References
(Regression)
Details
(Keywords: hang, perf, regression)
Attachments
(4 files, 1 obsolete file)
(deleted),
application/json
|
Details | |
(deleted),
text/x-phabricator-request
|
pascalc
:
approval-mozilla-beta-
|
Details |
(deleted),
text/x-phabricator-request
|
pascalc
:
approval-mozilla-beta-
|
Details |
(deleted),
video/webm
|
Details |
I'm seeing this both on current nightly and on beta. Profile from beta: https://perfht.ml/2SG3Boi
Looks like the parent is doing sync IPC via mozilla::layers::PWebRenderBridgeChild::SendGetSnapshot at startup. Stack:
mozilla::layers::PWebRenderBridgeChild::SendGetSnapshot(mozilla::layers::PTextureChild*)
PWebRenderBridge::Msg_GetSnapshot
mozilla::layers::WebRenderLayerManager::MakeSnapshotIfRequired(mozilla::gfx::IntSizeTyped<mozilla::LayoutDevicePixel>)
mozilla::layers::WebRenderLayerManager::EndTransactionWithoutLayer(nsDisplayList*, nsDisplayListBuilder*, WrFiltersHolder&&, mozilla::layers::WebRenderBackgroundData*)
nsDisplayList::PaintRoot(nsDisplayListBuilder*, gfxContext*, unsigned int)
nsDisplayList::PaintRoot
static nsLayoutUtils::PaintFrame(gfxContext*, nsIFrame*, nsRegion const&, unsigned int, nsDisplayListBuilderMode, nsLayoutUtils::PaintFrameFlags)
nsLayoutUtils::PaintFrame
mozilla::PresShell::RenderDocument(nsRect const&, mozilla::RenderDocumentFlags, unsigned int, gfxContext*)
mozilla::dom::CanvasRenderingContext2D::DrawWindow(nsGlobalWindowInner&, double, double, double, double, nsTSubstring<char> const&, unsigned int, mozilla::ErrorResult&)
mozilla::dom::CanvasRenderingContext2D_Binding::drawWindow(JSContext*, JS::Handle<JSObject *>, void*, JSJitMethodCallArgs const&)
CanvasRenderingContext2D.drawWindow
mozilla::dom::binding_detail::GenericMethod<mozilla::dom::binding_detail::NormalThisPolicy,mozilla::dom::binding_detail::ThrowExceptions>(JSContext*, unsigned int, JS::Value*)
js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason)
Interpret(JSContext*, js::RunState&)
js::RunScript(JSContext*, js::RunState&)
takeWindowSnapshot
testCompositor
runSanityTest
receiveMessage
js::RunScript
js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason)
JS::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, JS::HandleValueArray const&, JS::MutableHandle<JS::Value>)
mozilla::dom::MessageListener::ReceiveMessage(JSContext*, JS::Handle<JS::Value>, mozilla::dom::ReceiveMessageArgument const&, JS::MutableHandle<JS::Value>, mozilla::ErrorResult&)
nsFrameMessageManager::ReceiveMessage(nsISupports*, nsFrameLoader*, bool, nsTSubstring<char16_t> const&, bool, mozilla::dom::ipc::StructuredCloneData*, mozilla::jsipc::CpowHolder*, nsIPrincipal*, nsTArray<mozilla::dom::ipc::StructuredCloneData>*, mozilla::ErrorResult&)
MessageListener.receiveMessage
message manager handler
mozilla::dom::BrowserParent::ReceiveMessage(nsTString<char16_t> const&, bool, mozilla::dom::ipc::StructuredCloneData*, mozilla::jsipc::CpowHolder*, nsIPrincipal*, nsTArray<mozilla::dom::ipc::StructuredCloneData>*)
mozilla::dom::BrowserParent::RecvAsyncMessage(nsTString<char16_t> const&, nsTArray<mozilla::jsipc::CpowEntry>&&, nsIPrincipal*, mozilla::dom::ClonedMessageData const&)
BrowserParent::RecvAsyncMessage gfxSanity:ContentLoaded
mozilla::dom::PBrowserParent::OnMessageReceived(IPC::Message const&)
PBrowser::Msg_AsyncMessage
mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&)
mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message&&)
nsThread::ProcessNextEvent(bool, bool*)
NS_ProcessNextEvent(nsIThread*, bool)
mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*)
MessageLoop::RunHandler()
MessageLoop::Run()
nsBaseAppShell::Run()
nsAppShell::Run()
nsAppStartup::Run()
XREMain::XRE_mainRun()
XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&)
XREMain::XRE_main
XRE_main(int, char**, mozilla::BootstrapConfig const&)
NS_internal_main(int, char**, char**)
wmain(int, wchar_t**)
When this happens, the parent stops drawing, in some cases even showing the windows "not responding" titlebar. This is a powerful machine, it should not be hanging for 10 seconds. I have no idea what this code is doing, but unless this is somehow restricted to my machine, we can't ship this.
Reporter | ||
Comment 1•5 years ago
|
||
Reporter | ||
Comment 2•5 years ago
|
||
I suspect enabling compositor tripped this. :sotaro, can you take a look please?
Comment 3•5 years ago
|
||
So this is some JS API calling into CanvasRenderingContext2D.drawWindow
... Do we know what's calling that on startup?
Comment 4•5 years ago
|
||
I guess it is https://searchfox.org/mozilla-central/rev/a4be2fbe9bd4f405c91cc16e4e3a80400f5a9301/toolkit/components/thumbnails/PageThumbUtils.jsm#265?
Comment 5•5 years ago
|
||
Ah, no, SanityTest.jsm, sorry, should've looked at the profiles rather than just the stack :)
Reporter | ||
Comment 6•5 years ago
|
||
I confirmed that on today's nightly, if I use mozregression --launch 2020-02-14 --pref gfx.webrender.compositor:false
, the problem goes away. I'll see if this is a regression for webrender compositor on this machine (ie if I can find a specific bug if I force-enable the pref while running mozregression)
Reporter | ||
Comment 7•5 years ago
|
||
With compositor forced on, I get this regression window, which does seem very related... https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=5b5ca79bd644487d77a68aa57159f50f3db855d1&tochange=dab4de5a90b9799223befb8f607ede91944fc743
Updated•5 years ago
|
Updated•5 years ago
|
Comment 8•5 years ago
|
||
I don't quite understand the profile. It shows that, for most of the time we spend waiting, the GPU process isn't actually running? Or maybe whatever delays the GPU process happens before the profiler gets initialized?
Reporter | ||
Comment 9•5 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #8)
I don't quite understand the profile. It shows that, for most of the time we spend waiting, the GPU process isn't actually running? Or maybe whatever delays the GPU process happens before the profiler gets initialized?
I can reproduce this issue very reliably, so if there is some way for me to get this information please let me know and I can try doing so. The graphics failure section in about:support does include "Killing GPU process due to IPC reply timeout" so perhaps something breaks the GPU process? I have no idea about the specifics of the profile, the profile was just how I figured out what was causing this massive hang on startup that I was seeing...
Comment 10•5 years ago
|
||
Hey Gijs, I looked at this today with jrmuizel and mstange, and their strong suspicion is that the GPU process crashes or has hung, the launcher times out, and relaunches a new GPU process, and the main thread in the parent process gets blocked waiting on the sync IPC message to be serviced. Eventually that process gets killed automatically after 10s, the sanity test completes, and a new GPU process gets launched.
So what we need here is the stack for the hung GPU process. Is there a crash report in about:crashes? We might have gathered one when force-killing the hung GPU process.
Updated•5 years ago
|
Comment 11•5 years ago
|
||
If there's no crash report, we could extend the auto-kill time so that a debugger can be attached. You bump that time by setting layers.gpu-process.ipc_reply_timeout_ms
Reporter | ||
Comment 12•5 years ago
|
||
(In reply to Mike Conley (:mconley) (:βοΈ) from comment #11)
If there's no crash report, we could extend the auto-kill time so that a debugger can be attached. You bump that time by setting
layers.gpu-process.ipc_reply_timeout_ms
There's no crash report. So I did what you suggested, but MSVS at least out of the box did not break on anything when attached to the gpu process... it just updates the UI to indicate the GPU process died (ie no longer indicates it's attached to anything etc.).
What would be the easiest way for me to gather the actually-interesting stack here? I tried looking at https://searchfox.org/mozilla-central/rev/174f1195ec740e8f17223b48018f7e14e6d4e40e/gfx/layers/ipc/CompositorManagerChild.cpp#258 and can find the parent process side of things from there, but I'm not sure how to find out what the child process is doing at that point - when I break the debugger in the child process at this point it seems to "just" be waiting in mozilla::widget::WinUtils' WaitForMessage
loop. It's basically using (almost) no CPU while the parent process is "stuck" waiting for the child.
Comment 14•5 years ago
|
||
(In reply to :Gijs (he/him) from comment #12)
when I break the debugger in the child process at this point
Which process are you referring to here, exactly? Is the "child process" the GPU process? Another confusing aspect to this is that the "parent process" / "UI process" is the child-side of the CompositorManager actor, and the GPU process is the parent-side of that actor.
When the parent process main thread is blocking in GetSnapshot, it's waiting for something to happen on the Compositor thread of the GPU process, which is waiting for something to happen on the Renderer thread of the GPU process. The WaitForMessage
stack sounds like it might have been on the main thread of the GPU process. So the stacks we need are of the Compositor thread and on the Renderer thread of the GPU process.
Any content processes are not of interest here - this hang is purely an interaction between the parent process and the gpu process.
Reporter | ||
Comment 15•5 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #14)
(In reply to :Gijs (he/him) from comment #12)
when I break the debugger in the child process at this point
Which process are you referring to here, exactly? Is the "child process" the GPU process?
Yeah, sorry, the GPU process.
Another confusing aspect to this is that the "parent process" / "UI process" is the child-side of the CompositorManager actor, and the GPU process is the parent-side of that actor.
When the parent process main thread is blocking in GetSnapshot, it's waiting for something to happen on the Compositor thread of the GPU process, which is waiting for something to happen on the Renderer thread of the GPU process. TheWaitForMessage
stack sounds like it might have been on the main thread of the GPU process. So the stacks we need are of the Compositor thread and on the Renderer thread of the GPU process.
Renderer:
> xul.dll!mozilla::wr::RenderCompositorANGLE::MaybeReadback(const mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> & aReadbackSize, const mozilla::wr::ImageFormat & aReadbackFormat, const mozilla::Range<unsigned char> & aReadbackBuffer) Line 945 C++
xul.dll!mozilla::wr::RendererOGL::UpdateAndRender(const mozilla::Maybe<mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> > & aReadbackSize, const mozilla::Maybe<mozilla::wr::ImageFormat> & aReadbackFormat, const mozilla::Maybe<mozilla::Range<unsigned char> > & aReadbackBuffer, bool aHadSlowFrame, mozilla::wr::RendererStats * aOutStats) Line 168 C++
xul.dll!mozilla::wr::RenderThread::UpdateAndRender(mozilla::wr::WrWindowId aWindowId, const mozilla::layers::BaseTransactionId<mozilla::VsyncIdType> & aStartId, const mozilla::TimeStamp & aStartTime, bool aRender, const mozilla::Maybe<mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> > & aReadbackSize, const mozilla::Maybe<mozilla::wr::ImageFormat> & aReadbackFormat, const mozilla::Maybe<mozilla::Range<unsigned char> > & aReadbackBuffer, bool aHadSlowFrame) Line 478 C++
xul.dll!mozilla::wr::WebRenderAPI::Readback::Readback::Run(mozilla::wr::RenderThread & aRenderThread, mozilla::wr::WrWindowId aWindowId) Line 492 C++
xul.dll!mozilla::wr::RenderThread::RunEvent(mozilla::wr::WrWindowId aWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> > aEvent) Line 414 C++
[Inline Frame] xul.dll!mozilla::detail::RunnableMethodArguments<mozilla::wr::WrWindowId,mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> > &&>::applyImpl(mozilla::wr::RenderThread * o, void(mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> >) args, mozilla::Tuple<StoreCopyPassByConstLRef<mozilla::wr::WrWindowId>,StoreCopyPassByRRef<mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> > > > &, std::integer_sequence<unsigned long long,0,1>) Line 1160 C++
[Inline Frame] xul.dll!mozilla::detail::RunnableMethodArguments<mozilla::wr::WrWindowId,mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> > &&>::apply(mozilla::wr::RenderThread * o, void(mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> >)) Line 1166 C++
xul.dll!mozilla::detail::RunnableMethodImpl<mozilla::wr::RenderThread *,void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> >),1,mozilla::RunnableKind::Standard,mozilla::wr::WrWindowId,mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> > &&>::Run() Line 1215 C++
[Inline Frame] xul.dll!MessageLoop::RunTask(already_AddRefed<nsIRunnable> aTask) Line 442 C++
xul.dll!MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask && pending_task) Line 450 C++
xul.dll!MessageLoop::DoWork() Line 523 C++
xul.dll!base::MessagePumpDefault::Run(base::MessagePump::Delegate * delegate) Line 35 C++
[Inline Frame] xul.dll!MessageLoop::RunInternal() Line 315 C++
xul.dll!MessageLoop::RunHandler() Line 309 C++
[Inline Frame] xul.dll!MessageLoop::Run() Line 290 C++
xul.dll!base::Thread::ThreadMain() Line 192 C++
xul.dll!`anonymous namespace'::ThreadFunc(void * closure) Line 20 C++
[External Code]
[Inline Frame] mozglue.dll!mozilla::interceptor::FuncHook<mozilla::interceptor::WindowsDllInterceptor<mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,1> >,void (*)(int, void *, void *)>::operator()(int &) Line 151 C++
mozglue.dll!patched_BaseThreadInitThunk(int aIsInitialThread, void * aStartAddress, void * aThreadParam) Line 583 C++
[External Code]
Compositor:
[External Code]
> nss3.dll!_PR_MD_WAIT_CV(_MDCVar * cv, _MDLock * lock, unsigned int timeout) Line 252 C
[Inline Frame] nss3.dll!_PR_WaitCondVar(PRThread * thread, PRCondVar * cvar, PRLock * lock, unsigned int timeout) Line 177 C
nss3.dll!PR_WaitCondVar(PRCondVar * cvar, unsigned int timeout) Line 539 C
nss3.dll!PR_Wait(PRMonitor * mon, unsigned int ticks) Line 308 C
[Inline Frame] xul.dll!mozilla::ReentrantMonitor::Wait(unsigned int aInterval) Line 95 C++
[Inline Frame] xul.dll!mozilla::layers::SynchronousTask::Wait() Line 25 C++
xul.dll!mozilla::wr::WebRenderAPI::Readback(const mozilla::TimeStamp & aStartTime, mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> size, const mozilla::gfx::SurfaceFormat & aFormat, const mozilla::Range<unsigned char> & buffer) Line 517 C++
xul.dll!mozilla::layers::WebRenderBridgeParent::RecvGetSnapshot(mozilla::layers::PTextureParent * aTexture) Line 1876 C++
xul.dll!mozilla::layers::PWebRenderBridgeParent::OnMessageReceived(const IPC::Message & msg__, IPC::Message * & reply__) Line 979 C++
xul.dll!mozilla::layers::PCompositorManagerParent::OnMessageReceived(const IPC::Message & msg__, IPC::Message * & reply__) Line 590 C++
xul.dll!mozilla::ipc::MessageChannel::DispatchSyncMessage(mozilla::ipc::ActorLifecycleProxy * aProxy, const IPC::Message & aMsg, IPC::Message * & aReply) Line 2184 C++
xul.dll!mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message && aMsg) Line 2133 C++
xul.dll!mozilla::ipc::MessageChannel::MessageTask::Run() Line 2009 C++
[Inline Frame] xul.dll!MessageLoop::RunTask(already_AddRefed<nsIRunnable> aTask) Line 442 C++
xul.dll!MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask && pending_task) Line 450 C++
xul.dll!MessageLoop::DoWork() Line 523 C++
xul.dll!base::MessagePumpForUI::DoRunLoop() Line 204 C++
[Inline Frame] xul.dll!base::MessagePumpWin::RunWithDispatcher(base::MessagePump::Delegate * delegate, base::MessagePumpWin::Dispatcher * dispatcher) Line 57 C++
xul.dll!base::MessagePumpWin::Run(base::MessagePump::Delegate * delegate) Line 79 C++
[Inline Frame] xul.dll!MessageLoop::RunInternal() Line 315 C++
xul.dll!MessageLoop::RunHandler() Line 309 C++
[Inline Frame] xul.dll!MessageLoop::Run() Line 290 C++
xul.dll!base::Thread::ThreadMain() Line 192 C++
xul.dll!`anonymous namespace'::ThreadFunc(void * closure) Line 20 C++
[External Code]
[Inline Frame] mozglue.dll!mozilla::interceptor::FuncHook<mozilla::interceptor::WindowsDllInterceptor<mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,1> >,void (*)(int, void *, void *)>::operator()(int &) Line 151 C++
mozglue.dll!patched_BaseThreadInitThunk(int aIsInitialThread, void * aStartAddress, void * aThreadParam) Line 583 C++
[External Code]
I think this points at the code from bug 1585619 (the regressing bug) but I don't know what other information this really provides... It looks like the Renderer thread is hung in PrintWindow()
? But I don't know why... Does that help or did I look at the wrong thing?
Reporter | ||
Comment 16•5 years ago
|
||
The "Remarks" section on https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-printwindow#remarks does seem to indicate that blocking on this is unwise...
Comment 17•5 years ago
|
||
Thanks, can you get the rest of the stack of the Renderer thread (the part in external code), too? Your last comment indicates that it's blocking in PrintWindow, ok. So not in WaitForCommitCompletion. Can you confirm?
Reporter | ||
Comment 18•5 years ago
|
||
ntdll.dll!KiUserCallbackDispatch() Unknown
win32u.dll!NtUserPrintWindow() Unknown
xul.dll!mozilla::wr::RenderCompositorANGLE::MaybeReadback(const mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> & aReadbackSize, const mozilla::wr::ImageFormat & aReadbackFormat, const mozilla::Range<unsigned char> & aReadbackBuffer) Line 945 C++
xul.dll!mozilla::wr::RendererOGL::UpdateAndRender(const mozilla::Maybe<mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> > & aReadbackSize, const mozilla::Maybe<mozilla::wr::ImageFormat> & aReadbackFormat, const mozilla::Maybe<mozilla::Range<unsigned char> > & aReadbackBuffer, bool aHadSlowFrame, mozilla::wr::RendererStats * aOutStats) Line 168 C++
xul.dll!mozilla::wr::RenderThread::UpdateAndRender(mozilla::wr::WrWindowId aWindowId, const mozilla::layers::BaseTransactionId<mozilla::VsyncIdType> & aStartId, const mozilla::TimeStamp & aStartTime, bool aRender, const mozilla::Maybe<mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> > & aReadbackSize, const mozilla::Maybe<mozilla::wr::ImageFormat> & aReadbackFormat, const mozilla::Maybe<mozilla::Range<unsigned char> > & aReadbackBuffer, bool aHadSlowFrame) Line 478 C++
xul.dll!mozilla::wr::WebRenderAPI::Readback::Readback::Run(mozilla::wr::RenderThread & aRenderThread, mozilla::wr::WrWindowId aWindowId) Line 492 C++
xul.dll!mozilla::wr::RenderThread::RunEvent(mozilla::wr::WrWindowId aWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> > aEvent) Line 414 C++
[Inline Frame] xul.dll!mozilla::detail::RunnableMethodArguments<mozilla::wr::WrWindowId,mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> > &&>::applyImpl(mozilla::wr::RenderThread * o, void(mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> >) args, mozilla::Tuple<StoreCopyPassByConstLRef<mozilla::wr::WrWindowId>,StoreCopyPassByRRef<mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> > > > &, std::integer_sequence<unsigned long long,0,1>) Line 1160 C++
[Inline Frame] xul.dll!mozilla::detail::RunnableMethodArguments<mozilla::wr::WrWindowId,mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> > &&>::apply(mozilla::wr::RenderThread * o, void(mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> >)) Line 1166 C++
xul.dll!mozilla::detail::RunnableMethodImpl<mozilla::wr::RenderThread *,void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> >),1,mozilla::RunnableKind::Standard,mozilla::wr::WrWindowId,mozilla::UniquePtr<mozilla::wr::RendererEvent,mozilla::DefaultDelete<mozilla::wr::RendererEvent> > &&>::Run() Line 1215 C++
[Inline Frame] xul.dll!MessageLoop::RunTask(already_AddRefed<nsIRunnable> aTask) Line 442 C++
xul.dll!MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask && pending_task) Line 450 C++
xul.dll!MessageLoop::DoWork() Line 523 C++
xul.dll!base::MessagePumpDefault::Run(base::MessagePump::Delegate * delegate) Line 35 C++
[Inline Frame] xul.dll!MessageLoop::RunInternal() Line 315 C++
xul.dll!MessageLoop::RunHandler() Line 309 C++
[Inline Frame] xul.dll!MessageLoop::Run() Line 290 C++
xul.dll!base::Thread::ThreadMain() Line 192 C++
xul.dll!`anonymous namespace'::ThreadFunc(void * closure) Line 20 C++
> kernel32.dll!BaseThreadInitThunk() Unknown
[Inline Frame] mozglue.dll!mozilla::interceptor::FuncHook<mozilla::interceptor::WindowsDllInterceptor<mozilla::interceptor::VMSharingPolicyShared<mozilla::interceptor::MMPolicyInProcess,1> >,void (*)(int, void *, void *)>::operator()(int &) Line 151 C++
mozglue.dll!patched_BaseThreadInitThunk(int aIsInitialThread, void * aStartAddress, void * aThreadParam) Line 583 C++
ntdll.dll!RtlUserThreadStart() Unknown
This is all I can get out of MSVS... does that answer your question?
Reporter | ||
Comment 19•5 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #17)
So not in WaitForCommitCompletion. Can you confirm?
Right, it's definitely in PrintWindow().
Assignee | ||
Comment 20•5 years ago
|
||
With WR native compositor, screenshot becomes very slow(normally 30ms-60ms). Then when profiler screenshot happens, WR native compositor is disabled by Bug 1602643 . But we could not do it in this case. So at first, we might need to think about a possibility of removing drawWindow() if WR compositor is enabled on Windows.
https://searchfox.org/mozilla-central/rev/174f1195ec740e8f17223b48018f7e14e6d4e40e/toolkit/components/gfx/SanityTest.jsm#108
Comment 21•5 years ago
|
||
I wonder if it would work to use something like:
hdc_target = GetDC(window);
BitBlt(hdc, 0, 0, capture->width, capture->height, hdc_target,
capture->x, capture->y, SRCCOPY);
ReleaseDC(NULL, hdc_target);
instead of PrintWindow()
Assignee | ||
Comment 23•5 years ago
|
||
Hmm, I tested the following STR as to cause PrintWindow() on 2 Win10 PCs, but RenderCompositorANGLE::MaybeReadback() took around 30-40ms.
[1] Confirm that WebRender is enabled in about:support
[2] Enable the prefs gfx.webrender.compositor=true
[3] Remve "sanity-test.driver-version" pref
[4] Restart Firefox
Assignee | ||
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 24•5 years ago
|
||
Sanity test was originally added by Bug 1156135. It checks hardware video decoding and rendering capability. If it failed, "media.hardware-video-decoding.failed" pref is set to true.
Then Bug 1377866 added a check if Advanced Layers works. If the check fails, Advanced layer was disabled and update compositor by calling utils.triggerDeviceReset().
Assignee | ||
Comment 25•5 years ago
|
||
It seems that we could skip gfx sanity tests when WebRender is used. It is basically better to be handled by blacklist. And GPU process support(Bug 1314133) was enabled after Bug 1156135. By Bug 1338011, content process does not do hardware video decoding by default. It seems that it could reduce the video rendering problem. And D3D9DXVA2Manager blacklists several GPUs now. Further, WebRender is carefully rolling out. WebRender is not qualified for old gpu drivers.
- https://searchfox.org/mozilla-central/rev/0439db3a39faddb90197a87cc184c57dcbf0a770/dom/media/platforms/wmf/DXVA2Manager.cpp#48
- https://searchfox.org/mozilla-central/rev/0439db3a39faddb90197a87cc184c57dcbf0a770/dom/media/platforms/wmf/DXVA2Manager.cpp#78
And if WebRender is used, it seems that we could disable Advanced Layers as fallback candidate.
If the above could be acceptable, gfx sanity test could be skipped if WebRender is used.
Assignee | ||
Comment 26•5 years ago
|
||
Comment 27•5 years ago
|
||
Pushed by sikeda.birchill@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/beaad55e6508 Skip drawWindow() when LayerManger type is WebRender r=jrmuizel
Comment 28•5 years ago
|
||
bugherder |
Reporter | ||
Comment 29•5 years ago
|
||
Hoping to test that the fix works on my machine once today's nightlies are built, though I'll note here for anyone else looking at this that bug 1615683 turned webrender compositor off again on nightly (making this issue go away anyway - but not affecting beta), which confused me for a bit.
Reporter | ||
Comment 30•5 years ago
|
||
I'm still seeing this hang on today's nightly. Do you want me to file a follow-up bug, or reopen this one, or...?
Assignee | ||
Comment 31•5 years ago
|
||
Oh, sorry. I made a silly mistake during creating the patch. I am going to re-open the bug and create a followup fix.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 32•5 years ago
|
||
Updated•5 years ago
|
Assignee | ||
Comment 33•5 years ago
|
||
Comment 34•5 years ago
|
||
Pushed by sikeda.birchill@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e5f581db2514 Skip takeWindowSnapshot() with WebRender r=nical
Comment 35•5 years ago
|
||
bugherder |
Comment 36•5 years ago
|
||
Sotaro, is that something we could uplift safely to beta or should it ride the 75 train?
Updated•5 years ago
|
Assignee | ||
Comment 37•5 years ago
|
||
:Gijs, can you check if the problem is addressed on latest nightly?
Assignee | ||
Comment 40•5 years ago
|
||
Comment on attachment 9127739 [details]
Bug 1615590 - Skip takeWindowSnapshot() with WebRender
Beta/Release Uplift Approval Request
- User impact if declined: 10-second long startup hang might happen on some PCs just after updating Firefox.
https://phabricator.services.mozilla.com/D63125 is necessary to this patch.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Medium
- Why is the change risky/not risky? (and alternatives if risky): With the change, hardware decoded video readback test is skipped. If there is a PCs that are not blacklisted for hardware decoded video yet and the PCs are qualified for WebRender, hardware decoded video rendering could cause a problem. But hardware decoded video is already managed by the blacklisting. If the problem will happen, the blacklist should be updated.
- String changes made/needed: none
Assignee | ||
Updated•5 years ago
|
Comment 41•5 years ago
|
||
Comment on attachment 9127739 [details]
Bug 1615590 - Skip takeWindowSnapshot() with WebRender
Given the importantperf regression fixed, uplift approved for 74.0b8. If this patch causes functional regressions or instabilities, we will back it out before shipping 74. Thanks
Updated•5 years ago
|
Comment 42•5 years ago
|
||
Conflict on toolkit/components/gfx/SanityTest.jsm when I import https://hg.mozilla.org/mozilla-central/rev/e5f581db2514.
Updated•5 years ago
|
Assignee | ||
Comment 43•5 years ago
|
||
Razvan Maries, I could apply the 2 patches on beta without problem. D63431 depends on D63125, did you apply them in the same order?
Comment 44•5 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/641687384e58
https://hg.mozilla.org/releases/mozilla-beta/rev/bdf8c742a153
Updated•5 years ago
|
Assignee | ||
Comment 45•5 years ago
|
||
:jrmuizel, is it better to backout the fix from beta? DC compositor is going to be disabled by Bug 1618377 and the fix causes Bug 1618175.
Assignee | ||
Comment 47•5 years ago
|
||
(In reply to Sebastian Hengst [:aryx] (needinfo on intermittent or backout) from comment #44)
https://hg.mozilla.org/releases/mozilla-beta/rev/641687384e58
https://hg.mozilla.org/releases/mozilla-beta/rev/bdf8c742a153
:aryx, can you backout the above from beta because of Comment 45 and Comment 46? Thanks!
Comment 49•5 years ago
|
||
Pascal can the flags be modified so this doesn't show up in the uplift query since it got backed out?
Updated•5 years ago
|
Updated•5 years ago
|
Comment 50•5 years ago
|
||
Gijs, can you try capturing a Firefox or Edge window with https://webrtc.github.io/samples/src/content/getusermedia/getdisplaymedia/. I believe the Window capture code also uses PrintWindow so it would be interesting to see if you see hangs there too.
Reporter | ||
Comment 51•5 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #50)
Gijs, can you try capturing a Firefox or Edge window with https://webrtc.github.io/samples/src/content/getusermedia/getdisplaymedia/. I believe the Window capture code also uses PrintWindow so it would be interesting to see if you see hangs there too.
I don't see a lack of responsiveness on the page or the UI but the printwindow call doesn't seem to work properly - see attached recording from the page. Almost the entire screen is white, the only bit that isn't is what it thinks is the window frame - which looks like a win7 aero basic frame, which is odd because this is win10 and my window frame looks nothing like that, and the titlebar is not even enabled anyway (ie tabs are in the titlebar). Profile from another attempt in case that's enlightening: https://perfht.ml/2VDY4lo .
Assignee | ||
Updated•3 years ago
|
Updated•3 years ago
|
Description
•