Closed Bug 1645677 Opened 4 years ago Closed 2 years ago

[Wayland] Crash when opening toolbar panel

Categories

(Core :: Widget: Gtk, defect, P2)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
107 Branch
Tracking Status
firefox-esr102 --- disabled
firefox79 --- disabled
firefox80 --- disabled
firefox81 --- disabled
firefox105 --- disabled
firefox106 --- disabled
firefox107 --- fixed

People

(Reporter: jan, Assigned: stransky)

References

(Blocks 3 open bugs, Regressed 1 open bug, )

Details

(Keywords: crash, nightly-community)

Crash Data

Attachments

(6 files, 1 obsolete file)

Gnome Wayland, Debian Testing
Main process crash. I clicked to open the panel of enhanced-h264ify. It doesn't seem to be reproducible.
gfx.webrender.max-partial-present-rects is 0 (default),
but I have manually set gfx.webrender.panic-on-gl-error to true.

This bug is for crash report bp-9d065196-d957-4144-9e88-9ee940200614.

Top 10 frames of crashing thread:

0 libEGL_mesa.so.0 dri2_wl_swap_buffers_with_damage ./build/../src/egl/drivers/dri2/platform_wayland.c:1107
1 libEGL_mesa.so.0 dri2_swap_buffers_with_damage ./build/../src/egl/drivers/dri2/egl_dri2.c:1956
2 libEGL_mesa.so.0 _eglSwapBuffersWithDamageCommon ./build/../src/egl/main/eglapi.c:1376
3 libxul.so mozilla::gl::GLContextEGL::SwapBuffers gfx/gl/GLContextProviderEGL.cpp:536
4 libxul.so mozilla::wr::RenderCompositorEGL::EndFrame gfx/webrender_bindings/RenderCompositorEGL.cpp:113
5 libxul.so mozilla::wr::RendererOGL::UpdateAndRender gfx/webrender_bindings/RendererOGL.cpp:176
6 libxul.so mozilla::wr::RenderThread::UpdateAndRender gfx/webrender_bindings/RenderThread.cpp:478
7 libxul.so mozilla::wr::RenderThread::HandleFrameOneDoc gfx/webrender_bindings/RenderThread.cpp:356
8 libxul.so mozilla::detail::RunnableMethodImpl<mozilla::wr::RenderThread*, void  xpcom/threads/nsThreadUtils.h:1237
9 libxul.so base::MessagePumpDefault::Run ipc/chromium/src/base/message_pump_default.cc:35
Blocks: wr-linux
Severity: -- → S3

I clicked to open the panel of enhanced-h264ify.

When clicking on it after startup (while restored app tabs are still loading) I can reproduce a fallback to OpenGL (no crash report):

(#0) Error window is null
(#1) Error Failed to create EGLSurface
(#2) Error We don't have EGLSurface to draw into. Called too early?
(#3) Error We don't have EGLSurface to draw into. Called too early?
(#4) Error Compositors might be mixed (5,2)

I assume it could have the same underlying problem as the autoscroll icon.

Crash Signature: [@ dri2_wl_swap_buffers_with_damage] → [@ dri2_wl_swap_buffers_with_damage] [@ libEGL_mesa.so.0@0x28a81]
Depends on: 1650583
Crash Signature: [@ dri2_wl_swap_buffers_with_damage] [@ libEGL_mesa.so.0@0x28a81] → [@ dri2_wl_swap_buffers_with_damage] [@ libEGL_mesa.so.0@0x28a81] [@ wl_proxy_create_wrapper | libEGL_mesa.so.0@0x236b9 ]

Interesting! I wouldn't be surprised if this was related to bug 1650246 - i.e. maybe something about the first buffer swap is wrong for popups. Thanks for the hint with gfx.webrender.panic-on-gl-error, will try that.

Summary: [Wayland] Crash in [@ dri2_wl_swap_buffers_with_damage] → [Wayland] Crash when opening toolbar panel
Crash Signature: [@ dri2_wl_swap_buffers_with_damage] [@ libEGL_mesa.so.0@0x28a81] [@ wl_proxy_create_wrapper | libEGL_mesa.so.0@0x236b9 ] → [@ dri2_wl_swap_buffers_with_damage] [@ libEGL_mesa.so.0@0x28a81] [@ wl_proxy_create_wrapper | libEGL_mesa.so.0@0x236b9 ] [@ <name omitted> | <name omitted> | dri2_allocate_textures]
Crash Signature: [@ dri2_wl_swap_buffers_with_damage] [@ libEGL_mesa.so.0@0x28a81] [@ wl_proxy_create_wrapper | libEGL_mesa.so.0@0x236b9 ] [@ <name omitted> | <name omitted> | dri2_allocate_textures] → [@ dri2_wl_swap_buffers_with_damage] [@ libEGL_mesa.so.0@0x28a81] [@ wl_proxy_create_wrapper | libEGL_mesa.so.0@0x236b9 ] [@ <name omitted> | <name omitted> | dri2_allocate_textures] [@ wl_proxy_create_wrapper | libEGL_mesa.so.0@0x255c8]

(Matt Fagnani from bug 1656415 comment #3)

Firefox Nightly 81.0a1 (2020-8-5) on Wayland with Webrender enabled in Plasma 5.19.4 in Fedora Rawhide crashed when clicking to disable Tracking protection in the Tracking protection popup. A segmentation fault occurred in wl_proxy_create_wrapper at src/wayland-client.c:2237 in libwayland-client-1.18.0-2.fc33.x86_64 in the Renderer thread as in crash I originally reported here. The crash address was 0x0. A null pointer dereference might've happened. https://crash-stats.mozilla.org/report/index/23ad0563-e71b-4986-b183-cff7f0200805

The functions in frames 1 and 2 were similar to those in the original crash.
1 dri2_wl_create_window_surface in ../src/egl/drivers/dri2/platform_wayland.c:377 in mesa-libEGL-20.1.4-2.fc33.x86_64
2 _eglCreateWindowSurfaceCommon in ../src/egl/main/eglapi.c:971 in mesa-libEGL-20.1.4-2.fc33.x86_64

Crash Signature: [@ dri2_wl_swap_buffers_with_damage] [@ libEGL_mesa.so.0@0x28a81] [@ wl_proxy_create_wrapper | libEGL_mesa.so.0@0x236b9 ] [@ <name omitted> | <name omitted> | dri2_allocate_textures] [@ wl_proxy_create_wrapper | libEGL_mesa.so.0@0x255c8] → [@ dri2_wl_swap_buffers_with_damage] [@ libEGL_mesa.so.0@0x28a81] [@ wl_proxy_create_wrapper | libEGL_mesa.so.0@0x236b9 ] [@ <name omitted> | <name omitted> | dri2_allocate_textures] [@ wl_proxy_create_wrapper | libEGL_mesa.so.0@0x255c8] [@ <name omi…
Crash Signature: omitted> | <name omitted> | <name omitted> | mozilla::gl::CreateSurfaceFromNativeWindow ] → omitted> | <name omitted> | <name omitted> | mozilla::gl::CreateSurfaceFromNativeWindow ] [@ wl_proxy_create_wrapper | dri2_wl_create_window_surface]
No longer depends on: 1650583

I saw a segmentation fault in dri2_wl_swap_buffers_with_damage at ../src/egl/drivers/dri2/platform_wayland.c:1111 in mesa-libEGL-20.2.0~rc4-1.fc33.x86_64 in Nightly 82.0a1 (2020-9-12) on Wayland with Webrender enabled in Plasma 5.19.5 in Fedora 33. I clicked on Bookmarks in the menu bar then Bookmark This Page when the crash happened. https://crash-stats.mozilla.org/report/index/f484f315-0319-42cb-89a1-7c4590200912

platform_wayland.c:1111 was dri2_surf->wl_win->attached_width = dri2_surf->base.Width;
The crash might've involved a race condition in which dri2_surf or dri2_surf->wl_win was occasionally freed or corrupted before it was used. The crashes in update_buffers in mesa-libEGL at https://bugzilla.mozilla.org/show_bug.cgi?id=1655120 involved dri2_surf or dri2_surf->wl_win being an invalid pointer and using the bookmarks menu or toolbar buttons. The crashes in wl_proxy_create_wrapper at https://bugzilla.mozilla.org/show_bug.cgi?id=1656415 happened when the Wayland proxy was an invalid pointer and using the bookmarks menu or toolbar buttons.

firefox 82.0.2 the bug persists and is very annoying

The fallback to gl/basic appears to be fixed by bug 1681107 - (try: https://treeherder.mozilla.org/jobs?repo=try&revision=9f07ace949c2dfd903d1fa002d74e3016f42214a). Jan (or somebody else affected), can you confirm this?

As for the crash in dri2_wl_swap_buffers_with_damage, I think it can still happen under very unfortunate circumstances - and that we could avoid it by locking the wl_surface while swapping buffers, see bug 1680961

I haven't seen this in quite a while after bug 1681107 landed, thus closing. Please report back and/or reopen if you still see this issue.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WORKSFORME

Today I saw this crash; bp-1a5e6f53-ee30-4f54-b1f9-520350220801

What I did when the crash happened is, I have two monitors and am using a single Firefox window on a monitor and the browser window is full-screened.

  1. Drag a tab in the browser and drop the tab into another monitor
  2. A new Fiforefox window opened
  3. In the mean time, I did install this extension in bug 1683612 comment 0 from about:debugging
  4. Click the extension icon on the toolbar on the new browser window
  5. Change the monitor resolution for the another monitor
  6. Repeat 4) and 5) several times
  7. Click the Multi-Account containers' icon on the toolbar
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---

The severity field for this bug is relatively low, S3. However, the bug has 4 duplicates.
:bhood, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(bhood)
Flags: needinfo?(bhood)

I think I have a backtrace for it:

Hiding windows in main thread:

#0  0x00007f020ff0b12b in munmap () at /lib64/libc.so.6
#1  0x00007f020e967a5a in gdk_wayland_cairo_surface_destroy (p=0x7f01a7477eb0) at ../gdk/wayland/gdkdisplay-wayland.c:1354
#2  0x00007f020e38268f in _cairo_user_data_array_fini () at /lib64/libcairo.so.2
#3  0x00007f020e3d65f4 in cairo_surface_destroy () at /lib64/libcairo.so.2
#4  0x00007f020e972274 in drop_cairo_surfaces (window=window@entry=0x7f01af353de0) at ../gdk/wayland/gdkwindow-wayland.c:311
#5  0x00007f020e977eed in gdk_wayland_window_hide_surface (window=0x7f01af353de0) at ../gdk/wayland/gdkwindow-wayland.c:3451
#6  0x00007f020e949af0 in gdk_window_withdraw (window=0x7f01af353de0) at ../gdk/gdkwindow.c:5782
#7  0x00007f020ed8c2cd in gtk_window_unmap (widget=0x7f01aef6a360) at ../gtk/gtkwindow.c:6426
#8  0x00007f020f267db0 in g_closure_invoke () at /lib64/libgobject-2.0.so.0
#9  0x00007f020f294185 in signal_emit_unlocked_R.isra.0 () at /lib64/libgobject-2.0.so.0
#10 0x00007f020f284a2e in g_signal_emit_valist () at /lib64/libgobject-2.0.so.0
#11 0x00007f020f284cb3 in g_signal_emit () at /lib64/libgobject-2.0.so.0
#12 0x00007f020ed67e84 in gtk_widget_unmap (widget=0x7f01aef6a360) at ../gtk/gtkwidget.c:5085
#13 0x00007f020ed8be0b in gtk_window_hide (widget=0x7f01aef6a360) at ../gtk/gtkwindow.c:6221
#14 0x00007f020f267db0 in g_closure_invoke () at /lib64/libgobject-2.0.so.0
#15 0x00007f020f294185 in signal_emit_unlocked_R.isra.0 () at /lib64/libgobject-2.0.so.0
#16 0x00007f020f284a2e in g_signal_emit_valist () at /lib64/libgobject-2.0.so.0
#17 0x00007f020f284cb3 in g_signal_emit () at /lib64/libgobject-2.0.so.0
#18 0x00007f020ed6dfaf in gtk_widget_hide (widget=0x7f01aef6a360) at ../gtk/gtkwidget.c:4953
#19 0x00007f020792d147 in nsWindow::HideWaylandPopupWindow(bool, bool) (this=this@entry=0x7f01aee28400, aTemporaryHide=<optimized out>, aRemoveFromPopupList=<optimized out>)
    at /raid/src3/widget/gtk/nsWindow.cpp:1213
#20 0x00007f020792d9aa in nsWindow::WaylandPopupRemoveClosedPopups() (this=this@entry=0x7f01aee28400) at /raid/src3/widget/gtk/nsWindow.cpp:1262
#21 0x00007f02079316de in nsWindow::UpdateWaylandPopupHierarchy() (this=this@entry=0x7f01aee28400) at /raid/src3/widget/gtk/nsWindow.cpp:1830
#22 0x00007f0207928c3e in nsWindow::NativeShow(bool) (this=this@entry=0x7f01aee28400, aAction=<optimized out>) at /raid/src3/widget/gtk/nsWindow.cpp:6485
#23 0x00007f020792a66a in nsWindow::Show(bool) (this=0x7f01aee28400, aState=false) at /raid/src3/widget/gtk/nsWindow.cpp:940
#24 0x00007f02078b9556 in nsView::DoResetWidgetBounds(bool, bool) (this=<optimized out>, aMoveOnly=false, aInvalidateChangedSize=false) at /raid/src3/view/nsView.cpp:320
#25 0x00007f02078bbe0e in nsView::ResetWidgetBounds(bool, bool) (aRecurse=false, aForceSync=true, this=<optimized out>) at /raid/src3/view/nsView.cpp:189
#26 nsViewManager::ProcessPendingUpdatesForView(nsView*, bool) (this=0x7f01bdd2dbc0, aView=<optimized out>, aFlushDirtyRegion=false) at /raid/src3/view/nsViewManager.cpp:355
#27 0x00007f0207b5b94b in mozilla::PresShell::DoFlushPendingNotifications(mozilla::ChangesToFlush) (this=0x7f01bdd3c000, aFlush=...) at /raid/src3/layout/base/PresShell.cpp:4424
#28 0x00007f0207b31644 in mozilla::PresShell::FlushPendingNotifications(mozilla::ChangesToFlush) (this=0x7f01bdd3c000, aType=...)
    at /raid/src3/objdir-opt/dist/include/mozilla/PresShell.h:1470
#29 nsRefreshDriver::Tick(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp, nsRefreshDriver::IsExtraTick)
    (this=0x7f01c0495400, aId=aId@entry=..., aNowTime=..., aIsExtraTick=aIsExtraTick@entry=nsRefreshDriver::IsExtraTick::No) at /raid/src3/layout/base/nsRefreshDriver.cpp:2599
#30 0x00007f0207b377f9 in mozilla::RefreshDriverTimer::TickDriver(nsRefreshDriver*, mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp)
    (driver=0x7f01a68f5000, aId=..., now=...) at /raid/src3/layout/base/nsRefreshDriver.cpp:375
#31 mozilla::RefreshDriverTimer::TickRefreshDrivers(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp, nsTArray<RefPtr<nsRefreshDriver> >&)
    (this=this@entry=0x7f01bdd143a0, aId=aId@entry=..., aNow=..., aNow@entry=..., aDrivers=nsTArray<RefPtr<nsRefreshDriver> > & = {...}) at /raid/src3/layout/base/nsRefreshDriver.cpp:353
#32 0x00007f0207b37674 in mozilla::RefreshDriverTimer::Tick(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp)
     (this=this@entry=0x7f01bdd143a0, aId=aId@entry=..., now=now@entry=...) at /raid/src3/layout/base/nsRefreshDriver.cpp:369
#33 0x00007f0207b37581 in mozilla::VsyncRefreshDriverTimer::RunRefreshDrivers(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp) (this=0x7f01a68f5000, 
    this@entry=0x7f01bdd143a0, aId=..., aId@entry=..., aTimeStamp=..., aTimeStamp@entry=...) at /raid/src3/layout/base/nsRefreshDriver.cpp:896
#34 0x00007f0207b37025 in mozilla::VsyncRefreshDriverTimer::TickRefreshDriver(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp)
    (this=this@entry=0x7f01bdd143a0, aId=..., aVsyncTimestamp=...) at /raid/src3/layout/base/nsRefreshDriver.cpp:810
#35 0x00007f0207b36d81 in mozilla::VsyncRefreshDriverTimer::NotifyVsyncOnMainThread(mozilla::VsyncEvent const&) (this=this@entry=0x7f01bdd143a0, aVsyncEvent=...)
    at /raid/src3/layout/base/nsRefreshDriver.cpp:731
#36 0x00007f0207b36aea in mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::NotifyVsyncTimerOnMainThread() (this=<optimized out>)
    at /raid/src3/layout/base/nsRefreshDriver.cpp:594
#37 0x00007f0207b378bd in mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::NotifyVsync(mozilla::VsyncEvent const&)::{lambda()#1}::operator()() const (this=<optimized out>)
    at /raid/src3/layout/base/nsRefreshDriver.cpp:566
--Type <RET> for more, q to quit, c to continue without paging--
#38 mozilla::detail::RunnableFunction<mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::NotifyVsync(mozilla::VsyncEvent const&)::{lambda()#1}>::Run() (this=<optimized out>)
    at /raid/src3/objdir-opt/dist/include/nsThreadUtils.h:531
#39 0x00007f0204ec5ab9 in mozilla::RunnableTask::Run() (this=0x7f01a82bc680) at /raid/src3/xpcom/threads/TaskController.cpp:538

Rendering content by WR:

#0  0x00007f020fe8ec4c in __pthread_kill_implementation () from /lib64/libc.so.6
(gdb) bt
#0  0x00007f020fe8ec4c in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x00007f020fe3e9c6 in raise () at /lib64/libc.so.6
#2  0x00007f0208ea2b55 in nsProfileLock::FatalSignalHandler(int, siginfo_t*, void*) (signo=11, info=0x7f01d4b2b030, context=<optimized out>)
    at /raid/src3/toolkit/profile/nsProfileLock.cpp:174
#3  0x00007f02098eced9 in WasmTrapHandler(int, siginfo_t*, void*) (signum=11, info=0x7f01d4b2b030, context=0x7f01d4b2af00) at /raid/src3/js/src/wasm/WasmSignalHandlers.cpp:783
#4  0x00007f020fe3ea70 in <signal handler called> () at /lib64/libc.so.6
#5  0x00007f01d3788841 in dri2_wl_swap_buffers_with_damage (disp=0x7f01eec91f00, draw=0x7f01a7e68800, rects=0x7f01a75c67a0, n_rects=1) at ../src/egl/drivers/dri2/platform_wayland.c:1551
#6  0x00007f01d377e0fc in dri2_swap_buffers_with_damage (disp=0x7f01eec91f00, surf=0x7f01a7e68800, rects=0x7f01a75c67a0, n_rects=1) at ../src/egl/drivers/dri2/egl_dri2.c:2038
#7  0x00007f01d376faf3 in _eglSwapBuffersWithDamageCommon (disp=0x7f01eec91f00, surf=0x7f01a7e68800, rects=0x7f01a75c67a0, n_rects=1) at ../src/egl/main/eglapi.c:1386
#8  0x00007f0205866211 in mozilla::gl::GLLibraryEGL::fSwapBuffersWithDamage(void*, void*, int const*, int)
    (this=0x51868400000000, dpy=0x7f020fd3eba8, rects=0x7f01a75c67a0, n_rects=1, surface=<optimized out>) at /raid/src3/gfx/gl/GLLibraryEGL.h:510
#9  mozilla::gl::EglDisplay::fSwapBuffersWithDamage(void*, int const*, int) (this=0x0, rects=0x7f01a75c67a0, n_rects=1, surface=<optimized out>) at /raid/src3/gfx/gl/GLLibraryEGL.h:939
#10 mozilla::gl::GLContextEGL::SwapBuffers() (this=0x7f01f4378c00) at /raid/src3/gfx/gl/GLContextProviderEGL.cpp:537
#11 0x00007f0205af5385 in mozilla::wr::RenderCompositorEGL::EndFrame(nsTArray<mozilla::wr::Box2D<int, mozilla::wr::DevicePixel> > const&) (this=0x7f01b2478c00, aDirtyRects=<optimized out>)
    at /raid/src3/gfx/webrender_bindings/RenderCompositorEGL.cpp:148
#12 0x00007f0205b010e9 in mozilla::wr::RendererOGL::UpdateAndRender(mozilla::Maybe<mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> > const&, mozilla::Maybe<mozilla::wr::ImageFormat> const&, mozilla::Maybe<mozilla::Range<unsigned char> > const&, bool*, mozilla::wr::RendererStats*)
    (this=0x7f01ae3fdc80, aReadbackSize=..., aReadbackFormat=..., aReadbackBuffer=..., aNeedsYFlip=0x0, aOutStats=aOutStats@entry=0x7f01d4b2b7a0)
    at /raid/src3/gfx/webrender_bindings/RendererOGL.cpp:223
#13 0x00007f0205b0054b in mozilla::wr::RenderThread::UpdateAndRender(mozilla::wr::WrWindowId, mozilla::layers::BaseTransactionId<mozilla::VsyncIdType> const&, mozilla::TimeStamp const&, bool, mozilla::Maybe<mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> > const&, mozilla::Maybe<mozilla::wr::ImageFormat> const&, mozilla::Maybe<mozilla::Range<unsigned char> > const&, bool*) (this=this@entry=0x7f01eec5b040, aWindowId=aWindowId@entry=..., aStartId=..., aStartTime=..., aRender=true, aReadbackSize=..., aReadbackFormat=..., aReadbackBuffer=..., aNeedsYFlip=0x0)
    at /raid/src3/gfx/webrender_bindings/RenderThread.cpp:565
#14 0x00007f0205afff9b in mozilla::wr::RenderThread::HandleFrameOneDoc(mozilla::wr::WrWindowId, bool) (this=0x7f01eec5b040, aWindowId=..., aRender=<optimized out>)
    at /raid/src3/gfx/webrender_bindings/RenderThread.cpp:411
#15 0x00007f0205b0760f in mozilla::detail::RunnableMethodArguments<mozilla::wr::WrWindowId, bool>::applyImpl<mozilla::wr::RenderThread, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, bool), StoreCopyPassByConstLRef<mozilla::wr::WrWindowId>, StoreCopyPassByConstLRef<bool>, 0ul, 1ul>(mozilla::wr::RenderThread*, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, bool), mozilla::Tuple<StoreCopyPassByConstLRef<mozilla::wr::WrWindowId>, StoreCopyPassByConstLRef<bool> >&, std::integer_sequence<unsigned long, 0ul, 1ul>)
    (o=<optimized out>, m=<optimized out>, args=...) at /raid/src3/objdir-opt/dist/include/nsThreadUtils.h:1147
#16 mozilla::detail::RunnableMethodArguments<mozilla::wr::WrWindowId, bool>::apply<mozilla::wr::RenderThread, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, bool)>(mozilla::wr::RenderThread*, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, bool)) (this=0x38, o=<optimized out>, m=<optimized out>)
    at /raid/src3/objdir-opt/dist/include/nsThreadUtils.h:1153
#17 mozilla::detail::RunnableMethodImpl<mozilla::wr::RenderThread*, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, bool), true, (mozilla::RunnableKind)0, mozilla::wr::WrWindowId, bool>::Run() (this=0x0) at /raid/src3/objdir-opt/dist/include/nsThreadUtils.h:1200
#18 0x00007f0204eb8f77 in nsThread::ProcessNextEvent(bool, bool*) (this=0x7f01f1db5c40, aMayWait=<optimized out>, aResult=0x7f01d4b2bb1f) at /raid/src3/xpcom/threads/nsThread.cpp:1199
#19 0x00007f0204ebca7b in NS_ProcessNextEvent(nsIThread*, bool) (aThread=0x7f020fd3eba8, aThread@entry=0x7f01f1db5c40, aMayWait=true) at /raid/src3/xpcom/threads/nsThreadUtils.cpp:465
#20 0x00007f020554850a in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) (this=0x7f01eda745c0, aDelegate=0x7f01d4b2bbd0)
    at /raid/src3/ipc/glue/MessagePump.cpp:330
#21 0x00007f02054fbfd6 in MessageLoop::RunInternal() (this=0x81) at /raid/src3/ipc/chromium/src/base/message_loop.cc:381
#22 MessageLoop::RunHandler() (this=0x81) at /raid/src3/ipc/chromium/src/base/message_loop.cc:374
#23 MessageLoop::Run() (this=0x81) at /raid/src3/ipc/chromium/src/base/message_loop.cc:356
#24 0x00007f0204eb655e in nsThread::ThreadFunc(void*) (aArg=0x7f01f1dbaf80) at /raid/src3/xpcom/threads/nsThread.cpp:384
#25 0x00007f02100f8789 in _pt_root (arg=0x7f01eda5c700) at /raid/src3/nsprpub/pr/src/pthreads/ptthread.c:201
#26 0x00007f020fe8ce2d in start_thread () at /lib64/libc.so.6
#27 0x00007f020ff121b0 in clone3 () at /lib64/libc.so.6

It can be reproduced by large popups which uses HW Webrender.

Copying crash signatures from duplicate bugs.

Crash Signature: omitted> | <name omitted> | <name omitted> | mozilla::gl::CreateSurfaceFromNativeWindow ] [@ wl_proxy_create_wrapper | dri2_wl_create_window_surface] → omitted> | <name omitted> | <name omitted> | mozilla::gl::CreateSurfaceFromNativeWindow ] [@ wl_proxy_create_wrapper | dri2_wl_create_window_surface] [@ dri2_wl_swap_buffers_with_damage.lto_priv.0]

It looks lile we call dri2_wl_swap_buffers_with_damage() after dri2_wl_release_buffers().

dri2_wl_swap_buffers_with_damage():

(gdb) p* dri2_surf->current
$5 = {
  wl_buffer = 0x0,
  wl_release = false,
  dri_image = 0x0,
  linear_copy = 0x0,
  data = 0x0,
  data_size = 0,
  bo = 0x0,
  locked = false,
  age = 1
}  
Crash Signature: omitted> | <name omitted> | <name omitted> | mozilla::gl::CreateSurfaceFromNativeWindow ] [@ wl_proxy_create_wrapper | dri2_wl_create_window_surface] [@ dri2_wl_swap_buffers_with_damage.lto_priv.0] → omitted> | <name omitted> | <name omitted> | mozilla::gl::CreateSurfaceFromNativeWindow ] [@ wl_proxy_create_wrapper | dri2_wl_create_window_surface] [@ dri2_wl_swap_buffers_with_damage.lto_priv.0]

I wonder if we paint to early as I'm getting this log entries when running in debugger:

[Parent 64779: Main Thread]: D/WidgetPopup [7fff955cc400]:   moz_container_wayland_add_or_fire_initial_draw_callback ConfigureCompositor
[Parent 64779: Main Thread]: D/WidgetPopup [7fff955cc400]: nsWindow::ResumeCompositorImpl()
[Parent 64779: Main Thread]: D/WidgetPopup GtkCompositorWidget::EnableRendering() [7fff955cc400]
[Parent 64779: Main Thread]: D/WidgetPopup   quit, mIsRenderingSuspended = false
[Parent 64779: Main Thread]: D/Widget [7fff955cc400] moz_container_wayland_add_or_fire_initial_draw_callback set visible

and

(gdb) p dri2_surf->color_buffers
$6 = {{
    wl_buffer = 0x0,
    wl_release = false,
    dri_image = 0x0,
    linear_copy = 0x0,
    data = 0x0,
    data_size = 0,
    bo = 0x0,
    locked = false,
    age = 0
  }, {
    wl_buffer = 0x0,
    wl_release = false,
    dri_image = 0x0,
    linear_copy = 0x0,
    data = 0x0,
    data_size = 0,
    bo = 0x0,
    locked = false,
    age = 0
  }, {
    wl_buffer = 0x0,
    wl_release = false,
    dri_image = 0x0,
    linear_copy = 0x0,
    data = 0x0,
    data_size = 0,
    bo = 0x0,
    locked = false,
    age = 0
  }, {
    wl_buffer = 0x0,
    wl_release = false,
    dri_image = 0x0,
    linear_copy = 0x0,
    data = 0x0,
    data_size = 0,
    bo = 0x0,
    locked = false,
    age = 1
  }}

Definitely a Firefox bug, looking at it.

Component: Graphics → Widget: Gtk
Priority: -- → P2

Copying crash signatures from duplicate bugs.

Crash Signature: omitted> | <name omitted> | <name omitted> | mozilla::gl::CreateSurfaceFromNativeWindow ] [@ wl_proxy_create_wrapper | dri2_wl_create_window_surface] [@ dri2_wl_swap_buffers_with_damage.lto_priv.0] → omitted> | <name omitted> | <name omitted> | mozilla::gl::CreateSurfaceFromNativeWindow ] [@ wl_proxy_create_wrapper | dri2_wl_create_window_surface] [@ dri2_wl_swap_buffers_with_damage.lto_priv.0] [@ dri2_query_image]

Map/Unmap signals creates and deletes mContainer wayland surface and EGL window.

As we need to call the handlers in correct order (mContainer::map -> nsWindow::map and nsWindow::unmap -> mContainer::unmap)
connect the signals to mContainer widget and call mContainer::unmap from nsWindow::unmap.

Then nsWindow::unmap can update compositor before wl_surface/EGL window is released by mContainer.

Assignee: nobody → stransky

When GtkWidget is hidden, underlying wl_surface is deleted. We need to also update EGLSurface of GtkWidget (GtkCompositorWidget)
as EGLSurface is directly linked to wl_surface:

  • When GtkWidget is hidden, call GtkCompositorWidget::DisableRendering(). That releases GtkCompositorWidget resources
    related to GtkWidget (XWindow/XVisual etc.) and marks the widget as hidden.
  • If GtkWidget is backed by EGL call compositor resume which forces compositor to create new EGLSurface.
  • Make sure GLContextEGL can create EGLSurface even when GtkWidget is hidden and wl_surface is missing.
    It prevents fallback to SW rendering or pause RenderCompositorEGL which leads to Bug 1777664 (whole browser UI freeze).
  • Return early from RenderCompositorEGL::BeginFrame()/RenderCompositorEGL::EndFrame() when GtkCompositorWidget is hidden.

Depends on D157357

This can be tested with large popups which are accelerated.
try: https://treeherder.mozilla.org/jobs?repo=try&revision=4952df88411933a8cc075610c150529b5ae7ac87

Crash Signature: omitted> | <name omitted> | <name omitted> | mozilla::gl::CreateSurfaceFromNativeWindow ] [@ wl_proxy_create_wrapper | dri2_wl_create_window_surface] [@ dri2_wl_swap_buffers_with_damage.lto_priv.0] [@ dri2_query_image] → omitted> | <name omitted> | <name omitted> | mozilla::gl::CreateSurfaceFromNativeWindow ] [@ wl_proxy_create_wrapper | dri2_wl_create_window_surface] [@ dri2_wl_swap_buffers_with_damage.lto_priv.0] [@ dri2_query_image]

I tested the latest try build and found that the hamburger menu is almost always cropped when I open it (basically bug 1717351, but worse than ever). Martin, can you reproduce this?

Pushed by stransky@redhat.com: https://hg.mozilla.org/integration/autoland/rev/5ba3a56b3038 [Wayland] Attach map/unmap signals to mContainer r=emilio https://hg.mozilla.org/integration/autoland/rev/6841dc516087 [Wayland] Update EGLSurface when wl_surface is deleted r=emilio,jgilbert https://hg.mozilla.org/integration/autoland/rev/ca32ff38fc67 [Linux] Don't use SendResumeAsync() as we need to sync render thread with main thread where Gtk operates r=emilio https://hg.mozilla.org/integration/autoland/rev/27a918234048 [Wayland] Stop vsync before GtkWindow si hidden r=emilio https://hg.mozilla.org/integration/autoland/rev/4f164c3d0ba9 [Linux] Call unmap for X11 mozcontainer r=emilio

Will look at it, Thanks.

Attached file Bug 1645677 [Linux] Fix build on X11 r?emilio (obsolete) (deleted) —

Depends on D157795

Attachment #9295730 - Attachment is obsolete: true

Depends on D157629

Flags: needinfo?(stransky)
Pushed by stransky@redhat.com: https://hg.mozilla.org/integration/autoland/rev/2bc535493586 [Wayland] Attach map/unmap signals to mContainer r=emilio https://hg.mozilla.org/integration/autoland/rev/e069eb737805 [Wayland] Update EGLSurface when wl_surface is deleted r=emilio,jgilbert https://hg.mozilla.org/integration/autoland/rev/58442f6d4429 [Linux] Don't use SendResumeAsync() as we need to sync render thread with main thread where Gtk operates r=emilio https://hg.mozilla.org/integration/autoland/rev/f94e6c6f35e3 [Wayland] Stop vsync before GtkWindow si hidden r=emilio https://hg.mozilla.org/integration/autoland/rev/8292303219f9 [Linux] Call unmap for X11 mozcontainer r=emilio https://hg.mozilla.org/integration/autoland/rev/3825a87d293c [Linux] Fix build on X11 r=emilio
Regressions: 1792315
Regressions: 1792568
Regressions: 1788247
Flags: qe-verify+

(In reply to Hiroyuki Ikezoe (:hiro) from comment #13)

Today I saw this crash; bp-1a5e6f53-ee30-4f54-b1f9-520350220801

What I did when the crash happened is, I have two monitors and am using a single Firefox window on a monitor and the browser window is full-screened.

  1. Drag a tab in the browser and drop the tab into another monitor
  2. A new Fiforefox window opened
  3. In the mean time, I did install this extension in bug 1683612 comment 0 from about:debugging
  4. Click the extension icon on the toolbar on the new browser window
  5. Change the monitor resolution for the another monitor
  6. Repeat 4) and 5) several times
  7. Click the Multi-Account containers' icon on the toolbar

I am attempting to verify this fix but I could not succeed in reproducing it. There was no crash and all the pannels seem to be correctly painted. Could you help by answering some questions?

  1. Firstly, Are these the steps that are supposed to have been fixed?
  2. In step 5, which monitor should have its resolution changed? What kind of resolution should it be set to? (Any that has the incompatible aspect ratio?)
  3. In how many tries did it reproduce in your case?
  4. For step 7, I suppose I'd need to install the "Multi-Account Containers" addon, right?
  5. Is there anything else missing?

Thank you!

Flags: needinfo?(stransky)

It depends if addon popup is large enough or use remote content so WebRender is used for it. If you want to reproduce, use following steps:

  1. Install addon with large popup - I used https://addons.mozilla.org/en-US/firefox/addon/youtube-high-definition/
  2. Run Firefox with Wayland backend, click repeatedly between pocket popup and addon popup.
  3. You may see browser freeze after some time.
Flags: needinfo?(stransky)

The steps used to reproduce are as follows:

  1. opened an affected build version as Nightly v107.0a1 from 2022-09-20 (with Wayland window protocol)
  2. Drag a tab from this window to a second monitor so A new Firefox window opened
  3. Installed Multi-Account Container and Youtube High Definition (and other addons that have big/complex pop-ups when clicking the icons in the main bar).
  4. stress click the extension icons on the toolbar on the new browser window so that the pop-ups are painted.
  5. Changed the resolution of the second monitor between several different available ones.
  6. Repeat 4) and 5) several times

No crash or freeze was observed on my system after many tries.
This could be a compatibility issue or some steps are missing.

(In reply to Bodea Daniel [:danibodea] from comment #43)
Please don't spend much time on that as it's race condition and depends on timing and actual computer speed.

I did attempt to reproduce it on another, less performant system while loading all mentioned addons and stressing the browser session with resource-consuming websites and while changing all available monitor resolutions of a second monitor on both Nightly v107.0a1 from 2022-09-20 (build a few days older than the fix) and v105.0a1 from 2022-07-31 (the exact build the crash was firstly reported) but it could not be reproduced.

Also considering the assignee's comment 44, QA can not manually verify this fix.

Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: