Closed Bug 1225044 Opened 9 years ago Closed 8 years ago

Important graphic glitches on GNU/Linux with gtk 3

Categories

(Core :: Graphics, defect)

46 Branch
Unspecified
Linux
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla50
Tracking Status
firefox45 + wontfix
firefox46 --- wontfix
firefox47 --- wontfix
firefox48 --- wontfix
firefox49 --- verified
firefox50 --- verified

People

(Reporter: Sylvestre, Assigned: acomminos, NeedInfo)

References

Details

(Keywords: regression, Whiteboard: gfx-noted)

Attachments

(3 files)

Attached image sc.png (deleted) —
With nightly 45, I have some important graphic glitches under Linux.
I don't have any thing special and 42 beta was working fine.

The attachment shows all. I am of course available to test.
Spotted that on a VM (debian stable uptodate) with nightly available on friday.
Whiteboard: gfx-noted
Bas, any news about this? This makes the GNU/Linux version a pain to use.
We don't want this bug in aurora. Thanks
Flags: needinfo?(bas)
(In reply to Sylvestre Ledru [:sylvestre] from comment #2)
> Bas, any news about this? This makes the GNU/Linux version a pain to use.
> We don't want this bug in aurora. Thanks

Hmm, I thought this was actually fixed.

Nical, could you check this out? I suspect this may be a regression from bug 1210560.
Flags: needinfo?(bas)
Lee, you also did some clipping work on GNU linux recently, may also be related.
Assignee: nobody → nical.bugzilla
Flags: needinfo?(lsalzman)
I think this is a dup of 1224974, so already fixed in currently nightlies.
(In reply to Nathan Froyd [:froydnj] from comment #5)
> I think this is a dup of 1224974, so already fixed in currently nightlies.

Sylvestre, can you check if this is now fixed or if it is still occuring for you?
Flags: needinfo?(lsalzman) → needinfo?(sledru)
Yes, it is still occurring on a locally built nightly. However, it is not appearing all the time but it is easy to reproduce (5 clicks on the hamburger menu).
I can do a demo to nical if needed.
Flags: needinfo?(sledru)
(In reply to Sylvestre Ledru [:sylvestre] from comment #7)
> Yes, it is still occurring on a locally built nightly. However, it is not
> appearing all the time but it is easy to reproduce (5 clicks on the
> hamburger menu).
> I can do a demo to nical if needed.

Hmm, I tried a few different pref combinations (layers acceleration on/off, cairo vs. skia, etc.) and I can't manage to reproduce any of the reported symptoms here on a recent Linux nightly.
I can also reproduce on Nightly (build id: 20151214122242), using Ubuntu 15.10.

Graphic driver is Intel 3.0 Mesa 11.0.2
I also spotted some glitches on some B2G devices, namely at first Xperia Z1 Compact and Xperia M2, based of AOSP. Then I have also spotted them on Wileyfox Switft running recent B2G build based of CyanogenMod so I am starting to wonder if it might be related. Glitches are often visible when entering task switcher, around the place where the app name is displayed above the screenshot.
I haven't experienced this for a while. Closing it.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
Steps to reproduce:
1. Open any versions Firefox on Ubuntu 14.04/15.10/16.04
2. menubar > Help > Firefox Tour

Regression range:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=1ee54e043b9b05d69e6a9f981aa6c4ef0dd65be3&tochange=939320b957c588ad809e9b4a64b7f232dd4d9b72
That's a GTK3 switch.
Assignee: nical.bugzilla → andrew
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Summary: Important graphic glitches on GNU/Linux → Important graphic glitches on GNU/Linux with gtk 3
The main usability issue here (at least the one I can reproduce, where tour highlights are garbage/transparent) is caused by our widget fighting for the placement of the toplevel tour highlights with the window manager.

> 864626880[7ff1cb0a24a0]: nsWindow::Move [7ff18a52bc00] 1849.000000 58.000000
> 864626880[7ff1cb0a24a0]: configure event [7ff18a52bc00] 1845 58 51 51
> 864626880[7ff1cb0a24a0]: nsWindow::Move [7ff18a52bc00] 1849.000000 58.000000
> 864626880[7ff1cb0a24a0]: configure event [7ff18a52bc00] 1845 58 51 51
> etc...

This causes the expose events on the highlights to be discarded due to the event queue overflowing. The reason this fighting occurs is because the tour highlights are of type GTK_WINDOW_TOPLEVEL, which in our current widget code enforces their position (whereas GTK_WINDOW_POPUP windows do not). I'm currently looking into reasons why the positional fighting is occurring.
Never mind, the ConfigureNotify -> resize loop is a separate, disjoint issue (which is intermittent and likely merits its own bug). Still looking into the underlying reason why expose events aren't reaching our window from GDK.
Looks like this is a GTK bug we're exposing when rapidly showing and hiding a toplevel GtkWindow.

The reason we're not getting expose events on the UITour highlights is because their frame clock is permanently frozen (hence only affecting versions of GTK with GtkFrameClock). When we call gtk_window_show on our highlight widget, GTK internally freezes the frame clock until a ConfigureNotify event is received, incrementing the window's configure_request_count for toplevel windows whose size and position may be changed by the window manager. The intent is that when a ConfigureNotify is received on the window, if configure_request_count is greater than zero it is decremented and the frame clock is thawed.

Unfortunately, there is a call to gtk_widget_hide from nsWindow before the ConfigureNotify request is received, that sets configure_request_count to zero without thawing the frame clock. This is what's causing our freeze.

We should be able to work around this; I'll write up a patch.
Comment on attachment 8754052 [details]
Bug 1225044 - Workaround GTK windows freezing on unmap.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/53692/diff/1-2/
Comment on attachment 8754052 [details]
Bug 1225044 - Workaround GTK windows freezing on unmap.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/53692/diff/2-3/
Bug 1255747 is tracked as a regression in Firefox 47 but was marked as a duplicate of this bug. Any chance of getting this fixed in what is currently beta? Or are we ok with it being a polish bug and shipping with it?
Flags: needinfo?(milan)
The patch doesn't look that large or scary, so if it passes review, we can probably uplift.
Karl, review ping (on top of all the other pings on GTK related stuff you're getting :)
Flags: needinfo?(milan) → needinfo?(karlt)
Marking as a regression as per dub bug 1255747.
(In reply to Andrew Overholt [:overholt] from comment #21)
> Bug 1255747 is tracked as a regression in Firefox 47 but was marked as a
> duplicate of this bug. Any chance of getting this fixed in what is currently
> beta? Or are we ok with it being a polish bug and shipping with it?

I can reproduce bug 1255747 in 46, which has actually already shipped.
The reporter has a different range but I suspect that may be due to certain
timing required to reproduce.

This is a bug in libgtk, which has been fixed.  A partial workaround for
faulty gtk versions may be possible, but uplifting to 47 is not appropriate at
this stage.

(In reply to Milan Sreckovic [:milan] from comment #22)
> The patch doesn't look that large or scary, so if it passes review, we can
> probably uplift.
> Karl, review ping (on top of all the other pings on GTK related stuff you're
> getting :)

I'm not happy with the patch as is because it reorders events.
That can be fixed, but I wonder whether a more reliable approach is possible, and so it may make sense to check that out before adjusting the partial workaround here.
We're still not clear why this bug is not reproducible in GTK 3.6.3, 
libgtk-3-0_3.10.8-0ubuntu1.6_amd64.deb is the oldest version I've tested and seen this.
Flags: needinfo?(karlt)
Version: Trunk → 46 Branch
Although the GTK bug a theoretically existed for unmap after resize since GTK
2.14, it only started showing up for unmap after show in GTK 3.8 due to
https://git.gnome.org/browse/gtk+/commit/gtk/gtkwindow.c?id=97ba4b1b8eb82563f13762a4bd8cfe9beb8a121c
which changed from comparing the size request in gtk_window_move_resize
against the last size request to against the last received resize
notification.

Previously a configure notification was not expected after a show/map.  Now
one is expected when the window is resized before map (and an additional
resize request is sent and the window is frozen during check-resize after
map.)

I didn't work out why Gecko is hiding the window so soon after show.

#11 0x00007fdc99f5e573 in gtk_widget_hide (widget=0x7fdc9ad76800)
    at gtkwidget.c:4268
#12 0x00007fdc916a08f6 in nsWindow::NativeShow (this=<optimized out>, 
    aAction=<optimized out>)
    at /mnt/ssd1/karl/moz/dev/widget/gtk/nsWindow.cpp:4182
#13 0x00007fdc916a64c4 in nsWindow::Show (this=<optimized out>, 
    aState=<optimized out>)
    at /mnt/ssd1/karl/moz/dev/widget/gtk/nsWindow.cpp:1095
#14 0x00007fdc91673786 in nsView::DoResetWidgetBounds (this=<optimized out>, 
    aMoveOnly=<optimized out>, aInvalidateChangedSize=<optimized out>)
    at /mnt/ssd1/karl/moz/dev/view/nsView.cpp:322
#15 0x00007fdc91675dac in nsViewManager::ProcessPendingUpdatesForView (
    this=<optimized out>, aView=<optimized out>, 
    aFlushDirtyRegion=<optimized out>)
    at /mnt/ssd1/karl/moz/dev/view/nsViewManager.cpp:399
#16 0x00007fdc9167603a in nsViewManager::UpdateWidgetGeometry (
    this=<optimized out>)
    at /mnt/ssd1/karl/moz/dev/view/nsViewManager.cpp:1132
#17 0x00007fdc918910fe in PresShell::FlushPendingNotifications (
    this=<optimized out>, aFlush=...)
    at /mnt/ssd1/karl/moz/dev/layout/base/nsPresShell.cpp:4139
#18 0x00007fdc918911b8 in PresShell::FlushPendingNotifications (
    this=<optimized out>, aType=<optimized out>)
    at /mnt/ssd1/karl/moz/dev/layout/base/nsPresShell.cpp:3980
#19 0x00007fdc90a1b522 in nsDocument::FlushPendingNotifications (
    this=<optimized out>, aType=<optimized out>)
    at /mnt/ssd1/karl/moz/dev/dom/base/nsDocument.cpp:8393
#20 0x00007fdc917935f3 in nsComputedDOMStyle::UpdateCurrentStyleSources (
    this=<optimized out>, aNeedsLayoutFlush=<optimized out>)
    at /mnt/ssd1/karl/moz/dev/layout/style/nsComputedDOMStyle.cpp:646
#21 0x00007fdc917bb6de in nsComputedDOMStyle::GetPropertyCSSValue (
    this=<optimized out>, aPropertyName=..., aRv=...)
    at /mnt/ssd1/karl/moz/dev/layout/style/nsComputedDOMStyle.cpp:808
#22 0x00007fdc917bb80f in nsComputedDOMStyle::GetPropertyValue (
    this=<optimized out>, aPropertyName=..., aReturn=...)
    at /mnt/ssd1/karl/moz/dev/layout/style/nsComputedDOMStyle.cpp:383
#23 0x00007fdc917bb8e6 in nsComputedDOMStyle::GetPropertyValue (
    this=<optimized out>, aPropID=<optimized out>, aValue=...)
    at /mnt/ssd1/karl/moz/dev/layout/style/nsComputedDOMStyle.cpp:318
#24 0x00007fdc90b1a6d6 in nsDOMCSSDeclaration::GetPaddingTop (rv=..., 
    aValue=..., this=<optimized out>)
    at /mnt/ssd1/karl/moz/dev/layout/style/nsCSSPropList.h:3409
#25 mozilla::dom::CSS2PropertiesBinding::get_paddingTop (cx=<optimized out>, 
    obj=..., self=<optimized out>, args=...)
    at /mnt/sda11/karl/obj-opt/dom/bindings/CSS2PropertiesBinding.cpp:29471
Attachment #8754052 - Flags: review?(karlt) → review-
Comment on attachment 8754052 [details]
Bug 1225044 - Workaround GTK windows freezing on unmap.

https://reviewboard.mozilla.org/r/53692/#review55512

Thanks for diagnosing the cause here, Andrew.  Now that it is fixed in GTK,
the workaround can be limited to GTK version &lt;= 3.20.

I don&#39;t really object to this approach if it is simpler, although it would be
nice to unmap sooner rather than later, and I think that is possible.

It looks like it would be possible to count the maximium number of potential
configure_event calls required by watching the check-resize signal.

This count can be decremented in OnConfigureEvent().

When unmap is to be called, GtkWindow::configure_event() can be called the
remaining number of times with size from gtk_widget_get_allocation().
Do this only for gtk_get_minor_version() &lt; 21, in case configure_event is
modified in the future.

This approach would also work even if the window manager has not yet replied
(perhaps because the configure request is still in Xlib&#39;s output buffer).

What do you think?

If you think that checking the Xlib event queue is simpler, then I&#39;d like some
changes to the event checking as indicated below.

::: widget/gtk/nsWindow.cpp:1087
(Diff revision 3)
> +    // Determine if a ConfigureNotify event is pending for this window; if we
> +    // unmap a window while GTK is waiting for a configure event to unthaw the
> +    // frame counter, we can permanently freeze the counter. See bug 1225044.
> +    mUnmapOnConfigure = !aState && HasPendingConfigure();
> +    if (mUnmapOnConfigure)
> +        return NS_OK;

This might be better in NativeShow()?
That would also take effect during resize to zero size.
The call from OnConfigureEvent() would not need to check HasPendingConfigure() itself then.

::: widget/gtk/nsWindow.cpp:2459
(Diff revision 3)
> +    if (mUnmapOnConfigure && !HasPendingConfigure()) {
> +        mUnmapOnConfigure = false;
> +        NativeShow(false);
> +    }
> +

Probably better to have this earlier in the function in case there are delays or nested event loops during NotifyWindowMoved().

::: widget/gtk/nsWindow.cpp:7008
(Diff revision 3)
> +    if (XCheckTypedWindowEvent(mXDisplay, mXWindow, ConfigureNotify, &ev)) {
> +        XPutBackEvent(mXDisplay, &ev);

The events can be kept in order by using XCheckIfEvent() with a predicate that always returns
False.

I'd also like this limited to pending configures associated with this window, so that an unmap is not delayed for a period of moving another window.
Check that the approach is still successful with that limitation.
Andrew, any news about this? Thanks
Flags: needinfo?(andrew)
(In reply to Sylvestre Ledru [:sylvestre] from comment #27)
> Andrew, any news about this? Thanks

Thanks for the reminder, lost track of this one.

> It looks like it would be possible to count the maximium number of potential
> configure_event calls required by watching the check-resize signal.

This seems like a much better approach, I'll work on a patch for this. I don't think there's a way to make checking the Xlib event queue work 100% of the time without some sort of synchronous communication with the window manager, which isn't ideal.
Flags: needinfo?(andrew)
Comment on attachment 8754052 [details]
Bug 1225044 - Workaround GTK windows freezing on unmap.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/53692/diff/3-4/
Attachment #8754052 - Attachment description: MozReview Request: Bug 1225044 - Avoid unmapping windows with a pending ConfigureNotify, fixing GDK frame counter freezes. r?karlt → Bug 1225044 - Workaround GTK windows freezing on unmap.
Attachment #8754052 - Flags: review- → review?(karlt)
Attachment #8754052 - Flags: review?(karlt) → review-
Comment on attachment 8754052 [details]
Bug 1225044 - Workaround GTK windows freezing on unmap.

https://reviewboard.mozilla.org/r/53692/#review57474

Thanks for doing this!

::: widget/gtk/mozgtk/mozgtk.c:203
(Diff revision 4)
>  STUB(gtk_entry_get_type)
>  STUB(gtk_entry_new)
>  STUB(gtk_entry_set_activates_default)
>  STUB(gtk_entry_set_text)
>  STUB(gtk_enumerate_printers)
> +STUB(gtk_events_pending)

No longer necessary, with the nested event loop removed.

::: widget/gtk/nsWindow.cpp:4198
(Diff revision 4)
>      else {
>          if (mIsTopLevel) {
> -            gtk_widget_hide(GTK_WIDGET(mShell));
> +            // Workaround window freezes on GTK versions before 3.21.2 by
> +            // ensuring that configure events get dispatched to windows before
> +            // they are unmapped. See bug 1225044.
> +            if (gtk_check_version(3, 21, 2) && mPendingConfigures > 0) {

Can you make this gtk_check_version(3, 21, 2) != nullptr please, to indicate that the rv is not boolean, and so hint that the meaning return value is not as simple as one might expect (which is the opposite).

::: widget/gtk/nsWindow.cpp:4203
(Diff revision 4)
> +            if (gtk_check_version(3, 21, 2) && mPendingConfigures > 0) {
> +                GtkAllocation allocation;
> +                gtk_widget_get_allocation(GTK_WIDGET(mShell), &allocation);
> +
> +                GdkEvent event;
> +                memset(&event, 0, sizeof(event));

PodZero(&event);

GdkEvent event = { 0 };
is fine if that works for unions, but I guess it doesn't.

GdkEvent event = GdkEvent();
is another option.

::: widget/gtk/nsWindow.cpp:4220
(Diff revision 4)
> +
> +                // Since gtk_widget_hide sets the widget's
> +                // configure_request_count to 0, ensure our mock events get
> +                // processed first.
> +                while (gtk_events_pending()) {
> +                    gtk_main_iteration();

This risks running other events beyond the configures inserted above.  Consider all nested event loops evil.  They change the order of execution, surprising calling code in all manner of ways, and there is no knowing when they are going to complete.  One nested event loop may start another event loop that prevents all others from unwinding (if it triggers a JS alert() for example).

Call only GTK_WIDGET_GET_CLASS(mShell)->configure_event()

mPendingConfigures would then need to be adjusted here, as OnConfigureEvent would not be called, which I think is desirable.

::: widget/gtk/nsWindow.cpp:6080
(Diff revision 4)
> +check_resize_cb (GtkContainer* container, gpointer user_data)
> +{
> +    static_cast<nsWindow*>(user_data)->OnCheckResize();

Passing the nsWindow* through user_data is a little simpler than using get_window_for_gtk_widget(), but there is a risk of something using the GtkWindow after the nsWindow is destroyed, so this approach would require disconnecting from the signal before the nsWindow is disconnected from the GtkWindow.

I think I probably prefer following the existing get_window_for_gtk_widget() approach over having a set of disconnect calls isolated from signal-handling code.

theme_changed_cb needs to use user_data because the signal is on the GtkSettings instead of the GtkWindow.
https://reviewboard.mozilla.org/r/53692/#review57474

> PodZero(&event);
> 
> GdkEvent event = { 0 };
> is fine if that works for unions, but I guess it doesn't.
> 
> GdkEvent event = GdkEvent();
> is another option.

The configure_event parameter is GdkEventConfigure* so declare GdkEventConfigure.
No need for the GdkEvent union.
Comment on attachment 8754052 [details]
Bug 1225044 - Workaround GTK windows freezing on unmap.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/53692/diff/4-5/
Attachment #8754052 - Flags: review- → review?(karlt)
Comment on attachment 8754052 [details]
Bug 1225044 - Workaround GTK windows freezing on unmap.

https://reviewboard.mozilla.org/r/53692/#review57806
Attachment #8754052 - Flags: review?(karlt) → review+
Pushed by acomminos@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5e37b253b0c4
Workaround GTK windows freezing on unmap. r=karlt
https://hg.mozilla.org/mozilla-central/rev/5e37b253b0c4
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
Firefox 49 is marked as affected. What do you think about uplifting a fix?
Flags: needinfo?(andrew)
Yup, this is a good candidate for uplift- while it makes nonstandard use of GTK (calling GtkWidget::configure_event outside of a ConfigureNotify event), support across GTK versions seems to be quite sane.
Flags: needinfo?(andrew)
Hi Andrew,
Please create a uplift request for 49. Thanks.
Flags: needinfo?(andrew)
Comment on attachment 8754052 [details]
Bug 1225044 - Workaround GTK windows freezing on unmap.

Approval Request Comment
[Feature/regressing bug #]: N/A
[User impact if declined]: Firefox tour highlights will be invisible.
[Describe test coverage new/current, TreeHerder]: None
[Risks and why]: Minor risk; we manually trigger a GTK event handler that gets called frequently anyway.
[String/UUID change made/needed]: N/A
Flags: needinfo?(andrew)
Attachment #8754052 - Flags: approval-mozilla-beta?
Comment on attachment 8754052 [details]
Bug 1225044 - Workaround GTK windows freezing on unmap.

This patch fixes a graphic regression. Take it in 49 beta. Should be in 49 beta 5.
Attachment #8754052 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: qe-verify+
Hello! I only managed to reproduce that tour highlights are garbage/transparent on above mentioned old Fx builds. 
I also investigated this bug on latest Aurora 50.0a2 (2016-09-01) and 49.0b9 build1 (20160901141622), using Ubuntu 14.04 x86 (gtk 3.10.8). I still reproduced garbage/transparent tour highlights on latest Aurora (see screenshots https://drive.google.com/open?id=0B0nYKG6PRiCcLXlNc2owSUNpc00; https://drive.google.com/open?id=0B0nYKG6PRiCcS1oxc2JkNHN2MWs). In rest, on beta 9 no glitches or other graphical issues were encountered.
I investigated again this issue on Ubuntu 14.04 x86 (gtk 3.10.8) using latest Nightly 52.0a1 (2016-09-26), latest Aurora 51.0a2 (2016-09-26), 50.0b2 (20160926162149) and 49.0.1 build3 (20160922113459). As I already specified in comment 42, garbage/transparent tour highlights are still reproducible on Firefox 50. Using https://www.mozilla.org/en-US/firefox/51.0a2/firstrun/ to start the tour in the other builds, the problem still occurs. In rest, no glitches or other graphical issues were encountered. Any thoughts about these?
Flags: needinfo?(andrew)
Filed bug 1319764 as a follow up. 
Setting the corresponding flags based on this and on previous comments.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Depends on: 1319764
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: