Closed Bug 680817 Opened 13 years ago Closed 10 years ago

Investigate X11 OpenGL layers performance regressions

Categories

(Core :: Graphics, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla30

People

(Reporter: mattwoodrow, Assigned: mattwoodrow)

References

Details

Attachments

(2 files, 3 obsolete files)

Enabling OpenGL layers regressed Talos fairly badly, we should fix this :)

For x64:

Paint - increase 34.9%
SVG - increase 7.74%
tscroll - increase 6.43%
DHTML - increase 3.25%
Tp5 (RSS) - increase 6.46%
Tp5 (Private Bytes) - increase 4.68%

Profiling with perf shows that we're spending huge amounts of time allocating and free'ing memory. The majority of these are coming from calls to glXBindTexImageEXT. It looks like binding the pixmap is resulting in a copy internally, rather than texturing directly from the pixmap.

Adding a patch to avoid rebinding ThebesLayer pixmaps when the content is unchanged. This improves the situation slightly, but is a long way short of fixing the regression.

Do we have any contacts at nvidia we can ask about this?
Attachment #554778 - Flags: review?(joe)
Attached file Standalone test app (obsolete) (deleted) —
There are two main things to test here, one is the relative performance of XRender compositing vs OpenGL, and the other is just the raw cost of glXBindTexImageEXT.

For relative performance, basic canvas demos seem to show decent results. I've been using http://ie.microsoft.com/testdrive/performance/psychedelicbrowsing/Default.html and getting around a 20% performance regression.

For glXBindTexImageEXT performance the attached standalone program should give us an idea. It takes around 14 seconds to run on my q6600 with an Nvidia GTX 275.

Would be interesting to see if a similar problem occurs with other cards/drivers.
Attached file Standalone test app v2 (deleted) —
Added back missing lines
Attachment #555028 - Attachment is obsolete: true
Jeff tried this on a Radeon HD 5670 with the opensource drivers and got identical in-browser performance and 0.059 seconds for the standalone test.

Does anyone have an intel card that they can test this on?
We have a Sandy Bridge machine (Blake Winton's) in the Toronto office that we might be able to experiment with.
glXBindTexImageEXT() won't copy the pixmap contents in our driver except in extreme low memory situations (The system likely won't be usable in this state anyway).  However, it does allocate/free some data structures, especially the first time a pixmap is bound.

Our driver doesn't require that you re-bind the texture when the contents of the pixmap change.  Since the memory is shared, it will always be up to date.  For portability reasons, we had to allow for a copy in the actual extension so a re-bind is required to support all drivers, but if you want to detect our driver and switch to a path that only binds the pixmap once, we support that method.  The recommended method for doing this is checking the GL vendor string for "NVIDIA".

If the above isn't acceptable, I can file an internal bug to work on improving perf of BindTexImage, but it's never going to be nearly as fast as skipping it.
Thanks for the quick reply James!

> glXBindTexImageEXT() won't copy the pixmap contents in our driver except in
> extreme low memory situations (The system likely won't be usable in this
> state anyway).  However, it does allocate/free some data structures,
> especially the first time a pixmap is bound.

Sorry, I saw some large allocations (7mb iirc) coming from this call and assumed it was copying the pixel data.


> Our driver doesn't require that you re-bind the texture when the contents of
> the pixmap change.  Since the memory is shared, it will always be up to
> date.  For portability reasons, we had to allow for a copy in the actual
> extension so a re-bind is required to support all drivers, but if you want
> to detect our driver and switch to a path that only binds the pixmap once,
> we support that method.  The recommended method for doing this is checking
> the GL vendor string for "NVIDIA".

Great, this sounds like it will solve our problem. Will try it out ASAP.
This looks to have fixed the performance regressions for me.

Pushed to try server to get Talos results.

http://tbpl.allizom.org/?tree=Try&usebuildbot=1&rev=164b609a4ae8
Attachment #554778 - Attachment is obsolete: true
Attachment #554778 - Flags: review?(joe)
Attachment #555645 - Flags: review?
Attachment #555645 - Flags: review? → review?(joe)
I'm getting a lot of failures on tryserver with this change.

###!!! ABORT: X_GLXVendorPrivate: BadValue (integer parameter out of range for operation); 24427 requests ago; id=0x20de

*** glibc detected *** /home/cltbld/talos-slave/test/build/firefox/firefox-bin: double free or corruption (!prev): 0x00000000021acdf0 ***
======= Backtrace: =========
/lib64/libc.so.6[0x34d2e74576]
/home/cltbld/talos-slave/test/build/firefox/firefox-bin(free+0x1c)[0x404408]
/usr/lib64/nvidia/libGL.so.1[0x34eda76818]

And many more :)

I can't reproduce any of these locally (with the 275.19 driver), does anybody know why tinderbox is still running 190.42?
(In reply to Matt Woodrow (:mattwoodrow) from comment #8)
> does
> anybody know why tinderbox is still running 190.42?

Probably because nobody has asked them to install a newer one. Things only get installed to Talos machines when there is a specific request for that, esp. as things like that can also have an influence on performance measures.
Comment on attachment 555645 [details] [diff] [review]
Leave pixmaps bound to textures with the NVIDIA driver

Review of attachment 555645 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/thebes/GLXLibrary.h
@@ +118,5 @@
>      void ReleaseTexImage(GLXPixmap aPixmap);
>  
> +    // Make sure a bound pixmap is updated to include
> +    // changes made to the X Pixmap. This is equivalent
> +    // to unbinding and then binding again.

Maybe add that we do this for perf on NVIDIA cards.
Attachment #555645 - Flags: review?(joe) → review+
Depends on: 684165
Assignee: nobody → matt.woodrow
Depends on: 678940
Maybe we could workaround this crash with __GL_NO_DSO_FINALIZER (ftp://download.nvidia.com/XFree86/Linux-x86/190.42/README/README.txt)
Marco ran some recent comparisons on x86 try with links in bug 787853.

Unfortunately compare-talos is not working (bug 696196 perhaps) and I don't
have measures for variance.
                               RENDER      GL  
tdhtmlr_paint                  408.15    428.12
tresize                         12.0      19.14
tdhtmlr_nochrome_paint         422.91    414.18
tscrollr_paint               14080     14691
ts_paint                       707.05    791.79
tpaint                         223.0     237.0
tspaint_places_generated_med   702.89    784.89
tspaint_places_generated_max   711.47    804.95
tsvgr                         2920.15   4233.23
tsvgr_opacity                  100.0     110.0
tp5n_paint                     372.03    385.91
tp5n_main_rss_paint            108.5     121.6
tp5n_xres_paint                  7.1       5.1
tp5n_shutdown_paint           1123.0    1076.0
Depends on: 788873
Rebased!
Attachment #555645 - Attachment is obsolete: true
Attachment #8379384 - Flags: review+
Blocks: 974709
https://hg.mozilla.org/mozilla-central/rev/73d924b389d4
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: