Closed
Bug 453199
Opened 16 years ago
Closed 16 years ago
_cairo_xlib_remove_close_display_hook is O(n) slower than it should be
Categories
(Core :: Graphics, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: karlt, Assigned: karlt)
References
Details
(Keywords: memory-footprint, perf)
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
mozilla
:
review+
|
Details | Diff | Splinter Review |
cairo_xlib_display_t uses a linked list for close_display_hooks, and so
_cairo_xlib_remove_close_display_hook searches through each link in the list
for the hook to remove.
This was apparently designed for a small number of hooks, because, when the
number of links in the list becomes large, this becomes expensive.
However the number of hooks does become large because there is a hook for each
xlib surface (and sometimes a second when there is a scaled font on the
surface). imgContainers and nsWindows hold references to many surfaces.
During talos Tp tests, the number of xlib surfaces can exceed 2000. The
proportion of CPU_CLK_UNHALTED events for the Firefox process occurring in
_cairo_xlib_remove_close_display_hooks is 7.5% during talos's repeat
iterations.
The number of xlib surfaces may also be large when a session contains a large
number of tabs.
Further, removed hooks are kept in a freelist and the memory is not freed
until the cairo_xlib_display_t is destroyed.
Assignee | ||
Updated•16 years ago
|
Assignee | ||
Comment 1•16 years ago
|
||
Callers provide the memory for the hook, thus avoiding additional memory
allocations.
For this change to be effective, lookup of entries in cairo_hash_tables should
be O(1) (bug 453200).
Tp savings are about 5% with an O(1) hash table lookup
(if we can believe the tryserver).
Attachment #336396 -
Flags: review?(vladimir)
That is horrible, nice find :( Let me get behdad on ickle to review that.
Comment 3•16 years ago
|
||
Karl, if you embed the hook struct in the object itself, with prev and next pointers (that is, kernel-style linked lists), you don't need any additional structure and free is O(1).
Assignee | ||
Comment 4•16 years ago
|
||
Comment on attachment 336396 [details] [diff] [review]
use a hash table instead of a linked list
Behdad's suggestion is better.
Attachment #336396 -
Flags: review?(vladimir)
Assignee | ||
Comment 5•16 years ago
|
||
The changes to _cairo_xlib_call_close_display_hooks are to ensure that
close_display_hooks remains a valid list even after the memory for a hook is
released in hook->func.
Attachment #336396 -
Attachment is obsolete: true
Attachment #336434 -
Flags: review?(mozilla)
Comment 6•16 years ago
|
||
Comment on attachment 336434 [details] [diff] [review]
use a doubly-linked list of hooks provided by the caller
I like this!
Do you think we should do the same thing about the workqueue?
Attachment #336434 -
Flags: review?(mozilla) → review+
Assignee | ||
Comment 7•16 years ago
|
||
(In reply to comment #6)
> Do you think we should do the same thing about the workqueue?
It would probably be nice from a memory allocation POV, but it looks like there may not always be an existing object in which to embed the job struct.
It looks like random access (removal or otherwise) is not required with the workqueue.
I haven't studied workqueue usage patterns to know whether there is ever a large number of jobs in the queue.
Comment 8•16 years ago
|
||
Thankyou, that's a good improvement. The only alteration I made to the patch as I rebased it against master (939b836bfa95df759aca96936bb9a6d89d3130b8) was to drop the data member from the hook, and use the container_of macro to recover the original pointer to the structure. (And be explicit about the associated data for cairo_xlib_surface_font_private_t!)
(In reply to comment #7)
> (In reply to comment #6)
> > Do you think we should do the same thing about the workqueue?
>
> It would probably be nice from a memory allocation POV, but it looks like there
> may not always be an existing object in which to embed the job struct.
Aye, this is the case as the workqueue is, typically, associated with an XID that needs to be freed by some random function. Having said that, this is now only true for the XRenderPictures freed during surface finalization, so we could keep the cairo_xlib_surface_t alive until we run the workqueue... Except at the moment it would take some reorganisation of the higher levels to stop cairo_surface_destroy() from freeing the surface.
> It looks like random access (removal or otherwise) is not required with the
> workqueue.
Correct, once put onto the workqueue the resource is considered finalized by the xlib backend, and so the list is only consumed upon notification.
> I haven't studied workqueue usage patterns to know whether there is ever a
> large number of jobs in the queue.
From my tests (which admittedly have completely different usage patterns from yours! - it would be handy if you can capture profiling data using cairo-trace so that we have some realistic benchmarks upon which to work...) frequently there are no items in the list, and we can save a measurable amount of time by checking whether the list if empty before taking the lock - I think I can justify why that is safe! (However since it does require justification I've considered it too dirty to include without someone pointing to a profile and complaining.)
Comment 9•16 years ago
|
||
(In reply to comment #8)
> Thankyou, that's a good improvement. The only alteration I made to the patch as
> I rebased it against master (939b836bfa95df759aca96936bb9a6d89d3130b8)
Just in case anyone wonders, that's cairo master!
Assignee | ||
Comment 10•16 years ago
|
||
(In reply to comment #8)
Thanks, Chris.
> drop the data member from the hook, and use the container_of macro to recover
> the original pointer to the structure.
Nice.
> From my tests (which admittedly have completely different usage patterns from
> yours! - it would be handy if you can capture profiling data using cairo-trace
> so that we have some realistic benchmarks upon which to work...) frequently
> there are no items in the list, and we can save a measurable amount of time by
> checking whether the list if empty before taking the lock.
cairo-trace looks fancy.
FWIW _cairo_xlib_display_notify consumes < 0.01% of CPU resources in profiles that I've seen, though I think we might make the lock a noop in Mozilla.
Comment 11•16 years ago
|
||
(In reply to comment #8)
> there are no items in the list, and we can save a measurable amount of time by
> checking whether the list if empty before taking the lock - I think I can
> justify why that is safe! (However since it does require justification I've
> considered it too dirty to include without someone pointing to a profile and
> complaining.)
It'll be safe if you do an atomic read of the pointer. But lets not fix something that is not currently a problem.
Comment 12•16 years ago
|
||
(In reply to comment #10)
> FWIW _cairo_xlib_display_notify consumes < 0.01% of CPU resources in profiles
> that I've seen, though I think we might make the lock a noop in Mozilla.
Note that Firefox compiles cairo with no mutex support. Would be interesting to compile firefox with system cairo and see how much time is spent in various mutexes.
Assignee | ||
Updated•16 years ago
|
Assignee | ||
Comment 13•16 years ago
|
||
Chris landed the modified fix upstream (comment 8) and bug 453765 landed the fix in Mozilla.
You need to log in
before you can comment on or make changes to this bug.
Description
•