Closed
Bug 397303
Opened 17 years ago
Closed 16 years ago
Thebes rendering of inset/outset borders very slow
Categories
(Core :: Graphics, defect, P2)
Tracking
()
RESOLVED
FIXED
People
(Reporter: bzbarsky, Assigned: vlad)
References
(Blocks 1 open bug)
Details
(Keywords: perf, regression)
Attachments
(1 file, 1 obsolete file)
(deleted),
text/html
|
Details |
As I debugged bug 392629, I figured out that changing the forms.css rules that say:
border: 1px outset black !important;
border-left-width: 2px ! important;
to either have "border-style: solid" on all sides or have "border-width: 1px" on all sides sped things up immensely. Setting all sides to the same width but keeping the same style was still painfully slow.
We're going to fix bug 392629 by just not repainting a the list control's border as one scrolls inside it. But we should still address the beveled border painting issue.
Flags: blocking1.9?
Updated•17 years ago
|
Flags: blocking1.9? → blocking1.9+
Updated•17 years ago
|
Priority: -- → P4
Updated•17 years ago
|
Assignee: nobody → vladimir
Reporter | ||
Comment 1•17 years ago
|
||
This might be related to the wikipedia problem with clip performance issues, perhaps?
In any case, should this really be a P4, and not P3 given the current "P4 is actually just 'wanted'" thing?
Comment 2•17 years ago
|
||
(In reply to comment #1)
> This might be related to the wikipedia problem with clip performance issues,
> perhaps?
>
> In any case, should this really be a P4, and not P3 given the current "P4 is
> actually just 'wanted'" thing?
>
Hrm. Do we think this is common perf issue? If so I'd vote for movin it up..
Reporter | ||
Comment 3•17 years ago
|
||
Hmm. I'm being self-contradictory in comment 0; that's what happens when I type things as I test. I seem to recall that this was a problem no matter what the widths were, as long as the outset style was used. Which means any <table border="N"> for N > 0 will trigger it. I can double-check when I get back to Chicago, I guess...
Comment 4•17 years ago
|
||
Bz you get a chance to check this out?
Reporter | ||
Comment 5•17 years ago
|
||
This got lost somehow. I just retested, and the behavior is:
1) Solid borders of different widths are reasonably fast.
2) Outset borders all of the same width > 1px are slow. The 1px case is fast.
3) Outset borders of different widths are even slower than outset borders all
of the same width.
So yes, <table border="N"> for N > 1, will trigger this bug.
Comment 6•17 years ago
|
||
(In reply to comment #5)
> This got lost somehow. I just retested, and the behavior is:
>
> 1) Solid borders of different widths are reasonably fast.
> 2) Outset borders all of the same width > 1px are slow. The 1px case is fast.
> 3) Outset borders of different widths are even slower than outset borders all
> of the same width.
>
> So yes, <table border="N"> for N > 1, will trigger this bug.
>
K - thanks Bz. Vlad you got this? Should it be higher than P4?
Assignee | ||
Comment 7•17 years ago
|
||
I've got a benchmark somewhere here for border rendering that I used when writing this code; let me grab some numbers from all three platforms and see where we're at.
Assignee | ||
Comment 8•17 years ago
|
||
Ok, BorderBench 2008* says that inset/outset/groove/ridge borders are significantly slower than even the most general case of 4 totally different borders. This is obviously wrong and will be fixed. This is also the case on Win32 as well as Linux; oddly, Mac doesn't show this problem.
[*] http://people.mozilla.com/~vladimir/misc/borderbench.html
OS: Linux → All
Priority: P4 → P2
Reporter | ||
Comment 9•17 years ago
|
||
For what it's worth, with Linux and outset borders, the gfxContext::Fill call under FillFastBorderPath seems to be a large contributor to painting performance on the testcase from comment 0 with the invalidate that covers the border rect reintroduced.
On that codepath, we call _cairo_surface_fallback_fill, then _clip_and_composite_trapezoids, and end up spending almost all our time actually under XSync called from XRenderCompositeTrapezoids (this is a --sync profile, of course).
Comment 10•17 years ago
|
||
Can we add this benchmark to Tgfx?
Assignee | ||
Comment 11•17 years ago
|
||
We can, though I'm starting to rethink the value of very targetted tests in Tgfx. Right now there are a bunch of actual pages in there, which is good, but also some specific benchmarks that I wrote. These were good while I was working on those code paths, but I don't think they're useful to monitor in general -- only people working on those specific areas should really have need of the fine-grained benchmarks.
Comment 12•17 years ago
|
||
(In reply to comment #11)
> We can, though I'm starting to rethink the value of very targetted tests in
> Tgfx. Right now there are a bunch of actual pages in there, which is good, but
> also some specific benchmarks that I wrote. These were good while I was
> working on those code paths, but I don't think they're useful to monitor in
> general -- only people working on those specific areas should really have need
> of the fine-grained benchmarks.
>
Why not? Fine grained benchmarks are a great way to isolate the particular problem at hand. If we already got them why not get them into the "permanent record" as it were.
Assignee | ||
Comment 13•17 years ago
|
||
bz, can you test this patch out and see if it improves things? It's not a final fix, because I still need to understand why the optimization is actually slowing things down, but this improves my perf by at least 2x.
Reporter | ||
Comment 14•17 years ago
|
||
That patch definitely improves things a lot.
Assignee | ||
Comment 15•17 years ago
|
||
(In reply to comment #14)
> That patch definitely improves things a lot.
Not sure whether to be happy or not about that. I'm trying to isolate things down into a cairo-perf testcase so that I can see what the heck the bad 'optimized' path is doing, since the 4-pass method does a pile more clipping than the 2-pass.
However, what I'm hoping I'll find is that there's some brokenness that causes 2-pass to be slower for a whole bunch of good reasons, in which case I can go and delete a pile of code and have either 1-pass or 4-pass rendering.
Assignee | ||
Comment 16•17 years ago
|
||
Comment on attachment 296232 [details] [diff] [review]
test patch
[Checkin: Comment 17]
Ok, let's go with this for now. What's happening is that in the 4-pass case, we set up a clip path for just that one side. This means that we end up creating temporary surfaces that are overall small compared to the entire border area, even though we do a bunch more work.
For the 2-pass case, we don't do any clipping, we just draw two paths.. but the bounds of both of those paths are the entire rectangle covered by the border area. This means that a temporary surface gets created that's huge.
A real, final fix for all of this stuff is to render each rectangular side region independently from the corners, since the sides can be very fast rectangular aligned fills. I don't know whether I should keep this bug open for that or file a new one, but we should at least take this patch for now.
Once I do the other bit, I'll remove all the unnecessary code that deals with 4.
Attachment #296232 -
Flags: review?(roc)
Attachment #296232 -
Flags: superreview+
Attachment #296232 -
Flags: review?(roc)
Attachment #296232 -
Flags: review+
Assignee | ||
Comment 17•17 years ago
|
||
Ok, bandaid checked in.
Updated•17 years ago
|
Attachment #296232 -
Attachment description: test patch → test patch
[Checkin: Comment 17]
Comment 18•17 years ago
|
||
See bug 377419 comment 39 for testcase(s), exhibiting 2(!) regressions, which were fixed by this checkin;
there was also a discussion in that bug which actually concerns the current bug and not bug 377419...
No longer blocks: 377419
Assignee | ||
Comment 19•17 years ago
|
||
(In reply to comment #18)
> See bug 377419 comment 39 for testcase(s), exhibiting 2(!) regressions, which
> were fixed by this checkin;
You're talking about performance regressions and not functionality/rendering/etc, correct?
Comment 20•17 years ago
|
||
'Perf', yes:
see bug 377419 comment 46.
Reporter | ||
Comment 21•17 years ago
|
||
So the bad news is that the bandaid made scrolling tinderbox a lot slower. :( A single middle-click on the right edge of the bottom scrollbar (to look at what the Windows builds are doing) takes about 3-4 seconds to repaint. Dragging the scroll thumb or using the arrow keys is even worse (more repaints). Reverting back to 2-pass rendering for inset/outset makes the number < 0.5 second for sure. I bet part of the problem is that tables are drawing too many of the clipped-out borders.
In any case, I profiled scrolling the page with arrow keys while running --sync, and during said scrolling we spend about 75% of total time under nsCSSRendering::PaintBorder. I'll attach a jprof profile of just the time spent under there. To make a long story short, the time is largely spent calling:
XRenderCompositeTrapezoids
XRenderComposite
_create_a8_picture
XRenderFreePicture
all under gfxContext::Clip().
Reporter | ||
Comment 22•17 years ago
|
||
Updated•17 years ago
|
Flags: wanted-next+
Flags: tracking1.9+
Flags: blocking1.9-
Reporter | ||
Comment 23•17 years ago
|
||
It might be worth backing out the band-aid if it fixes bug 421601 or something.... does it?
Comment 24•17 years ago
|
||
(In reply to comment #23)
> It might be worth backing out the band-aid if it fixes bug 421601 or
> something.... does it?
Yes, it seems like it does.
With the testcase from that bug, I get 160206ms as result before I backed out the patch, after backing out the patch, I got 17914ms. (note, I tested with a debug build)
Reporter | ||
Comment 25•17 years ago
|
||
Usually numbers from debug builds are pretty meaningless... They can be off by many orders of magnitude in some cases.
Comment 26•17 years ago
|
||
From my experience, I seem to get pretty reliable numbers from debug builds.
Fwiw, I'm trying to get a non-debug build, but I'm now blocked by bug 338224 (and was blocked by all kinds of Vista woes). I guess it could take me quite some time to get a working non-debug build, I'm afraid.
Assignee | ||
Comment 27•17 years ago
|
||
Comment on attachment 296232 [details] [diff] [review]
test patch
[Checkin: Comment 17]
Backing this patch out; marking obsolete.
It caused a significant perf regression in more common situations.
Attachment #296232 -
Attachment is obsolete: true
Assignee | ||
Comment 28•16 years ago
|
||
This should be significantly better now after bug 424423 was fixed.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•