Closed Bug 787947 Opened 12 years ago Closed 12 years ago

Corrupted layer drawn

Categories

(Core :: Layout, defect)

x86
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla18
Tracking Status
firefox15 --- unaffected
firefox16 + fixed
firefox17 + fixed
firefox18 + verified

People

(Reporter: BenWa, Assigned: roc)

References

Details

(Keywords: regression)

Attachments

(4 files, 1 obsolete file)

Attached image Screenshot (deleted) —
I got this corruption while working on cleopatra against nightly. I setup a custom deployment for this bug: http://people.mozilla.com/~bgirard/cleopatra_bug/?report=e9e2e60361ced1e3c0368f655358158b2a04d88a Change that caused this bug: js/diagnosticBar.js - if (width < 1) return 0; + //if (width < 1) return 0; This removes the minimal size for a diagnostic (red icons) which increases the complexity of the page and adds many small element.
Problem does not reproduce with Mac OMTC.
This regression is currently in Beta but hasn't shipped yet. We NEED to fix this before releasing.
STR: You must be running with OpenGL layers (check about:support for GPU accelerated windows > 0)
If you get a chance Alice a regression window here would be very helpful.
Benoit - are you planning to investigate further since this is a gfx issue? Do you feel this will be a common issue post-release?
Assignee: nobody → bgirard
I'm going to try and schedule time for this tomorrow. My gut feeling is that this issue is very situational with a uncommon trigger but it does reproduce 100% of the time. Given that the severity of the bug is very high. So I don't estimate that it would be very common in general but very common in sites that trigger it.
We don't have TBPL builds so I'm going to guess patch at random :(. First attempt is: http://hg.mozilla.org/mozilla-central/rev/461c9816a3be Because it deals with non pixel align gradients which this test case has.
Yay, I got lucky, the problem is in fact Bug 779399. Roc should I back out 461c9816a3be from central, aurora, beta or do you want to fix it? (In reply to Alex Keybl [:akeybl] from comment #6) > Benoit - are you planning to investigate further since this is a gfx issue? > Do you feel this will be a common issue post-release? With the regression identified I can give a better answer. This will reproduce 100% of the time if a page uses a non pixel aligned gradient.
Blocks: 779399
(In reply to Benoit Girard (:BenWa) from comment #9) > Yay, I got lucky, the problem is in fact Bug 779399. > > Roc should I back out 461c9816a3be from central, aurora, beta or do you want > to fix it? > > (In reply to Alex Keybl [:akeybl] from comment #6) > > Benoit - are you planning to investigate further since this is a gfx issue? > > Do you feel this will be a common issue post-release? > > With the regression identified I can give a better answer. This will > reproduce 100% of the time if a page uses a non pixel aligned gradient. Given the user impact here, if bug 779399 feels like an isolated issue and a forward fix carries any risk, we should definitely back out for Beta.
I don't understand how https://hg.mozilla.org/releases/mozilla-aurora/rev/5b1b65bf7f4d, which only makes some tweaks to gradient rendering and only touches public Thebes APIs, could cause texture corruption like that seen in the screenshot in comment #0. Maybe it tickles an existing bug?
Depends on: 792903
(In reply to Benoit Girard (:BenWa) from comment #11) > Backout on beta: > https://hg.mozilla.org/releases/mozilla-beta/rev/fc24961171a3 This causes 16Beta4 crash! See Bug792903 comment#5
So did the backout actually fix the bug originally reported here? This bug, with the oblique lines in the screenshot, definitely looks like a wrong stride or width value in a texture upload.
STR: Use a Mac Nightly from September 3 Use current Gecko Profiler add-on (September 24) Load the profile link given in comment 0 Unroll the tree until it starts scrolling down: corruption then starts appearing, initially concentrated in the scrollbar area on the right (I'm on OSX 10.8 if that matters) and in the top ~ 20 pixels of the treeview. Also a diagonal gray line (slope -1) bars the whole treeview across. Uploading screenshot.
Whiteboard: STR in comment 15
Also, taking this screenshot instantly resulted in the treeview area becoming completely garbled.
Sorry, didn't realize that this had only been backed out on beta -- so no need to use an old nightly to reproduce.
(In reply to Benoit Girard (:BenWa) from comment #9) > With the regression identified I can give a better answer. This will > reproduce 100% of the time if a page uses a non pixel aligned gradient. One thing that's weird is that the backed-out patch (https://hg.mozilla.org/releases/mozilla-aurora/rev/5b1b65bf7f4d) makes the edges of the gradient *more* pixel-aligned, not less (at the cost of introducing a small scale at times). I think we should reland the backed-out patch. This bug is the only regression linked to that patch (which was in the tree for a long time, and is still on Nightly and Aurora). And, simple Thebes calls like the patch uses should not be able to cause the kind of corruption seen in this bug, so the patch can't be the root cause.
I looked at the layer tree in the failing case. There's a big ThebesLayer containing almost the entire window's content, and some of the panes aren't corrupt, so this doesn't look like a layers bug. Now I'm thinking a cairo/quartz bug.
I think I've figured this out and it's my bug. Stand by...
Assignee: bgirard → roc
Whiteboard: STR in comment 15
The bug is that if the tile rect gets snapped to an empty rect, the patch in bug 779399 will scale us by 0 in the horizontal or vertical direction, which puts the cairo context into an error state, which means nothing else gets drawn. (IMHO it would be more logical for drawing operations to be no-ops while the CTM is singular, and return to normal after resetting the CTM, but whatever.) What looks like stride errors is simply the contents of the destination buffer with nothing drawn into it.
Attached patch Avoid scaling by 0 when snapping gradient tiles (obsolete) (deleted) — Splinter Review
This is the actual fix, for trunk (on top of the patch 779399, although it sort of replaces that patch).
Attachment #664443 - Flags: review?(jmuizelaar)
Attachment #664442 - Flags: review?(matt.woodrow) → review+
Whiteboard: [leave open]
Attachment #664443 - Flags: review?(jmuizelaar) → review+
Component: Graphics: Layers → Layout
Flags: in-testsuite? → in-testsuite+
OS: Mac OS X → All
Whiteboard: [leave open]
Once this has merged to central we should try to get this on Aurora before the uplift.
The reftest failed due to numerical inaccuracy. We paint single-pixel-wide slices of a gradient and sometimes the slice width is slightly different from one pixel, and that causes us to paint the slice with a small amount of transparency, which creates the lighter-colored lines seen in the Windows reftest analyzer. The Mac failures are less regular but probably related to the same issue.
I also realized that this test would normally pass even without the fix here, since the gradient line covers the gradient tile so we'd paint the whole gradient in one pass instead of slice by slice. That doesn't actually happen in this case due to numerical inaccuracy making us think the gradient line is slightly non-perpendicular to the tile edge, but we should fix that (in a separate bug). So we need to do the test differently.
Attached patch patch with updated test (deleted) — Splinter Review
This version of the test doesn't check that the gradient renders --- the gradient is just a transition from white to transparent white, over a white background, so it shouldn't be visible on any platform. Instead it checks that we're able to draw the text on top of the gradient, which means the graphics context should not have been put into an error state by drawing the gradient.
Attachment #664443 - Attachment is obsolete: true
Reftests are green. I'll reland this.
I'm pretty sure the reftest failure was caused by bug 794709, not this bug, since the try push is green and the failing test is a reftest-print reftest and doesn't use gradients. So I'm repushing this.
Comment on attachment 667782 [details] [diff] [review] patch with updated test Review of attachment 667782 [details] [diff] [review]: ----------------------------------------------------------------- This is a pretty bad regression from bug 779399 that is showing up on real Web sites. We fixed it on beta by backing out bug 779399 (which left other visible bugs behind). This is the real fix, so we should take it for Aurora.
Attachment #667782 - Flags: approval-mozilla-aurora?
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
Attachment #667782 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Approved for aurora . Please land before monday oct 8th merge.
Reproduced the STR in comment 15 with HWA off on Nightly 2012-09-04. Verified fixed on FF 18b4 Win 7x64, Mac OS X 10.7.5.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: