Closed Bug 732985 Opened 13 years ago Closed 13 years ago

[Azure] Extremely large images are rendered garbled or not at all after scrolling

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla15

People

(Reporter: pxbugz, Assigned: bas.schouten)

References

Details

Attachments

(3 files, 2 obsolete files)

H6EMd.jpg 13 years ago Steve Scott (pxbugz) (deleted), image/jpeg		Details
Part 1: Add image scaling code to Azure 13 years ago Bas Schouten (:bas.schouten) (deleted), patch		Details \| Diff \| Splinter Review
Part 1: Add image scaling code to Azure v2 13 years ago Bas Schouten (:bas.schouten) (deleted), patch	jrmuizel : review-	Details \| Diff \| Splinter Review
Part 2: Use ImageScales for large images in Azure 13 years ago Bas Schouten (:bas.schouten) (deleted), patch	jrmuizel : review+	Details \| Diff \| Splinter Review
Part 1: Add image scaling code to Azure v3 13 years ago Bas Schouten (:bas.schouten) (deleted), patch	jrmuizel : review+	Details \| Diff \| Splinter Review

Steve Scott (pxbugz)

Reporter

Description

•

13 years ago

Attached image H6EMd.jpg (deleted) — Details

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120304 Firefox/13.0a1 Build ID: 20120304031031 Steps to reproduce: -Set gfx.content.azure.enabled to true -Encountered an extremely large (nearly 10,000px tall) 'comic strip' image on Reddit and viewed. Actual results: The viewable part of the image, after expanding to full size, renders correctly. Scrolling the image renders it all garbled. Expected results: Image should be rendered correctly. Issue not present without gfx.content.azure.enabled true. Latest nightly tested Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120304 Firefox/13.0a1 Mozregression tested back to http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=4795500b7c1d&tochange=8ae16e346bd0 when gfx.content.azure.enabled was first introduced so I guess I just haven't encountered a huge enough image until now.

Steve Scott (pxbugz)

Reporter

Updated

•

13 years ago

Blocks: 715768

Bas Schouten (:bas.schouten)

Assignee

Updated

•

13 years ago

Assignee: nobody → bas.schouten

Status: UNCONFIRMED → ASSIGNED

Ever confirmed: true

Bas Schouten (:bas.schouten)

Assignee

Comment 2

•

13 years ago

Attached patch Part 1: Add image scaling code to Azure (obsolete) (deleted) — Details — Splinter Review

This patch adds image scaling code to Azure, including a number of tests for the new code.

Attachment #613285 - Flags: review?(jmuizelaar)

Jeff Muizelaar [:jrmuizel]

Comment 3

•

13 years ago

(In reply to Bas Schouten (:bas) from comment #2) > Created attachment 613285 [details] [diff] [review] > Part 1: Add image scaling code to Azure > > This patch adds image scaling code to Azure, including a number of tests for > the new code. Should this include the SSE2 code?

Bas Schouten (:bas.schouten)

Assignee

Comment 4

•

13 years ago

Attached patch Part 1: Add image scaling code to Azure v2 (obsolete) (deleted) — Details — Splinter Review

With SSE2 code this time.

Attachment #613285 - Attachment is obsolete: true

Attachment #613420 - Flags: review?(jmuizelaar)

Attachment #613285 - Flags: review?(jmuizelaar)

Bas Schouten (:bas.schouten)

Assignee

Comment 5

•

13 years ago

Attached patch Part 2: Use ImageScales for large images in Azure (deleted) — Details — Splinter Review

This patch makes us use the image scaler for very large images in Azure. It also moves some stl::vectors to use plain new[] to avoid array initialization costs (which become quite high on these large images), and fixes 2 bugs in partial upload code. There's a couple of things this doesn't cover: - DrawSurface uses its own code, it should use this too, it's only used by canvas though, so that issue is already in releases Fx versions without issues. It should be addressed soon though. - Perhaps we should cache the downsampled surface for future drawing, on animated cases that should make a big performance difference. It also has an interesting side-effect, it makes extremely large images look much better when scaled down than just very large images.

Attachment #613446 - Flags: review?(jmuizelaar)

Jeff Muizelaar [:jrmuizel]

Comment 6

•

13 years ago

Comment on attachment 613446 [details] [diff] [review] Part 2: Use ImageScales for large images in Azure Review of attachment 613446 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/2d/Tools.h @@ +75,5 @@ > + return false; > +} > + > +static inline Float > +Dist2D(Point aA, Point aB) I think you can just call this Distance(). @@ +78,5 @@ > +static inline Float > +Dist2D(Point aA, Point aB) > +{ > + return sqrt((aB.x - aA.x) * (aB.x - aA.x) + > + (aB.y - aA.y) * (aB.y - aA.y)); Using hypot is more correct here. Also, it seems like on OS X this will use the double version of sqrt instead of the single precision one.

Attachment #613446 - Flags: review?(jmuizelaar) → review+

Jeff Muizelaar [:jrmuizel]

Comment 7

•

13 years ago

Comment on attachment 613420 [details] [diff] [review] Part 1: Add image scaling code to Azure v2 Review of attachment 613420 [details] [diff] [review]: ----------------------------------------------------------------- I haven't reviewed everything yet, but here are some comments to start on. ::: gfx/2d/ImageScalingSSE2.cpp @@ +79,5 @@ > + * == ~((~(a + b)) >> 1) > + * == (a + b) >> 1 > + * == g(a, b) > + */ > +__forceinline __m128i avg_sse2_8x2(__m128i *a, __m128i *b, __m128i *c, __m128i *d) Why do you pass these by pointer instead of by value? __forceinline is msvc only. There's probably something in mfbt to help with this. @@ +84,5 @@ > +{ > +#define shuf1 _MM_SHUFFLE(2, 0, 2, 0) > +#define shuf2 _MM_SHUFFLE(3, 1, 3, 1) > + > + __m128i t = _mm_castps_si128(_mm_shuffle_ps(_mm_castsi128_ps(*a), _mm_castsi128_ps(*b), shuf1)); add a wrapper function that does these casts maybe call it shuffle_si128 @@ +99,5 @@ > + > + __m128i carry = _mm_or_si128(_mm_and_si128(*a, *b), _mm_or_si128(_mm_and_si128(*a, *c), _mm_and_si128(*b, *c))); > + > + __m128i minusone = _mm_set1_epi32(0xffffffff); > +#define not(arg) _mm_xor_si128(arg, minusone) just use an inline function instead of a macro @@ +114,5 @@ > +#define not(arg) _mm_xor_si128(arg, minusone) > + > + return not(_mm_avg_epu8(not(a), not(b))); > +#undef not > +} These functions should be static inline @@ +133,5 @@ > +inline __m128i loadUnaligned128(const __m128i *aSource) > +{ > + // Yes! We use uninitialized memory here, we'll overwrite it though! > + __m128 res = _mm_loadl_pi(_mm_set1_ps(0), (const __m64*)aSource); > + return _mm_castps_si128(_mm_loadh_pi(res, ((const __m64*)(aSource)) + 1)); Why not just use _mm_loadu_si128? @@ +148,5 @@ > + // underflow. > + sum = (((sum ^ d) & mask) >> 1) + (sum & d); > + > + return (((sum ^ carry) & mask) >> 1) + (sum & carry); > +} It would be nice not to duplicate these functions. @@ +169,5 @@ > +ImageHalfScaler::HalfImage2DSSE2(uint8_t *aSource, int32_t aSourceStride, > + const IntSize &aSourceSize, uint8_t *aDest, > + uint32_t aDestStride) > +{ > + for (int y = 0; y < aSourceSize.height; y += 2) { What if aSourceSize.height = 1? @@ +176,5 @@ > + // Run a loop depending on alignment. > + if (!(uintptr_t(aSource + (y * aSourceStride)) % 16) && > + !(uintptr_t(aSource + ((y + 1) * aSourceStride)) % 16)) { > + for (; x < (aSourceSize.width - 7); x += 8) { > + uint8_t *upperRow = aSource + (y * aSourceStride + x * 4); Seems like the type of these should __m128i*. Please fix all of them @@ +177,5 @@ > + if (!(uintptr_t(aSource + (y * aSourceStride)) % 16) && > + !(uintptr_t(aSource + ((y + 1) * aSourceStride)) % 16)) { > + for (; x < (aSourceSize.width - 7); x += 8) { > + uint8_t *upperRow = aSource + (y * aSourceStride + x * 4); > + uint8_t *lowerRow = aSource + ((y + 1) * aSourceStride + x * 4); It might be worth renaming 4 to pixelSize. It took me a while to figure out how the 8 and 4 went together. @@ +198,5 @@ > + __m128i d = loadUnaligned128((__m128i*)lowerRow + 1); > + > + *storage++ = avg_sse2_8x2(&a, &b, &c, &d); > + } > + } else if (!(uintptr_t(aSource + (y * aSourceStride)) % 16)) { This condition is the same as above. Presumably you meant y+1? Please add this to the tests. @@ +247,5 @@ > + > +void > +ImageHalfScaler::HalfImageVSSE2(uint8_t *aSource, int32_t aSourceStride, > + const IntSize &aSourceSize, uint8_t *aDest, > + uint32_t aDestStride) I think we can afford to use Vertical instead of V

Attachment #613420 - Flags: review?(jmuizelaar) → review-

Bas Schouten (:bas.schouten)

Assignee

Comment 8

•

13 years ago

Here's my replies, only on stuff where I had questions. The rest seems strictly improvements. (In reply to Jeff Muizelaar [:jrmuizel] from comment #7) > Comment on attachment 613420 [details] [diff] [review] > Part 1: Add image scaling code to Azure v2 > > Review of attachment 613420 [details] [diff] [review]: > ----------------------------------------------------------------- > > I haven't reviewed everything yet, but here are some comments to start on. > > ::: gfx/2d/ImageScalingSSE2.cpp > @@ +79,5 @@ > > + * == ~((~(a + b)) >> 1) > > + * == (a + b) >> 1 > > + * == g(a, b) > > + */ > > +__forceinline __m128i avg_sse2_8x2(__m128i *a, __m128i *b, __m128i *c, __m128i *d) > > Why do you pass these by pointer instead of by value? Since MSVC's calling convention doesn't allow passing more than 3 SSE arguments through registers, it would mean passing one on the stack. MSVC does not allow passing 16-byte aligned structures on the stack. As such passing my pointer is the only option in MSVC. > __forceinline is msvc only. There's probably something in mfbt to help with > this. MOZ_ALWAYS_INLINE, yes, sorry, an oversight on my part. > @@ +133,5 @@ > > +inline __m128i loadUnaligned128(const __m128i *aSource) > > +{ > > + // Yes! We use uninitialized memory here, we'll overwrite it though! > > + __m128 res = _mm_loadl_pi(_mm_set1_ps(0), (const __m64*)aSource); > > + return _mm_castps_si128(_mm_loadh_pi(res, ((const __m64*)(aSource)) + 1)); > > Why not just use _mm_loadu_si128? It's significantly slower according to anyone I've talked to. Bjacob's math libraries to the same thing. > @@ +148,5 @@ > > + // underflow. > > + sum = (((sum ^ d) & mask) >> 1) + (sum & d); > > + > > + return (((sum ^ carry) & mask) >> 1) + (sum & carry); > > +} > > It would be nice not to duplicate these functions. It would, sadly I'd need to shove them into headers if we wanted that, GCC for some reason does not allow using SSE2 intrinsics in a file which is not compiled with sse2 arch, which for GCC also means it will use SSE2 in all other code it compiles. > @@ +169,5 @@ > > +ImageHalfScaler::HalfImage2DSSE2(uint8_t *aSource, int32_t aSourceStride, > > + const IntSize &aSourceSize, uint8_t *aDest, > > + uint32_t aDestStride) > > +{ > > + for (int y = 0; y < aSourceSize.height; y += 2) { > > What if aSourceSize.height = 1? Hrm, my intention was to never size down to below 4x4 (doesn't seem to make too much sense at that point). Seems I forgot those checks, will add. > @@ +177,5 @@ > > + if (!(uintptr_t(aSource + (y * aSourceStride)) % 16) && > > + !(uintptr_t(aSource + ((y + 1) * aSourceStride)) % 16)) { > > + for (; x < (aSourceSize.width - 7); x += 8) { > > + uint8_t *upperRow = aSource + (y * aSourceStride + x * 4); > > + uint8_t *lowerRow = aSource + ((y + 1) * aSourceStride + x * 4); > > It might be worth renaming 4 to pixelSize. It took me a while to figure out > how the 8 and 4 went together. Will do. > @@ +247,5 @@ > > + > > +void > > +ImageHalfScaler::HalfImageVSSE2(uint8_t *aSource, int32_t aSourceStride, > > + const IntSize &aSourceSize, uint8_t *aDest, > > + uint32_t aDestStride) > > I think we can afford to use Vertical instead of V I assume you'd like the same for Horizontal then?

Jeff Muizelaar [:jrmuizel]

Comment 9

•

13 years ago

(In reply to Bas Schouten (:bas) from comment #8) > > > > > > Why do you pass these by pointer instead of by value? > > Since MSVC's calling convention doesn't allow passing more than 3 SSE > arguments through registers, it would mean passing one on the stack. MSVC > does not allow passing 16-byte aligned structures on the stack. As such > passing my pointer is the only option in MSVC. Hmm... You should definitely add a comment about this reason. Hopefully, most compilers are able to inline it all and untangle the loads and stores. > > > > Why not just use _mm_loadu_si128? > > It's significantly slower according to anyone I've talked to. Bjacob's math > libraries to the same thing. It looks like this has been fixed in Nehalem and newer. You should add a comment about that and later when a large precentage of our users are on Nehalem+ we can switch it. > > @@ +247,5 @@ > > > + > > > +void > > > +ImageHalfScaler::HalfImageVSSE2(uint8_t *aSource, int32_t aSourceStride, > > > + const IntSize &aSourceSize, uint8_t *aDest, > > > + uint32_t aDestStride) > > > > I think we can afford to use Vertical instead of V > > I assume you'd like the same for Horizontal then? Yes.

Bas Schouten (:bas.schouten)

Assignee

Comment 10

•

13 years ago

(In reply to Jeff Muizelaar [:jrmuizel] from comment #9) > (In reply to Bas Schouten (:bas) from comment #8) > > > > > > > > > Why do you pass these by pointer instead of by value? > > > > Since MSVC's calling convention doesn't allow passing more than 3 SSE > > arguments through registers, it would mean passing one on the stack. MSVC > > does not allow passing 16-byte aligned structures on the stack. As such > > passing my pointer is the only option in MSVC. > > Hmm... You should definitely add a comment about this reason. Hopefully, > most compilers are able to inline it all and untangle the loads and stores. I checked the generated assembly for MSVC 2010 -O2 and it was fine. It did require __forceinline, inline wasn't sufficient to get this inlined and resulted in a bunch of wasted loads/stores and a function call. > > > > > > > Why not just use _mm_loadu_si128? > > > > It's significantly slower according to anyone I've talked to. Bjacob's math > > libraries to the same thing. > > It looks like this has been fixed in Nehalem and newer. You should add a > comment about that and later when a large precentage of our users are on > Nehalem+ we can switch it. Yes, that's what Derf said as well. I'll add a comment.

Bas Schouten (:bas.schouten)

Assignee

Comment 11

•

13 years ago

(In reply to Jeff Muizelaar [:jrmuizel] from comment #7) > Comment on attachment 613420 [details] [diff] [review] > Part 1: Add image scaling code to Azure v2 > > Review of attachment 613420 [details] [diff] [review]: > ----------------------------------------------------------------- > > @@ +84,5 @@ > > +{ > > +#define shuf1 _MM_SHUFFLE(2, 0, 2, 0) > > +#define shuf2 _MM_SHUFFLE(3, 1, 3, 1) > > + > > + __m128i t = _mm_castps_si128(_mm_shuffle_ps(_mm_castsi128_ps(*a), _mm_castsi128_ps(*b), shuf1)); > > add a wrapper function that does these casts maybe call it shuffle_si128 I tried this but it's tricky, as it wants __Imm to be a compile time constant. If I make this a wrapper function the compile cannot guarantee that when compiling the function. I'll make it a #define for now.

Bas Schouten (:bas.schouten)

Assignee

Comment 12

•

13 years ago

(In reply to Jeff Muizelaar [:jrmuizel] from comment #7) > Comment on attachment 613420 [details] [diff] [review] > Part 1: Add image scaling code to Azure v2 > > Review of attachment 613420 [details] [diff] [review]: > ----------------------------------------------------------------- > > @@ +198,5 @@ > > + __m128i d = loadUnaligned128((__m128i*)lowerRow + 1); > > + > > + *storage++ = avg_sse2_8x2(&a, &b, &c, &d); > > + } > > + } else if (!(uintptr_t(aSource + (y * aSourceStride)) % 16)) { > > This condition is the same as above. Presumably you meant y+1? Please add > this to the tests. I can't add this to the tests as when !(uintptr_t(aSource + ((y + 1) * aSourceStride)) % 16) in the current code this will just use the last clause of all unaligned loads, which works just fine but just performs worse.

Bas Schouten (:bas.schouten)

Assignee

Comment 13

•

13 years ago

Attached patch Part 1: Add image scaling code to Azure v3 (deleted) — Details — Splinter Review

Attachment #613420 - Attachment is obsolete: true

Attachment #617318 - Flags: review?(jmuizelaar)

Jeff Muizelaar [:jrmuizel]

Comment 14

•

13 years ago

Comment on attachment 617318 [details] [diff] [review] Part 1: Add image scaling code to Azure v3 Review of attachment 617318 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/2d/ImageScaling.cpp @@ +107,5 @@ > + } > + > + delete [] mDataStorage; > + // Allocate 15 bytes extra to make sure we can get 16 byte alignment. > + mDataStorage = new uint8_t[internalSurfSize.height * mStride + 15]; Perhaps a reference to bug 751696 @@ +122,5 @@ > + > + IntSize currentSampledSize = mOrigSize; > + uint32_t currentSampledStride = mOrigStride; > + uint8_t *currentSampledData = mOrigData; > + It would be nice to add a bit of documentation about how you're dealing with odd sizes. advantages/alternatives etc. @@ +136,5 @@ > + if (Factory::HasSSE2()) { > + HalfImage2DSSE2(currentSampledData, currentSampledStride, currentSampledSize, > + mData, mStride); > + } else > +#endif Please move this #ifdefery out of this main function. For example add a HalfImage2D_C and have HalfImage2D call the appropriate function. Also, it might be worth adding an underscore to HalfImage2DSSE2 as case isn't enough to separate the words here.

Attachment #617318 - Flags: review?(jmuizelaar) → review+

Ed Morley [:emorley]

Comment 15

•

13 years ago

https://hg.mozilla.org/mozilla-central/rev/b937d3d96745 https://hg.mozilla.org/mozilla-central/rev/f448869ad726

Status: ASSIGNED → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla15

Honza Bambas (:mayhemer)

Updated

•

13 years ago

Depends on: 753790

Serge Gautherie (:sgautherie)

Updated

•

13 years ago

Depends on: 753772

alexander :surkov (:asurkov)

Comment 16

•

13 years ago

I get build error: gfx/2d/Tools.h(81) : C3861: 'hypotf': identifier not found on VC 2010. Ideas?

Joe Drew (not getting mail)

Comment 17

•

13 years ago

surkov, bug 753790

alexander :surkov (:asurkov)

Comment 18

•

13 years ago

(In reply to Joe Drew (:JOEDREW!) from comment #17) > surkov, bug 753790 thank you

Arjen de Jager

Updated

•

13 years ago

Status: RESOLVED → VERIFIED

Serge Gautherie (:sgautherie)

Updated

•

13 years ago

Depends on: 756010

Bas Schouten (:bas.schouten)

Assignee

Updated

•

13 years ago

No longer blocks: 715768

Bas Schouten (:bas.schouten)

Assignee

Updated

•

13 years ago

Blocks: 715768

You need to log in before you can comment on or make changes to this bug.