Open Bug 898436 Opened 11 years ago Updated 2 years ago

Write unit tests to exercise RasterImage's decode functions to make sure you can always call them

Categories

(Core :: Graphics: ImageLib, defect)

x86_64
Linux
defect

Tracking

()

People

(Reporter: joe, Unassigned, Mentored)

References

Details

(Whiteboard: [lang=c++])

Attachments

(1 file, 2 obsolete files)

We need to write some unit tests for RasterImage so that we're sure that we can always call them with a size decode open, for example. Ideally we'd create a RasterImage and then call the various Decode methods on it in various orders. I want a test to ensure that we never run into locking problems that result in deadlocks, like bug 887466. It's probably easiest to do this with gtest: https://developer.mozilla.org/en-US/docs/GTest
I'm interested in this one, but please advise if other mentored bugs in the gfx team are more urgent. Looks like I should derive a new TestRasterImage from TestBase in gfx/tests/gtest and mention the cpp in that directory's moz.build. but re the " !='windows' " in moz.build, I'm not sure if this should be included in windows or not. Would it offend your coding standard if I didn't make a header, but just declared the new test class in the new cpp? Somebody did that in TestTiledLayerBuffer, but it's not a TestBase derivative. I think I'll just make a dummy test class and ask for a review of that first, then you can tell me more about what you want tested.
Attached patch TestRasterImageDummy.patch (obsolete) (deleted) — Splinter Review
Here's the dummy test for RasterImage. Please provide feedback about whether I put it in the right place, followed the coding standard, need to do something else as well, etc, and then tell me more about what specific tests are required. When can we meet on IRC #gfx ?
Attachment #785363 - Flags: review?(joe)
Flags: needinfo?(bgirard)
BenWa, can you help Adrian?
Hi folks, Apart from feedback on the attached skeleton, I'd like to brainstorm ahead a bit and ask for feedback. Firstly, if the goal is to hunt deadlocks, then I'm not seeing anything in gtest or any mozilla extension of gtest that could find one without deadlocking the whole test run. So either it's there and I've missed it, or it could be written generically. I'm thinking of something along the lines of ENSURE_THIS_COMPLETES_WITHIN_TIMEOUT in someplace like TestBase. Should I write that or not? If so, where? What's the currently favoured platform-independent threads/locks/timers mechanism? Secondly, the bug that triggered this one wouldn't have happened if the lock mechanism didn't block lock attempts from the owning thread. Microsoft's critical sections and mutex's are like that, i.e. they allow multiple locks from the same thread. So is it worth changing or replacing the lock mechanism in this way? There's a conversation about this here: http://stackoverflow.com/questions/2821263/lock-a-mutex-multiple-times-in-the-same-thread which seems to imply that linux offers a flag about this but implementation might be patchy. I could write a test to see if recursive mutexes are working. As for the actual tests, right now I'm just trying to figure out which of these decode functions are actually called in what order in the code. Then I'll have a better idea of what tests are needed. It's taking me a while to read it. I think I discovered a flow from an image request to some consumers (presumably the renderer) via various trackers and observers. The idea seems to be that the end consumer gets updated with progress about the dimensions of the image, individual frames of an animated one and regions of a huge one, although either the responding server or the decoder might be slow and it all might get cancelled anytime. But precisely which requests get posted to which threads and which decode functions get called is eluding me still, so I'm not blocked.
Attached patch TestRasterImage.patch (obsolete) (deleted) — Splinter Review
Or like this? Now it needs a clobber after patching to rebuild properly, and I'm not sure if RasterImage is part of Moz2D or not (or will be.)
Attachment #786297 - Flags: review?(bgirard)
Comment on attachment 786297 [details] [diff] [review] TestRasterImage.patch Review of attachment 786297 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/2d/unittest/TestRasterImage.h @@ +6,5 @@ > +#pragma once > + > +#include "TestBase.h" > + > +class TestRasterImage : public TestBase TestBase is something used by moz2d which is our 2D drawing API which is a separate module from our image stuff. You shouldn't need this header and TestBase and instead just import gtest.h like other non moz2d tests. This means that this test wont break if something changes with moz2d. In fact it would be better if the test lived with RasterImage. Can you setup this folder to host gtests: http://mxr.mozilla.org/mozilla-central/source/image/test/? You can find more documentation in https://developer.mozilla.org/en-US/docs/GTest note that it's a bit out of date has more stuff has moved to moz.build since I wrote it. ::: gfx/tests/gtest/TestMoz2D.cpp @@ +26,5 @@ > > ASSERT_EQ(failures, 0); > } > + > +TEST(Moz2D, RasterImage) { And in your image/test/gtest/TestRasterImage.cpp you will just need one of these function per test you want to run. No need wrap moz2D test. The reason Moz2D work this way is because it has its own standalone unit test framework and here we just wrap the tests. ::: gfx/tests/gtest/moz.build @@ +21,5 @@ > 'TestBase.cpp', > 'TestMoz2D.cpp', > 'TestPoint.cpp', > 'TestScaling.cpp', > + 'TestRasterImage.cpp', Also ideally you test would also work on windows so wouldn't need to be with the != "windows" block.
Attachment #786297 - Flags: review?(bgirard) → feedback+
(In reply to Adrian May [:adrianmay] from comment #4) > Hi folks, > > Apart from feedback on the attached skeleton, I'd like to brainstorm ahead a > bit and ask for feedback. > > Firstly, if the goal is to hunt deadlocks, then I'm not seeing anything in > gtest or any mozilla extension of gtest that could find one without > deadlocking the whole test run. So either it's there and I've missed it, or > it could be written generically. I'm thinking of something along the lines > of ENSURE_THIS_COMPLETES_WITHIN_TIMEOUT in someplace like TestBase. Should I > write that or not? If so, where? What's the currently favoured > platform-independent threads/locks/timers mechanism? A test completing is a sign that the functionality is working well. We timeout our tests if there's no input for 300 seconds so for now let's not implement anything fancy. Our best platform abstractions are generally prefer mfbt, otherwise xpcom and nspr (and sometimes chromium ipc bits we imported. For the threading and locking I suggest using the xpcom bits: http://mxr.mozilla.org/mozilla-central/source/xpcom/threads/nsThread.h http://mxr.mozilla.org/mozilla-central/source/xpcom/glue/Mutex.h > > Secondly, the bug that triggered this one wouldn't have happened if the lock > mechanism didn't block lock attempts from the owning thread. Microsoft's > critical sections and mutex's are like that, i.e. they allow multiple locks > from the same thread. So is it worth changing or replacing the lock > mechanism in this way? There's a conversation about this here: > > http://stackoverflow.com/questions/2821263/lock-a-mutex-multiple-times-in- > the-same-thread > which seems to imply that linux offers a flag about this but implementation > might be patchy. I could write a test to see if recursive mutexes are > working. What your looking for is re-entrant locks. We have support for them but generally we prefer to use non re-entrant lock. We should understand what we do while holding a lock and know what code we're hitting while holding that lock. So ideally we should know if we are hitting something that requires to grab a lock we already own. I don't know the RasterImage code so I can't comment about the specific but if you need more specific help let me know and I'll check out the code. > > As for the actual tests, right now I'm just trying to figure out which of > these decode functions are actually called in what order in the code. Then > I'll have a better idea of what tests are needed. It's taking me a while to > read it. I think I discovered a flow from an image request to some consumers > (presumably the renderer) via various trackers and observers. The idea seems > to be that the end consumer gets updated with progress about the dimensions > of the image, individual frames of an animated one and regions of a huge > one, although either the responding server or the decoder might be slow and > it all might get cancelled anytime. But precisely which requests get posted > to which threads and which decode functions get called is eluding me still, > so I'm not blocked. Ok let us know if you need help and I can read up on the code to give you some help or find someone who knows this code. Thanks for looking into this!
Flags: needinfo?(bgirard)
OK I have a patch like you said in the pipeline, but I think I'm gonna concentrate on 877115 for a few days because (a) this one is making me feel the need to understand all the code from imgRequest to the consumers, and (b) this one might become mostly obsolete if somebody fixes 876687. 877115, OTOH, gives me a quick and easy way to do something useful while I get my brain around the codebase. Returnig to this one, how would you feel about writing gtest macros in the mutex code itself? Something like: Mutex::Lock() { #ifdef GTEST if (We're the owner) GTEST_BARF #endif Lock as usual... } It could be frowned on to mention something specific like gtest in such a central place, but I'm not sure how else to be sure we detected the reentries. I'm assuming that #define GTEST exists and that the mutex gets compiled that way when building gtest, but perhaps even that isn't true.
Attached patch TestRasterImage_2.patch (deleted) — Splinter Review
Ok, I can find the reentries as easily as this - no need to muck about with the locks. But it aborts the whole test run which seems a bit drastic. You wouldn't want to land this cos the test deliberately fails. I just want to check that there's nothing wrong so far.
Attachment #785363 - Attachment is obsolete: true
Attachment #786297 - Attachment is obsolete: true
Attachment #785363 - Flags: review?(joe)
Attachment #787938 - Flags: review?(bgirard)
(In reply to Adrian May [:adrianmay] from comment #8) > > Returnig to this one, how would you feel about writing gtest macros in the > mutex code itself? Something like: > > Mutex::Lock() > { > #ifdef GTEST > if (We're the owner) > GTEST_BARF > #endif > Lock as usual... > } > > It could be frowned on to mention something specific like gtest in such a > central place, but I'm not sure how else to be sure we detected the > reentries. I'm assuming that #define GTEST exists and that the mutex gets > compiled that way when building gtest, but perhaps even that isn't true. Our primary builds (nightly, aurora, beta, release, developers if they build with test) are compiled with GTEST. If you want to add general assertion then you should do so under #IFDEF DEBUG. We run GTEST both on a release and DEBUG configuration. We can't support a general #ifdef GTEST define that only gets compile for GTEST when running unit test without a large charge to how we build because we'd have to build a cpp twice and conditionally link one or the other. It's best to write general assertions and have them run in ifdef DEBUG.
Whiteboard: [lang=c++][mentor=joe@drew.ca] → [lang=c++][mentor=benwa]
(In reply to Adrian May [:adrianmay] from comment #9) > Created attachment 787938 [details] [diff] [review] > TestRasterImage_2.patch > > Ok, I can find the reentries as easily as this - no need to muck about with > the locks. But it aborts the whole test run which seems a bit drastic. I can't see what's triggering the abort for them patch. Can you elaborate? Perhaps you forgot to include a change? > > You wouldn't want to land this cos the test deliberately fails. I just want > to check that there's nothing wrong so far. If we have a test that depends on a fix it's still worth checking-in the test into the tree but waiting for the fix to activate the test.
Attachment #787938 - Flags: review?(bgirard)
Try it and you'll see what I mean. My test deliberately provokes the timeout you told me about. Or perhaps it's annoying something specific to the mutexes. But the result is not merely a fail, it's breaking out of the whole test script. Clearly I don't need the GTEST thing I was talking about earlier. I figured out in the meantime that all the cpps are not recompiled when you type ./mach gtest. But if you land this, gtest itself is broken. You'll only get the first half of the tests. Like this: 0:09.56 TEST-START | TiledLayerBuffer.TileStart 0:09.56 TEST-PASS | TiledLayerBuffer.TileStart | test completed (time: 0ms) 0:09.56 TEST-START | TiledLayerBuffer.EmptyUpdate 0:09.56 TEST-PASS | TiledLayerBuffer.EmptyUpdate | test completed (time: 1ms) 0:09.56 TEST-START | TestRasterImage.Easy 0:09.56 ###!!! ERROR: Potential deadlock detected: 0:09.56 === Cyclical dependency starts at 0:09.56 --- Mutex : TestRasterImageMutex (currently acquired) 0:09.56 calling context 0:09.56 [stack trace unavailable] 0:09.56 0:09.56 === Cycle completed at 0:09.56 --- Mutex : TestRasterImageMutex (currently acquired) 0:09.58 calling contextgtests TEST-UNEXPECTED-FAIL | firefox | test crashed 0:09.58 0:09.58 [stack trace unavailable] 0:09.58 0:09.58 ###!!! Deadlock may happen NOW! 0:09.58 0:09.58 ###!!! ASSERTION: Potential deadlock detected: 0:09.58 Cyclical dependency starts at 0:09.58 Mutex : TestRasterImageMutex (currently acquired) 0:09.58 Cycle completed at 0:09.58 Mutex : TestRasterImageMutex (currently acquired) 0:09.58 0:09.58 ###!!! Deadlock may happen NOW! 0:09.58 0:09.58 : 'Error', file /home/ad/Mozilla/Gfx/mozilla-central/obj-x86_64-unknown-linux-gnu/xpcom/build/BlockingResourceBase.cpp, line 125 0:09.58 UNKNOWN [/home/ad/Mozilla/Gfx/mozilla-central/obj-x86_64-unknown-linux-gnu/dist/bin/gtest/libxul.so +0x01C7FE95] 0:09.58 UNKNOWN [/home/ad/Mozilla/Gfx/mozilla-central/obj-x86_64-unknown-linux-gnu/dist/bin/gtest/libxul.so +0x0181975D] 0:09.58 UNKNOWN [/home/ad/Mozilla/Gfx/mozilla-central/obj-x86_64-unknown-linux-gnu/dist/bin/gtest/libxul.so +0x01805580] 0:09.58 UNKNOWN [/home/ad/Mozilla/Gfx/mozilla-central/obj-x86_64-unknown-linux-gnu/dist/bin/gtest/libxul.so +0x01805695] 0:09.58 UNKNOWN [/home/ad/Mozilla/Gfx/mozilla-central/obj-x86_64-unknown-linux-gnu/dist/bin/gtest/libxul.so +0x01805744]
(In reply to Adrian May [:adrianmay] from comment #12) > Try it and you'll see what I mean. My test deliberately provokes the timeout > you told me about. Or perhaps it's annoying something specific to the > mutexes. But the result is not merely a fail, it's breaking out of the whole > test script. We should be writing a test where we don't expect a deadlock to happen. If the deadlock does happen then we have a real bug and it's not a big deal if the entire test suite is aborted in a bug. > > Clearly I don't need the GTEST thing I was talking about earlier. I figured > out in the meantime that all the cpps are not recompiled when you type > ./mach gtest. There are not, you want to use something like './mach build && ./mach gtest'. > > But if you land this, gtest itself is broken. You'll only get the first half > of the tests. Like this: > Right currently the test forces a deadlock. We should be calling the RasterImage with various valid sequence and make sure we never hit a deadlock.
> We should be writing a test where we don't expect a deadlock to happen. If > the deadlock does happen then we have a real bug and it's not a big deal if > the entire test suite is aborted in a bug. OK > Right currently the test forces a deadlock. We should be calling the > RasterImage with various valid sequence and make sure we never hit a > deadlock. Well, I'm poking through all the surrounding code to see how these decode methods are called in reality. I haven't got my brain around the thread structure yet, but I will if I stare at it for a while. I guess I have to think about different types and sizes of image and server speeds too.
Mentor: bgirard
Whiteboard: [lang=c++][mentor=benwa] → [lang=c++]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: