Closed Bug 387132 Opened 18 years ago Closed 16 years ago

speed up reftest running

Categories

(Testing :: Reftest, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: vlad, Assigned: vlad)

References

Details

(Keywords: fixed1.9.1)

Attachments

(2 files, 9 obsolete files)

(deleted), patch
vlad
: review+
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
Attached patch speed up reftest (obsolete) (deleted) — Splinter Review
Reftest currently uses toDataURL() to encode each test to a PNG data: URI, then compares strings. This is pretty slow. We should use a binary component instead that examines the data directly. As a bonus, we can get the exact number of pixels that differ... note that I changed the output format for this, not sure what else needs to be updated for that.
Attachment #271230 - Flags: review?(dbaron)
Blocks: 372196
Hmm, not working for me on OS X. :-( rm -f libreftest_s.dylib c++ -fno-rtti -fno-exceptions -Wall -Wconversion -Wpointer-arith -Wcast-align -Woverloaded-virtual -Wsynth -Wno-ctor-dtor-privacy -Wno-non-virtual-dtor -Wno-long-long -isysroot /Developer/SDKs/MacOSX10.4u.sdk -fpascal-strings -no-cpp-precomp -fno-common -fshort-wchar -I/Developer/SDKs/MacOSX10.4u.sdk/Developer/Headers/FlatCarbon -pipe -DDEBUG -D_DEBUG -DDEBUG_dolske -DTRACING -g -fPIC -o libreftest_s.dylib nsReftestHelper.o -L../../../dist/bin -L../../../dist/lib -lthebes -L../../../dist/bin -lxpcom -lxpcom_core -L../../../dist/bin -L../../../dist/lib -lplds4 -lplc4 -lnspr4 -bundle /usr/bin/ld: warning can't open dynamic library: @executable_path/libmozz.dylib referenced from: ../../../dist/bin/libthebes.dylib (checking for undefined symbols may be affected) (No such file or directory, errno = 2) /usr/bin/ld: Undefined symbols: _MOZ_Z_deflate referenced from libthebes expected to be defined in @executable_path/libmozz.dylib _MOZ_Z_deflateEnd referenced from libthebes expected to be defined in @executable_path/libmozz.dylib _MOZ_Z_deflateInit_ referenced from libthebes expected to be defined in @executable_path/libmozz.dylib collect2: ld returned 1 exit status make[1]: *** [libreftest_s.dylib] Error 1 make: *** [default] Error 2 My .mozconfig is for a debug shared build.
Ah, fixed. Needed to add |zlib| to REQUIRES, and |$(ZLIB_LIBS)| to EXTRA_DSO_LDOPTS in the Makefile.
would be nice if the IDL file had docs, especially about the meaning of the return value
With this patch, reftest runs on my laptop dropped from ~15 minutes to ~5 minutes! Yay! Is it useful to count the number of differing pixels? It might be faster still to only use the memcmp result, and let the caller determine if it wants to do extra work concerning the difference... For |!=| reftests, the memcmp result is sufficient.
Speaking of !=, a fair number of tests are !=. Maybe pass in the expected result to optimize for both cases, not just the cases where the data in the two canvases are the same? If memcmp reports a difference on a !=, I shouldn't really care how many differ.
Counting the number of changed pixels can be useful; I'd like to see that introduced into the reftest language, so that you can set a max threshhold. For example, there are a few reftests that are currently marked as failing on the mac because Quartz has an off-by-one error in some compositing operators; it would be nice to be able to say "on mac, this test will have exactly 2 differing pixels" or "at most 1 differing pixel" so that we can still get coverage of the test there.
I would suggest that it would be better if the comparison just reported the pixels that are different. Is there a way to identify the test results you describe? Yes. It would be better done in the harness that runs the tests. The current scheme used to invoke reftests is static, using lists of tests in files. A dynamic system that could find the list of available tests, decide which to run based on rules, and then decide how to interpret the results based on other rules would be better. At least do not make the test result less informative.
I timed the reftests with and without pixel counting on failure... With pixel counting: real 4m52.232s user 3m38.345s sys 0m51.753s Without pixel counting: real 4m46.770s user 3m34.409s sys 0m50.956s Might as well count 'em.
Cool, thanks for doing that timing.. I had a feeling it was going to be something like that (at that point it's such a tight loop that it should be really cheap).
+interface nsIReftestHelper : nsISupports { + unsigned long compareCanvas (in nsIDOMHTMLCanvasElement canvas1, + in nsIDOMHTMLCanvasElement canvas2); +}; Some documentation of this method seems appropriate. Tabs in this patch need to be spaces... How about refactoring CompareCanvas so it uses a single function that takes an nsIDOMHTMLCanvasElement* and returns a gfxImageSurface* or null on failure? Then you can call it twice ... would save quite a bit of code. I don't personally think the pixel count is all that interesting, but what the heck. + if (*(unsigned int *) p1 != + *(unsigned int *) p2) PRUint32 ... if we're going to use it anywhere, better use it here :-) BTW the short lines are a little odd ... did you compose this on a VIC-20 or something? +function CanvasToURL(canvas) You don't seem to use this. You probably should check with dbaron that the changes to the reftest output are OK. I don't know what scripts might be trying to parse it.
Attached patch speed up reftest, v2.0 (obsolete) (deleted) — Splinter Review
* Fixed indentation * Abstracted out canvas-to-image-surface * Removed change to output format (just added an extra line instead) * Made reftests work even when the helper class wasn't available (via the old toDataURL)
Attachment #271230 - Attachment is obsolete: true
Attachment #274025 - Flags: review?(roc)
Attachment #271230 - Flags: review?(dbaron)
+ nsRefPtr<gfxImageSurface> img = new gfxImageSurface(gfxIntSize(w, h), gfxASurface::ImageFormatARGB32); + nsRefPtr<gfxContext> t = new gfxContext(img); Check for null? + nsCOMPtr<nsICanvasElement> elt1 = do_QueryInterface(canvas1); + nsCOMPtr<nsICanvasElement> elt2 = do_QueryInterface(canvas2); + if (!elt1 || !elt2) + return NS_ERROR_FAILURE; + + PRUint32 w, h, w2, h2; + + rv = elt1->GetSize(&w, &h); + NS_ENSURE_SUCCESS(rv, rv); + + rv = elt2->GetSize(&w2, &h2); + NS_ENSURE_SUCCESS(rv, rv); + + if (w != w2 || h != h2) + return NS_ERROR_FAILURE; + Isn't it easier to just get the gfxImageSurfaces first and then compare their sizes? + nsRefPtr<gfxImageSurface> img1 = CanvasToImageSurface(canvas1); + nsRefPtr<gfxImageSurface> img2 = CanvasToImageSurface(canvas2); + + if (img1->Stride() != img2->Stride()) You definitely need to check for nulls here. + if (*(unsigned int *) p1 != *(unsigned int *) p2) PRUint32!!
Attached patch speed up reftest, v2.1 (checked in) (obsolete) (deleted) — Splinter Review
Updated.
Attachment #274025 - Attachment is obsolete: true
Attachment #274330 - Flags: review?(roc)
Attachment #274025 - Flags: review?(roc)
Does this work on OS X as is? In comment #2 I had to add sprinkle on some magic zlib dust to get things going.
Does for me, I thought I added the magic zlib dust? I'll check before checking in, though the original built fine for me too, so I dunno.
Comment on attachment 274330 [details] [diff] [review] speed up reftest, v2.1 (checked in) >+ dump("---- number of differing pixels in reftest: " + differences + "\n"); Could you prefix this with "REFTEST:" so that we can still get all reftest output by grepping for that? r=dbaron with that (although I didn't really look at the C++ parts)
Attachment #274330 - Flags: review+
Sure, will do. We might break a parser or two, but we can fix that separately.
Gah, this is being screwed over by libxul. The component is checked in, but with the binary bit disabled in the Makefile. Not sure what to do.
Just add LIBXUL_LIBRARY = 1 to the Makefile? (I actually was going to suggest that, but then figured it probably didn't matter; I suppose it does.)
Wasn't able to get it working with libxul -- nsICanvasRenderingContextInternal uses strings :( bsmedberg said he was going to take a whack at it. For now, we can reenable the binary bit if libxul is disabled.
Attachment #275202 - Flags: review?
Attachment #275202 - Flags: review? → review?(dbaron)
Attached patch reftest-fast for libxul, rev. 1 (obsolete) (deleted) — Splinter Review
Attachment #275453 - Flags: review?(vladimir)
Comment on attachment 275453 [details] [diff] [review] reftest-fast for libxul, rev. 1 Looks good, thanks for looking into this!
Attachment #275453 - Flags: review?(vladimir) → review+
Comment on attachment 275202 [details] [diff] [review] don't build the binary component if libxul is enabled r=dbaron if you still need it, although I'm hoping you don't
Attachment #275202 - Flags: review?(dbaron) → review+
Comment on attachment 275453 [details] [diff] [review] reftest-fast for libxul, rev. 1 Hmm, looks like bsmedberg's last patch (to make this work in libxul) got dropped on the floor after r+... AFAIK there's no reason not to take this, so I'll request sr/a-1.9.
Attachment #275453 - Flags: superreview?(roc)
Attachment #275453 - Flags: superreview?(roc)
Attachment #275453 - Flags: superreview+
Attachment #275453 - Flags: approval1.9+
So, CVS history like this was previously checked in, but had the Makefile change backed out due to OS X linkage issues. Naturally, that was the only file I looked at to see if this was in yet. :( This is an updated patch without the bits already checked in, but will presumably need fixing before it's ready for checkin.
Attachment #275202 - Attachment is obsolete: true
Attachment #275453 - Attachment is obsolete: true
Attachment #283094 - Attachment is patch: true
Attachment #283094 - Attachment mime type: application/octet-stream → text/plain
Attached patch Almost works (obsolete) (deleted) — Splinter Review
The problem with the last patch is that XPCOM_GLUE_LDOPTS and MOZ_COMPONENT_LIBS both contain "-Wl,-executable_path,../../../dist/bin" (when libxul is enabled on OS X). Apparently this displeases the linker, even though both values are the same, and it complains. This patch is basically what bsmedberg tried doing to fix the bustage last time, which is to pull out and use just the bits of XPCOM_GLUE_LDOPTS that are actually needed. This is probably evil and prone to breakage, but whatever! :-) I clobbered a few times with and without libxul (on OS X) and this now compiles. Unfortunately, it doesn't seem to actually work. :-( Bug 396972 changed the canvas API used here; now either my patch is using it wrong or it [::Render()] is broken... The pixel values we end up looking at are always zeros.
Attachment #283094 - Attachment is obsolete: true
Attachment #274330 - Attachment description: speed up reftest, v2.1 → speed up reftest, v2.1 (checked in)
Attached patch Patch for review, v.3 (deleted) — Splinter Review
Vlad pointed out the error of my ways. This patch now works for me. I kinda hate to set a review flag on this bug yet again, but will err on the side of caution since the PLEASE_LINK_KTHX hack is kind of ugly. I'm not really sure if there's a better way to do it, or if this is just "good enough".
Attachment #274330 - Attachment is obsolete: true
Attachment #283306 - Attachment is obsolete: true
Attachment #283310 - Flags: review?(vladimir)
Attachment #283310 - Attachment is patch: true
Attachment #283310 - Attachment mime type: application/octet-stream → text/plain
Attachment #283310 - Flags: review?(vladimir) → review+
Attachment #283310 - Flags: superreview?(roc)
Attachment #283310 - Flags: approval1.9?
Comment on attachment 283310 [details] [diff] [review] Patch for review, v.3 I'm not really a build system kind of guy
Attachment #283310 - Flags: superreview?(roc) → superreview?(benjamin)
Attachment #283310 - Flags: superreview?(benjamin) → superreview+
Comment on attachment 283310 [details] [diff] [review] Patch for review, v.3 Sorry, I forgot we're code red right now. I *hate* the fact that we're using the same flag to mean different things at different times --- and that we're repeatedly switching between the two meanings!
Attachment #283310 - Flags: approval1.9+ → approval1.9?
Attachment #283310 - Flags: approval1.9? → approval1.9+
cvs commit: Examining layout/tools/reftest Checking in layout/tools/reftest/Makefile.in; /cvsroot/mozilla/layout/tools/reftest/Makefile.in,v <-- Makefile.in new revision: 1.9; previous revision: 1.8 done Checking in layout/tools/reftest/nsReftestHelper.cpp; /cvsroot/mozilla/layout/tools/reftest/nsReftestHelper.cpp,v <-- nsReftestHelper.cpp new revision: 1.3; previous revision: 1.2 done
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Attached patch Disable fast reftests on Windows (obsolete) (deleted) — Splinter Review
Bah. Now the linker problems are on Windows. Linux and OS X seem to be working fine, so I'm going to conditionally revert this by not building on Windows.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Windows is complaining: nsReftestHelper.obj : error LNK2001: unresolved external symbol "public: virtual unsigned int __thiscall gfxASurface::BeginPrinting(class nsAString const &,class nsAString const &)" (?BeginPrinting@gfxASurface@@UAEIABVnsAString@@0@Z) I don't really get why, as all the BeginPrinting() stuff is within Thebes. I do note that although gfxASurface.h defines it as a virtual function (and gfxASurface.cpp provides a default implementation), all the other gfxXXXXSurface subclasses also define and implement it... Except for gfxImageSurface, which is what nsReftestHelper is using. Maybe MSVC doesn't like this?
I'm getting libreftest.so link problems when building on Ubuntu Linux with fresh builds from this morning. Errors: http://pastebin.mozilla.org/212807 I fixed this locally by substituting "Linux" for "WINNT" in the line added by the "Disable fast reftests on Windows" patch.
Here are the .mozconfig options that I used when getting the above link problems: mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/../obj mk_add_options MOZ_CO_PROJECT=browser ac_add_options --enable-application=browser ac_add_options --enable-debug --disable-optimize ac_add_options --disable-crashreporter I also tried using --disable-libxul, and that didn't fix the issue.
Also, I got these link issues on two separate machines: - a laptop running Ubuntu 7.04 - a desktop running Ubuntu 7.10 beta
I started to get /usr/bin/ld: nsReftestHelper.o: relocation R_X86_64_PC32 against `NS_NewGenericModule2(nsModuleInfo const*, nsIModule**)' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: final link failed: Bad value
Ugh -- punting. I've disabled this code again. :-( I'm not sure why Windows/Linux fails now, even on non-libxul builds. Seems like something wonky is going on here, with the different platforms failing in different ways. Even though this was working on OS X now (ironically), I disabled it there too lest the unknown-wonkiness spread and cause headaches for someone else later.
Attachment #283648 - Attachment is obsolete: true
Comment on attachment 275453 [details] [diff] [review] reftest-fast for libxul, rev. 1 Resetting approval flags on bugs that have not been checked in by Oct 22, 11:59PM PDT. Please re-request approval if needed.
Attachment #275453 - Flags: approval1.9+
Comment on attachment 283310 [details] [diff] [review] Patch for review, v.3 Resetting all approval1.9+ flags on bugs that have not been checked in by Oct 22 11:59 PM PDT. Please re-request approval if needed.
Attachment #283310 - Flags: approval1.9+
Comment on attachment 275453 [details] [diff] [review] reftest-fast for libxul, rev. 1 This was landed by bsmedberg on 2007-08-06 13:13.
Attachment #275453 - Flags: approval1.9?
Comment on attachment 275453 [details] [diff] [review] reftest-fast for libxul, rev. 1 Reset approval flag to + as it was already checked in.
Attachment #275453 - Flags: approval1.9? → approval1.9+
Note to anyone coming in for the first time: this is checked in, but disabled in the makefile due to not being able to figure out how to make the damn thing link.
So what situations does this actually work in? Can we enable it in as many such situations if we can (using platform ifdefs and libxul ifdefs as needed)? If we can speed up developers' test runs or some of the unit test boxen, that would be graet, even if we can't speed up all reftest consumers because of the string stuff. That said, nsICanvasRenderingContextInternal just seems to use |string| and |wstring|; those shouldn't cause linkage issues, right?
[missed 1.9 checkin] - This landed but is disabled.
Whiteboard: [missed 1.9 checkin]
Hardware: PC → All
Version: unspecified → Trunk
FWIW, it works fine for me on Tiger if I comment out the 'ifdef DISABLED_SEE_BUG_387132' lines in layout/tools/reftest/Makefile.in.
(In reply to comment #45) > FWIW, it works fine for me on Tiger Yeah, this is known to work on Mac, but breaks on Windows/Linux, so we're currently disabling it globally, since the breakage isn't well understood. See comment 37 for more info / reasoning.
It's disabled, but not because of any concerns about 1.9 -- just noone could get the thing to link correctly under all cases. bz: The issue is with gfxASurface::BeginPrinting/EndPrinting, which are defined as non-virtual methods in the header. They use an nsAString, which ends up changing their signature when used outside of core code. One idea (that I think I tried?) would be to just define those methods inline in the header (as they are trivial no-ops on gfxASurface).
Whiteboard: [missed 1.9 checkin] → [not needed for 1.9]
So why can't we link this the same way we link libgklayout? In other words, MOZILLA_INTERAL_API, and no XPCOM glue mess?
Another option is just to add a "compare two canvases" API to DOMWindowUtils in all builds.
That's not a bad idea; there's no security impact or anything, so the code can just be moved to DOMWindowUtils. Well, I guess someone might be able to use it to see if a write-only canvas is the same as one that they control, so maybe we need to limit it to read/write canvases?
Yeah, I think limiting to read/write or caller having UniversalXPConnect is good.
I was assuming we'd require UniversalXPConnect like the other DOMWindowsUtils methods. But sure, imposing the same check as getImageData makes sense. Maybe in fact we should just make it a moz method in canvas itself.
Attached patch let's try this again, in DOMWindowUtils now (obsolete) (deleted) — Splinter Review
Ok, moved to DOMWindowUtils. Is fast. I'll kick off a tryserver run.
Attachment #349289 - Flags: review?(roc)
Comment on attachment 349289 [details] [diff] [review] let's try this again, in DOMWindowUtils now + if (gWindowUtils) { + dump ("Using canvas comparison acceleration.\n"); } Probably don't want this. Once your patch lands, it's always going to be there. Add some comments in nsIDOMWindowUtils.idl explaining that if the canvas sizes are different, it throws, and also that UniversalXPConnect is needed.
Vlad, you want to change WindowSnapshot.js too. Otherwise the mochitests that were using nsIReftestHelper to compare canvases will fail.
Er, I guess that code dealt with lack of reftest helper, of course. But still, would be good to use the new setup there. ;)
We should fix this ASAP, it should save developers and Tinderboxes some time.
Flags: wanted1.9.1?
Whiteboard: [not needed for 1.9]
I didn't even know about WindowSnapshot.js, updated. Do I want to try to land this now or wait until after the beta?
Flags: wanted1.9.1? → wanted1.9.1+
Vlad, don't forget to land this :-)
Too late, tree's clolsed.
Ted, any chance you could land this in the wee hours of the morning before the west coast wakes up? I'd try to get it in tonight but I might pass out before it gets late enough.
I may be able to land this this afternoon if you like.
If roc doesn't get it, I can land it tomorrow.
Pushed to trunk as c0af304f59ea
Status: REOPENED → RESOLVED
Closed: 17 years ago16 years ago
Resolution: --- → FIXED
Whiteboard: [needs 191 approval]
Comment on attachment 349289 [details] [diff] [review] let's try this again, in DOMWindowUtils now Speeds up reftests a lot, worth having on the 1.9.1 branch
Attachment #349289 - Flags: approval1.9.1?
Comment on attachment 349289 [details] [diff] [review] let's try this again, in DOMWindowUtils now a191=beltzner
Attachment #349289 - Flags: approval1.9.1? → approval1.9.1+
Whiteboard: [needs 191 approval] → [needs 191 landing]
Component: Layout → Reftest
Flags: wanted1.9.1+
Product: Core → Testing
Version: Trunk → unspecified
Attachment #349289 - Attachment is obsolete: true
Attachment #349348 - Attachment description: updated → updated [Checkin: Comment 65 & 68]
[Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.2a1pre) Gecko/20081209 Minefield/3.2a1pre] (home, optim default) (W2Ksp4) (http://hg.mozilla.org/mozilla-central/rev/85507cfcdda8) V.Fixed: duration dropped to 4 mn from 8 mn on my computer.
Status: RESOLVED → VERIFIED
Keywords: fixed1.9.1
Whiteboard: [needs 191 landing]
This added few unsigned vs. signed warnings to nsDOMWindowUtils.
Depends on: 504929
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: