Closed
Bug 978966
Opened 11 years ago
Closed 10 years ago
964 WebGL conformance test failures in mochitest-plain1 on Windows Server 2012 instances
Categories
(Core :: Graphics: CanvasWebGL, defect)
Tracking
()
RESOLVED
FIXED
mozilla34
People
(Reporter: bugzilla, Assigned: bjacob)
References
Details
(Whiteboard: webgl-correctness)
Attachments
(4 files)
(deleted),
application/zip
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
u480271
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
u480271
:
review+
|
Details | Diff | Splinter Review |
Modulo other problems caused by configuration issues, there are still 964 failures in mochitest-plain1 in the WebGL conformance test suite.
Reporter | ||
Comment 1•11 years ago
|
||
Reporter | ||
Updated•11 years ago
|
Component: Platform Support → Canvas: WebGL
Product: Release Engineering → Core
QA Contact: coop
Hardware: x86_64 → x86
Version: other → Trunk
Comment 2•11 years ago
|
||
These tests pass on regular Windows machines, but fail on the server, right?
Reporter | ||
Comment 3•11 years ago
|
||
Correct.
Reporter | ||
Comment 4•11 years ago
|
||
Note that the Windows Server 2012 test environment is set up such that the test session is running with the RemoteFX virtual GPU.
Comment 5•11 years ago
|
||
Nothing subtle about these differences.
Here's the actual webgl error list from this. When looking at each individual failure, it's only 34 tests that have failures in them (some just have many, probably same underlying cause).
On the VMs, these are using the Windows Server virtual GPU. Hopefully this is something that's fixable, as the virtual GPU seemed to be enough to run things like Epic Citadel just fine.
How do we get access to the virtual GPU for testing? I'd like to see screenshots of of actual vs. expect at a minimum.
Comment 8•11 years ago
|
||
If you open the error log (attachment) you'll see URLs in it that say something like "paste this into your browser address to see results". So, you can at least see what that run looked like.
Aaron, can you set up Dan with a VM on AWS? This kind of thing should be debuggable without any of the actual testing infra, since it should reproduce by just starting up firefox in the VM and running the webgl tests. If that's difficult, I can give Dan an instance on my AWS account.
Reporter | ||
Comment 10•11 years ago
|
||
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #9)
> Aaron, can you set up Dan with a VM on AWS? This kind of thing should be
> debuggable without any of the actual testing infra, since it should
> reproduce by just starting up firefox in the VM and running the webgl tests.
> If that's difficult, I can give Dan an instance on my AWS account.
John should be able to help out with that.
Flags: needinfo?(jhopkins)
Comment 11•11 years ago
|
||
We have two configurations of Windows Server 2012 VMs in use at the moment:
1) continuous integraton VMs for testing the Date branch builds.
2) VMs that run tests against a manually specified build upon bootup and output the test logs locally.
Both of these run the tests in a RDP environment to give us the proper graphical context.
What we can't do is simply RDP to the VM and run tests manually because then we won't have the right graphical context. For example, the RDP server may use our local graphics card for acceleration instead of the VirtualFX based acceleration.
> This kind of thing should be debuggable without any of the actual testing infra, since it should reproduce by just starting up firefox in the VM and running the webgl tests.
For the reasons above, I don't think this will work. Please let me know if I've missed something.
Flags: needinfo?(jhopkins)
(In reply to John Hopkins (:jhopkins) from comment #11)
> > This kind of thing should be debuggable without any of the actual testing infra, since it should reproduce by just starting up firefox in the VM and running the webgl tests.
>
> For the reasons above, I don't think this will work. Please let me know if
> I've missed something.
Yes -- when Dan RDP's in to the cltbld user, he should have the same graphics config that the VM has when the autologin user sets up the RDP session back to itself. If it doesn't reproduce the problem, then we can look for another solution, but for now a VM of the #2 style should work (just something that he can RDP into, or even VNC if we want to be 100% sure that it'll be identical but I don't think we have vnc servers set up.. but that can be plan B if the tests don't reproduce).
Updated•11 years ago
|
Whiteboard: webgl-correctness
Comment 13•11 years ago
|
||
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #12)
> Yes -- when Dan RDP's in to the cltbld user, he should have the same
> graphics config that the VM has when the autologin user sets up the RDP
> session back to itself.
AFAIK, only a single RDP connection is permitted at one time so that probably won't work.
> If it doesn't reproduce the problem, then we can
> look for another solution, but for now a VM of the #2 style should work
> (just something that he can RDP into, or even VNC if we want to be 100% sure
> that it'll be identical but I don't think we have vnc servers set up.. but
> that can be plan B if the tests don't reproduce).
I tried pinging you on IRC earlier re: whether you self-served this request (per your email asking about the AMI). Please let me know. If not, I'll create a new instance for you Thursday morning (Eastern).
Flags: needinfo?(vladimir)
It doesn't need to be simultaneous, he just needs to be able to connect in. Him RDP'ing should be fine. I have not yet had a chance to get a VM for Dan -- GDC stuff is taking up too much time. If you could spin him up a windows VM and send him the credentials, that'd be great. Thanks!
Flags: needinfo?(vladimir)
Comment 15•11 years ago
|
||
Loan request fulfilled in bug 983196
Assignee | ||
Comment 17•10 years ago
|
||
Hi, I'm taking this for a couple days to at least diagnose what we should do about this.
Can I get a VM loaned to me?
Assignee: nobody → bjacob
Flags: needinfo?(taras.mozilla)
Comment 18•10 years ago
|
||
No progress, it dropped off my radar. I see Benoit is picking it up.
Flags: needinfo?(dglastonbury)
Comment 19•10 years ago
|
||
Chris can hook you up(if he hasn't already)
Flags: needinfo?(taras.mozilla) → needinfo?(catlee)
Assignee | ||
Comment 20•10 years ago
|
||
Thanks Chris, I am now reproducing this on the VM.
It's very strange. On the VM, running individual test pages from our content/canvas/test directory, I can reproduce the failures; but running the same tests from the upstream 1.0.1 tests from Khronos' server, most of these tests pass -- even tests that are identical to the version we have in our tree. Investigating.
Flags: needinfo?(catlee)
Assignee | ||
Comment 21•10 years ago
|
||
Figured it. The problem on these VMs is that the MSAA (multisampling antialiasing) implementation is broken in that it assumes that blending is enabled and uses the default blendFunc(SRC_ALPHA, ONE_MINUS_SRC_ALPHA). Indeed, disabling blending or trying to change the blendFunc has no effect at all, as long as MSAA is used.
For example, in the gl-clear.html test, we get failures like this:
PASS should be 0,0,0,255
FAIL should be 128,128,128,192
at (0, 0) expected: 128,128,128,192 was 32,32,32,239
The value 32,32,32,239 is what we would get if blending were enabled, but the test disables it; that disabling has no effect on this driver. I tried keeping blending enabled and using a blendFunc (ONE, ZERO) to simulate no blending, but again that blendFunc call had no effect. Then I suspected a bug in the driver's framebuffer operations so I disabled MSAA and then, everything works fine.
That gives us a very easy way out: in WebGL, antialiasing is not mandatory. So let's disable antialiasing on this driver.
Assignee | ||
Comment 22•10 years ago
|
||
Disabling anti-aliasing in these WebGL conformance tests also seems not too bad from the perspective of losing test coverage. It's orthogonal to WebGL state machine behavior which is what these mochitests primarily intend to cover. Anti-aliasing still works when the default blending operations are used, so we'll still be able to exercise antialiasing in Talos and Reftest.
Assignee | ||
Comment 23•10 years ago
|
||
After discussing this with Vlad:
Tweaking the WebGL conformance tests specifically to avoid antialiasing would be a non-upstreamable departure from Khronos tests, not something that we'd be thrilled to do.
Instead we probably need to accept that since antialiasing is broken on this driver, we must just avoid AA unconditionally there.
If we could agree to keep WebGL Reftests and Talos tests running on real graphics hardware, where they would get antialiasing, that would make it much more acceptable to have just the mochitests run without antialiasing. Because again, antialiasing is not so vital to the mochitests, but it's much more important in reftests (compositing correctness) and Talos (compositing performance).
Assignee | ||
Comment 24•10 years ago
|
||
For my reference - the GL_RENDERER string that we need to check for:
Microsoft Basic Render Driver Direct3D9Ex vs_3_0 ps_3_0
Exposed by ANGLE as
ANGLE (Microsoft Basic Render Driver Direct3D9Ex vs_3_0 ps_3_0)
I guess we should match a substring.
Assignee | ||
Comment 25•10 years ago
|
||
Attachment #8461773 -
Flags: review?(jgilbert)
Assignee | ||
Comment 26•10 years ago
|
||
Comment on attachment 8461773 [details] [diff] [review]
no-multisample-in-redmond
Whoever gets this first.
Attachment #8461773 -
Flags: review?(dglastonbury)
Comment 27•10 years ago
|
||
Comment on attachment 8461773 [details] [diff] [review]
no-multisample-in-redmond
Review of attachment 8461773 [details] [diff] [review]:
-----------------------------------------------------------------
LGTM
Attachment #8461773 -
Flags: review?(jgilbert)
Attachment #8461773 -
Flags: review?(dglastonbury)
Attachment #8461773 -
Flags: review+
Comment 28•10 years ago
|
||
Benoit, try run?
Assignee | ||
Comment 29•10 years ago
|
||
Land or do not land, there is no try.
https://hg.mozilla.org/integration/mozilla-inbound/rev/fed0f0f3cb1c
We can leave open though until Aaron confirms that it's fixed.
Chris, I probably don't need my VM anymore.
Flags: needinfo?(catlee)
Flags: needinfo?(aklotz)
Whiteboard: webgl-correctness → webgl-correctness [leave open]
Reporter | ||
Comment 31•10 years ago
|
||
It's WAY better, but there's still quite a bit of stuff in here:
https://tbpl.mozilla.org/php/getParsedLog.php?id=44670275&tree=Date&full=1
Any ideas?
Flags: needinfo?(aklotz)
Assignee | ||
Comment 32•10 years ago
|
||
Great, so now we're down to just 4 WebGL test pages failing:
conformance/context/context-attributes-alpha-depth-stencil-antialias.html
conformance/renderbuffers/framebuffer-object-attachment.html
conformance/state/gl-object-get-calls.html
conformance/more/functions/isTests.html
(Somehow these didn't fail when I ran the upstream 1.0.1 tests on the VM, but our mochitests' copy differs substantially from upstream).
You just need to add them (copy and paste the above 4 lines) into this file:
http://hg.mozilla.org/mozilla-central/file/75fe3b8f592c/dom/canvas/test/webgl-conformance/failing_tests_windows.txt
It may look confusing that this one happens to be currently empty (i.e. current Windows slaves ran these tests free of any failure) so here is an example of a non-empty such file:
http://hg.mozilla.org/mozilla-central/file/75fe3b8f592c/dom/canvas/test/webgl-conformance/failing_tests_android.txt
Assignee | ||
Comment 33•10 years ago
|
||
Note: I'd gladly write that (4-line) patch but we can't land it as long as we're running the current windows slaves. What we could do is have two separate files, failing_tests_windows_old.txt and failing_tests_windows_new_vms.txt, and add these 4 lines to the latter only, and add some code in the mochitest to switch between the two. For example, here is current code that we use to switch between android failures files:
http://hg.mozilla.org/mozilla-central/file/75fe3b8f592c/dom/canvas/test/webgl-conformance/test_webgl_conformance_test_suite.html#l525
...hm, OK, let me take a stab at writing that patch.
Assignee | ||
Comment 34•10 years ago
|
||
Attachment #8463612 -
Flags: review?(dglastonbury)
Attachment #8463612 -
Flags: review?(dglastonbury) → review+
Assignee | ||
Comment 35•10 years ago
|
||
Previous try run had a wrong trychooser command, "windows" instead of "win32".
New try - also has win64 in a futile hope that it would run on the new WS2012 VMs, but I'm told that that's not the case -
https://tbpl.mozilla.org/?tree=Try&rev=86b3fdc7a864
Assignee | ||
Comment 36•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/d240749902d7
Aaron, how is it looking now?
Flags: needinfo?(aklotz)
Reporter | ||
Comment 38•10 years ago
|
||
Looks good! No more WebGL conformance test failures on M1!
https://tbpl.mozilla.org/?tree=Date&rev=10bd24ec3f55
Flags: needinfo?(aklotz)
Assignee | ||
Comment 39•10 years ago
|
||
Excellent! Let's close this bug, then.
Chris, I definitely don't need the VM anymore. Thanks for the help!
Whiteboard: webgl-correctness [leave open] → webgl-correctness
Assignee | ||
Updated•10 years ago
|
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Target Milestone: --- → mozilla34
QA Whiteboard: [qa-]
You need to log in
before you can comment on or make changes to this bug.
Description
•