Closed
Bug 899697
Opened 11 years ago
Closed 11 years ago
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla27
People
(Reporter: RyanVM, Assigned: jandem)
References
Details
(Keywords: intermittent-failure)
Attachments
(1 file)
(deleted),
patch
|
terrence
:
review+
|
Details | Diff | Splinter Review |
https://tbpl.mozilla.org/php/getParsedLog.php?id=25915625&tree=Fx-Team
WINNT 5.2 fx-team build on 2013-07-30 07:56:15 PDT for push 051d88f15bc7
slave: w64-ix-slave130
TEST-PASS | e:\builds\moz2_slave\fx-team-w32-000000000000000000\build\js\src\jit-test\tests\auto-regress\bug704136.js |
FAIL - e:\builds\moz2_slave\fx-team-w32-000000000000000000\build\js\src\jit-test\tests\auto-regress\bug704136.js
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\fx-team-w32-000000000000000000\build\js\src\jit-test\tests\auto-regress\bug704136.js | --ion-eager: e:\builds\moz2_slave\fx-team-w32-000000000000000000\build\js\src\jit-test\tests\auto-regress\bug704136.js:8:0 ReferenceError: jsTestDriverEnd is not defined
TEST-PASS | e:\builds\moz2_slave\fx-team-w32-000000000000000000\build\js\src\jit-test\tests\auto-regress\bug704136.js | --baseline-eager
TEST-PASS | e:\builds\moz2_slave\fx-team-w32-000000000000000000\build\js\src\jit-test\tests\auto-regress\bug704136.js | --baseline-eager --no-ti --no-fpu
TEST-PASS | e:\builds\moz2_slave\fx-team-w32-000000000000000000\build\js\src\jit-test\tests\auto-regress\bug704136.js | --no-baseline --no-ion
TEST-PASS | e:\builds\moz2_slave\fx-team-w32-000000000000000000\build\js\src\jit-test\tests\auto-regress\bug704136.js | --no-baseline --no-ion --no-ti
Assignee | ||
Comment 1•11 years ago
|
||
We've had a lot of intermittent jit-test failures like this one recently: some test that's expected to throw, say, a ReferenceError fails with the error we expect, and still the test is marked as failure.
I also had one on Try: https://tbpl.mozilla.org/?tree=Try&rev=dd3bd6f21e16
My Try build, bug 899611 and this bug are all Windows-only. I wonder if there's a threading/timing related bug in the jit-test harness or something.
Assignee | ||
Comment 2•11 years ago
|
||
Bug 880086 is another one. The test looks like this:
// |jit-test| error:Error
function jsTestDriverEnd() {}
this.__defineSetter__("x", function () {});
x %= 5;
jsTestDriverEnd();
mjitChunkLimit();
The interpreter throws an error, as expected:
TEST-UNEXPECTED-FAIL | e:\builds\moz2_slave\m-in-w32-000000000000000000000\build\js\src\jit-test\tests\auto-regress\bug726636.js | --no-baseline --no-ion --no-ti: e:\builds\moz2_slave\m-in-w32-000000000000000000000\build\js\src\jit-test\tests\auto-regress\bug726636.js:10:0 ReferenceError: mjitChunkLimit is not defined
And yet, it's marked as failure... This one is also Windows 7.
It's possible the shell returns an exit code other than 3 for some reason, but I've no idea why. I will try to reproduce this on Windows now.
Assignee | ||
Comment 3•11 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #2)
> It's possible the shell returns an exit code other than 3 for some reason,
> but I've no idea why. I will try to reproduce this on Windows now.
I'm able to reproduce this on Windows after running the same test thousands of times. Will add some logging and see if that tells us anything...
Assignee | ||
Comment 4•11 years ago
|
||
OK, so the exit code is 0 instead of 3, even though we do print the uncaught exception... If I change the shell to always return a non-zero value from main(), the jit-test harness still thinks it's 0...
So either the jit-test harness is wrong, or it's a threading issue somehow.
Assignee | ||
Comment 5•11 years ago
|
||
Unfortunately, I won't be near my Windows pc for a few days, so I can't debug this until Monday :( Anybody should feel free to investigate further...
If you run the bug726636.js jit-test in a loop, with an opt32 thread-safe Windows shell build, it should fail in < 15 minutes. I will try this on OS X as well, but it looks like this is Windows only.
The shell seems to exit with code 0 every X thousand runs, no matter what main() returns. I looked for exit(..) calls but couldn't find anything interesting..
Assignee | ||
Comment 6•11 years ago
|
||
Bug 892697 is another one, also Windows. Test expects an error, shell throws that error, but somehow exits with a code other than 3.
I will get to the bottom of this in a few days, when I'm back.
Assignee | ||
Updated•11 years ago
|
Assignee: general → jdemooij
Status: NEW → ASSIGNED
Assignee | ||
Comment 7•11 years ago
|
||
Bug 776043 exposed this bug. Before that bug, we'd only check if stderr contained the expected error and we ignored the return code. Now we also make sure the return code is 3.
And indeed, a few days later the first of these bugs were filed. Here's a list of bugs that are all caused by this bug. List may be incomplete:
Bug 874858, bug 876213, bug 878611, bug 880086, bug 881403, bug 881604, bug 883224, bug 883327, bug 884064, bug 884183, bug 884451, bug 885142, bug 885146, bug 886171, bug 887003, bug 887559, bug 888868, bug 891063, bug 892697, bug 892975, bug 894032, bug 894436, bug 894613, bug 897298, bug 898084, bug 899611, bug 899697, bug 902047, bug 902052
All of these are Windows 7. I'm still investigating what's causing this on Windows.
Assignee | ||
Comment 8•11 years ago
|
||
Even if main() always returns 3, it will fail every X thousand runs. I'm still narrowing it down but I think it's caused by something we do when destroying the JSRuntime.
It's possible for a process to return 0 although the "real" exit code was something else:
http://blogs.msdn.com/b/oldnewthing/archive/2008/05/06/8461730.aspx
My best guess is that it's an NSPR locking/threading thing triggering that somehow.
Assignee | ||
Comment 9•11 years ago
|
||
These intermittent Windows-only jit-test failures keep coming in (philor just filed at least 7 bugs, Ryan has also filed lots of them over the past few weeks and see also the list in comment 7).
I think we should revert bug 776043 for Windows (don't check the return code there). Terrence, are you ok with that?
Flags: needinfo?(terrence)
Assignee | ||
Comment 10•11 years ago
|
||
Or a bit better: on Windows always allow code 0, so that we will still fail if we see another error code.
Comment 11•11 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #10)
> Or a bit better: on Windows always allow code 0, so that we will still fail
> if we see another error code.
Yes, that sounds like an excellent workaround. I'd guess python's multiprocessing just does not work as well on windows.
Christian, did you see any notes in the multiprocessing docs that might cause the above when you were parallelizing the jit-tests?
Flags: needinfo?(terrence) → needinfo?(choller)
Thanks, Jan. Let's just patch this now.
Attachment #806129 -
Flags: review?(terrence)
Comment 13•11 years ago
|
||
Comment on attachment 806129 [details] [diff] [review]
jittest-fix
Review of attachment 806129 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks for beating me to this! The logic looks correct: r=me.
Attachment #806129 -
Flags: review?(terrence) → review+
Comment 15•11 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
Reporter | ||
Comment 16•11 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/0b5e84aa74bd
https://hg.mozilla.org/releases/mozilla-beta/rev/6d91dd498f23
https://hg.mozilla.org/releases/mozilla-esr24/rev/b2a6325734a6
status-firefox25:
--- → fixed
status-firefox26:
--- → fixed
status-firefox27:
--- → fixed
status-firefox-esr24:
--- → fixed
Comment 48•11 years ago
|
||
Comment 49•11 years ago
|
||
Bah, bad awesomebar.
You need to log in
before you can comment on or make changes to this bug.
Description
•