Open
Bug 896840
Opened 11 years ago
Updated 2 years ago
Find cause of reftests sometimes failing with "load failed: null", on Android at least
Categories
(Testing :: Reftest, defect)
Tracking
(Not tracked)
NEW
People
(Reporter: dholbert, Unassigned)
References
Details
Attachments
(1 file)
(deleted),
patch
|
dbaron
:
review-
|
Details | Diff | Splinter Review |
As documented in bug 789751, we have lots of Android test failures where the harness reports the failure as e.g.:
{
REFTEST TEST-UNEXPECTED-FAIL | http://10.250.48.23:30311/tests/gfx/tests/crashtests/580719-1.html | load failed: null
}
The "null" there is us trying to print a passed-in explanation for the failure, but either no explanation was passed, or the explanation string was never set, or something like that.
We should be better about actually printing a useful message here. (I suspect these are mostly timeouts, but I can't really know for sure because all I have is "null".)
Some handy MXR links:
- The failure message is printed here:
> 1641 function LoadFailed(why)
> 1642 {
> [...]
> 1644 gDumpLog("REFTEST TEST-UNEXPECTED-FAIL | " +
> 1645 gURLs[0]["url" + gState].spec + " | load failed: " + why + "\n");
[1]
So, presumably a null value is being passed for the "why" argument there.
LoadFailed gets called by RecvFailedLoad() [2], which is registered as a message listener for "reftest:FailedLoad"[3], which gets invoked asynchronously by SendFailedLoad in reftest-content.js [4], which gets invoked by LoadFailed() in reftest-content.js [5].
And this appears to be where the "why" explanation-string traces back to -- LoadFailed passes in gFailureReason as the explanation. And there are some clauses that set or append to gFailureReason, but there's also code (e.g. in RecordResult()) that sets it to null.
We might want to null-check it at the callsite in [5] and pass in the constant-string "unknown reason" or something like that, as a first pass, to make sure it's *actually* making it through to the actual logging code via the code-path I described above.
[1] http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest.js#293
[2] http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest.js#1869
[3] http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest.js#1801
[4] http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest-content.js#790
[5] http://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest-content.js#655
Comment 1•11 years ago
|
||
See also: bug 789751
Comment 2•11 years ago
|
||
(In reply to Daniel Holbert [:dholbert] from comment #0)
> As documented in bug 789751, we have lots of Android test failures where the
> harness reports the failure as e.g.:
I can read.
Comment 3•11 years ago
|
||
dholbert++ for filing this :-)
Reporter | ||
Comment 4•11 years ago
|
||
[Thanks, edmorley!]
So here's a patch to hopefully make some progress on this. Per comment 0, this seems to trace back to gFailureReason being undefined or null, so this patch initializes it and resets it to actual helpful string values instead of to those values.
(I initially wrote this as a temporary debugging patch, with the idea being we'd land it & watch for bug 789751 & then backout. But we might even just want to keep this or a modified version of it.)
Attachment #781970 -
Flags: review?(dbaron)
Reporter | ||
Comment 5•11 years ago
|
||
(I pushed this to Try, hoping to hit an instance of bug 789751 & learn something, but even after triggering tons of reftest/crashtest jobs, I haven't hit bug 789751 yet, unfortunately. It's low-frequency enough that it's hard to hit on-demand, I guess. That Try run is here, for reference: https://tbpl.mozilla.org/?tree=Try&rev=4167e2a99987 )
Comment 6•11 years ago
|
||
Comment on attachment 781970 [details] [diff] [review]
diagnosing patch 1
It doesn't seem all that interesting to distinguish between these two cases -- isn't all it will tell you whether this is the first reftest in this reftest run?
I think both of these strings should probably say
"Haven't started test load yet."
... or at least should, if you have another assignment to gFailureReason for all later steps in the loading process. (The current gFailureReason logic is mostly about debugging the special things we wait for, and not the "normal" loading.)
Also, for debugging this problem, it's probably more useful to find points *earlier* in the test loading process at which you could set gFailureReason -- that seems more likely to lead to useful information. That may or may not mean building a complete set of gFailureReasons describing everything we're waiting for.
Also, given the way gFailureReason works, it seems bad to check in reasons with gaps in the logic *after* them, since gFailureReason is a description of what we're currently waiting for.
But I think once that's done, the "Haven't started test load yet" message would be useless, since if we have everything we're waiting for documented, then we would never get stuck in that state.
Attachment #781970 -
Flags: review?(dbaron) → review-
Comment 7•11 years ago
|
||
FWIW, we're hitting these on B2G now as well.
Comment 8•11 years ago
|
||
Comment 9•11 years ago
|
||
Comment 10•11 years ago
|
||
Comment 11•11 years ago
|
||
OOps, just realised comments 8-10 were meant to be in bug 789751
Updated•11 years ago
|
Blocks: 789751
Summary: Reftests sometimes fail with "load failed: null", on Android at least → Find cause of reftests sometimes failing with "load failed: null", on Android at least
Comment 12•11 years ago
|
||
(In reply to David Baron [:dbaron] (needinfo? me) (UTC-7) from comment #6)
> Also, for debugging this problem, it's probably more useful to find points
> *earlier* in the test loading process at which you could set gFailureReason
> -- that seems more likely to lead to useful information. That may or may
> not mean building a complete set of gFailureReasons describing everything
> we're waiting for.
I've been playing around with some try pushes for this, and got the jsreftest failure to occur with some additional logging:
https://tbpl.mozilla.org/php/getParsedLog.php?id=38005581&tree=Try
When this failure occurs, in StartTestURI() we return from LoadURI() [1] but then never enter OnDocumentLoad() [2].
[1] https://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest-content.js#136
[2] https://mxr.mozilla.org/mozilla-central/source/layout/tools/reftest/reftest-content.js#556
David, are you the best person to investigate further?
Comment 13•10 years ago
|
||
Clearing old needinfo requests where I've given up expecting a response, to clear up my bugzilla-todos dashboard "to nag" list.
Flags: needinfo?(dbaron)
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•