Closed Bug 484231 Opened 16 years ago Closed 14 years ago

A Mochitest crashdump turns the waterfall orange but misses to |TinderboxPrint|

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: sgautherie, Assigned: sgautherie)

References

()

Details

(Keywords: fixed1.9.1, Whiteboard: [Waiting for (additional) bug 483062 on 1.9.1] [fixed1.9.1b4: Bv1])

Attachments

(2 files, 2 obsolete files)

Example:
{
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1237459052.1237466973.12663.gz&fulltext=1
Linux mozilla-central unit test on 2009/03/19 03:37:32

Build Error Summary

TEST-UNEXPECTED-FAIL | (automation.py) | Exited with code 1 during test run
TEST-UNEXPECTED-FAIL | (automation.py) | Browser crashed (minidump found)
TEST-UNEXPECTED-FAIL | runtests-leaks | missing output line for total leaks!
}

***

automation.py > runApp()
{
  if checkForCrashes(profileDir, symbolsPath):
    status = -1

  ...

  return status
}
which is fine.


runtests.py.in > main()
{
  status = automation.runApp(testURL, browserEnv, options.app,

  ...

  # hanging due to non-halting threads is no fun; assume we hit the errors we
  # were going to hit already and exit with a success code
  sys.exit(0)
}
This code was added by bug 418009.
In general, can't we use |sys.exit(status)| here ?
(Should help bug 483230 too, shouldn't it ?)


steps.py:
I think this should be specifically handled in ShellCommandReportTimeout().
(or at least MozillaReftest() and MozillaMochitest())
I don't understand what you're asking for here: what do you think should be TinderboxPrinted?
I think the "expected" report for a crash would be "FAIL".
But, because it seems this crash happened after the test suite completed, steps.py doesn't catch it in MozillaMochitest() either.

I'm thinking about a patch for the steps.py part: hold on...
Assignee: nobody → sgautherie.bz
(To be applied on top of bug 473259 patch Cv1a and Dv1.)
Attachment #368407 - Flags: review?(ccooper)
Status: NEW → ASSIGNED
Av1 Reftest |elif crashRe.match(line):| could not work :-<

While there:
*Enhance leak detection which can have the same waterfall issue.
*Keep Reftest in sync'. (So I can proceed with bug 469518 too.)
*Explicitly differentiate test counts and leak 'FAIL's.

(To be applied on top of bug 473259 patch Cv1a and Dv1.)
Attachment #368407 - Attachment is obsolete: true
Attachment #369117 - Flags: review?(ccooper)
Attachment #368407 - Flags: review?(ccooper)
(In reply to comment #0)
> In general, can't we use |sys.exit(status)| here ?

Jeff ?
Blocks: 469518
I think you could exit with status there, but I suspect it might cut tinderbox builds short if the first mochitest run fails, even if it's useful to know whether the other mochitest runs pass or not.

I think the important part there was that it exited rather than returned, because other threads in Python itself might hang (processing server output in that case) and prevent runtests.py from finishing at all.  That might or might not be a problem any more, but I'd be inclined to leave it alone and not worry about it.
(Following our irc discussion, with nthomas too.)

All other |runApp()| callers return |status|:
http://mxr.mozilla.org/mozilla-central/search?string=runApp%5C%28&regexp=on&case=on


(In reply to comment #6)
> I think you could exit with status there, but I suspect it might cut tinderbox
> builds short if the first mochitest run fails

nthomas and I think a non-zero return code is not expected to abort following test suites.

What I expect it to do is trigger the catch-all 'Unknown Error: command finished with exit code: %d' from
http://mxr.mozilla.org/build/source/buildbotcustom/unittest/steps.py
... in order to catch (other) "unexpected" cases.

> I think the important part there was that it exited rather than returned,

That's how I understood the comment too :-)
Attachment #369190 - Flags: review?(jwalden+bmo)
Comment on attachment 369190 [details] [diff] [review]
(Bv1) runtests.py.in: return |status| instead of |0|, and 3 (unrelated) nits
[Checkin: See comment 15 & 16]

If it works, fine; I just wouldn't have spent any time trying to figure that out given that we already get error messages when a non-zero status happens, even if 0 gets returned here.
Attachment #369190 - Flags: review?(jwalden+bmo) → review+
ccooper, ping ?
end of quarter == really bad time to expect quick turnaround on reviews. Sorry.
Av2, extended to TUnit.
Attachment #369117 - Attachment is obsolete: true
Attachment #369803 - Flags: review?(ccooper)
Attachment #369117 - Flags: review?(ccooper)
Comment on attachment 369803 [details] [diff] [review]
(Av2a) steps.py: Add 'CRASH' support to summaryText() and use it, Enhance leak detection and use it for TUnit and Reftest too
[Checkin: Comment 13]

>-def summaryText(passCount, failCount, knownFailCount = None, leaked = False):
>+def summaryText(passCount, failCount, knownFailCount = None,
>+        crashed = False, leaked = False):

Not that it really matters because you're touching all the affected files, but is there a reason you changed the args order here, inserting crashed before leaked rather than just adding crashed to the end?

>+        # Regular expression for crash and leak detections.
>+        harnessErrorsRe = re.compile(r"TEST-UNEXPECTED-FAIL \| .* \| (missing output line for total leaks!|negative leaks caught!|leaked \d+ bytes during test execution)$")
>+        # Process the log.
>         for line in log.readlines():
>             if "TEST-PASS" in line:
>                 passCount = passCount + 1
>+                continue
>             if "TEST-UNEXPECTED-" in line:
>-                failCount = failCount + 1
>+                # Set the error flags.
>+                # Or set the failure count.
>+                m = harnessErrorsRe.match(line)
>+                if m:
>+                    r = m.group(1)
>+                    if r == "missing output line for total leaks!":
>+                        leaked = None
>+                    else:
>+                        leaked = True
>+                else:
>+                    failCount = failCount + 1
>+                # continue

Do we need a elif case for negative leaks, or is reporting it as an UNEXPECTED-FAIL + leak sufficient? (same for other cases below)
Attachment #369803 - Flags: review?(ccooper) → review+
Comment on attachment 369803 [details] [diff] [review]
(Av2a) steps.py: Add 'CRASH' support to summaryText() and use it, Enhance leak detection and use it for TUnit and Reftest too
[Checkin: Comment 13]


http://hg.mozilla.org/build/buildbotcustom/rev/6ed37da063bd


(In reply to comment #12)

Per your irc approval:

> is there a reason you changed the args order here, inserting crashed before
> leaked rather than just adding crashed to the end?

Simply seemed more ordered that way.

> Do we need a elif case for negative leaks, or is reporting it as an
> UNEXPECTED-FAIL + leak sufficient?

On the waterfall, no need to distinguish negative/positive leaks.
Attachment #369803 - Attachment description: (Av2a) steps.py: Add 'CRASH' support to summaryText() and use it, Enhance leak detection and use it for TUnit and Reftest too → (Av2a) steps.py: Add 'CRASH' support to summaryText() and use it, Enhance leak detection and use it for TUnit and Reftest too [Checkin: Comment 13]
Attachment #369803 - Flags: checked‑in+ checked‑in+
Depends on: 483062
Whiteboard: [Waiting for bug 483062]
Av2a is now running in production.
Depends on: 486192
Comment on attachment 369190 [details] [diff] [review]
(Bv1) runtests.py.in: return |status| instead of |0|, and 3 (unrelated) nits
[Checkin: See comment 15 & 16]


http://hg.mozilla.org/mozilla-central/rev/fe9f9ba0b6ff

After fixing context for
{
patching file build/automation.py.in
Hunk #1 FAILED at 62
1 out of 1 hunks FAILED
patching file layout/tools/reftest/runreftest.py
Hunk #1 FAILED at 142
1 out of 1 hunks FAILED
}
Attachment #369190 - Attachment description: (Bv1) runtests.py.in: return |status| instead of |0|, and 3 (unrelated) nits → (Bv1) runtests.py.in: return |status| instead of |0|, and 3 (unrelated) nits [Checkin: See comment 15]
Comment on attachment 369190 [details] [diff] [review]
(Bv1) runtests.py.in: return |status| instead of |0|, and 3 (unrelated) nits
[Checkin: See comment 15 & 16]


http://hg.mozilla.org/releases/mozilla-1.9.1/rev/e57603b0c758
Attachment #369190 - Attachment description: (Bv1) runtests.py.in: return |status| instead of |0|, and 3 (unrelated) nits [Checkin: See comment 15] → (Bv1) runtests.py.in: return |status| instead of |0|, and 3 (unrelated) nits [Checkin: See comment 15 & 16]
Keywords: fixed1.9.1
Whiteboard: [Waiting for bug 483062] → [Waiting for (additional) bug 483062] [fixed1.9.1b4: Bv1]
Blocks: 473259
Whiteboard: [Waiting for (additional) bug 483062] [fixed1.9.1b4: Bv1] → [Waiting for (additional) bug 483062 on 1.9.1] [fixed1.9.1b4: Bv1]
Its been over a year since last comment. Is this still a valid bug or can we close?
Serge: based on the whiteboard, can you post a patch that contains *only* the files you do need from bug 483062 on 1.9.1? 

That will take the guesswork out of someone (me) landing it.
I'm happy to land an updated patch when it appears.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → INCOMPLETE
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: