Closed Bug 1599583 Opened 5 years ago Closed 5 years ago

Browser chrome tests fail to log results

Categories

(Testing :: Mochitest, defect)

Version 3
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1543447

People

(Reporter: mossop, Unassigned)

References

Details

Running a browser-chrome in debug builds on OSX right now no longer shows any output or exits when complete.

Running shows the normal logs as the browser starts up and then the message INFO runtests.py | Waiting for browser... appears and then nothing at all after, which is weird for a debug build.

The test does actually run in the browser but once complete the browser just sits there doing nothing and I have to manually close and kill the test harness.

Correction. If I wait for 5 minutes after the browser window closes the test results do suddenly show up in the console.

I ran something like this several times after doing a debug, artifact build on my local macosx1015 (catalina):

./mach test browser/components/urlbar/tests/browser/browser_removeUnsafeProtocolsFromURLBarPaste.js

There were times when the test window hung, but each time the test browser window and the harness window closed. The hangs were resolved if I refocused onto the harness window.

I figured out that this problem goes away if I comment out the assertion that fires on every startup from bug 1590906.

Depends on: 1590906

Hmm, goes away until my test run hits a crash I'm investigating, at which point output stops getting logged for five minutes again. In both cases the initial like of the problem is output then the output pauses and when it resumes you get the stack of the problem. So you get logs like this:

 1:30.23 GECKO(24493) Hit MOZ_CRASH() at /Users/dave/mozilla/source/trunk/xpcom/base/nsCycleCollector.cpp:608
 6:52.38 GECKO(24493) #01: PtrInfo::AnnotatedReleaseAssert(bool, char const*) (nsCycleCollector.cpp:608, in XUL)
 6:52.38 GECKO(24493) #02: nsCycleCollector::ScanWhiteNodes(bool) (nsCycleCollector.cpp:2899, in XUL)
 6:52.38 GECKO(24493) #03: nsCycleCollector::ScanRoots(bool) (nsCycleCollector.cpp:2969, in XUL)
 6:52.38 GECKO(24493) #04: nsCycleCollector::Collect(ccType, js::SliceBudget&, nsICycleCollectorListener*, bool) (nsCycleCollector.cpp:3435, in XUL)

(In reply to Dave Townsend [:mossop] (he/him) from comment #4)

Hmm, goes away until my test run hits a crash I'm investigating, at which point output stops getting logged for five minutes again. In both cases the initial like of the problem is output then the output pauses and when it resumes you get the stack of the problem. So you get logs like this:

 1:30.23 GECKO(24493) Hit MOZ_CRASH() at /Users/dave/mozilla/source/trunk/xpcom/base/nsCycleCollector.cpp:608
 6:52.38 GECKO(24493) #01: PtrInfo::AnnotatedReleaseAssert(bool, char const*) (nsCycleCollector.cpp:608, in XUL)
 6:52.38 GECKO(24493) #02: nsCycleCollector::ScanWhiteNodes(bool) (nsCycleCollector.cpp:2899, in XUL)
 6:52.38 GECKO(24493) #03: nsCycleCollector::ScanRoots(bool) (nsCycleCollector.cpp:2969, in XUL)
 6:52.38 GECKO(24493) #04: nsCycleCollector::Collect(ccType, js::SliceBudget&, nsICycleCollectorListener*, bool) (nsCycleCollector.cpp:3435, in XUL)

Yeah, I read your description and this bug last night when you posted on IRC, was puzzled and had no ideas. Then this morning I suddenly remembered what this might be, which maybe you've figured out by now - the long delay is (I think!) caused by symbolicating the stack trace (for either MOZ_CRASH or MOZ_ASSERT). This can take a long time on mac, though ISTR that in the past I had to manually pipe logs through a tool to do this, and without that the delay wouldn't show up but also there'd be no symbols so the stack traces on artifact builds were useless.

I'm not sure what would have changed here - I assume it was working before. Perhaps you didn't hit these asserts/crashes before? You could perhaps figure out when this regressed using hg bisect.

From IRC, AIUI this is an artifact build. Nathan or Chris, any ideas what might have changed here? This predates bug 1420363 landing, AFAICT as this was filed a week ago... but perhaps this was bug 1593465? My original reason for filing there was about infra, but it seems plausible that if we now upload more artifacts, if we end up downloading those again for artifact builds, that we end up using them locally or something?

Flags: needinfo?(nfroyd)
Flags: needinfo?(cmanchester)

(In reply to Dave Townsend [:mossop] (he/him) from comment #3)

I figured out that this problem goes away if I comment out the assertion that fires on every startup from bug 1590906.

Is this bug 1543447 in disguise?

Flags: needinfo?(nfroyd)

(In reply to :Gijs (he/him) from comment #5)

(In reply to Dave Townsend [:mossop] (he/him) from comment #4)

Hmm, goes away until my test run hits a crash I'm investigating, at which point output stops getting logged for five minutes again. In both cases the initial like of the problem is output then the output pauses and when it resumes you get the stack of the problem. So you get logs like this:

 1:30.23 GECKO(24493) Hit MOZ_CRASH() at /Users/dave/mozilla/source/trunk/xpcom/base/nsCycleCollector.cpp:608
 6:52.38 GECKO(24493) #01: PtrInfo::AnnotatedReleaseAssert(bool, char const*) (nsCycleCollector.cpp:608, in XUL)
 6:52.38 GECKO(24493) #02: nsCycleCollector::ScanWhiteNodes(bool) (nsCycleCollector.cpp:2899, in XUL)
 6:52.38 GECKO(24493) #03: nsCycleCollector::ScanRoots(bool) (nsCycleCollector.cpp:2969, in XUL)
 6:52.38 GECKO(24493) #04: nsCycleCollector::Collect(ccType, js::SliceBudget&, nsICycleCollectorListener*, bool) (nsCycleCollector.cpp:3435, in XUL)

Yeah, I read your description and this bug last night when you posted on IRC, was puzzled and had no ideas. Then this morning I suddenly remembered what this might be, which maybe you've figured out by now - the long delay is (I think!) caused by symbolicating the stack trace (for either MOZ_CRASH or MOZ_ASSERT). This can take a long time on mac, though ISTR that in the past I had to manually pipe logs through a tool to do this, and without that the delay wouldn't show up but also there'd be no symbols so the stack traces on artifact builds were useless.

I'm not sure what would have changed here - I assume it was working before. Perhaps you didn't hit these asserts/crashes before? You could perhaps figure out when this regressed using hg bisect.

I'm pretty sure I've hit crashes in tests before and not seen this, but I didn't notice that the line numbers were present, I've always been annoyed that I didn't get line numbers in crash stacks. Did we turn something on to do that now? Or did I accidentally turn something on to do that?

From IRC, AIUI this is an artifact build. Nathan or Chris, any ideas what might have changed here? This predates bug 1420363 landing, AFAICT as this was filed a week ago... but perhaps this was bug 1593465? My original reason for filing there was about infra, but it seems plausible that if we now upload more artifacts, if we end up downloading those again for artifact builds, that we end up using them locally or something?

This is a full debug build, doesn't reproduce with an artifact build.

(In reply to Dave Townsend [:mossop] (he/him) from comment #7)

I'm pretty sure I've hit crashes in tests before and not seen this, but I didn't notice that the line numbers were present, I've always been annoyed that I didn't get line numbers in crash stacks. Did we turn something on to do that now? Or did I accidentally turn something on to do that?

If you haven't always seen this and/or it isn't new since catalina, maybe? hg bisect would help with this, but if you're only seeing this with full builds then that's a pretty serious timesink...

(In reply to Nathan Froyd [:froydnj] from comment #6)

(In reply to Dave Townsend [:mossop] (he/him) from comment #3)

I figured out that this problem goes away if I comment out the assertion that fires on every startup from bug 1590906.

Is this bug 1543447 in disguise?

Yes it does look like that, thanks for pointing that out the environment variable solves my issues. I'm a little confused about why I'm suddenly hitting this though.

The behaviour of having to wait so long to get output is quite frustrating. I wonder if it would be better to turn off stack fixing by default and output a link explaining how to turn it on and the costs of doing so.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → DUPLICATE
Flags: needinfo?(cmanchester)
You need to log in before you can comment on or make changes to this bug.