385248 - [client] tinderbox should (sometimes?) run tools/rb/fix-linux-stack.pl / fix-macosx-stack.pl over log

Assignee

Description

•

17 years ago

Tinderbox logs sometimes have crash stacks or assertion stacks in them.  However, on Linux, these stacks aren't very useful by default, unless tools/rb/fix-linux-stack.pl is run over the log.  We should do this so that the logs are more useful.  It needs to be run on the machine with the build, before the build is deleted, since it's getting debugging information out of the libraries.

(If we do this, we should also follow up by installing some *-debuginfo RPMs on the tinderboxes.)

David Baron :dbaron:

Assignee

Updated

•

16 years ago

Summary: [client] tinderbox should (sometimes?) run tools/rb/fix-linux-stack.pl over log → [client] tinderbox should (sometimes?) run tools/rb/fix-linux-stack.pl / fix-macosx-stack.pl over log

David Baron :dbaron:

Assignee

Comment 1

•

16 years ago

See also bug 432385 for doing this over the leak stats differences.

cls

Updated

•

16 years ago

Severity: normal → enhancement

Boris Zbarsky [:bzbarsky]

Comment 2

•

15 years ago

What can I do to help this happen?  I'm trying to sort out orange on tbox due to an assert that would take all of 2 minutes to pin down if I knew what the immediate caller of the function asserting is.  As it is, I've now spent the better part of an hour on it, and counting.

Michael Morgan [:morgamic]

Comment 3

•

15 years ago

This sounds like we just need to set up some cron jobs?  What needs to happen?  I can work with IT to speed this up.

Boris Zbarsky [:bzbarsky]

Comment 4

•

15 years ago

I think comment 0 sort of sums it up.  In case of a crash or fatal assert, we need to run this script on the log.  If it's easier to run in all cases, we should just do that.

Boris Zbarsky [:bzbarsky]

Comment 5

•

15 years ago

Oh, and I don't see how a cron job would do the right thing.

Michael Morgan [:morgamic]

Comment 6

•

15 years ago

Yeah, re-read comment 0 -- I was hoping we could run the script on logs in a batch but I guess that's not possible.  Some questions I had in an email to dbaron:

* who worked on this in the past?  who can work on it now?
* where is the code we'd be changing?
* is there a test env for us to test this in?

Michael Morgan [:morgamic]

Comment 7

•

15 years ago

Regarding who works on it, I'm really just figuring out who could be a contact for info about how to setup/install/etc.

Reed Loden [:reed]

Comment 8

•

15 years ago

IT would have nothing to do with build machines.

Assignee: morgamic → nobody

Component: Tinderbox → Release Engineering

Product: Webtools → mozilla.org

QA Contact: tinderbox → release

Version: Trunk → other

Michael Morgan [:morgamic]

Comment 9

•

15 years ago

Thanks, really helpful.

Boris Zbarsky [:bzbarsky]

Comment 10

•

15 years ago

Setup/install what?  The script?  It's just at tools/rb/fix-linux-stack.pl (or fix-macosx-stack.pl) in the m-c tree.  If you have a source tree, you have it.  I guess on the test machines we don't have the source tree, so we might need to copy it over or something.....  It needs no setup; just needs to be run.

David Baron :dbaron:

Assignee

Comment 11

•

15 years ago

Back when this bug was filed (or at least pretty shortly before that point), there was no component for tinderbox client bugs other than Webtools/Tinderbox.  That's why it was in that component, marked "[client]", meaning that it was a bug in the code run on the build machines.

All this needs is a change to the buildbot scripts to run the scripts over the logs that come from debug builds.

Michael Morgan [:morgamic]

Comment 12

•

15 years ago

Ok - thanks guys.  Will check with releng and see who can help.

Nick Thomas [:nthomas] (UTC+12)

Comment 13

•

15 years ago

So what type of builds would this need to run on in modern builds ? leak test (debug) and unit (for the stacks on crash) ? 

For the former, can we teach leaktest.py to pipe the crash stack through fix-*-stack.pl ?  Could you be more specific which "logs" need to be processed.

Boris Zbarsky [:bzbarsky]

Comment 14

•

15 years ago

Just anything that has a stack.  If you can pipe just the stack through it, that's perfect.

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Comment 15

•

15 years ago

This script should be run, per unittest job. Doing this as a cron job wont always work right, depending on how quickly the slave cleans up to start another job.

afaict, the best approach is to have these scripts called as part of the existing Makefile targets that we run per build. Probably just near the end, after the logs are produced, but before the Makefile returns - but I dont know details of how the scripts work.

I see from comment#10 that the scripts are already checked in. So, it seems like remaining steps are to change the Makefile targets?

Assigning over to BuildConfig - Ted, does that seem like a fair summary to you?

Component: Release Engineering → Build Config

Product: mozilla.org → Firefox

QA Contact: release → build.config

Version: other → unspecified

David Baron :dbaron:

Assignee

Comment 16

•

15 years ago

Basically "anything that has a stack" means "any stdout output from a debug build", since debug builds will print stacks for both assertions and crashes.

I think this means it has to happen in the buildbot scripts.

Component: Build Config → Release Engineering

Product: Firefox → mozilla.org

QA Contact: build.config → release

Version: unspecified → other

David Baron :dbaron:

Assignee

Comment 17

•

15 years ago

Er, I suppose if everything's in the tree you're right that it goes in Firefox::Build Config, although I'm not sure if that's the case for the alive test, which is often the first (and only, if its failure makes the rest of the tests not run) one to fail when something goes wrong.

So moving back to build config, and if we fix it there and alive test is still an issue, we'll move it back.

Component: Release Engineering → Build Config

Product: mozilla.org → Firefox

QA Contact: release → build.config

Version: other → Trunk

David Baron :dbaron:

Assignee

Comment 18

•

15 years ago

And from nthomas, the commands the leak test builders currently run are:

python leaktest.py -- -register
python leaktest.py -- -CreateProfile default
python leaktest.py -l bloat.log
python leaktest.py -- --trace-malloc malloc.log --shutdown-leaks=sdleak.log
obj-firefox/dist/bin/leakstats ../malloc.log
obj-firefox/dist/bin/leakstats ../malloc.log.old
bash -c perl build/tools/trace-malloc/diffbloatdump.pl --depth=15 --use-address
  /dev/null sdleak.log > sdleak.tree
/bin/bash -c perl build/tools/rb/fix-linux-stack.pl sdleak.tree.raw > sdleak.tree
perl build/tools/trace-malloc/diffbloatdump.pl --depth=15 sdleak.tree.old
  sdleak.tree

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Comment 19

•

15 years ago

(In reply to comment #18)
> And from nthomas, the commands the leak test builders currently run are:
> 
> python leaktest.py -- -register
> python leaktest.py -- -CreateProfile default
> python leaktest.py -l bloat.log
> python leaktest.py -- --trace-malloc malloc.log --shutdown-leaks=sdleak.log
> obj-firefox/dist/bin/leakstats ../malloc.log
> obj-firefox/dist/bin/leakstats ../malloc.log.old
> bash -c perl build/tools/trace-malloc/diffbloatdump.pl --depth=15 --use-address
>   /dev/null sdleak.log > sdleak.tree
> /bin/bash -c perl build/tools/rb/fix-linux-stack.pl sdleak.tree.raw >
> sdleak.tree
> perl build/tools/trace-malloc/diffbloatdump.pl --depth=15 sdleak.tree.old
>   sdleak.tree

Ted: how about moving all that (and dbaron's additional script request) into a new Makefile target?

("make leaktest" was being discussed in irc, but I don't care much what its called.)

Nick Thomas [:nthomas] (UTC+12)

Comment 20

•

15 years ago

That's a somewhat abbreviated list, leaving out previous log downloads and new log uploads. For the full list I recommend searching for "BuildStep started" in a leak test log, or reading 
http://hg.mozilla.org/build/buildbotcustom/file/tip/process/factory.py#l444
http://hg.mozilla.org/build/buildbotcustom/file/tip/steps/test.py#l45

David Baron :dbaron:

Assignee

Comment 21

•

15 years ago

(In reply to comment #19)
> Ted: how about moving all that (and dbaron's additional script request) into a
> new Makefile target?

If we want to do that, can we do it in a different bug?

This bug has accumulated far too many unrelated comments already today.

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Comment 22

•

15 years ago

(In reply to comment #21)
> (In reply to comment #19)
> > Ted: how about moving all that (and dbaron's additional script request) into a
> > new Makefile target?
> 
> If we want to do that, can we do it in a different bug?

Ted, let me know if you want a separate bug filed - I'd be happy to oblige! :-)


> This bug has accumulated far too many unrelated comments already today.

David Baron :dbaron:

Assignee

Comment 23

•

15 years ago

I have a work-in-progress patch that works for me at
http://hg.mozilla.org/users/dbaron_mozilla.com/patches/file/tip/run-output-through-stack-fixer
though I still need to do a lot more testing, and I suspect (for portability) that I also need to call fix-linux-stack/fix-macosx-stack by calling perl rather than by calling the script directly.

Assignee: nobody → dbaron

David Baron :dbaron:

Assignee

Updated

•

15 years ago

OS: Linux → All

Priority: -- → P3

Product: Firefox → Core

QA Contact: build.config → build-config

Hardware: x86 → All

Target Milestone: --- → mozilla1.9.2

David Baron :dbaron:

Assignee

Comment 24

•

15 years ago

Attached patch patch (obsolete) (deleted) — Details — Splinter Review

I think this is at least the bulk of what's needed here.

I've tested that it fixes up stacks when running leaktest (both regular and trace-malloc commands), reftest, and mochitest on my Linux debug build.

I also sent it to try server with the "if IS_DEBUG_BUILD" changed to "if True" so that it would actually get run under the build automation; mostly green, though I got what I have to presume is a random-orange timeout in the middle of mochitest-plain on Mac.  (Unit tests passed fully on Windows and Linux, and everything other than mochitest-plain passed on Mac.)


The core of the patch is pretty simple; in the case where we're piping the output (i.e., no debugger attached), stick an extra pipe in the chain if appropriate.  A few things of note:

 * I don't have a whole lot of experience writing python, so you should review my python somewhat closely

 * I added two new global variables in automation.py (PERL and IS_LINUX), but I didn't put them in the __all__ list, which I presume is the list of stuff the module exports, because I tend to think other callers should avoid using them if possible.  (Especially so for IS_LINUX, since we want people to use UNIXISH in general, but in this case fix-linux-stack is actually only known to work on Linux; I'm not sure if any other Unixish platforms ship addr2line (part of GNU binutils).)

 * I install fix-linux-stack or fix-macosx-stack to dist/bin/ from the Makefile in build/ because I didn't want to deal with adding one for tools/rb and figuring out where it should fit in the build process.  They're currently installed ifdef ENABLE_TESTS (that ifdef starts just above the context of the diff), although we only actually use them in debug builds because I think debug builds are the only ones that print stacks to stdout in their default configuration.

Attachment #392979 - Flags: review?(ted.mielczarek)

(not currently active) Ted Mielczarek

Updated

•

15 years ago

Attachment #392979 - Flags: review?(ted.mielczarek) → review+

(not currently active) Ted Mielczarek

Comment 25

•

15 years ago

Comment on attachment 392979 [details] [diff] [review]
patch

This looks good, with the caveat that it's not going to work right if we start running unittests on packaged debug builds. Since we aren't currently doing that, I'm ok with you checking this in, as long as you file a follow-up bug on fixing that. FWIW, what would need fixing is:
1) Include the fix-stack scripts in the test package, probably just by putting them in the list here: http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/Makefile.in#92
2) Provide some way to override PERL from the command line.

Comments on the patch itself:
+
+ifeq ($(OS_ARCH),Darwin)
+libs:: $(topsrcdir)/tools/rb/fix-macosx-stack.pl
+	$(INSTALL) $< $(DIST)/bin
 endif
+
+ifeq ($(OS_ARCH),Linux)
+libs:: $(topsrcdir)/tools/rb/fix-linux-stack.pl
+	$(INSTALL) $< $(DIST)/bin
+endif
+endif

Do you have an extra endif there? (Did you mean to have an ifdef ENABLE_TESTS on top?)

+        stackFixerProcess = Process([PERL, os.path.join(utilityPath, stackFixerCommand)], stdin=logsource, stdout=subprocess.PIPE)

I wonder if you ought to be calling .wait() on this process later, to ensure that we don't wind up with zombies.

David Baron :dbaron:

Assignee

Comment 26

•

15 years ago

(In reply to comment #25)
> Do you have an extra endif there? (Did you mean to have an ifdef ENABLE_TESTS
> on top?)

It's in an existing ENABLE_TESTS; the "extra" is just because diff decided to put the existing one in the middle rather than at the end (note the one without a + that looks like it ought to have one).

David Baron :dbaron:

Assignee

Comment 27

•

15 years ago

Attached patch patch [Checkin: Comment 29] (deleted) — Details — Splinter Review

Here's a revised patch that I think addresses all your comments other than the passing perl as an argument issue.  (Not quite sure how to approach that given how many ways automation.py is used; maybe an environment variable would be better?)

So the changes here are:
 * waiting for the stack fixer process, and reporting a failure if it returns nonzero status

 * adding fix-linux-stack.pl / fix-macosx-stack.pl to the test package

Attachment #392979 - Attachment is obsolete: true

Attachment #393560 - Flags: review?(ted.mielczarek)

(not currently active) Ted Mielczarek

Updated

•

15 years ago

Attachment #393560 - Flags: review?(ted.mielczarek) → review+

(not currently active) Ted Mielczarek

Comment 28

•

15 years ago

Comment on attachment 393560 [details] [diff] [review]
patch
[Checkin: Comment 29]

Looks good. Yeah, an environment variable for PERL would probably work. For now, this will work fine if we get debug unittests on packaged builds, since all our build slaves will have perl in the same place.

David Baron :dbaron:

Assignee

Comment 29

•

15 years ago

http://hg.mozilla.org/mozilla-central/rev/219bd7354882

Status: NEW → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

Target Milestone: mozilla1.9.2 → mozilla1.9.2b1

David Baron :dbaron:

Assignee

Comment 30

•

15 years ago

And filed bug 509620 as a followup on the path-to-perl issue.

David Baron :dbaron:

Assignee

Comment 31

•

15 years ago

I should probably also file separate bugs for xpcshell-tests and maybe make check.

(not currently active) Ted Mielczarek

Comment 32

•

15 years ago

|make check| doesn't seem worth the effort, since it's a bunch of standalone C++ programs scattered about various makefiles.

xpcshell-tests should be easy, the patch will probably look very similar to this, in runxpcshelltests.py.

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

15 years ago

Depends on: 510014

David Baron :dbaron:

Assignee

Comment 33

•

15 years ago

I filed bug 510550 for runxpcshelltests.py.

Nick Thomas [:nthomas] (UTC+12)

Updated

•

15 years ago

Blocks: 510489

Serge Gautherie (:sgautherie)

Updated

•

15 years ago

Blocks: 487494

Serge Gautherie (:sgautherie)

Updated

•

15 years ago

Attachment #393560 - Attachment description: patch → patch [Checkin: Comment 29]

Jesse Ruderman

Updated

•

15 years ago

Depends on: 531479

Wtownsend702@gmail.com

Updated

•

10 years ago

blocking-b2g: --- → koi?

David Baron :dbaron:

Assignee

Updated

•

10 years ago

blocking-b2g: koi? → ---

BMO Automation

Updated

•

6 years ago

Product: Core → Firefox Build System

patch 15 years ago David Baron :dbaron: (deleted), patch	ted : review+	Details \| Diff \| Splinter Review
patch [Checkin: Comment 29] 15 years ago David Baron :dbaron: (deleted), patch	ted : review+	Details \| Diff \| Splinter Review