Closed
Bug 590526
Opened 14 years ago
Closed 14 years ago
Don't start two runs of the same job at the same time, since Tinderbox hides one of them
Categories
(Tree Management Graveyard :: TBPL, defect)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 630538
People
(Reporter: bhearsum, Unassigned)
References
Details
But, in at least one case I can see it in Buildbot. http://test-master01.build.mozilla.org:8012/builders/Rev3%20MacOSX%20Snow%20Leopard%2010.6.2%20tryserver%20opt%20test%20mochitests-1%2F5/builds/592 is a run of opt mochitest 1/5 on ebac4228ed4a. It doesn't show up on TBPL: http://grab.by/64tW And I can't seem to find it on plain old Tinderbox either.
Reporter | ||
Comment 1•14 years ago
|
||
Looks like this is caused by start time collisions. The following all started opt mochitest 1/5 on 64-bit mac at Wed Aug 25 01:32:18 2010: bb55562ff200 ae49d384bc7d a5117807db6f b1c4b5f417dd eb2a4ec3eab7 ebac4228ed4a 4a7d0ce9f897 e53fab4ba86c Based on the sheer improbability of all of these builds having sendchanges sent at the same instant I'm going to guess that either the test master was hung shortly before this time, or had issues with the connection to the db, or something like that.
Updated•14 years ago
|
Component: Release Engineering → Tinderboxpushlog
Product: mozilla.org → Webtools
QA Contact: release → tinderboxpushlog
Comment 4•14 years ago
|
||
This happens all the time to me. Is there some way forward towards fixing this issue?
Comment 6•14 years ago
|
||
I know nothing about this system, but why can't we use the same mechanism that gets the logs into http://ftp.mozilla.org/pub/mozilla.org/firefox/tryserver-builds ? That seems to work consistently.
Comment 7•14 years ago
|
||
We just use what tinderbox provides us, for now that is. Dropping our dependency is long underway but progress is very, very slow. :-(
Comment 8•14 years ago
|
||
(In reply to comment #7) > We just use what tinderbox provides us, for now that is. Dropping our > dependency is long underway but progress is very, very slow. :-( Yes, and we've come a long way thus far! But the logs are out there, somewhere; is there a technical reason right now that we can't get the logs from [1], or use the same mechanism for log retrieval as [1]? I'm not comfortable with the idea that we're stuck with this bug until a "very, very slow" process finishes. [1] http://ftp.mozilla.org/pub/mozilla.org/firefox/tryserver-builds
Comment 9•14 years ago
|
||
This has (very little) to do with log retrieval. The problem at hand is that we ask tinderbox, βwhat jobs were run in the past 12 hoursβ and it just doesnβt return all the jobs. But you are right in the sense that log retrieval and parsing is one small step to using builddb for the βtell me what happenedβ part.
Comment 10•14 years ago
|
||
This is becoming problematic. Every tryserver push I've made in the past few weeks (with the exception of single-job pushes) has been missing at least one job. I'd guess that things are getting worse because try is getting busier.
Comment 11•14 years ago
|
||
There is no shortcut for tbpl: tinderbox *will not* tell us about two builds with the same start time, so the only thing we can do is completely rewrite tbpl to use a data source which is not currently available to us which will tell us about them while also giving us everything else we need. If you're going to get a solution other than that, it would have to be from buildbot not starting two builds of the same job at the same time, which from my vague understanding of how it works you aren't going to get.
Component: Tinderboxpushlog → Release Engineering
Priority: P1 → --
Product: Webtools → mozilla.org
QA Contact: tinderboxpushlog → release
Summary: lots of test runs are missing from TBPL on try → Don't start two runs of the same job at the same time, since Tinderbox hides one of them
Comment 12•14 years ago
|
||
The fix to tinderbox would be invasive enough that it makes more sense to invest that effort in making TBPL get this log data from another source.
Component: Release Engineering → Tinderboxpushlog
Product: mozilla.org → Webtools
QA Contact: release → tinderboxpushlog
Updated•14 years ago
|
Comment 14•14 years ago
|
||
The workaround for this problem is to load up the self-serve URL to find out the status (green/orange/red/pending/completed): https://build.mozilla.org/buildapi/self-serve/try/rev/eb6877e334c8 If you needed to actually see the logs you would have to load up tinderbox and search for your changeset across the different columns: http://tinderbox.mozilla.org/admintree.cgi?tree=Try There is a greasemonkey script to help looking around: http://www.google.ca/search?q=greasemonkey+tinderbox&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:unofficial&client=firefox-a
Comment 15•14 years ago
|
||
Um, no. If your workaround includes the word "tinderbox" then you're misunderstanding the problem. Buildbot sends email to Tinderbox saying "a run of Linux opt mochitests-5/5 on 281be3877d80 which started at 1306774890 was orange, here's the log" and it stores that as linux-opt-mochitests-5/5-1306774890. Buildbot then sends email to Tinderbox saying "a run of Linux opt mochitests-5/5 on 777a46249d2a which started at 1306774890 was green, here's the log" and Tinderbox overwrites linux-opt-mochitests-5/5-1306774890, completely erasing any sign of the previous orange. Tinderbox believes that there's a 1:1 correspondence between job names and physical machines, so it doesn't believe it's possible for two jobs to start at the same time. The actual, miserable, workaround for try is to not filter try email to the trash, and for anything which you got email saying it wasn't green to download the full logs from ftp, dig through them with no showlog.cgi to help you find the failure, and search Bugzilla with no help from tbpl to see if it's a known failure.
Comment 16•14 years ago
|
||
So could buildbot be convinced to instead send email to Tinderbox saying "a run of Linux opt mochitests-5/5 on 777a46249d2a which started at 1306774890." + random(1000000) + " was green, here's the log"? What's a few milliseconds between friends?
Assignee | ||
Updated•10 years ago
|
Product: Webtools → Tree Management
Updated•10 years ago
|
Version: other → unspecified
Assignee | ||
Updated•10 years ago
|
Product: Tree Management → Tree Management Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•