Closed
Bug 617626
Opened 14 years ago
Closed 14 years ago
increase space on ftp.m.o so we can save tinderbox builds for ~1 month
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: joe, Assigned: mrz)
References
Details
Right now we save tinderbox builds for something like 2 days, which is just not enough when trying to find regressions. We should buy enough disk to make it possible for us to save tinderbox builds for a lot longer.
This doesn't have to be fast disk, and it doesn't have to be redundant or backed up. It just needs to be faster to download than to build.
(The graphics team was bitten hard by this when trying to bisect a WebGL regression. It happened sometime in the last 3 days, but we don't save enough tinderbox builds to bisect.)
Comment 1•14 years ago
|
||
See also Bug 463034 where i asked for this 2 Years ago.
http://hourly-archive.localgho.st/hourly-archive2/ is an Alternative.
Comment 2•14 years ago
|
||
FWIW, the current 24 hour expiry policy has us using ~60G in firefox/tinderbox-builds at the moment. That'll fluctuate depending on how many changes land each day.
We can trim that down if we only want to make the Firefox binaries available, since the tests and symbols archive are what's taking up most of the space. Depends if the need is to be able to re-run test suites multiple days later or if you just want to be able to pull the builds and check them.
Comment 3•14 years ago
|
||
(In reply to comment #2)
> We can trim that down if we only want to make the Firefox binaries available,
> since the tests and symbols archive are what's taking up most of the space.
> Depends if the need is to be able to re-run test suites multiple days later or
> if you just want to be able to pull the builds and check them.
Talos runs would be useful, which means we at least need the symbols. Having only 24 hours to ask for more talos runs makes it difficult sometimes.
Comment 4•14 years ago
|
||
(In reply to comment #2)
> FWIW, the current 24 hour expiry policy has us using ~60G in
> firefox/tinderbox-builds at the moment. That'll fluctuate depending on how many
> changes land each day.
At 60G per day it seems reasonable to be able to store at least 2 weeks worth of builds.
Updated•14 years ago
|
Assignee: server-ops → mrz
Comment 5•14 years ago
|
||
(In reply to comment #4)
> (In reply to comment #2)
> > FWIW, the current 24 hour expiry policy has us using ~60G in
> > firefox/tinderbox-builds at the moment. That'll fluctuate depending on how many
> > changes land each day.
>
> At 60G per day it seems reasonable to be able to store at least 2 weeks worth
> of builds.
Lets increase this by 2TB.
1) We're now posting logs for builds, tests to ftp.m.o, alongside the builds.
2) As part of the Tinderbox meeting yesterday, we'd like to keep the builds and their logs, on ftp.m.o for ~1month. (This is a reduction from our current ~60day retention of *logs* on tinderbox server, but an increase from our current retention of *builds* on ftp.m.o. It doesnt make much sense to keep logs without builds, so ~30days is a good compromise. 60gb x 30 = 1.8TB, so rounding this up to 2.0TB.)
Severity: minor → normal
Summary: buy enough disk to make it possible to save tinderbox builds for at least 1 week → increase space on ftp.m.o so we can save tinderbox builds for ~1 month
Updated•14 years ago
|
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
Comment 7•14 years ago
|
||
This change will consume much of the free space that bug 614786 will create so don't be surprised if you get a request to expand the space on the Netapp #1. Duping kinda obscures that this is asking for more resources.
Comment 8•14 years ago
|
||
(In reply to comment #6)
>
> *** This bug has been marked as a duplicate of bug 614786 ***
This is not a DUP, its a 2nd request for 2TB additional space on ftp.m.o. This is to support storing builds and logs for a month. Unrelated to bug#614786.
Status: RESOLVED → REOPENED
Flags: needs-treeclosure?
Resolution: DUPLICATE → ---
Comment 9•14 years ago
|
||
Bug 614786 is supplying you with an additional 1.8 TB. So you need 2 TB more on top of that?
Comment 10•14 years ago
|
||
ok, answered question yes on IRC.
09:16:40 < joduinn> 2TB is for keeping trybuilds for longer
09:17:00 < joduinn> other 2TB is for keeping incremental builds for longer
Comment 11•14 years ago
|
||
Assuming there's space on the netapp shelves, we could in theory expand the space on the existing partitions once the move is done.
Comment 12•14 years ago
|
||
summary after talking with aravind, aki, bhearsum:
1) aravind will add 2TB of space as "firefox/tinderbox-builds".
2) This diskspace is RAID, but does not have HA heads.
3) Note: it is possible to have this space disappear, if the head fails. If this happens,
* all ~30days worth of builds will disappear, until the head is replaced. Estimate 1 day. No builds will actually be lost, and they will reappear as soon as replacement head is online.
* This will *not* close the tree, because the existing high-availability mountpoint will automatically become visible, so new build-on-checkin builds can still post correctly, and still be copied down for testing.
* Bringing back up the new head will require a brief tree closure, while we move over the files generated during the outage.
Comment 13•14 years ago
|
||
10.253.0.139:/data/tinderbox-builds is ready for use.
Comment 14•14 years ago
|
||
The new mount point and the bind mounts are in place, the old tree is being rsynced into the new one. We can call this done. I will delete the old tree once the rsync is done.
Status: REOPENED → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → FIXED
Comment 15•14 years ago
|
||
(In reply to comment #14)
> The new mount point and the bind mounts are in place, the old tree is being
> rsynced into the new one. We can call this done. I will delete the old tree
> once the rsync is done.
That rsync work is being done in bug#614786
Blocks: 614786
Comment 16•14 years ago
|
||
Slight change in *how* we implement this, after irc discussion with aki, aravind, bhearsum, justdave, joduinn. Posting here (even though bug already closed, to keep all interested parties in the loop!)
1) aki made point about not just filling the HA space because we have space is valid. Its also shared with release builds, nightlies, etc, and it would be nice to not always play "find more space games".
2) we did promise to keep builds and logs for longer then currently do, as part of the stop-using-tinderbox-server project.
3) at the time, we said 30 days (arbitrary) - based on current setup of tinderbox builds kept for 1 day, tinderbox build logs kept for 60 days
4) I'm proposing that we keep 30 days, and we now do it as 14days on HA, and 16 on nonHA. This means that if the nonHA disk fails, it will not close the tree.
5) aki proposed something about using softlinks to avoid breaking links in bugs, blogs, etc. If that is something aki and justdave can get in place, that would be great.
Comment 17•14 years ago
|
||
I've adjusted the cron job that cleans up firefox/tinderbox-builds to keep builds for 30 days. [surf:/etc/cron.d/cleanup-hourly-builds]
Comment 18•14 years ago
|
||
(In reply to comment #16)
> 4) I'm proposing that we keep 30 days, and we now do it as 14days on HA, and 16
> on nonHA. This means that if the nonHA disk fails, it will not close the tree.
> 5) aki proposed something about using softlinks to avoid breaking links in
> bugs, blogs, etc. If that is something aki and justdave can get in place, that
> would be great.
joduinn, this isn't the case right now. Please followup if you feel strongly about it.
Comment 19•14 years ago
|
||
Comment 20•14 years ago
|
||
I think it is important for links to tryserver builds to stay consistent so that we can paste them in bugs an have them work for the full time period. Do I need to file a new bug about that?
Comment 21•14 years ago
|
||
(In reply to comment #20)
There's already one on file - bug 615963.
Updated•10 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•