Closed Bug 413713 Opened 17 years ago Closed 17 years ago

Need rapidly cycling Talos boxes to replace old perf test machines

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rcampbell, Assigned: anodelman)

References

Details

Before we can mothball the old perf test machines, we need to provide a rapidly-cycling talos box that can reproduce as closely as possible what the old machines provide.

The requirements are: 10-30m cycle times.

Pared down pageset for running Tp.

Other tests include: Ts, Tdhtml, and Txul.
I've been spouting for a little while to whomever would listen that a pared down talos pageset should be possible, even one that tracks the "real" talos numbers.  My thinking is that if we can get our hands on a stats software package, we oughta be able to run a regression analysis on each of the ~400 pages, and find the N pages that, taken together, correlate well with real Tp over the history we currently have.

N could be 50, or 30, or 10 -- whatever tradeoff we want to make between rapid sniff tests and probability of catching regressions.  I'm thinking that we name a cycle-time, 10 minutes say, and keep running down the list of "next most useful page" until we start hitting that.

We'll want to be clear that this isn't the real Tp, that full Talos Tp results win all arguments.  Tp_snifftest?  Tquickcheck?  Something.

What are the odds that we could get some kind of straight DB dump that has historical scores for every page?  Maybe the CSV dumps from graph server are enough...
The CSV dumps would be enough, but you'd have a hard time going through enough historical data to get them.  To do this kind of analysis you really just want a copy of the database.

That said, I bet someone could spend an afternoon and manually identify pages that would be useful to have in the set.
Can we just get a mysql dump of the db Aravind?
(In reply to comment #3)
> Can we just get a mysql dump of the db Aravind?
> 

Yup, people.mozilla.com:/tmp/graphs_mozilla_org.2008.01.24.sql.gz
From a meeting today with me, schrep, robcee & joduinn our current thinking is as follows:

- we want rapidly cycling talos boxes so that we can retire tinderbox perf test boxes
- if terms of the fastest route to get us to that point we need to have tests that run as quickly as the tinderbox perf tests do
- thus, to start, we will just set up a set of talos boxes (win/mac/linux) to run the same tests that the current tinderbox perf test boxes are running (tp with the historic pageset, ts, tdhtml, txul, ts)

This has the advantage of not having to do any statistical analysis of the new page set to determine an interesting sub set of pages, along with pretty much exactly replacing the machines that we want to retire.

We can still pursue the concept of a small scale talos machine that runs pared down versions of all the tests we have (including tgfx, tsvg, etc) - but that would considered a separate project.
That's much easier, I agree.  :)
(In reply to comment #5)
> From a meeting today with me, schrep, robcee & joduinn our current thinking is
> as follows:

++ to this plan.   
If it is *super* easy it would be great to get a set of 1.8 boxes running the same tests - I'm curious on how we are doing on the old Tp pageset on branch vs. trunk.   
Depends on: 414456
Depends on: 416237
Assignee: nobody → anodelman
We now have both a set of trunk fast and branch fast machines reporting.  See Firefox & Mozilla1.8 waterfalls.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Mass move of Core:Testing bugs to mozilla.org:ReleaseEngineering. Filter on RelEngMassMove to ignore.
Component: Testing → Release Engineering
Product: Core → mozilla.org
QA Contact: testing → release
Version: Trunk → other
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.