Closed Bug 494165 Opened 16 years ago Closed 15 years ago

Mochitest should know how to split tests up into N pieces

Tracking

(Not tracked)

Status:

RESOLVED FIXED

Milestone:

mozilla1.9.3a1

People

(Reporter: catlee, Assigned: catlee)

References

Details

Attachments

(1 file, 4 obsolete files)

Add --this-chunk and --total-chunks support to runtests.py 15 years ago Chris AtLee [:catlee] (deleted), patch		Details \| Diff \| Splinter Review
Add --this-chunk, --total-chunks, --chunk-by-dir, and --shuffle support to runtests.py 15 years ago Chris AtLee [:catlee] (deleted), patch	ted : review-	Details \| Diff \| Splinter Review
Add --this-chunk, --total-chunks, --chunk-by-dir, and --shuffle support to runtests.py 15 years ago Chris AtLee [:catlee] (deleted), patch		Details \| Diff \| Splinter Review
Add --this-chunk, --total-chunks, --chunk-by-dir, and --shuffle support to runtests.py 15 years ago Chris AtLee [:catlee] (deleted), patch	ted : review+	Details \| Diff \| Splinter Review
Add --this-chunk, --total-chunks, --chunk-by-dir, and --shuffle support to runtests.py 15 years ago Chris AtLee [:catlee] (deleted), patch	ted : review+	Details \| Diff \| Splinter Review

Chris AtLee [:catlee]

Assignee

Description

•

16 years ago

In order to split up mochitests into smaller pieces and have each piece run in parallel on our build infrastructure, it would be nice if the mochitest test harness knew how to split up the tests into pieces. One idea is to have the harness understand --total-pieces=N and --this-piece=M arguments. Given the same packaged tests, the harness would partition the set of tests into the same N pieces on each machine, and then would only run tests in the M'th piece on any given machine.

Chris AtLee [:catlee]

Assignee

Comment 1

•

16 years ago

We need to make sure that if somebody adds new tests, it doesn't affect which piece exists tests are placed into.

(not currently active) Ted Mielczarek

Comment 2

•

16 years ago

Why? As long as you continue to split it into equal parts, why does it matter which bin the tests fall into?

Chris AtLee [:catlee]

Assignee

Comment 3

•

16 years ago

It can change the order in which tests are run in relation to each other. Consider the case where you have 4 tests, A,B,C,D that are split up into two pieces, [A,B], [C,D]. If somebody adds two more tests, E,F, and you just split into equal parts, then your new pieces look like [A,B,C], [D,E,F]. Now C, which used to be run as the first test in its piece is run as the last. Tests _should_ be independent, but I suspect they're not. Having the order change like this could make tracking down problems tricky.

(not currently active) Ted Mielczarek

Comment 4

•

16 years ago

I don't think we should be overly worried about that. If we're going to split the tests up, they're going to have to work well in that case.

Chris AtLee [:catlee]

Assignee

Comment 5

•

15 years ago

Attached patch Add --this-chunk and --total-chunks support to runtests.py (obsolete) (deleted) — Details — Splinter Review

Attachment #382817 - Flags: review?(ted.mielczarek)

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Comment 6

•

15 years ago

(In reply to comment #3) > It can change the order in which tests are run in relation to each other. > > Consider the case where you have 4 tests, A,B,C,D that are split up into two > pieces, [A,B], [C,D]. > > If somebody adds two more tests, E,F, and you just split into equal parts, then > your new pieces look like [A,B,C], [D,E,F]. > > Now C, which used to be run as the first test in its piece is run as the last. ...and also D is now run without C running beforehand. Whomever adds new E,F would be on the hook to figure out why E,F broke C,D in that situation. Further complication is that even without anyone adding new tests, I believe we are changing partition size based on number of available slaves? In that scenario, we could get [A], [B], [C], [D] one day, and then [A,B], [C,D] the next day. (Think cases where we add new slaves, or took some out of the pool for repairs.) Or [A,B],[C,D] on one o.s. and [A],[B],[C],[D] on another o.s. because of the different numbers of slaves on each o.s. Its bad if adding a test "causes" another unrelated test to fail out. Feels worse if tests appear to work one run, and fail in later run, with no apparent related change, because of dynamic partitions and unknown inter-dependencies. If we get into this situation, do we have an easy way to see what tests were run in what partition - so anyone chasing problems like this could figure out a pattern? > Tests _should_ be independent, but I suspect they're not. Having the order > change like this could make tracking down problems tricky. (In reply to comment #4) > I don't think we should be overly worried about that. If we're going to split > the tests up, they're going to have to work well in that case. Totally agree, its right and proper for tests to not be interdependent. Manually breaking them up will also trip up on this, but at least it gives you known consistent scenarios, which are easier to debug and fix. My fear is that there are unknown interdependencies which we will uncover by using dynamic partitioning like this, which by the dynamic nature of it, will be harder to debug and fix. In irc chat earlier, ted,catlee,myself agreed to go ahead and see how much of a problem this turns out to be. Or not. Fingers crossed.

Chris AtLee [:catlee]

Assignee

Comment 7

•

15 years ago

(In reply to comment #6) > Further complication is that even without anyone adding new tests, I believe we > are changing partition size based on number of available slaves? In that > scenario, we could get [A], [B], [C], [D] one day, and then [A,B], [C,D] the > next day. (Think cases where we add new slaves, or took some out of the pool > for repairs.) Or [A,B],[C,D] on one o.s. and [A],[B],[C],[D] on another o.s. > because of the different numbers of slaves on each o.s. We've never considered changing the partition size based on the number of available slaves. You must be thinking of something else. Our plan has always been to pick a semi-hardcoded number of partitions that gives a reasonable balance between test run time, and overhead for the setup involved.

(not currently active) Ted Mielczarek

Updated

•

15 years ago

Assignee: ted.mielczarek → catlee

Status: NEW → ASSIGNED

Chris AtLee [:catlee]

Assignee

Comment 8

•

15 years ago

Attached patch Add --this-chunk, --total-chunks, --chunk-by-dir, and --shuffle support to runtests.py (obsolete) (deleted) — Details — Splinter Review

Attachment #382817 - Attachment is obsolete: true

Attachment #382817 - Flags: review?(ted.mielczarek)

Chris AtLee [:catlee]

Assignee

Comment 9

•

15 years ago

Comment on attachment 384199 [details] [diff] [review] Add --this-chunk, --total-chunks, --chunk-by-dir, and --shuffle support to runtests.py The --chunk-by-dir option groups tests together by the top N directories. The are then split chunks according to which directory they're in. --chunk-by-dir=2 will allow you to split up all the component directories. ('tests' is always the first directory, hence the need for descending to the second level)

Chris AtLee [:catlee]

Assignee

Updated

•

15 years ago

Attachment #384199 - Flags: review?(ted.mielczarek)

Murali Nandigama [:murali]

Updated

•

15 years ago

Blocks: 502689

(not currently active) Ted Mielczarek

Comment 10

•

15 years ago

Comment on attachment 384199 [details] [diff] [review] Add --this-chunk, --total-chunks, --chunk-by-dir, and --shuffle support to runtests.py Overall looks pretty good, just one main comment. I think you should drop the "--chunk-by-dir" option. If you want test chunking to be by-directory, then just make that the default behavior. You're implementing this for your own use, so there's no need to support theoretical use cases. Just implement what you want to use. r- for that alone, nits below. + var dir = test_path.split("/", chunkByDir).join("/"); When you fix the above, I think it's ok to hardcode either the number of directories here, or the known path prefix to strip out to get to the level of directories you care about. + TestRunner.logger.log("Running tests in " + dir); This seems like it'd get a little verbose, could you just build up the list of dirs and print them all on one line at the end? + } else { + var my_tests = gTestList; + } I'd prefer if you declared |var my_tests =gTestList;| up above the if block, and drop the else here. + if (params.shuffle) { I guess there's no better way to do this in JS, eh? Makes me pine for random.shuffle.

Attachment #384199 - Flags: review?(ted.mielczarek) → review-

(not currently active) Ted Mielczarek

Comment 11

•

15 years ago

Also, the bug title seems to indicate that you'd want this for other test suites. Do you care about those, or is mochitest sufficient?

Joel Maher ( :jmaher ) (UTC -8)

Comment 12

•

15 years ago

I think for mobile it would be nice to move to a model of how many tests to run in a chunk. It would be nice to say "run 10 tests per chunk through all the mochitest files". That idea doesn't work for the scenario of test dependencies. I don't want to change the focus of this bug, but would manifest files be a good approach here? If we pick a base chunk size we could make each chunk have a manifest file.

Chris AtLee [:catlee]

Assignee

Updated

•

15 years ago

Assignee: catlee → nobody

Chris AtLee [:catlee]

Assignee

Updated

•

15 years ago

Assignee: nobody → catlee

Chris AtLee [:catlee]

Assignee

Comment 13

•

15 years ago

Attached patch Add --this-chunk, --total-chunks, --chunk-by-dir, and --shuffle support to runtests.py (obsolete) (deleted) — Details — Splinter Review

Attachment #384199 - Attachment is obsolete: true

Attachment #398395 - Flags: review?(ted.mielczarek)

Chris AtLee [:catlee]

Assignee

Updated

•

15 years ago

Attachment #398395 - Attachment is obsolete: true

Attachment #398395 - Flags: review?(ted.mielczarek)

Chris AtLee [:catlee]

Assignee

Comment 14

•

15 years ago

Attached patch Add --this-chunk, --total-chunks, --chunk-by-dir, and --shuffle support to runtests.py (obsolete) (deleted) — Details — Splinter Review

Attachment #399456 - Flags: review?(ted.mielczarek)

(not currently active) Ted Mielczarek

Updated

•

15 years ago

Attachment #399456 - Flags: review?(ted.mielczarek) → review+

(not currently active) Ted Mielczarek

Comment 15

•

15 years ago

Comment on attachment 399456 [details] [diff] [review] Add --this-chunk, --total-chunks, --chunk-by-dir, and --shuffle support to runtests.py I still think at some point we just want to hardcode the right behavior here, but let's just get this in now so you can try it out.

Chris AtLee [:catlee]

Assignee

Updated

•

15 years ago

Priority: -- → P3

(not currently active) Ted Mielczarek

Updated

•

15 years ago

Keywords: checkin-needed

(not currently active) Ted Mielczarek

Comment 16

•

15 years ago

Pushed to m-c: http://hg.mozilla.org/mozilla-central/rev/e5f6affc4c88

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Keywords: checkin-needed

Resolution: --- → FIXED

Target Milestone: --- → mozilla1.9.3a1

(not currently active) Ted Mielczarek

Comment 17

•

15 years ago

Backed out, this appears to have broken mochitest on tinderbox: http://hg.mozilla.org/mozilla-central/rev/31dba3bc44f4 http://hg.mozilla.org/mozilla-central/rev/68b5e9772b4c http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1253536109.1253537381.23599.gz Server listening on port 4443 with cert pgo server certificate 0 INFO SimpleTest START 1 INFO Running undefined... command timed out: 1200 seconds without output, killing pid 36138 process killed by signal 9

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Chris AtLee [:catlee]

Assignee

Comment 18

•

15 years ago

Attached patch Add --this-chunk, --total-chunks, --chunk-by-dir, and --shuffle support to runtests.py (deleted) — Details — Splinter Review

Same as before, except move my_tests variable up one level in scope.

Attachment #399456 - Attachment is obsolete: true

Attachment #401912 - Flags: review?(ted.mielczarek)

(not currently active) Ted Mielczarek

Updated

•

15 years ago

Attachment #401912 - Flags: review?(ted.mielczarek) → review+

(not currently active) Ted Mielczarek

Comment 19

•

15 years ago

Pushed the updated patch to m-c: http://hg.mozilla.org/mozilla-central/rev/e0b84d337fa2

Status: REOPENED → RESOLVED

Closed: 15 years ago → 15 years ago

Resolution: --- → FIXED

(not currently active) Ted Mielczarek

Updated

•

15 years ago

Blocks: 487689

Summary: Test harnesses should know how to split tests up into N pieces → Mochitest should know how to split tests up into N pieces

You need to log in before you can comment on or make changes to this bug.