Closed Bug 641956 Opened 14 years ago Closed 14 years ago

CLI option '--restart' should restart between each test and reset the profile

Categories

(Testing Graveyard :: Mozmill, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: k0scist, Assigned: k0scist)

References

Details

(Whiteboard: [mozmill-2.0+])

Attachments

(2 files)

The --restart flag should restart between each test and reset the profile. This will ensure that you're running each test in isolation
Whiteboard: [mozmill-2.0?]
Assignee: nobody → jhammel
Summary: restart should restart between each testa and reset the profile → CLI option '--restart' should restart between each test and reset the profile
Whiteboard: [mozmill-2.0?] → [mozmill-2.0+]
So doing this will break the ability to run restart tests as they are now. The combination of killing state (profile data) and not depending on --restart to just go through tests in order renders the current form of restart tests useless. I have no problem with this, since the tests should be updated to fit the 2.0 model: a test == a file and restarts can be managed internal to the test. We'll likely have other changes we need to make anyway and its probably better to encourage good behaviour (properly atomic tests) instead of allowing bad behaviour to persist (restart tests are files that happen to be in a specific order).
Other things to consider: - currently, mozmill:CLI.run does the thing where it rearranges the tests into groups and runs them that way. Since we're already restarting and stopping in the MozMill class, I'm inclined to move this logic to there - currently (also) restart tests in manifests with a 'restart' key. In doing this change, what does this mean for manifests and tests with a restart key? Possibilities: 1. a key of 'restart' is the equivalent of to a test of restart. That is, the runner is started, the test is run, the runner is shutdown, the profile is reset (or however you want to phrase that flow, i'm being intentionally hand-wavy, the point is that the test runs in its own box). 2. the key of 'restart' is obseleted; --restart is the only way to control this behaviour 3. the key of 'restart' does anything else; probably a bad idea I'm in favor of 1, but would entertain objections. I'll go ahead of my idea of how this should work unless anyone objects
The other ambiguity, which may only affect the manifest case (or maybe option 2. for comment 2 look better) is when do you reset the profile? Before the restart test? after the restart test? both?
(In reply to comment #3) > The other ambiguity, which may only affect the manifest case (or maybe option > 2. for comment 2 look better) is when do you reset the profile? Before the > restart test? after the restart test? both? I would think before the test. Both doesn't make much sense, and if a test needs a clean profile, it shouldn't have to rely on the test before it to clean the profile up.
I guess to me its ambiguous whether 'restart' + cleaning profile means "Okay, I want a clean profile before running this test", "I want a clean profile after running this test", or both. Since its quick, i'm inclined to go for both for now (the sledgehammer effect). We also have to make clean profiles cleaner, at least in the specified case. I'll endeavor to write a bug for that soon.
As I mentioned yesterday, I'd be in favour of obsoleting both the 'restart' key and '--restart' and instead just have a 'reset-profile' key. In 2.0 'normal' tests have the ability to restart the browser, so the *only* use case we could come up with for using --restart was to clean the profile between tests. We could tell people to use '--reset-profile' to reset the profile between tests (or it could be set on a per test basis via manifest) and that the fact the browser restarts is just a necessary side effect to resetting the profile. This has the obvious advantage of completely abolishing the notion of 'normal' and 'restart' tests. There would only be tests. tldr; I'm in favour of 1 but renaming 'restart' to 'reset-profile'
Attachment #520232 - Flags: feedback?
So whatever its called (--restart, --reset, --reset-profile, --random-flag) doesn't have anything to do with the behaviour. I just want to figure out the behaviour desired. Which comes down to a choice of at least two things: 1. tests should, individually, be able to declare (i.e. through manifests) if they are running in a box (stopping runner before, resetting profile, stopping runner after, resetting profile) or if they don't need to do that 2. same as one, but it is a global (or at least per-`run()` call) decision versus a per test decision. I was leaning towards 1. yesterday....today more towards 2. (not for any code complexity reason, fwiw). But I think its best just to put up both patches and see which one is worth taking. And *then* worrying about what we call the damn thing
If 2. is taken, it should also be decided whether this is an API-level flag (i.e. a parameter to MozMill.run() and MozMill.run_tests() ) or just a CLI-level flag. I'm inclined to the latter as it will make the code simpler and make the MozMill class do less, but can see the argument for the other case.
Blocks: 607111
Some beautiful myths and why theyre wrong: 1. Tests should clean up after themselves so you shouldnt need to restart or reset the profile between them False. Tests should clean up after themselves. No question. But no one is perfect. In practice it is desirable to be able to run a series of tests each in their own black box so you can be assured that if a test doesn't properly clean up after itself subsequent tests aren't affected. 2. Since you want to ensure each test has a blank state you should reset the profile and restart between each test False. If computers were infinitely fast, this would be ideal. But the fact of the matter is restarting (e.g.) Firefox and the various Mozmill-sleeps associated with it make this much more time consuming than running the tests straight through. Tests *reasonably* clean up after themselves (see 1.) then they may be normally run without the restart between each test with restarting and reseting the profile reserved either for debugging or for other cases where isolation is demanded. 3. Being able to selectively restart and reset the profile in the manifest would be a pretty cool feature. False. The primary purpose of manifests is to get a list of tests to run and associated data. Note that, in general, associated data is something like "doesn't run on windows" or "disabled due to some bug". Since you can freely pass key, value metadata the harness (Mozmill, in this case) is fee to do whatever it wants with it. However, it should not be used as a functional change to the test or harness, ideally. This violates that rule. Also, think about the use case. Each file == a test (which may freely restart internally). You want to run tests in isolation. ... When would you want to selectively run tests in isolation? As seen in points 1. and 2. tests should clean up after themselves and, if you're willing to pay the cost, you can ensure this isolation by restarting between each file. If you know that a certain test needs isolation, you should just clean up after yourself in the test (or surrounding tests). So I've pretty much convinced myself that, despite my patch, --restart (or again whatever we end up calling it) should just be a global switch. Whether it should be part of the API or not, I don't know. Initially I'll just add it to the CLI class which should (in conjunction with the existant patch) give some idea how the alternative would look.
Comment on attachment 520232 [details] [diff] [review] method 1: allow restart to be specified as a flag on the test this is awful; see comment 10
Attachment #520232 - Flags: feedback? → feedback+
Attachment #520232 - Flags: feedback+ → feedback-
Attached patch Option 2. from comment 2 (deleted) — Splinter Review
I propose this as the way to go. When reviewed the various options should be taken into account, but this is my suggestion
Attachment #520302 - Flags: review?(ctalbert)
Attachment #520302 - Flags: feedback?(fayearthur+bugs)
Heh. Quite a long discussion, mostly with you! FWIW, I think myth 2 is no myth. I far prefer to do a hard reset of AUT state between tests, because I fully believe that if you're doing anything more complicated in teardown than deleting a mock object, rolling back a database, or some other one-fell-swoop reversal you're doomed to failure. Stepped reversals of operations with side effects on external datastores (like the profile) simply Don't Work unless you follow very specific patterns which are hard to coordinate and enforce (and which we don't follow). Even then they sometimes don't work (Hypothetical example: an audit log can't be reversed, since the reversal adds to the log too). And the best reversal pattern won't take into account unanticipated side effects. If I'd been in on the design at the beginning, I'd have warned us -vehemently- away from that approach. I've attempted to use it in the past, and it's caused many, many very hard to diagnose and debug issues. Even if you don't believe me that by nature that process is so fragile as to be completely undesirable, simply consider that when the AUT isn't reliable (which it never is or we wouldn't be testing it), reversing state with the AUT is also not reliable. It's unfortunate that the only way we have to do a full reversal in our case is a complete (and slow) app restart. But I still think it's the way to go, and rather than putting time into trying to make tests clean up perfectly, I'd put time into figuring out how to reset the app faster: either by getting support for profile reset put in the app itself, or figuring out a quicker way to restart. Failing all that, I'd look for ways to split up the test run into parallel VMs to compress time. Simply put, a fast and unreliable test run is simply unreliable. Reliability is the floor before the test run means anything at all; speed is completely secondary. But all that aside, I agree, Option 2; Comment 2. If there becomes a use case for per-test options, we can add it later, but what I'm primarily looking for is a single test-run override that applies to all tests.
Comment on attachment 520302 [details] [diff] [review] Option 2. from comment 2 I like this.
Attachment #520302 - Flags: feedback?(fayearthur+bugs) → feedback+
(In reply to comment #13) > FWIW, I think myth 2 is no myth. I far prefer to do a hard reset of AUT state > between tests, because I fully believe that if you're doing anything more > complicated in teardown than deleting a mock object, rolling back a database, > or some other one-fell-swoop reversal you're doomed to failure. Please keep in mind that we do not want to exercise such a clean-up in all cases of our tests. Migration tests for example should never reset any data which has been built-up before the major upgrade. Otherwise there will be no way for us in a future migration test-run to check if all the profile data from an older version of Firefox has been completely brought over. Further - beside of a couple of false positive failures - we have already found a couple of issues in Firefox itself, which have been manifested because tests didn't clear-up all modified data. If we would always reset everything in-between the tests, we would lose the capability to be an equivalent of manual tests, which are always building up a polluted profile. It's a quite complex process by clearing-up parts of the profile because of its complexity. And now it's becoming even more complex when restart tests can be injected in the middle of normal tests.
Status: NEW → ASSIGNED
(In reply to comment #15) > Please keep in mind that we do not want to exercise such a clean-up in all > cases of our tests. Migration tests for example should never reset any data > which has been built-up before the major upgrade. Otherwise there will be no > way for us in a future migration test-run to check if all the profile data from > an older version of Firefox has been completely brought over. > Sure, there are exceptional cases where you count on old data being there. I would typically do that as a fresh test on a pre-loaded-up profile, though. Also, to be clear, I'm not referring to restart-type tests where you do multiple phases on the same profile. Those have to work without resetting in between phases. > Further - beside of a couple of false positive failures - we have already found > a couple of issues in Firefox itself, which have been manifested because tests > didn't clear-up all modified data. If we would always reset everything > in-between the tests, we would lose the capability to be an equivalent of > manual tests, which are always building up a polluted profile. It's not a good tradeoff, in my opinion. Yes, you may find a few errors you can actually track down using a profile that's accumulating data. But I'd prefer to have something like a UI fuzzer/monkey that has audit logging, profile snapshotting, etc. in place to tell you what might have led to the problem. Those are made to actually generate and diagnose that kind of error. More often you introduce errors you can't possibly isolate or intermittent failures, and more to the point, you possibly compromise even all passing tests done against the polluted profile by not having a known start state. Having a known start state is such an important quality for a regression test that I believe anything that can make it unknown should generally be eliminated if at all possible. If, from there, you're still slipping a lot of corner/edge bugs that come down to pre-existing profile, you expand your testing as mentioned above to cover.
Comment on attachment 520302 [details] [diff] [review] Option 2. from comment 2 Looks good. We need to also change the help text on the --restart option to call out this behavior. i.e. "--restart restarts the application and resets the profile between each test file." r+ with that change.
Attachment #520302 - Flags: review?(ctalbert) → review+
pushed to master with the CLI --help change: https://github.com/mozautomation/mozmill/commit/de60cbe152eaf3957ba43213fee4e3650ed5a65b running with the internal test manifest I witness bug 639991, but I'm not going to fix that here though it should be fixed for 2.0
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
landed a follow-up at https://github.com/mozautomation/mozmill/commit/3ce59b775edb30ea098bb5b22f072fed6e768785; this works around our broken check and in general works better for skipped tests
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: