Closed Bug 1329957 Opened 8 years ago Closed 8 years ago

[mozmill] calendar recurrence tests regularly failing on linux

Categories

(Calendar :: Internal Components, defect, P2)

x86
Linux
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Taraman, Assigned: Taraman)

References

Details

(Whiteboard: [Thunderbird-testfailure: Z Linux])

Attachments

(5 files, 1 obsolete file)

The Mozmill recurrence tests fail most of the time on linux32 and sometimes on Linux 64 debug. Tests seen failing on Linux32: testWeeklyUntilRecurrence testWeeklyWithExceptionRecurrence testBiweeklyRecurrence Tests seen failing on Linux64: testAnnualRecurrence testLastDayOfMonthRecurrence ince the tests always fail on the first step after closing the event-dialog, it seems it has something to do with event-saving.
I set up a VM with linux32 now, to see if I can reproduce this issue, since the tests are failing all the time there. And it seems, that the datepickers are broken there. They are not displayed correctly and always show 1/1/1970. When running the test, the type command in these fields also everytime puts this value. This is the reason, why the tests which use typed dates fail. So it seems at the moment, this is not a test but, but a real bug.
Attached image evntDatePickerError.png (deleted) —
This is what the eventDialog looks like.
Hardware: Unspecified → x86
Thanks, that's an evidence I was looking for in bug 1325792.
Depends on: 1325792
Is there anything in the error console? If so, please comment in bug 1325792.
Flags: needinfo?(Mozilla)
Flags: needinfo?(Mozilla)
Aside from the Linux32 Problem, there are the Failures on Linux64, which seem to be timing related. It looks, that my function to see if the view loaded completely is not working 100%. So I will set every check for an element after view-change to a waitFor call instead nodeExists.
Damn, did I just report a duplicate of this in bug 1334783? The is certainly more visible since bug 1332708 landed.
Maybe the Linux32 issue is (also) related to the setup of the automation machines. There have been recently comments on other bugs that stated some obscure behaviour disappeared after switching to another desktop or widget engine. So, if modifying timeouts or making sure to have e.g. a menue list opened before trying to click one of its items like recommended in bug 1332708, it may be worth to try to reproduce the issue by setting up a VM with the same configuration that is used on automation (if you haven't that already).
On a second thought, the Linux32 may be hard to fix as long the datetime issue as documented with the screenshot exists on that platform, so we should make that a priority.
Depends on: 1229684
Attached patch hardenTests.diff (deleted) — Splinter Review
This should fix the intermittent failures in the tests by adding "waitFor..." functions after every view change.
Attachment #8833518 - Flags: review?(makemyday)
Thanks, we really need to get the trees (C-C and C-A) greener again. However, on Linux32 I still see: TEST-UNEXPECTED-FAIL | /builds/slave/test/build/tests/mozmill/cal-recurrence/testWeeklyWithExceptionRecurrence.js | testWeeklyWithExceptionRecurrence.js::testWeeklyWithExceptionRecurrence
Yes I know, but this is dependent on bug 1229684.
Well, the log says: INFO - SUMMARY-UNEXPECTED-FAIL | testWeeklyWithExceptionRecurrence.js | testWeeklyWithExceptionRecurrence.js::testWeeklyWithExceptionRecurrence INFO - EXCEPTION: Timeout exceeded for waitForElementNotPresent Lookup: /id("messengerWindow")/id("tabmail-container")/id("tabmail")/id("tabpanelcontainer")/id("calendarTabPanel")/id("calendarContent")/id("calendarDisplayDeck")/id("calendar-view-box")/id("view-deck")/id("day-view")/anon({"anonid":"mainbox"})/anon({"anonid":"scrollbox"})/anon({"anonid":"daybox"})/[0]/anon({"anonid":"boxstack"})/anon({"anonid":"topbox"})/{"flex":"1"}/{"flex":"1"}/{"flex":"1"}/{"tooltip":"itemTooltip","calendar":"mozmill"} INFO - at: utils.js line 447 INFO - TimeoutError utils.js:447 13 INFO - waitFor utils.js:485 11 INFO - MozMillController.prototype.waitFor controller.js:687 3 INFO - MozMillController.prototype.waitForElementNotPresent controller.js:709 3 INFO - testWeeklyWithExceptionRecurrence testWeeklyWithExceptionRecurrence.js:107 5 So that has nothing to do with bug 1229684. Or does it?
Comment on attachment 8833518 [details] [diff] [review] hardenTests.diff Review of attachment 8833518 [details] [diff] [review]: ----------------------------------------------------------------- Thanks, r=me with the below comment considered. ::: calendar/test/mozmill/cal-recurrence/testAnnualRecurrence.js @@ +58,4 @@ > > // day view > switchToView(controller, "day"); > + controller.waitForElement(lookupEventBox("day", ALLDAY, null, 1, null, EVENTPATH)); Can you please make this controller.waitForElement( lookupEventBox("day", ALLDAY, null, 1, null, EVENTPATH) ); This applies also for the other three changes in cal-recurrence/testAnnualRecurrence.js.
Attachment #8833518 - Flags: review?(makemyday) → review+
(In reply to Jorg K (GMT+1) from comment #14) > So that has nothing to do with bug 1229684. Or does it? The current patch is intended for hardening the tests as proposed in bug 1332708 but not making the passing on Linux32 instantly. As mentioned in comment 13, fixing the failing mozmill tests for that platform also depends on resolving the datetime formatter issue by landing bug 1229684. Setting the until date for the recurring event doesn't seem to result in the expceted recurrences which lets fail testing for the same here (which is what the quoted log message is about).
Style nits fixed, pushed to: -> https://hg.mozilla.org/comm-central/rev/56625e348834cc35cb8c8333efca97fc77ed113b Leaving the bug open for a moment, to see if the intermittent failures on linux64 cease. As laid out in comment #16 & comment #13, te linux32 failures are not due to faulty tests, but working tests failing due to a real bug.
Target Milestone: --- → 5.3
https://hg.mozilla.org/comm-central/rev/25eb5fc185536e83d8d54a019ba77a7427d9f0c6 (backout) https://hg.mozilla.org/comm-central/rev/2537a3a08ebd52eaf485d95fe693a1c50bd3f3ac (relanded) Please make sure you land patches with the correct bug number. Also please adhere to the standard format of the commit message.
Target Milestone: 5.3 → 5.6
@Jorg K: sorry for that, will keep an eye on that in the future. We have one more failure at a different spot on linux64 on tryserver: https://treeherder.mozilla.org/logviewer.html#?job_id=75030159&repo=try-comm-central&lineNumber=5878 Here, it looks that for some reason, the recurrence is not set correctly.
Attached image index.png (deleted) —
This is a screenshot taken by Mozmills screenshot function at the time of test failure. The Date should have been 2008-03-31 and there an occurrence is expected. It looks like the function gotToDate of the test-calendar-utils has a timing problem here. If you assume that the month march in the minimonth is clicked and the code to click on the month is executed before the display updates, you will land on 2008-03-02.
Now we have two more failures on linux32debug: 06:37:16 INFO - EXCEPTION: waitFor: Timeout exceeded for '() => { 06:37:16 INFO - return iframe.contentWindow.onLoad && 06:37:16 INFO - iframe.contentWindow.onLoad.hasLoaded == true; 06:37:16 INFO - }' 06:37:16 INFO - EXCEPTION: waitFor: Timeout exceeded for '() => mozmill.utils.getWindows("Calendar:EventDialog").length == 0' The first is while waiting for the event-dialog to open, the second is while witing for it to close- which is mainly because the preceding test stopped and left the dialog open. It seems, that the linux debug builds are the the ones having least performance. So maybe we should raise the timeout - although it seems that the standard 5 secs should be well enough...
Attached patch More fixes (deleted) — Splinter Review
This patch addresses the above 2 issues. The minimonth has an event if the month changed and we can check for this to ensure the day-matrix is updated. For the opening of the event-dialog I increased the timeout. Successful Test-run is here: https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=679a8fcd3a3ce8c23c36c90fde3a98b66493d63e And to be sure I started another one: https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=873c200a502ed469d4262a4e97f679135e3dfc56
Attachment #8835360 - Flags: review?(makemyday)
Whiteboard: [Thunderbird-testfailure: Z Linux]
Comment on attachment 8835360 [details] [diff] [review] More fixes Review of attachment 8835360 [details] [diff] [review]: ----------------------------------------------------------------- Looks good, thanks for keeping traction.
Attachment #8835360 - Flags: review?(makemyday) → review+
May I ask of this is done now? You may have noticed the whiteboard entry and the query on the treeherder status (my new toys): https://mzl.la/2lrQTHC So this bug is one of the ones that will make the tree greener, I hope. More tests are still coming in bug 983791 where I saw a huge patch on try.
I left this open because I want to watch Treeherder if there are any more failures not covered by these two patches. But we can also close it and open a new bug if need be. Bug 983791 could land soon, if we really want all the recurrences to be tested in rotated view - I asked Philipp if we keep it this way.
We have now 1 failure in the last 4 days: https://treeherder.mozilla.org/logviewer.html#?job_id=76808242&repo=comm-central&lineNumber=2714 With this rate of failure, it is hard to track this behaviour down. Is there any way to put some permanent debug code in the test - preferrably only on tryserver. This way I could put some things in the log to help understand what is happening.
12:34:58 INFO - SUMMARY-UNEXPECTED-FAIL | testWeeklyNRecurrence.js | testWeeklyNRecurrence.js::testWeeklyNRecurrence 12:34:58 INFO - EXCEPTION: Expression "{"value":"2009"}" returned null. Anonymous == false 12:34:58 INFO - at: elementslib.js line 486 12:34:58 INFO - reduceLookup elementslib.js:486 15 12:34:58 INFO - Lookup.prototype.getNode elementslib.js:501 10 12:34:58 INFO - MozMillController.prototype.click controller.js:527 17 12:34:58 INFO - goToDate test-calendar-utils.js:257 9 12:34:58 INFO - testWeeklyNRecurrence testWeeklyNRecurrence.js:52 5 12:34:58 INFO - Runner.prototype.wrapper frame.js:585 9 12:34:58 INFO - Runner.prototype._runTestModule frame.js:655 9 12:34:58 INFO - Runner.prototype.runTestModule frame.js:701 3 12:34:58 INFO - Runner.prototype.runTestDirectory frame.js:525 7 12:34:58 INFO - runTestDirectory frame.js:707 3 12:34:58 INFO - Bridge.prototype._execFunction server.js:179 10 12:34:58 INFO - Bridge.prototype.execFunction server.js:183 16 12:34:58 INFO - Session.prototype.receive server.js:283 3 12:34:58 INFO - AsyncRead.prototype.onDataAvailable server.js:88 3 This is the error we still get on the linux debug builds from time to time. It looks like the flaw is in the goToDate function of the calendar-utils. I will have a look how to fix this when I have time.
Priority: -- → P2
Attached patch fixGoToDate V1 (obsolete) (deleted) — Splinter Review
This is another patch addressing the EXCEPTION: Expression "{"value":"2009"}" returned null. Anonymous == false issue by replacing the generic timeout by a test that the value is in view. The positive side-effect is that this speeds up the function on fast systems/builds. Also the timeout for the monthchange-event to fire is raised, since we had timouts for this on the debug-builds. I will start a try run later.
Attachment #8860034 - Flags: review?(makemyday)
Here ist the try run: https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=ade81ade417736af6320d6a9038644cc06b9e00b unfortunately, there is still a timeout failure in the monthchange event. It looks, like there is a case, when the event does not fire. I will investigate further.
It looks, like the monthchange really takes a long time on some occasions, as some debug code revealed: >07:21:04 INFO - console.log: Change to 5-1-2009 >07:21:13 INFO - console.log: Debugstring: select year monthchange took 4.924s >07:21:14 INFO - console.log: Debugstring: select month monthchange took 0.616s on another run even: >07:52:13 INFO - console.log: Change to 1-1-1970 >07:52:18 INFO - --DOCSHELL 0x7f3095371800 == 14 [pid = 2320] [id = {77c12a9c-16de-43a0-a6f6-d37ec947d10c}] >07:52:31 INFO - console.log: Debugstring: select year monthchange took 12.652s Most interestingly the test didn't stop although the timeout was set to 5s only. The log-times also don't reflect the measured times - so maybe this is not too accurate... I will set the Timeout to 10s now to be sure. I thought about doing this only for debug builds, but I don't see any bad side-effects in raising the timeout for the other builds as well, so I will skip this code...
Attached patch fixGoToDate V2 (deleted) — Splinter Review
The patch as per previous comment
Attachment #8860034 - Attachment is obsolete: true
Attachment #8860034 - Flags: review?(makemyday)
Attachment #8862575 - Flags: review?(makemyday)
Comment on attachment 8862575 [details] [diff] [review] fixGoToDate V2 Review of attachment 8862575 [details] [diff] [review]: ----------------------------------------------------------------- I leave the length of the timeout to your discretion, r+ with the below comments considered. ::: calendar/test/mozmill/shared-modules/test-calendar-utils.js @@ +246,4 @@ > anon({"anonid":"yearcell"}) > `)); > > + function getYearListitem (aYear) { Make this let getYearListitem = function(aYear) { please or move this function to the top of goToDate before you define any variable to avoid strict warnings. @@ +251,5 @@ > + ${miniMonth}/anon({"anonid":"minimonth-header"})/ > + anon({"anonid":"minmonth-popupset"})/anon({"anonid":"years-popup"})/ > + {"label":"${aYear}"} > + `) > + } And this };
Attachment #8862575 - Flags: review?(makemyday) → review+
With the above fixed pushed to comm-central: -> https://hg.mozilla.org/comm-central/rev/9c284993401409fa2c6af5a0430197980b0e69bf This should fix this bug now.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: 5.6 → 5.7
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: