Closed Bug 585286 Opened 14 years ago Closed 12 years ago

Screensaver activates sometimes on Rev3 Fedora mochitest machines

Categories

(Release Engineering :: General, defect)

All
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: karlt, Unassigned)

References

Details

(Keywords: intermittent-failure, Whiteboard: [screensaver])

Attachments

(6 files)

I set up the attached program to run and print screen saver state on mochitest timeout (using a variant of attachment 463756 [details] [diff] [review]). When the screen saver is off and auto-activation is disabled (as expected) the results look something like: Rev3 Fedora 12 tryserver opt test mochitest-other on 2010/08/05 21:14:36 s: talos-r3-fed-031 http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1281068076.1281069032.24698.gz#err4 TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/browser/components/privatebrowsing/test/browser/browser_privatebrowsing_beforeunload_enter.js | application timed out after 330 seconds with no output XScreenSaver state: Disabled XScreenSaver kind: Blanked User input idle for 918 seconds SCREENSHOT: data:image/png;base64,[...] INFO | automation.py | Application ran for: 0:07:05.174093 Sometimes the screen saver is on. Rev3 Fedora 12x64 tryserver debug test mochitests-1/5 on 2010/08/06 22:37:32 s: talos-r3-fed64-018 http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1281159452.1281160594.1905.gz#err24 3699 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_CrossSiteXHR_origin.html | Test timed out. XScreenSaver state: On XScreenSaver has been on for 0 seconds XScreenSaver kind: Blanked User input idle for 821 seconds SCREENSHOT: data:image/png;base64,[...] 3700 INFO SimpleTest finished /tests/content/base/test/test_CrossSiteXHR_origin.html in 327077ms 3701 INFO TEST-START | /tests/content/base/test/test_NodeIterator_basics_filters.xhtml TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_NodeIterator_basics_filters.xhtml | application timed out after 330 seconds with no output XScreenSaver state: On XScreenSaver has been on for 0 seconds XScreenSaver kind: Blanked User input idle for 1241 seconds SCREENSHOT: data:image/png;base64,[...] INFO | automation.py | Application ran for: 0:14:34.296170 This is causing a number of mochitests to timeout or not run correctly.
> XScreenSaver has been on for 0 seconds I don't believe that the screen-saver is set to activate on an idle-timer. Local experiments show that the til_or_since value reported in this line only gets set to the length of time that the screen-saver has been running if the screen saver was started automatically on idle timeout. This 0 value is what is reported when idle-timer activation is disabled. The "Activate screensaver when computer is idle" settings in attachment 451052 [details] would disable this I assume. What seems to be happening here is that something else is activating the screen saver. This can be done directly through the same X protocol that "xset s activate" uses, or gnome-screensaver has DBUS APIs that apps can use to initiate this.
> User input idle for 1241 seconds I would have expected this period to typically be much larger for these machines that don't have any user interaction. This time is only a few minutes longer than the time that the tests have been running. Do the machines get rebooted between each test run? It is also possible for apps to reset this idle time. (The only way that I know also deactivates the screen saver.) I assume VNC can also influence the idle time. Can we rule out the possibility of VNC connections while the machine is in the pool?
The best solution here depends on what is causing this. * If it is a GNOME app starting the screen saver, then I'm guessing the DBUS API would be disabled by changing the gconf setting /apps/gnome_settings_daemon/screensaver/start_screensaver to false. * If whatever is activating the screen saver only does so before the tests run, then the command "xset s reset" could be run to deactivate the screen saver.
Blocks: 569237
Blocks: 580483
Blocks: 578591
Blocks: 569974
Blocks: 569425
Blocks: 569238
Blocks: 557456
I think the first report of this problem was 2010/05/31 09:05:00. e.g. bug 569237 comment 0. AFAIK that was some finite time after the switch to Rev3 Fedora machines. Bug 557456 has a couple of reports in early April, but none in May, so those initial reports might be something different.
I'm going to try to verify that this is actually happening, and how.
Not sure I'm going to get a chance to look at this after all :(
Looking at the source code for gnome-screensaver, I see that it (regularly, repeatedly) disables idle activation of the server's built-in screen saver, but otherwise does not use it. The issue here is the built-in screen saver being active, not a gnome-screensaver replacement, so it doesn't look like it is getting activated via gnome-screensaver DBUS APIs. I guess it might be possible that the built-in screen saver has already activated (on idle timeout or otherwise) before gnome-screensaver starts, but, with the automatic login setup, I assume there's not much time between the X server starting and gnome-screensaver being launched. It might be worth checking screen saver state from the "talos-slave" buildbot script startup to see whether that's a good place to "xset s reset". (State could be compared with state at script shutdown, for example.) (Unless there's a reason why perf test machines are different, we'll want to fix this on those machines, also.)
Given that we don't know what is activating the screensaver or when, this seems the best place to turn it off.
Attachment #466880 - Flags: review?(ted.mielczarek)
We know that the screensaver is sometimes running on mochitest machines. I guess this means that it is most likely also sometimes running on talos machines, which would affect results.
Assignee: nobody → karlt
Attachment #466896 - Flags: review?(anodelman)
This seems a bit silly. Can't we just have RelEng disable the screensaver on the slaves?
(In reply to comment #10) > This seems a bit silly. Can't we just have RelEng disable the screensaver on > the slaves? The screensaver is internal to the X server so I think the only way to be certain that that screensaver won't run would be to patch and recompile the server. The screensaver is sort of "disabled" here: https://wiki.mozilla.org/ReferencePlatforms/Test/FedoraLinux#Preferences But this doesn't prevent an application from activating the screensaver, or prevent the screensaver from activating before the gnome session starts. I guess we could try Option "BlankTime" "0" in the ServerFlags section of xorg.conf, but I'm not expecting that to help. The automatic login should start gnome within the default 10 minute activation time. There are other things that we could try that are more likely to help. Given that we don't know what changed in May, we don't know what is activating the screensaver so I'm guessing as to what might help. We could try deactivating the screen saver in "Startup Programs" https://wiki.mozilla.org/ReferencePlatforms/Test/FedoraLinux#Preferences Or from buildbot scripts. I had a bit of a look there but didn't get as far as locating a platform-specific script that gets run for all tests. Which seems least silly to you?
Comment on attachment 466880 [details] [diff] [review] deactivate screensaver before running the test app I think anything that involves changing it on the slave sounds better than throwing these hacks in our test harnesses. If we exhaust those options and are still seeing this, then we can revisit.
Attachment #466880 - Flags: review?(ted.mielczarek) → review-
I'll pass this back to release engineering then.
Assignee: karlt → nobody
Attachment #466896 - Flags: review?(anodelman)
> The automatic login should start gnome within the default 10 minute activation > time. We do config management (puppet) on boot, so if a large package is getting deployed this might have an impact. I can write a patch to run the command from attachment 466880 [details] [diff] [review].
Assignee: nobody → jhford
Status: NEW → ASSIGNED
Attached patch unit test patch (deleted) — Splinter Review
this patch runs xset s reset once per test run immediately before starting the test harness. I don't know if this needs to be run per test or even more than once.
Attachment #472788 - Flags: feedback?
Comment on attachment 472788 [details] [diff] [review] unit test patch > this patch runs xset s reset once per test run immediately before starting the > test harness. This sounds good, thanks John. > I don't know if this needs to be run per test or even more than once. I'm hoping once per test run will be enough. If not, we're going to have trouble picking the right time to run it. >+ def addSetupSteps(self): I notice that TalosFactory calls addSetupSteps explicitly, but UnittestPackagedBuildFactory does not. Does BuildFactory call this? >+ name='disable_screensaver', 'deactivate_screensaver' might be more accurate. I think we'll also want to deactivate the screen saver for talos runs (I'm guessing that UnittestPackagedBuildFactory is not used for talso), but if you'd prefer to do that separately then that's fine.
Attachment #472788 - Flags: feedback?
Attachment #472788 - Flags: feedback+
TalosFactory is in an entirely different and unrelated class hierarchy.
Attachment #472788 - Flags: feedback?
Blocks: 597742
Attachment #472788 - Flags: review?(catlee)
Attachment #472788 - Flags: review?(catlee) → review+
Attached patch set the environment (deleted) — Splinter Review
gah, i forgot that we launch buildbot outside of an X environment on build slaves and that we run unit tests on build master slaves.
Attachment #479073 - Flags: review?(lsblakk)
Attachment #479073 - Flags: review?(lsblakk) → review+
to be extra clear, the patch landed in comment 18 was included in a reconfig. There was an issue found with how we launch buildbot, specifically, the DISPLAY variable not being set. Comment 19 is a patch that fixes this issue and will be included in a reconfig happening today
Flags: needs-reconfig+
follow up landed and was part of reconfig
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Blocks: 601783
No longer blocks: 580483
No longer blocks: 569974
No longer blocks: 569238
No longer blocks: 569237
No longer blocks: 557456
Serge, was it necessary to remove the fixed bugs from the "Blocks" list? It's easier to see which bugs are fixed by looking at that than by searching the comments for duplicates.
(In reply to comment #28) > Serge, was it necessary to remove the fixed bugs from the "Blocks" list? It simply seems odd to depend on a duplicate (of itself): I would suggest you to use "R.Fixed, and maybe add 'fixed by bug nnn' on the whiteboard" rather than "R.Duplicate" then. > It's easier to see which bugs are fixed by looking at that than by searching > the comments for duplicates. I fully agree with that.
Whiteboard: [orange]
Whiteboard: [orange] → [orange][screensaver]
We are still seeing issues with X screen blanking on occasion. Reopening.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to Karl Tomlinson (:karlt) from comment #28) > Serge, was it necessary to remove the fixed bugs from the "Blocks" list? > It's easier to see which bugs are fixed by looking at that than by searching > the comments for duplicates. (FWIW, you don't have to search the comments for duplicates -- there's a "Duplicates" list at the top of this bug, 2 lines above the "Blocks" list)
catlee said in IRC that this works: xset s off s reset
Assignee: jhford → nobody
r+'d by callek on irc.
Attachment #674821 - Flags: review+
Do we anticipate that this patch will fix the issues in bug 805170.? (If think so: once it lands, let's retrigger some of the affected M-Oth jobs and make sure they're green before we reopen Mozilla-Beta/ESR.)
Retriggered jobs passed and trees have been reopened.
Status: REOPENED → RESOLVED
Closed: 14 years ago12 years ago
Resolution: --- → FIXED
Blocks: 805170
Whiteboard: [orange][screensaver] → [screensaver]
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: