Intermittent make[1]: *** [profiledbuild] Error 1 after error: XDG_RUNTIME_DIR not set in the environment.
Categories
(Firefox Build System :: General, defect, P5)
Tracking
(firefox68 fixed)
Tracking | Status | |
---|---|---|
firefox68 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: mshal)
References
Details
(Keywords: intermittent-failure, Whiteboard: [stockwell unknown])
Attachments
(1 file)
(deleted),
text/x-phabricator-request
|
Details |
Assignee | ||
Comment 1•6 years ago
|
||
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 3•6 years ago
|
||
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Updated•6 years ago
|
Comment 5•6 years ago
|
||
Have seen this a lot in the last couple of hours: https://treeherder.mozilla.org/#/jobs?repo=autoland&searchStr=linux%2Cx64%2Cpgo%2Cprofile-guided%2Coptimization%2Cbuilds%2Cgenerate-profile-linux64%2Fpgo%2Cbpgo%28run%29&fromchange=9edba3987a86260f2d3136cbe23387426d4defb1&tochange=653fa722ad95365d077f0392c86c5b418af1679f&selectedJob=221153251
Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=221188856&repo=autoland&lineNumber=273
Assignee | ||
Comment 6•6 years ago
|
||
This is ok for now in the Bpgo(run) step, but it does confirm our suspicion that it appears more often in the 3-tier PGO than it does in the single tier PGO. The fact that this intermittently fails more often in 3-tier version is the reason we haven't switched over Linux PGO builds yet.
The Bpgo(run) failures can be starred as this bug without requiring a retrigger, but Linux x64 pgo 'B' builds should still be retriggered. If this is causing problems for sheriffing, we can hide the Bpgo(run) builds for now, though we'd like them to still execute while we are implementing 3-tier PGO.
Assignee | ||
Comment 7•6 years ago
|
||
To clarify: Bpgo(run) is new from bug 1507334 and is not needed to ship anything yet (so retriggering is not necessary), though it is the direction we are moving toward in the near future. So we'd prefer not to back it out, but can hide things in the meantime if necessary.
Comment 8•6 years ago
|
||
Make them tier 3? or at least tier 2.
Comment 9•6 years ago
|
||
I think this first appeared on Autoland, with a similar log, on a merge from central:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=221011987&repo=autoland&lineNumber=262
Comment 10•6 years ago
|
||
There are 27 failures in the last 7 days, all on linux64 pgo.
Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=221619269&repo=mozilla-inbound&lineNumber=241
[task 2019-01-13T21:33:53.165Z] New python executable in /builds/worker/checkouts/gecko/obj-x86_64-pc-linux-gnu/_virtualenvs/init/bin/python2.7
[task 2019-01-13T21:33:53.165Z] Also creating executable in /builds/worker/checkouts/gecko/obj-x86_64-pc-linux-gnu/_virtualenvs/init/bin/python
[task 2019-01-13T21:33:54.534Z] Installing setuptools, pip, wheel...done.
[task 2019-01-13T21:33:54.755Z] WARNING: Python.h not found. Install Python development headers.
[task 2019-01-13T21:33:54.755Z] Error processing command. Ignoring because optional. (optional:setup.py:third_party/python/psutil:build_ext:--inplace)
[task 2019-01-13T21:33:54.755Z] Error processing command. Ignoring because optional. (optional:packages.txt:comm/build/virtualenv_packages.txt)
[task 2019-01-13T21:33:55.482Z] Firefox exited with code 1 during profile initialization
[task 2019-01-13T21:33:55.482Z] Firefox output (/builds/worker/artifacts/profile-run-1.log):
[task 2019-01-13T21:33:55.482Z] error: XDG_RUNTIME_DIR not set in the environment.
[task 2019-01-13T21:33:55.482Z] Unable to init server
[task 2019-01-13T21:33:55.482Z] Error: cannot open display: :2
[task 2019-01-13T21:33:55.482Z]
[task 2019-01-13T21:33:55.514Z] cleanup
[task 2019-01-13T21:33:55.514Z] + cleanup
[task 2019-01-13T21:33:55.514Z] + local rv=1
[task 2019-01-13T21:33:55.514Z] + cleanup_xvfb
[task 2019-01-13T21:33:55.514Z] pidof Xvfb
[task 2019-01-13T21:33:55.514Z] ++ pidof Xvfb
[task 2019-01-13T21:33:55.516Z] + local xvfb_pid=37
[task 2019-01-13T21:33:55.516Z] + local vnc=false
[task 2019-01-13T21:33:55.516Z] + local interactive=false
[task 2019-01-13T21:33:55.516Z] + '[' -n 37 ']'
[task 2019-01-13T21:33:55.516Z] + [[ false == false ]]
[task 2019-01-13T21:33:55.516Z] + [[ false == false ]]
[task 2019-01-13T21:33:55.516Z] + kill 37
[task 2019-01-13T21:33:55.516Z] + screen -XS xvfb quit
[task 2019-01-13T21:33:55.640Z] No screen session found.
[task 2019-01-13T21:33:55.640Z] + true
[task 2019-01-13T21:33:55.640Z] + exit 1
[fetches 2019-01-13T21:33:55.641Z] removing /builds/worker/fetches
[fetches 2019-01-13T21:33:55.641Z] finished
[taskcluster 2019-01-13 21:33:56.023Z] === Task Finished ===
[taskcluster 2019-01-13 21:33:56.161Z] Artifact "public/build/profdata.tar.xz" not found at "/builds/worker/artifacts/profdata.tar.xz"
[taskcluster 2019-01-13 21:33:56.360Z] Artifact "public/build/profile-run-2.log" not found at "/builds/worker/artifacts/profile-run-2.log"
[taskcluster 2019-01-13 21:33:57.064Z] Unsuccessful task run with exit code: 1 completed in 109.299 seconds
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 12•6 years ago
|
||
I just pushed bug 1519424. Hopefully that has an impact here.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 15•6 years ago
|
||
Resolving as fixed. Michael pushed debian9 (bug 1519424) on January 14 and he says it appears to have fixed the XDG_RUNTIME_DIR errors:
Comment hidden (Intermittent Failures Robot) |
Comment 17•6 years ago
|
||
Reopening this as in the last week there are 4 failures (till now) that were classified against bug 1517939, all have this line in the log:
error: XDG_RUNTIME_DIR not set in the environment.
Latest Th push: https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception&revision=732f41524a76330d1c6d72b350edb5056d73468c&selectedJob=238997791
Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=238997791&repo=autoland&lineNumber=34922
see https://bugzilla.mozilla.org/show_bug.cgi?id=1517939#c14
Assignee | ||
Comment 18•6 years ago
|
||
These failures are all from the 1-tier Linux PGO builds. We held off on switching to the 3-tier PGO builds because they hit the XDG_RUNTIME_DIR failures much more often, but those have been fixed by switching to debian9. So I think we should take this opportunity to fully switch Linux over to 3-tier. We can start with regular B builds, and then after a short trial period convert N builds as well.
Assignee | ||
Updated•6 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 22•6 years ago
|
||
Now that 3-tier PGO uses a debian9 image to generate the profile data
(bug 1519424), we no longer see the XDG_RUNTIME_DIR failures in the run
task. The frequency of those errors was the primary blocker for enabling
3-tier PGO in the first place. Since we still see those errors
occasionally in 1-tier PGO, we should switch to the 3-tier model for
Linux.
Assignee | ||
Comment 23•6 years ago
|
||
Comment hidden (Intermittent Failures Robot) |
Comment 25•6 years ago
|
||
Comment 26•6 years ago
|
||
bugherder |
Comment 27•6 years ago
|
||
Seems like the fix from comment 25 reduced some of the build times on Linux:
== Change summary for alert #20479 (as of Mon, 15 Apr 2019 22:10:52 GMT) ==
Improvements:
41% build times linux64-shippable opt nightly taskcluster-m5.4xlarge 4,888.47 -> 2,891.82
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=20479
Assignee | ||
Comment 28•6 years ago
|
||
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #27)
Seems like the fix from comment 25 reduced some of the build times on Linux:
== Change summary for alert #20479 (as of Mon, 15 Apr 2019 22:10:52 GMT) ==
Improvements:
41% build times linux64-shippable opt nightly taskcluster-m5.4xlarge 4,888.47 -> 2,891.82
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=20479
Unfortunately they aren't directly comparable since that build was changed from doing a 1-tier PGO, where all parts of the PGO build are done in one task, to the 3-tier model, where the build is split into the "instr", "run", and "B" tasks. Looking at a recent m-c push, the 3 tasks were 39 minutes, 4 minutes, and 51 minutes, so a total of 5640s. Overall it's a little slower, but it allows us to enable PGO on more platforms and opens up the possibility for reproducible PGO builds since the profile data is now an artifact in Taskcluster.
Comment hidden (Intermittent Failures Robot) |
Comment 30•6 years ago
|
||
This continues to occur in recent central as beta simulations:
Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=242035935&repo=try&lineNumber=35518
[task 2019-04-23T13:42:11.854Z] 13:42:11 INFO - error: XDG_RUNTIME_DIR not set in the environment.
[task 2019-04-23T13:42:11.854Z] 13:42:11 INFO - Unable to init server
[task 2019-04-23T13:42:11.854Z] 13:42:11 INFO - Error: cannot open display: :2
[task 2019-04-23T13:42:11.854Z] 13:42:11 INFO - Makefile:188: recipe for target 'profiledbuild' failed
[task 2019-04-23T13:42:11.855Z] 13:42:11 ERROR - make[1]: *** [profiledbuild] Error 1
[task 2019-04-23T13:42:11.855Z] 13:42:11 INFO - make[1]: Leaving directory '/builds/worker/workspace/build/src/obj-firefox'
[task 2019-04-23T13:42:11.855Z] 13:42:11 INFO - client.mk:125: recipe for target 'build' failed
[task 2019-04-23T13:42:11.855Z] 13:42:11 INFO - make: *** [build] Error 2
[task 2019-04-23T13:42:11.895Z] 13:42:11 INFO - 292 compiler warnings present.
[task 2019-04-23T13:42:12.023Z] 13:42:12 INFO - Notification center failed: Install notify-send (usually part of the libnotify package) to get a notification when the build finishes.
[task 2019-04-23T13:42:12.082Z] 13:42:12 ERROR - Return code: 2
[task 2019-04-23T13:42:12.082Z] 13:42:12 WARNING - setting return code to 2
[task 2019-04-23T13:42:12.082Z] 13:42:12 FATAL - 'mach build -v' did not run successfully. Please check log for errors.
[task 2019-04-23T13:42:12.082Z] 13:42:12 FATAL - Running post_fatal callback...
[task 2019-04-23T13:42:12.082Z] 13:42:12 FATAL - Exiting -1
[task 2019-04-23T13:42:12.083Z] 13:42:12 INFO - [mozharness: 2019-04-23 13:42:12.082974Z] Finished build step (failed)
[task 2019-04-23T13:42:12.083Z] 13:42:12 INFO - Running post-run listener: _parse_build_tests_ccov
[task 2019-04-23T13:42:12.083Z] 13:42:12 INFO - Running post-run listener: _shutdown_sccache
[task 2019-04-23T13:42:12.083Z] 13:42:12 INFO - Running post-run listener: _summarize
[task 2019-04-23T13:42:12.083Z] 13:42:12 ERROR - # TBPL FAILURE #
Assignee | ||
Comment 31•6 years ago
|
||
It looks like those builds are still using the 1-tier PGO. :Callek, should the builds from #c30 be shippable builds now? Or are they still intended to be nightlies? If it's the latter, we'll probably want to convert them to 3-tier PGO.
Comment 32•6 years ago
|
||
The devedition builds are labeled nightlies, but have only ever[1] run on-push on mozilla-beta (and beta-sims). They should be renamed to -shippable
from nightly, but doing that is purely cosmetic[2]. That said, independent of what they are named, they should be switched to 3-tier pgo.
[1] at least since the migration to taskcluster and release-promotion
[2] the two main points for shippable where to unify the on-push pgo builds and those we ship, and to use the on-push builds for shipping nightlies; both those have always been true of the devedition builds.
Assignee | ||
Comment 33•6 years ago
|
||
I filed bug 1547395 for enabling 3-tier PGO on the devedition builds.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Description
•