Closed
Bug 1396845
Opened 7 years ago
Closed 6 years ago
Intermittent sccache: encountered fatal error
Categories
(Firefox Build System :: General, defect, P1)
Firefox Build System
General
Tracking
(firefox64 fixed)
RESOLVED
FIXED
mozilla64
Tracking | Status | |
---|---|---|
firefox64 | --- | fixed |
People
(Reporter: aryx, Assigned: ted)
References
Details
(Keywords: intermittent-failure, Whiteboard: [stockwell infra][stockwell fixed:product])
Attachments
(1 file)
(deleted),
text/x-phabricator-request
|
Details |
https://treeherder.mozilla.org/logviewer.html#?job_id=128529143&repo=autoland
[task 2017-09-05T14:06:13.005910Z] 14:06:13 INFO - gmake[5]: Entering directory '/builds/worker/workspace/build/src/obj-firefox/dom/workers'
[task 2017-09-05T14:06:13.009152Z] 14:06:13 INFO - /builds/worker/workspace/build/src/sccache2/sccache /builds/worker/workspace/build/src/clang/bin/clang++ -B /builds/worker/workspace/build/src/gcc/bin -std=gnu++11 -o Unified_cpp_dom_workers2.o -c -I/builds/worker/workspace/build/src/obj-firefox/dist/stl_wrappers -I/builds/worker/workspace/build/src/obj-firefox/dist/system_wrappers -include /builds/worker/workspace/build/src/config/gcc_hidden.h -DDEBUG=1 -DOS_POSIX=1 -DOS_LINUX=1 -DSTATIC_EXPORTABLE_JS_API -DMOZ_HAS_MOZGLUE -DMOZILLA_INTERNAL_API -DIMPL_LIBXUL -I/builds/worker/workspace/build/src/dom/workers -I/builds/worker/workspace/build/src/obj-firefox/dom/workers -I/builds/worker/workspace/build/src/dom/base -I/builds/worker/workspace/build/src/dom/system -I/builds/worker/workspace/build/src/dom/base -I/builds/worker/workspace/build/src/dom/bindings -I/builds/worker/workspace/build/src/xpcom/build -I/builds/worker/workspace/build/src/xpcom/threads -I/builds/worker/workspace/build/src/obj-firefox/ipc/ipdl/_ipdlheaders -I/builds/worker/workspace/build/src/ipc/chromium/src -I/builds/worker/workspace/build/src/ipc/glue -I/builds/worker/workspace/build/src/obj-firefox/dist/include -I/builds/worker/workspace/build/src/obj-firefox/dist/include/nspr -I/builds/worker/workspace/build/src/obj-firefox/dist/include/nss -fPIC -DMOZILLA_CLIENT -include /builds/worker/workspace/build/src/obj-firefox/mozilla-config.h -MD -MP -MF .deps/Unified_cpp_dom_workers2.o.pp -Qunused-arguments -Qunused-arguments -Wall -Wc++11-compat -Wempty-body -Wignored-qualifiers -Woverloaded-virtual -Wpointer-arith -Wsign-compare -Wtype-limits -Wunreachable-code -Wunreachable-code-return -Wwrite-strings -Wno-invalid-offsetof -Wclass-varargs -Wloop-analysis -Wc++11-compat-pedantic -Wc++14-compat -Wc++14-compat-pedantic -Wc++1z-compat -Wcomma -Wimplicit-fallthrough -Werror=non-literal-null-conversion -Wstring-conversion -Wno-inline-new-delete -Wno-error=deprecated-declarations -Wno-error=array-bounds -Wformat -Wno-gnu-zero-variadic-macro-arguments -Wformat-security -Wno-unknown-warning-option -Wno-return-type-c-linkage -fsanitize=address -fno-exceptions -fno-strict-aliasing -fno-rtti -ffunction-sections -fdata-sections -fno-exceptions -fno-math-errno -pthread -D_GLIBCXX_USE_CXX11_ABI=0 -pipe -g -O1 -fno-omit-frame-pointer -Werror -Wno-error=shadow /builds/worker/workspace/build/src/obj-firefox/dom/workers/Unified_cpp_dom_workers2.cpp
[task 2017-09-05T14:06:13.009239Z] 14:06:13 INFO - sccache: encountered fatal error
[task 2017-09-05T14:06:13.009274Z] 14:06:13 INFO - sccache: error : Invalid checksum
[task 2017-09-05T14:06:13.009303Z] 14:06:13 INFO - sccache: cause: Invalid checksum
[task 2017-09-05T14:06:13.009339Z] 14:06:13 INFO - /builds/worker/workspace/build/src/config/rules.mk:1064: recipe for target 'Unified_cpp_dom_workers2.o' failed
[task 2017-09-05T14:06:13.009365Z] 14:06:13 INFO - gmake[5]: *** [Unified_cpp_dom_workers2.o] Error 254
[task 2017-09-05T14:06:13.009394Z] 14:06:13 INFO - gmake[5]: Leaving directory '/builds/worker/workspace/build/src/obj-firefox/dom/workers'
[task 2017-09-05T14:06:13.009425Z] 14:06:13 INFO - /builds/worker/workspace/build/src/config/recurse.mk:73: recipe for target 'dom/workers/target' failed
[task 2017-09-05T14:06:13.009449Z] 14:06:13 INFO - gmake[4]: *** [dom/workers/target] Error 2
[task 2017-09-05T14:06:13.009473Z] 14:06:13 INFO - gmake[4]: *** Waiting for unfinished jobs....
Reporter | ||
Comment 1•7 years ago
|
||
This permafails on autoland and inbound in the Linux x64 debug static builds now - but not on central. See e.g. https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=0d8847995698e0493fb3dec26864d01061d4611e&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable
Ted, any suggestions how to fix that? Thank you.
Severity: normal → blocker
Component: Build Config → Buildduty
Flags: needinfo?(ted)
Product: Core → Release Engineering
QA Contact: catlee
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 3•7 years ago
|
||
There must be a bad entry in the cache that only these builds are hitting. sccache does have an env var that can be used to skip cache lookups, `SCCACHE_RECACHE=1`, but we don't have a good way to set that for a single build type. I guess the closest thing we could do would be to stick `mk_add_options export SCCACHE_RECACHE=1` in the mozconfig for this specific build type, then back it out once you get a green build?
Flags: needinfo?(ted)
Reporter | ||
Comment 4•7 years ago
|
||
Severity: blocker → normal
Comment 5•7 years ago
|
||
And now, Linux32 opt.
Comment hidden (Intermittent Failures Robot) |
Comment 7•7 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #3)
> There must be a bad entry in the cache that only these builds are hitting.
> sccache does have an env var that can be used to skip cache lookups,
> `SCCACHE_RECACHE=1`, but we don't have a good way to set that for a single
> build type. I guess the closest thing we could do would be to stick
> `mk_add_options export SCCACHE_RECACHE=1` in the mozconfig for this specific
> build type, then back it out once you get a green build?
I wonder if cache wouldn't invalidate eventually.
Worth a try with :ted's solution too.
Priority: -- → P1
Comment hidden (Intermittent Failures Robot) |
Comment 9•7 years ago
|
||
Given Buildduty manifesto[1], this seems like an error that doesn't fall on Buildduty's plate.
Moving this over to General Automation for now, until we find a more suited owner.
Please move back if I'm wrong.
[1]: https://wiki.mozilla.org/ReleaseEngineering/Buildduty_manifesto#Things_Buildduty_isn.E2.80.99t_responsible_for
Component: Buildduty → General Automation
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 13•7 years ago
|
||
For future reference, here is something that I just did, and should work for future occurrences of the problem:
- Select the busted job on treeherder
- Click "...", then "Edit and Retrigger"
- In the create task textbox, under env, add "SCCACHE_RECACHE: '1'"
- Create the task. It will show up in treeherder.
- Wait for it to finish.
- Once it's green, other failing jobs of the same type can then be retriggered (so if a linux32-opt busted and you triggered a task with SCCACHE_RECACHE for linux32-opt, you can then retrigger the other busted linux32-opts, assuming they were getting the same broken object).
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 19•7 years ago
|
||
We have 47 failures on this bug in the last 7 days.
They occur on Windows2012 x64, 32 and the affected build type is opt.
Recent log: https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=155713818&lineNumber=38760
and a relevant part of it:
20:48:29 INFO - Running `z:/build/build/src/sccache2/sccache.exe 'z:/build/build/src/rustc/bin/rustc.exe' --crate-name regex_syntax 'Z:\build\build\src\third_party\rust\regex-syntax\src\lib.rs' --crate-type lib --emit=dep-info,link -C opt-level=2 -C metadata=56cac34a1113c987 -C extra-filename=-56cac34a1113c987 --out-dir 'z:/build/build/src/obj-firefox/toolkit/library\i686-pc-windows-msvc\release\deps' --target i686-pc-windows-msvc -L 'dependency=z:/build/build/src/obj-firefox/toolkit/library\i686-pc-windows-msvc\release\deps' -L 'dependency=z:/build/build/src/obj-firefox/toolkit/library\release\deps' --cap-lints allow -C opt-level=2 -C debuginfo=2`
38759
20:48:29 INFO - sccache: encountered fatal error
38760
20:48:29 INFO - sccache: error : Failed to create temp dir
38761
20:48:29 INFO - sccache: cause: Failed to create temp dir
38762
20:48:29 INFO - error: Could not compile `stable_deref_trait`.
38763
20:48:29 INFO - Caused by:
38764
20:48:29 INFO - process didn't exit successfully: `z:/build/build/src/sccache2/sccache.exe z:/build/build/src/rustc/bin/rustc.exe --crate-name stable_deref_trait Z:\build\build\src\third_party\rust\stable_deref_trait\src\lib.rs --crate-type lib --emit=dep-info,link -C opt-level=2 --cfg feature="default" --cfg feature="std" -C metadata=de8ffe798b49d872 -C extra-filename=-de8ffe798b49d872 --out-dir z:/build/build/src/obj-firefox/toolkit/library\i686-pc-windows-msvc\release\deps --target i686-pc-windows-msvc -L dependency=z:/build/build/src/obj-firefox/toolkit/library\i686-pc-windows-msvc\release\deps -L dependency=z:/build/build/src/obj-firefox/toolkit/library\release\deps --cap-lints allow -C opt-level=2 -C debuginfo=2` (exit code: 4294967294)
38765
20:48:29 INFO - warning: build failed, waiting for other jobs to finish...
38766
20:48:29 INFO - sccache: encountered fatal error
38767
20:48:29 INFO - sccache: error : Failed to create temp dir
38768
20:48:29 INFO - sccache: cause: Failed to create temp dir
38769
20:48:29 INFO - sccache: encountered fatal error
:catlee, can you please take a look at this?
Flags: needinfo?(catlee)
Whiteboard: [stockwell needswork]
Updated•7 years ago
|
Flags: needinfo?(catlee) → needinfo?(ted)
Updated•7 years ago
|
Whiteboard: [stockwell needswork] → [stockwell infra]
Assignee | ||
Comment 20•7 years ago
|
||
This is different from the error message in comment 0. This seems to be failing here:
https://github.com/mozilla/sccache/blob/39504428e28dc9f53ae7abdef77f8676d1c0e525/src/compiler/rust.rs#L131
`TempDir::new` isn't doing anything very fancy, the guts of it look like:
https://github.com/rust-lang-nursery/tempdir/blob/41377a0fb637de79a2e14426ee489f218ea02c19/src/lib.rs#L190
If sccache is failing to create a temp directory there's probably something else bad happening that's making that happen.
Flags: needinfo?(ted)
Assignee | ||
Comment 21•7 years ago
|
||
I changed the error handling for that case so that it should print the underlying error when this fails:
https://github.com/mozilla/sccache/commit/4fe33a6da14b14adc33db539b0cc963e1fa6f8d8
I'll get a patch up to update the version of sccache we use in CI and we can at least hopefully get better diagnostics.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 32•7 years ago
|
||
Apparently I never got around to updating sccache to pick up that fix until last week (bug 1445218). Looking at the most recent logs from orangefactor shows that the errors all look like:
01:40:09 INFO - sccache: encountered fatal error
01:40:09 INFO - sccache: error : Failed to create temp dir
01:40:09 INFO - sccache: cause: Failed to create temp dir
01:40:09 INFO - sccache: cause: The system cannot find the path specified. (os error 3)
The Rust standard library winds up calling `CreateDirectoryW`:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa363855(v=vs.85).aspx
So this is that function returning with an error, and the error being `ERROR_PATH_NOT_FOUND`. Given that, this means that the temp directory itself does not exist, so this must be a system configuration issue somehow.
Component: General Automation → Worker
Product: Release Engineering → Taskcluster
QA Contact: catlee → pmoore
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Updated•6 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 50•6 years ago
|
||
There are 33 occurrences in the last 7 days, on windows2012-32 opt and windows2012-64 opt.
Updated•6 years ago
|
Whiteboard: [stockwell infra] → [stockwell infra][stockwell needswork]
Comment hidden (Intermittent Failures Robot) |
Comment 53•6 years ago
|
||
Update: There were 33 failures in the last 7 days on windows2012-64 & windows2012-32.
Recent failure log:
https://treeherder.mozilla.org/logviewer.html#?job_id=189683020&repo=autoland&lineNumber=67666
02:05:41 INFO - Comparing manifests...
02:05:41 INFO - Comparing packages...
02:05:41 INFO - mozmake.EXE[3]: Leaving directory 'z:/build/build/src/obj-firefox/config/tests/src-simple'
02:05:41 INFO - mozmake.EXE[2]: Leaving directory 'z:/build/build/src/obj-firefox/config'
02:05:41 INFO - Compiling itoa v0.4.1
02:05:41 INFO - Running `z:/build/build/src/sccache2/sccache.exe 'z:/build/build/src/rustc/bin/rustc.exe' --crate-name itoa 'third_party\rust\itoa\src\lib.rs' --crate-type lib --emit=dep-info,link -C opt-level=2 -C codegen-units=1 --cfg 'feature="default"' --cfg 'feature="std"' -C metadata=91a63898de3b9caa -C extra-filename=-91a63898de3b9caa --out-dir 'z:/build/build/src/obj-firefox/toolkit/library\x86_64-pc-windows-msvc\release\deps' --target x86_64-pc-windows-msvc -L 'dependency=z:/build/build/src/obj-firefox/toolkit/library\x86_64-pc-windows-msvc\release\deps' -L 'dependency=z:/build/build/src/obj-firefox/toolkit/library\release\deps' --cap-lints warn -C opt-level=2 -C debuginfo=2`
02:05:42 INFO - ......sccache: encountered fatal error
02:05:42 INFO - sccache: error : Failed to create temp dir
02:05:42 INFO - sccache: cause: Failed to create temp dir
02:05:42 INFO - sccache: cause: The system cannot find the path specified. (os error 3)
02:05:42 INFO - error: Could not compile `siphasher`.
02:05:42 INFO - Caused by:
02:05:42 INFO - process didn't exit successfully: `z:/build/build/src/sccache2/sccache.exe z:/build/build/src/rustc/bin/rustc.exe --crate-name siphasher third_party\rust\siphasher\src\lib.rs --crate-type lib --emit=dep-info,link -C opt-level=2 -C codegen-units=1 -C metadata=757e07074f2cab3b -C extra-filename=-757e07074f2cab3b --out-dir z:/build/build/src/obj-firefox/toolkit/library\release\deps -L dependency=z:/build/build/src/obj-firefox/toolkit/library\release\deps --cap-lints warn` (exit code: 4294967294)
02:05:42 INFO - warning: build failed, waiting for other jobs to finish...
02:05:42 INFO - .TEST-PASS | autobinscope.py | ../../dist/bin/xul.dll succeeded
02:05:42 INFO - z:/build/build/src/obj-firefox/_virtualenvs/init/Scripts/python.exe z:/build/build/src/build/win32/autobinscope.py ../../dist/bin/nss3.dll ../../dist/crashreporter-symbols/
02:05:42 INFO - sccache: encountered fatal error
02:05:42 INFO - sccache: error : Failed to create temp dir
02:05:42 INFO - sccache: cause: Failed to create temp dir
02:05:42 INFO - sccache: cause: The system cannot find the path specified. (os error 3)
02:05:43 INFO - error: Could not compile `unicode-xid`.
02:05:43 INFO - Caused by:
02:05:43 INFO - process didn't exit successfully: `z:/build/build/src/sccache2/sccache.exe z:/build/build/src/rustc/bin/rustc.exe --crate-name unicode_xid third_party\rust\unicode-xid\src\lib.rs --crate-type lib --emit=dep-info,link -C opt-level=2 -C codegen-units=1 --cfg feature="default" -C metadata=ec858df19fcf253b -C extra-filename=-ec858df19fcf253b --out-dir z:/build/build/src/obj-firefox/toolkit/library\release\deps -L dependency=z:/build/build/src/obj-firefox/toolkit/library\release\deps --cap-lints warn` (exit code: 4294967294)
02:05:43 INFO - warning: build failed, waiting for other jobs to finish...
Comment hidden (Intermittent Failures Robot) |
Comment 55•6 years ago
|
||
Update:
There have been 41 failures in the last 7 days, all on opt build type (with one exception)
Failures on platforms:
windows2012-64: 23
windows2012-32: 17
osx-10-10 / debug: 1
Recent log file with the failure:
https://treeherder.mozilla.org/logviewer.html#?job_id=191394972&repo=mozilla-inbound&lineNumber=64638
Flags: needinfo?(pmoore)
Comment 56•6 years ago
|
||
(In reply to Ted Mielczarek [:ted] [:ted.mielczarek] from comment #32)
> Apparently I never got around to updating sccache to pick up that fix until
> last week (bug 1445218). Looking at the most recent logs from orangefactor
> shows that the errors all look like:
>
> 01:40:09 INFO - sccache: encountered fatal error
> 01:40:09 INFO - sccache: error : Failed to create temp dir
> 01:40:09 INFO - sccache: cause: Failed to create temp dir
> 01:40:09 INFO - sccache: cause: The system cannot find the path
> specified. (os error 3)
>
> The Rust standard library winds up calling `CreateDirectoryW`:
> https://msdn.microsoft.com/en-us/library/windows/desktop/aa363855(v=vs.85).
> aspx
>
> So this is that function returning with an error, and the error being
> `ERROR_PATH_NOT_FOUND`. Given that, this means that the temp directory
> itself does not exist, so this must be a system configuration issue somehow.
Hi Ted,
Do you know what the path is of the directory it is trying to create?
Flags: needinfo?(pmoore) → needinfo?(ted)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 59•6 years ago
|
||
There are 27 total failures in the last 7 days on windows2012.
Pete/Ted is there any update here?
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 61•6 years ago
|
||
(In reply to Pete Moore [:pmoore][:pete] from comment #56)
> Hi Ted,
>
> Do you know what the path is of the directory it is trying to create?
I don't know exactly because the path doesn't get propagated up with the error unfortunately. The `tempdir` crate calls `env::temp_dir()` to get the temp directory in which to create the new directory, and the Rust docs say this:
https://doc.rust-lang.org/std/env/fn.temp_dir.html#windows
"Returns the value of, in order, the TMP, TEMP, USERPROFILE environment variable if any are set and not the empty string. Otherwise, temp_dir returns the path of the Windows directory. This behavior is identical to that of GetTempPath, which this function uses internally."
The Windows implementation of that function literally just calls `GetTempPathW`:
https://github.com/rust-lang/rust/blob/e847f30f571e617990f7193665ab10127100e9df/src/libstd/sys/windows/os.rs#L291
Flags: needinfo?(ted)
Comment 62•6 years ago
|
||
(In reply to Ted Mielczarek [:ted] [:ted.mielczarek] from comment #61)
> (In reply to Pete Moore [:pmoore][:pete] from comment #56)
> > Hi Ted,
> >
> > Do you know what the path is of the directory it is trying to create?
>
> I don't know exactly because the path doesn't get propagated up with the
> error unfortunately. The `tempdir` crate calls `env::temp_dir()` to get the
> temp directory in which to create the new directory, and the Rust docs say
> this:
> https://doc.rust-lang.org/std/env/fn.temp_dir.html#windows
>
> "Returns the value of, in order, the TMP, TEMP, USERPROFILE environment
> variable if any are set and not the empty string. Otherwise, temp_dir
> returns the path of the Windows directory. This behavior is identical to
> that of GetTempPath, which this function uses internally."
>
> The Windows implementation of that function literally just calls
> `GetTempPathW`:
> https://github.com/rust-lang/rust/blob/
> e847f30f571e617990f7193665ab10127100e9df/src/libstd/sys/windows/os.rs#L291
The rust docs look consistent with https://docs.microsoft.com/en-us/windows/desktop/api/fileapi/nf-fileapi-gettemppathw#remarks - so this probably is accurate.
I'm curious whether those env vars intermittently have invalid values or are not set etc, or whether the temp directory genuinely goes missing.
If there is any way to get some debug in there to show what directory it is trying to delete, and to perhaps to dump the contents of the parent directory, and show DACLs associated to the directory if it exists, or its closest parent that does exist (using e.g. icacls command) that could possibly shed some light.
If the temp directory is getting deleted, I suspect this may be triggered by some process/thread of the task that is executing, rather than by the worker - but anything is possible...
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 67•6 years ago
|
||
This bug failed 36 in the last 7 days. Occurs on windows2012 platforms on opt build type.
Recent log:
https://treeherder.mozilla.org/logviewer.html#?job_id=200252674&repo=mozilla-inbound
INFO - ......sccache: encountered fatal error
20:38:09 INFO - sccache: error : Failed to create temp dir
20:38:09 INFO - sccache: cause: Failed to create temp dir
20:38:09 INFO - sccache: cause: The system cannot find the path specified. (os error 3)
20:38:09 INFO - error: Could not compile `siphasher`.
20:38:09 INFO - Caused by:
20:38:09 INFO - process didn't exit successfully: `z:/build/build/src/sccache2/sccache.exe 'z:/build/build/src/rustc/bin/rustc.exe' --crate-name siphasher 'third_party\rust\siphasher\src\lib.rs' --crate-type lib --emit=dep-info,link -C opt-level=2 -C codegen-units=1 -C metadata=2f1deef41a0ba09e -C extra-filename=-2f1deef41a0ba09e --out-dir 'z:/build/build/src/obj-firefox/toolkit/library\release\deps' -L 'dependency=z:/build/build/src/obj-firefox/toolkit/library\release\deps' --cap-lints warn` (exit code: 4294967294)
20:38:09 INFO - warning: build failed, waiting for other jobs to finish...
20:38:09 INFO - .sccache: encountered fatal error
20:38:09 INFO - sccache: error : Failed to create temp dir
20:38:09 INFO - sccache: cause: Failed to create temp dir
20:38:09 INFO - sccache: cause: The system cannot find the path specified. (os error 3)
20:38:09 INFO - error: Could not compile `libc`.
20:38:09 INFO - Caused by:
20:38:09 INFO - process didn't exit successfully: `z:/build/build/src/sccache2/sccache.exe 'z:/build/build/src/rustc/bin/rustc.exe' --crate-name libc 'third_party\rust\libc\src\lib.rs' --crate-type lib --emit=dep-info,link -C opt-level=2 -C codegen-units=1 --cfg 'feature="default"' --cfg 'feature="use_std"' -C metadata=b63034c45cb453f7 -C extra-filename=-b63034c45cb453f7 --out-dir 'z:/build/build/src/obj-firefox/toolkit/library\release\deps' -L 'dependency=z:/build/build/src/obj-firefox/toolkit/library\release\deps' --cap-lints warn` (exit code: 4294967294)
20:38:09 INFO - warning: build failed, waiting for other jobs to finish...
20:38:09 INFO - TEST-PASS | autobinscope.py | ../../dist/bin/xul.dll succeeded
20:38:09 INFO - z:/build/build/src/obj-firefox/_virtualenvs/init/Scripts/python.exe z:/build/build/src/build/win32/autobinscope.py ../../dist/bin/nss3.dll ../../dist/crashreporter-symbols/
20:38:10 INFO - sccache: encountered fatal error
20:38:10 INFO - sccache: error : Failed to create temp dir
20:38:10 INFO - sccache: cause: Failed to create temp dir
20:38:10 INFO - sccache: cause: The system cannot find the path specified. (os error 3)
20:38:10 INFO - error: Could not compile `unicode-xid`.
20:38:10 INFO - Caused by:
20:38:10 INFO - process didn't exit successfully: `z:/build/build/src/sccache2/sccache.exe 'z:/build/build/src/rustc/bin/rustc.exe' --crate-name unicode_xid 'third_party\rust\unicode-xid\src\lib.rs' --crate-type lib --emit=dep-info,link -C opt-level=2 -C codegen-units=1 --cfg 'feature="default"' -C metadata=5bd8ab608523f65a -C extra-filename=-5bd8ab608523f65a --out-dir 'z:/build/build/src/obj-firefox/toolkit/library\release\deps' -L 'dependency=z:/build/build/src/obj-firefox/toolkit/library\release\deps' --cap-lints warn` (exit code: 4294967294)
20:38:10 INFO - warning: build failed, waiting for other jobs to finish...
20:38:10 INFO - sccache: encountered fatal error
20:38:10 INFO - sccache: error : Failed to create temp dir
20:38:10 INFO - sccache: cause: Failed to create temp dir
20:38:10 INFO - sccache: cause: The system cannot find the path specified. (os error 3)
20:38:10 INFO - sccache: encountered fatal error
20:38:10 INFO - sccache: error : Failed to create temp dir
20:38:10 INFO - sccache: cause: Failed to create temp dir
20:38:10 INFO - sccache: cause: The system cannot find the path specified. (os error 3)
20:38:10 INFO - sccache: encountered fatal error
20:38:10 INFO - sccache: error : Failed to create temp dir
20:38:10 INFO - sccache: cause: Failed to create temp dir
20:38:10 INFO - sccache: cause: The system cannot find the path specified. (os error 3)
20:38:10 INFO - sccache: encountered fatal error
20:38:10 INFO - sccache: error : Failed to create temp dir
20:38:10 INFO - sccache: cause: Failed to create temp dir
20:38:10 INFO - sccache: cause: The system cannot find the path specified. (os error 3)
20:38:10 INFO - error: Could not compile `dtoa`.
20:38:10 INFO - Caused by:
20:38:10 INFO - process didn't exit successfully: `z:/build/build/src/sccache2/sccache.exe 'z:/build/build/src/rustc/bin/rustc.exe' --crate-name dtoa 'third_party\rust\dtoa\src\lib.rs' --crate-type lib --emit=dep-info,link -C opt-level=2 -C codegen-units=1 -C metadata=4012af297245995f -C extra-filename=-4012af297245995f --out-dir 'z:/build/build/src/obj-firefox/toolkit/library\x86_64-pc-windows-msvc\release\deps' --target x86_64-pc-windows-msvc -L 'dependency=z:/build/build/src/obj-firefox/toolkit/library\x86_64-pc-windows-msvc\release\deps' -L 'dependency=z:/build/build/src/obj-firefox/toolkit/library\release\deps' --cap-lints warn -C opt-level=2 -C debuginfo=2` (exit code: 4294967294)
20:38:10 INFO - warning: build failed, waiting for other jobs to finish...
20:38:10 INFO - error: Could not compile `siphasher`.
pmoore: Can you please take a look at this bug?
Flags: needinfo?(pmoore)
Comment hidden (Intermittent Failures Robot) |
Comment 69•6 years ago
|
||
Hey ted, is there any way we can inject some code into the failing process that could dump the info described in comment 62? Thanks!
Flags: needinfo?(pmoore) → needinfo?(ted)
Assignee | ||
Comment 70•6 years ago
|
||
Hm, yeah, I think we could fork the tempdir crate and put in extra logging for testing purposes.
Assignee | ||
Comment 71•6 years ago
|
||
I forked that crate and wrote a simple patch to put the path in the error message:
https://github.com/luser/tempdir/commit/bc045d5e96c85074da83efdf7ca512a2cfd5826b
We'll just need to change the sccache toolchain task to build sccache using this version of the crate.
Assignee | ||
Comment 72•6 years ago
|
||
Assignee | ||
Comment 73•6 years ago
|
||
I looked through some of the builds from comment 66 and I noticed that basically all of those examples are rusttest builds, which is interesting.
Flags: needinfo?(ted)
Assignee | ||
Comment 74•6 years ago
|
||
I managed to reproduce the failure on that try push, but my override of the tempdir crate didn't work properly. :-/
Assignee | ||
Comment 75•6 years ago
|
||
Assignee | ||
Comment 76•6 years ago
|
||
That worked, but I can't say I like the answer it's giving me:
https://treeherder.mozilla.org/logviewer.html#?job_id=201758699&repo=try&lineNumber=45655
19:58:44 INFO - sccache: cause: Failed to create temp dir
19:58:44 INFO - sccache: cause: Error creating temp directory "z:\\build\\build\\src\\c\\Users\\task_1537988825\\AppData\\Local\\Temp\\sccache.HQRR1j9faz6I": The system cannot find the path specified. (os error 3)
That's a pretty messed up path! I assume something has screwed up the TEMPDIR path by eating the colon or something like that? If it was set to `c/Users/.../Temp` then I could see it producing this path.
Assignee | ||
Comment 77•6 years ago
|
||
Assignee | ||
Comment 78•6 years ago
|
||
(In reply to Ted Mielczarek [:ted] [:ted.mielczarek] from comment #77)
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=f5a422b240cf8c5b9db8a359a8273d5dbe0f255e
I added a dump of the full environment to the error message in this push. Hopefully we can catch the failure and get some more info.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 80•6 years ago
|
||
Took quite a few retries this time but I did finally get a failure:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f5a422b240cf8c5b9db8a359a8273d5dbe0f255e&selectedJob=202697224
The log shows that both TMP and TEMP are set incorrectly:
20:35:00 INFO - TEMP=c/Users/task_1538422421/AppData/Local/Temp
20:35:00 INFO - TERM=cygwin
20:35:00 INFO - TINDERBOX_OUTPUT=1
20:35:00 INFO - TMP=c/Users/task_1538422421/AppData/Local/Temp
Looking further up in the logs to where mozharness is dumping the environment, I see:
19:48:50 INFO - 'TEMP': 'C:\\Users\\task_1538422421\\AppData\\Local\\Temp',
19:48:50 INFO - 'TINDERBOX_OUTPUT': '1',
19:48:50 INFO - 'TMP': 'C:\\Users\\task_1538422421\\AppData\\Local\\Temp',
Those are consistent with the values from successful builds, so I haven't found anything definitive yet.
Assignee | ||
Comment 81•6 years ago
|
||
Assignee | ||
Comment 82•6 years ago
|
||
Assignee | ||
Comment 83•6 years ago
|
||
I caught a failure with that last try push:
https://treeherder.mozilla.org/logviewer.html#?job_id=202917973&repo=try&lineNumber=46042
I added extra logging to dump the environment using `env` from make before launching sccache, and from sccache itself when starting. Those both produce sensible results.
From env:
17:19:27 INFO - TEMP=/tmp
17:19:27 INFO - TERM=cygwin
17:19:27 INFO - TINDERBOX_OUTPUT=1
17:19:27 INFO - TMP=/tmp
From sccache:
17:19:27 INFO - TEMP=C:/Users/task_1538499626/AppData/Local/Temp
17:19:27 INFO - TERM=cygwin
17:19:27 INFO - TINDERBOX_OUTPUT=1
17:19:27 INFO - TMP=C:/Users/task_1538499626/AppData/Local/Temp
However, when it failed it produced the same busted output as previously:
18:00:16 INFO - TEMP=c/Users/task_1538499626/AppData/Local/Temp
18:00:16 INFO - TERM=cygwin
18:00:16 INFO - TINDERBOX_OUTPUT=1
18:00:16 INFO - TMP=c/Users/task_1538499626/AppData/Local/Temp
Assignee | ||
Comment 84•6 years ago
|
||
I realized while looking into that log that these builds also exhibit bug 1495798, which I just filed today. It seems pretty plausible that fixing that bug would fix this issue, since it looks like sccache is getting launched with correct environment variables the first time. If we weren't shutting it down and then re-starting it in an ad-hoc manner during `make check` I don't think this would happen.
Component: Worker → General
Depends on: 1495798
Product: Taskcluster → Firefox Build System
QA Contact: pmoore
Assignee | ||
Comment 85•6 years ago
|
||
I retriggered a bunch of win64-rusttest builds on my try push in bug 1495798 and I hit something that looked like this error:
https://treeherder.mozilla.org/logviewer.html#?job_id=203191252&repo=try&lineNumber=48397
21:17:36 INFO - ..error: couldn't create a temp dir: The system cannot find the path specified. (os error 3)
21:17:36 INFO - error: aborting due to previous error
21:17:36 INFO - error: Could not compile `stylo_tests`.
...but when I looked closer I realized the error message was different. That error comes from rustc itself:
https://github.com/rust-lang/rust/blob/8a0e5cbf124cea5254bb09111cae6af1b86b21e4/src/librustc_codegen_llvm/back/link.rs#L299
Interestingly, that code is trying to create a temp dir inside the output directory, so if that fails things are definitely broken in a strange way!
Assignee | ||
Comment 86•6 years ago
|
||
Although, the fact that it's breaking in that way is a useful data point. My patch in bug 1495798 ensures that we keep the sccache server started during the build alive for the `make check` step, so most Rust compiles in that step should be handled properly through the existing sccache server that has a working environment. Looking at sccache.log verifies that the server did get the request for the compile that failed:
DEBUG:sccache::server: parse_arguments: CannotCache(crate-type): ["--crate-name", "build_script_build", "servo\\ports\\geckolib\\tests\\build.rs", "--crate-type", "bin", "--emit=dep-info,link", "-C", "opt-level=2", "-C", "codegen-units=1", "-C", "metadata=43baea8111549fcb", "-C", "extra-filename=-43baea8111549fcb", "--out-dir", "z:/build/build/src/obj-firefox/toolkit/library\\release\\build\\stylo_tests-43baea8111549fcb", "-L", "dependency=z:/build/build/src/obj-firefox/toolkit/library\\release\\deps", "--extern", "regex=z:/build/build/src/obj-firefox/toolkit/library\\release\\deps\\libregex-9bf31ce226af7ac4.rlib"]
...but it refused to cache it because it's compiling a build script binary, which means rustc was then invoked from the sccache *client* process, where it failed!
Assignee | ||
Comment 87•6 years ago
|
||
I mentioned this on IRC, but looking at the log in comment 83 again I noticed that while not all env vars are messed up, some of them definitely are, and PATH is one of those:
18:00:16 INFO - PATH=z:/build/build/src/obj-firefox/toolkit/library\release\deps;Z:\task_1538499626\build\src\rustc\bin;z/build/build/src/clang/bin:z/build/build/src/clang/bin:z/build/build/src/vs2017_15.8.4/VC/bin/Hostx64/x64:z/build/build/src/vs2017_15.8.4/SDK/bin/10.0.17134.0/x64:z/build/build/src/vs2017_15.8.4/VC/redist/x64/Microsoft.VC141.CRT:z/build/build/src/vs2017_15.8.4/SDK/Redist/ucrt/DLLs/x64:z/build/build/src/vs2017_15.8.4/DIA SDK/bin/amd64:c/Program Files/Mercurial:c/mozilla-build/bin:c/mozilla-build/kdiff3:c/mozilla-build/moztools-x64/bin:c/mozilla-build/mozmake:/bin:/local/bin:c/mozilla-build/nsis-3.01:c/mozilla-build/python:c/mozilla-build/python/Scripts:c/Windows/system32:c/Windows:c/Windows/System32/Wbem:c/Windows/System32/WindowsPowerShell/v1.0/:c/Program Files/Amazon/cfn-bootstrap/:c/Program Files (x86)/GNU/GnuPG/pub:c/Program Files (x86)/Windows Kits/10/Windows Performance Toolkit/:c/mozilla-build/python/lib/site-packages/pywin32_system32:c/mozilla-build/python/lib/site-packages/pywin32_system32
That could definitely explain the symptoms in bug 1470384!
Assignee | ||
Comment 88•6 years ago
|
||
Given the way that cargo prepares the PATH env var for rustc subprocesses it seems like the environment is getting screwed up between make and cargo:
https://github.com/rust-lang/cargo/blob/b1e1d388ea09ecc27515345b180dd02263b087df/src/cargo/core/compiler/compilation.rs#L202-L205
From comment 87 you can see that there are two paths at the start of PATH that look correct, and those are what cargo adds. The rest are mangled, and presumably came from cargo's environment.
Assignee | ||
Comment 89•6 years ago
|
||
I have another theory! When we invoke `mach build` we run mach by way of bash on Windows:
https://dxr.mozilla.org/mozilla-central/rev/dcba2a476ccfb77f859b3614d2c77ed3cc06a225/testing/mozharness/mozharness/mozilla/building/buildbase.py#1131-1138
but when we run `mach build check` we do not use bash:
https://dxr.mozilla.org/mozilla-central/rev/dcba2a476ccfb77f859b3614d2c77ed3cc06a225/testing/mozharness/mozharness/mozilla/building/buildbase.py#1283-1290
Assignee | ||
Comment 90•6 years ago
|
||
Assignee | ||
Comment 91•6 years ago
|
||
Assignee | ||
Comment 92•6 years ago
|
||
I ran 101 build-win64-rusttests/opt tasks on that last try push and of those 96 passed, 4 failed in known xpcshell selftest intermittents, and 1 failed due to an infra error downloading a toolchain. I didn't see a single instance of this or bug 1470384, which seems pretty good!
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 93•6 years ago
|
||
Historically when invoking mach from mozharness on windows we have started
it by way of bash. However, we haven't been doing that for `mach build check`.
This hadn't caused noticeable issues in the past, but now that we are
running rusttest builds, which wind up compiling extra Rust code in that
step, intermittent failures are showing up because of this. The root cause
is likely a bug in msys' translation of environment variables, but msys
is unsupported and we're already using this workaround for the rest of the
build.
Comment 94•6 years ago
|
||
Pushed by tmielczarek@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/af465aa90d73
invoke mach via bash on windows when running `mach build check`. r=froydnj
Comment 95•6 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 6 years ago
status-firefox64:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 97•6 years ago
|
||
The graph from the previous comment shows this failure disappearing as of 2018-10-06, so it looks like this patch did fix the issue!
Updated•6 years ago
|
Whiteboard: [stockwell infra][stockwell needswork] → [stockwell infra][stockwell fixed:product]
Assignee | ||
Comment 98•6 years ago
|
||
Updated•6 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•