Closed Bug 1774585 Opened 2 years ago Closed 2 years ago

sccache-dist fails on m-c opt build: Compiling uniffi_bindgen v0.18.0

Categories

(Firefox Build System :: General, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gerard-majax, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 2 obsolete files)

Attached file fail.log (deleted) —
No description provided.
0:10.07 error: template "Types.kt" not found in directories ["/home/alex/codaz/Mozilla/gecko-cinnabar/third_party/rust/uniffi_bindgen/templates"]
 0:10.07   --> /home/alex/codaz/Mozilla/gecko-cinnabar/third_party/rust/uniffi_bindgen/src/bindings/kotlin/gen_kotlin/mod.rs:99:10
 0:10.07    |
 0:10.07 99 | #[derive(Template)]
 0:10.07    |          ^^^^^^^^
 0:10.07    |
 0:10.07    = note: this error originates in the derive macro `Template` (in Nightly builds, run with -Z macro-backtrace for more info)

There's no third_party/rust/uniffi_bindgen/templates on my system, but there are subfolders:

$ find third_party/rust/uniffi_bindgen/ -type d -name "templates"
third_party/rust/uniffi_bindgen/src/bindings/kotlin/templates
third_party/rust/uniffi_bindgen/src/bindings/python/templates
third_party/rust/uniffi_bindgen/src/bindings/ruby/templates
third_party/rust/uniffi_bindgen/src/bindings/swift/templates
third_party/rust/uniffi_bindgen/src/scaffolding/templates

And there's the file we want: third_party/rust/uniffi_bindgen/src/bindings/kotlin/templates/Types.kt

The proper configuration is defined in https://searchfox.org/mozilla-central/rev/ec3889f74d6b5695833280f4370ca0e9ba59a3e4/third_party/rust/uniffi_bindgen/askama.toml#3 and the default values are from https://searchfox.org/mozilla-central/rev/ec3889f74d6b5695833280f4370ca0e9ba59a3e4/third_party/rust/askama_shared/src/lib.rs#42-43 i.e., CARGO_MANIFEST_DIR/templates/ which would match third_party/rust/uniffi_bindgen/templates. I suspect we miss sending askama.toml when doing dist build?

I hacked sccache to force send askama.toml, build is now passing.

Assignee: nobody → lissyx+mozillians

bdk, could we avoid building bindings stuff related to languages we don't have a use for?

Flags: needinfo?(bdeankawamura)

I think we should be able to add a feature flag for this. But is there a way to prevent uniffi-bindgen from being built altogether? It's only needed when devs manually execute it with mach uniffi generate.

Flags: needinfo?(bdeankawamura)

(In reply to Ben Dean-Kawamura from comment #5)

I think we should be able to add a feature flag for this. But is there a way to prevent uniffi-bindgen from being built altogether? It's only needed when devs manually execute it with mach uniffi generate.

It's a dependency of uniffi_build which is a dependency of glean as of bug 1768834.

(In reply to Ben Dean-Kawamura from comment #5)

I think we should be able to add a feature flag for this. But is there a way to prevent uniffi-bindgen from being built altogether? It's only needed when devs manually execute it with mach uniffi generate.

./mach uniffi generate
It looks like you are trying to run an unknown mach command: uniffi

Run |mach help| to show a list of commands.

What is that command ?

Presumably glean generate python bindings, but I guess we don't need that when building glean for gecko...

Flags: needinfo?(jrediger)

Oops, I thought this was from the phab patch I just pushed. Forget about that.

Glean is using it to build the Rust side of the FFI layer (uniffi-bindgen builds both the Rust and foreign side). It shouldn't need the Kotlin functionality for that and we can work on making that optional. That's going to be a bit of a tricky process though. For a long time there was code to ensure the Rust bindings and Foreign bindings were built by the exact same tool to prevent issues with version mismatches. We're currently working through of relaxing that requirement. How urgent is this one?

Also, it's still going to need uniffi_bindgen/src/scaffolding/templates to build the Rust parts the askama.toml file will still need to be in the correct place.

No longer blocks: 1774556

(In reply to Ben Dean-Kawamura from comment #9)

Oops, I thought this was from the phab patch I just pushed. Forget about that.

Glean is using it to build the Rust side of the FFI layer (uniffi-bindgen builds both the Rust and foreign side). It shouldn't need the Kotlin functionality for that and we can work on making that optional. That's going to be a bit of a tricky process though. For a long time there was code to ensure the Rust bindings and Foreign bindings were built by the exact same tool to prevent issues with version mismatches. We're currently working through of relaxing that requirement. How urgent is this one?

Also, it's still going to need uniffi_bindgen/src/scaffolding/templates to build the Rust parts the askama.toml file will still need to be in the correct place.

Well it's as urgent as this is making my builds much slower if I can only rely on my laptop hence making me slower to investigate bug 1774556

Glean could ship with pregenerated bindings?

Can you help me understand the build failure more? I'm pretty sure askama.toml is in the source tree, what's happening in the build process that's causing it to not copy over?

The secondary issue is the the python/swift/kotlin bindings generation code. That's definitely unneeded and we're going to work on putting that behind a feature flag. That should give a small build time improvement, but I don't think that's going to solve this issue. The Askama templates are also used to build the Rust side of the FFI, so you need that askama.toml file in the correct place. For example, I see error: template "scaffolding_template.rs" not found in your original error log, which represents uniffi-bindgen generating Rust code for the build.

(In reply to Ben Dean-Kawamura from comment #12)

Can you help me understand the build failure more? I'm pretty sure askama.toml is in the source tree, what's happening in the build process that's causing it to not copy over?

As much as I know of sccache, it's copying deps from rustc -Z ls and others, here it's a plain file so it's not considered a file to copy

Attached file GitHub Pull Request (obsolete) (deleted) —

Wait a second, is the problem stemming from a proc macro reading files?

Yeah, looks like this is it... well, I don't think that's a proper thing to do for a proc macro in the first place...

So, I am unsure given latest comment from Mike if uniffi_bindgen with those macros are the way to go, and if they are, it seems we need:

  • to account for the template's macro references to file for rebuilding and caching
  • proper way to discover askama.toml

Ben, could you take over the issue, or let me know if the current set of changes is sound at least ? Also, I can't explain why the issue only hits on opt builds and not on debug, but I really have to focus on other things, and with the heatwave I'm not really at luxury of running that many many builds.

I have also spotted failures only on Python, Swift, Ruby and Kotlin usage, nothing on Rust even though you mentionned it on Comment 12. If you can share me a patch that would avoid building un-necessary bindings such as kotlin, ruby, swift and python I could verify if it unblocks my opt builds or if we still hit it on rust, though.

Flags: needinfo?(bdeankawamura)

I'm happy to help out with this one, but I'm confused about what the issue is.

It seems to me that the issue is that sccache is not copying askama.toml over. Is there an issue with the 2 PRs that Alexandre made? If so, could we try an alternative path?

Another issue is that the askama proc-macro is reading files. I don't know what we can do about this though short of backing out the Glean change and not going forward with the UniFFI desktop JS project. Maybe we could refactor UniFFI so that it worked with the default configuration but that means the templates would be less convenient place for UniFFI devs. Also, the macro would still be reading the template files. I'm hoping to replace Askama, but that's a long-term project.

AFAICT, the fact that uniffi-bindgen is building unneeded python/switch/ruby/kotlin generation code is a secondary issue. I'm pretty sure that fixing this won't fix the build issue, but I put together a branch to test this out: https://github.com/bendk/uniffi-rs/tree/no-bindings. I just removed the bindings generation code altogether there.

Is there a way for me to repro this myself?

Flags: needinfo?(bdeankawamura)

(In reply to Ben Dean-Kawamura from comment #19)

I'm happy to help out with this one, but I'm confused about what the issue is.

It seems to me that the issue is that sccache is not copying askama.toml over. Is there an issue with the 2 PRs that Alexandre made? If so, could we try an alternative path?

That's the root cause. The problem is that so far the only way I found to fix it was to:

  • add cargo packaging info mentionning askama.toml
  • leverage cargo package --list from sccache-dist to know we need the file and copy it for the remote build to happen correctly

I'm not at ease enough with rust to know if we could do differently to detect that askama.toml is a build dependency, because at the end of the day this is what it is. Mike mentionned my use of cargo might be risky.

[...]

Is there a way for me to repro this myself?

Setup sccache-dist, build opt. Somehow debug build dont show that for me. I dont know why, and I've already spent too much time on that problem for the moment.

Assignee: lissyx+mozillians → nobody
Severity: -- → S3
Priority: -- → P3

The simpler version that seems to fix this issue for an sccache build is to include_path that file in the uniffi_bindgen build, as I've one in this commit.

To test add this to Cargo.toml in m-c under [patch.crates-io]:

uniffi_bindgen = { git = "https://github.com/badboy/uniffi-rs", rev = "7fdd34adb1975a65bbdcdfbdc98ec3314d6f31f3" }

Then cargo update -p uniffi_bindgen; mach vendor rust, and it should build again with sccache.
I can get a UniFFI patch release out, so we can easily update it in m-c next.

Flags: needinfo?(jrediger)

The simpler version that seems to fix this issue for an sccache build is to include_path that file in the uniffi_bindgen build, as I've one in this commit.

The downside is that this relies on the compiler and/or the linker removing the unused data (since uniffi_bindgen is a dep of uniffi which is a dep of glean, not a build-dep).

(In reply to Mike Hommey [:glandium] from comment #22)

The simpler version that seems to fix this issue for an sccache build is to include_path that file in the uniffi_bindgen build, as I've one in this commit.

The downside is that this relies on the compiler and/or the linker removing the unused data (since uniffi_bindgen is a dep of uniffi which is a dep of glean, not a build-dep).

Yes, it's a hack. But it's probably the quickest thing we can deliver this week, with potentially restructuring our dependency usage afterwards, which might be a bit more work.

:glandium, do you think that hack is an acceptable solution for now? If so we can probably get that in tomorrow (I'm out for 2 weeks after this week, so I'd like to leave this in a good place)

Flags: needinfo?(mh+mozilla)

It's likely a better hack than mine :)

Attachment #9281722 - Attachment is obsolete: true
Attachment #9281723 - Attachment is obsolete: true

I'm not 100% confident about this, but it seems like uniffi_bindgen is a transitive dependency from the uniffi_macros proc-macro so its data shouldn't affect the libxul binary size. It's also a build-dependency from glean-core, which should be fine as well.

Can we get some fix landed ? This is now also hitting my debug builds

Sorry for the delay. We should have this fixed by the beginning of the next week for sure.

https://phabricator.services.mozilla.com/D151499 should fix this one (is it possible to specify 2 bugs for a phab-patch?)

(In reply to Ben Dean-Kawamura from comment #29)

https://phabricator.services.mozilla.com/D151499 should fix this one (is it possible to specify 2 bugs for a phab-patch?)

Not sure it's possible but it's probably not advisable. Better/simpler to just encode the dependency relationship in bugzilla (e.g. in this case we could either dupe this to bug 1772132 or, perhaps-better since the bugs are about distinct things, just resolve this as FIXED with a dependency on bug 1772132 once the patch lands).

the glean update landed in bug 1772132. I verified it locally with an sccache-dist setup that it all builds again just fine.

Status: NEW → RESOLVED
Closed: 2 years ago
Depends on: 1772132
Flags: needinfo?(mh+mozilla)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: