Closed Bug 1336013 Opened 8 years ago Closed 7 years ago

make Stylo ASAN builds work

Categories

(Core :: CSS Parsing and Computation, defect, P2)

defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: heycam, Assigned: shinglyu)

References

(Blocks 1 open bug)

Details

(Whiteboard: [Stylo])

ASAN builds currently require --disable-jemalloc, but that is incompatible with --enable-stylo.  I wonder if there is any way we can get ASAN builds working with Stylo?
Why is --disable-jemalloc incompatible with stylo? Does it have to do with that business where we made stylo imply a special jemalloc #define?
Flags: needinfo?(mh+mozilla)
Try this:

diff --git a/toolkit/moz.configure b/toolkit/moz.configure
index c8561108677e..a5aacfcfd41a 100644
--- a/toolkit/moz.configure
+++ b/toolkit/moz.configure
@@ -670,8 +670,10 @@ set_config('MOZ_LIBCLANG_PATH', delayed_getattr(stylo, 'libclang_path'))
 set_config('MOZ_CLANG_PATH', delayed_getattr(stylo, 'clang_path'))
 set_config('MOZ_STYLO_BINDGEN', delayed_getattr(stylo, 'bindgen_enabled'))
 
-@depends(stylo, dmd)
-def jemalloc(stylo, dmd):
+@depends(stylo, dmd, '--enable-jemalloc', when='--enable-compile-environment')
+def jemalloc(stylo, dmd, jemalloc):
+    if jemalloc.origin != 'default' and not jemalloc:
+        return None
     if stylo:
         return 'moz'
     elif dmd:
Flags: needinfo?(mh+mozilla)
That works fine to get things to compile.  I still get a Segmentation fault very early on, though, just like if I leave off --disable-jemalloc for an ASAN non-Stylo build.
Blocks: stylo-tooling
No longer blocks: stylo
Summary: stylo: make ASAN builds work → make ASAN builds work
Whiteboard: [Stylo]
Comment 2 didn't work for me; glandium says for local testing we can do this instead:

diff --git a/toolkit/moz.configure b/toolkit/moz.configure
--- a/toolkit/moz.configure
+++ b/toolkit/moz.configure
@@ -679,8 +679,8 @@ def jemalloc(stylo, target, dmd):
     elif dmd:
         return True
 
-imply_option('--enable-jemalloc', jemalloc,
-             reason='--enable-dmd or --enable-stylo')
 
 option('--with-servo', env='SERVO_TARGET_DIR', nargs=1,
        help='Absolute path of the target directory where libgeckoservo can '

That got me past configure, but then bindgen crashed:

 0:17.10 thread '<unnamed>' panicked at 'C'mon: Continue', C:\bot\slave\stable-dist-rustc-win-msvc-64\build\src\libcore\result.rs:837
...
 0:17.10    7:     0x7ff63f888a2f - core::result::unwrap_failed<bindgen::parse::ParseError>
 0:17.10                         at C:\bot\slave\stable-dist-rustc-win-msvc-64\build\src\libcore\macros.rs:29
 0:17.10    8:     0x7ff63f8d46d0 - bindgen::ir::ty::Type::from_clang_ty
 0:17.10                         at S:\blue\third_party\rust\bindgen\src\ir\ty.rs:1024
...
0:17.11 thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Any', C:\bot\slave\stable-dist-rustc-win-msvc-64\build\src\libcore\result.rs:837
fitzgen, can you help dmajor out with bindgen here?
Flags: needinfo?(nfitzgerald)
FWIW, I filed bug 1347036 about the fact that comment 2 fails to do the expected thing silently. I don't remember exactly what was going on, but ISTR that my conclusion was that it wasn't possible to do it that way. So bug 1347036 is about not silently pretending it works as expected when it doesn't.

We still need to find a way to make this work.

ni?'ing myself to keep that in my queue.
Flags: needinfo?(mh+mozilla)
What's the STR for the bindgen panic?
Flags: needinfo?(nfitzgerald) → needinfo?(dmajor)
(In reply to Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8] from comment #7)
> What's the STR for the bindgen panic?

Build stylo with an LLVM >= 4.0.0 in your PATH. (I was building Win64, if that matters.)

The 4.0.0 release and current trunk (5.0.0) both hit the panic. I built the 3.9.0 tag and it's fine: I was able to build stylo with and without ASan (though the latter doesn't start up, which I'll investigate next).
Flags: needinfo?(dmajor)
> I was able to build stylo with and without ASan
> (though the latter doesn't start up, which I'll investigate next).

Er, I mean "former". The ASan build doesn't start. Windows says "The application was unable to start correctly (0xc0000142). Click OK to close the application." This message usually indicates a library loading problem. I bet it's the ASan runtime DLL.
The Windows ASan runtime in LLVM 3.9.0 is missing too many fixes to be useful. We'll need to get the bindgen stuff sorted out so that we can use LLVM trunk.
I've filed https://github.com/servo/rust-bindgen/issues/584 for tracking the issue on the bindgen side.
Wait, I'm using clang 5.0.0, and stylo builds work fine for me.
I guess this can be due to windows-only headers and similar stuff, we've definitely had bugs like that before.
Alias: stylo-asan
Summary: make ASAN builds work → stylo: make ASAN builds work
Summary: stylo: make ASAN builds work → make ASAN builds work
No longer blocks: 1356988
(In reply to David Major [:dmajor] from comment #8)
> The 4.0.0 release and current trunk (5.0.0) both hit the panic. I built the
> 3.9.0 tag and it's fine: I was able to build stylo with and without ASan
> (though the latter doesn't start up, which I'll investigate next).

dmajor or heycam, is this bug specifically about Stylo ASAN problems with LLVM 4.0? We currently require LLVM 3.9 (bug 1357889) or 5.0 (bug 1359508) and forbid 4.0.

bholley suggested that someone do a Try push that:

(1) Removes the special stylo behavior in mozjemalloc (since we're already working on sorting that out)
(2) Passes --enable-stylo on all platforms.
Flags: needinfo?(dmajor)
Flags: needinfo?(cam)
Priority: -- → P2
I would guess that this bug was for getting Stylo ASAN builds working, whatever that takes. The LLVM issue was one particular roadblock along the way, although it's apparently now cleared. I suppose the next step would be to try a fresh build and see if there's anything else standing in the way.
Flags: needinfo?(dmajor)
(I'd be keen to do so myself but I'm not sure I have the cycles. I'll keep an eye on this bug and revisit if nobody takes it before I do.)
Shing, when you have time, can you please test whether a Stylo ASAN build works correctly? ASAN is one of our blockers building Stylo on Nightly, but we don't know whether it is actually still broken. :)

Bobby suggests testing a Try push that:

(1) Removes the special stylo behavior in mozjemalloc
(2) Passes --enable-stylo on all platforms.
Flags: needinfo?(cam) → needinfo?(slyu)
I'm glad to help, but I'm not quite sure about (1), does that mean applying the patch in comment 2 or is there anything else I need to modify?
Flags: needinfo?(shing.lyu) → needinfo?(bobbyholley)
(In reply to Shing Lyu [:shinglyu] from comment #18)
> I'm glad to help, but I'm not quite sure about (1), does that mean applying
> the patch in comment 2 or is there anything else I need to modify?

That's right.
Flags: needinfo?(bobbyholley)
linux64-asan build seems to work in [1]. I deliberately add a use-after-free error in Gecko and see some ASAN report in [2].

I tried to make OSX and Windows debug build asan using the path [3], but looks like they still can't find the clang they needed. 

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=6583707fd2b19b2e894e3adc28c04f38b09dd15a
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=73e544cec450fdbb3535093c47394838aad24750
[3] https://hg.mozilla.org/try/rev/7bae347b0e953bc0176283184e9046a0e4e88e2d
(In reply to Shing Lyu [:shinglyu] from comment #20)
> linux64-asan build seems to work in [1]. I deliberately add a use-after-free
> error in Gecko and see some ASAN report in [2].

Nice! That's really great news. Can you also do a push with the reftest and mochitest with asan? Would be good to know whether we have any UAFs beyond the code exercised by the build job (which is just some xpcshell smoketests IIRC).

> 
> I tried to make OSX and Windows debug build asan using the path [3], but
> looks like they still can't find the clang they needed. 

Ok, that sounds like the mostly-orthogonal issue of getting stylo build support on other platforms. Chris, do we have bugs on file to track this?

> 
> [1]
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=6583707fd2b19b2e894e3adc28c04f38b09dd15a
> [2]
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=73e544cec450fdbb3535093c47394838aad24750
> [3] https://hg.mozilla.org/try/rev/7bae347b0e953bc0176283184e9046a0e4e88e2d
Flags: needinfo?(shing.lyu)
Flags: needinfo?(cpeterson)
(In reply to Shing Lyu [:shinglyu] from comment #20)
> I tried to make OSX and Windows debug build asan using the path [3], but
> looks like they still can't find the clang they needed. 
> 
> [1]
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=6583707fd2b19b2e894e3adc28c04f38b09dd15a
> [2]
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=73e544cec450fdbb3535093c47394838aad24750
> [3] https://hg.mozilla.org/try/rev/7bae347b0e953bc0176283184e9046a0e4e88e2d

Nathan, shouldn't the Windows and Mac builders get the clang packages they need from Stylo bootstrap bug 1314355? Do we need to do something else to update the builders?
Assignee: nobody → shing.lyu
Depends on: 1314355
Flags: needinfo?(cpeterson) → needinfo?(nfroyd)
Sorry, the linux link in Comment 20 didn't have stylo enabled. This one is the correct one (stylo + linux64-asan/debug and stylo + linux64-asan/opt), and with all the tests bholley requested: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d31cf95b1ac021ba56d010754062a7f3134c7e42
Flags: needinfo?(shing.lyu)
(In reply to Chris Peterson [:cpeterson] from comment #22)
> (In reply to Shing Lyu [:shinglyu] from comment #20)
> > I tried to make OSX and Windows debug build asan using the path [3], but
> > looks like they still can't find the clang they needed. 
> > 
> > [1]
> > https://treeherder.mozilla.org/#/
> > jobs?repo=try&revision=6583707fd2b19b2e894e3adc28c04f38b09dd15a
> > [2]
> > https://treeherder.mozilla.org/#/
> > jobs?repo=try&revision=73e544cec450fdbb3535093c47394838aad24750
> > [3] https://hg.mozilla.org/try/rev/7bae347b0e953bc0176283184e9046a0e4e88e2d
> 
> Nathan, shouldn't the Windows and Mac builders get the clang packages they
> need from Stylo bootstrap bug 1314355? Do we need to do something else to
> update the builders?

Um.  The Windows failure looks like mozconfig bustage, rather than not being able to the find the particular clang packages.  Though I suspect that our normal Windows builds (i.e. the not-static analysis ones) won't be downloading the clang packages they would need, so that will need to be fixed.

The OS X failure looks like the ASan libraries weren't built for the clang we use, so configure errored when using the ASan flags for something.  So we'd have to update our build scripts to build those libraries.  Note that we don't actually build Mac ASan builds anywhere on infra, AFAIK, though I have seen some of our developers building with ASan locally.
Flags: needinfo?(nfroyd)
Blocks: 1363976
So, the leaks there should definitely be investigated. However, I believe those are from running asan with stylo built+enabled, rather than just with stylo built (which is the near-term goal).

Can you redo the push with [1] set to false? That will tell us if asan builds are green with stylo built+disabled.

[1] http://searchfox.org/mozilla-central/rev/d66b9f27d5630a90b2fce4d70d4e9050f43df9b4/modules/libpref/init/all.js#5638
Flags: needinfo?(shing.lyu)
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #25)
> So, the leaks there should definitely be investigated. However, I believe
> those are from running asan with stylo built+enabled, rather than just with
> stylo built (which is the near-term goal).
> 
> Can you redo the push with [1] set to false? That will tell us if asan
> builds are green with stylo built+disabled.
> 
> [1]
> http://searchfox.org/mozilla-central/rev/
> d66b9f27d5630a90b2fce4d70d4e9050f43df9b4/modules/libpref/init/all.js#5638

I did this in bug 1356991 comment 4.
Flags: needinfo?(shing.lyu)
Double confirm: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7f405f68728eee3be0bd28ea0dfbf05c0f7417eb
(In reply to Mike Hommey [:glandium] from comment #6)
> We still need to find a way to make this work.
> 
> ni?'ing myself to keep that in my queue.

So, in the meantime, bug 1361258 landed and --disable-jemalloc and --enable-stylo don't conflict anymore.
Flags: needinfo?(mh+mozilla)
(In reply to Mike Hommey [:glandium] from comment #28)
> So, in the meantime, bug 1361258 landed and --disable-jemalloc and
> --enable-stylo don't conflict anymore.

Hooray! Thanks for fixing that.

Shing, would you be able to try it out again and see if we can have a stylo-off ASAN builds and run successfully across platforms on try?
Flags: needinfo?(shing.lyu)
Nathan, is there anything left to do here, or did you manage to build by default on ASAN?
Flags: needinfo?(nfroyd)
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #30)
> Nathan, is there anything left to do here, or did you manage to build by
> default on ASAN?

Stylo-by-default built on ASan OK.
Flags: needinfo?(nfroyd)
Resolving as WFM as per Nathan's comment 31.
Flags: needinfo?(shing.lyu)
Alias: stylo-asan
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
Summary: make ASAN builds work → make Stylo ASAN builds work
You need to log in before you can comment on or make changes to this bug.