Closed Bug 1688452 Opened 4 years ago Closed 3 years ago

Sandbox libexpat using RLBox

Tracking

()

Status:

RESOLVED FIXED

Milestone:

96 Branch

Tracking Flags:

Tracking

Status

firefox96

---

fixed

People

(Reporter: deian, Assigned: deian)

References

(Blocks 2 open bugs)

Details

(Keywords: perf-alert)

Attachments

(1 file, 10 obsolete files)

Bug 1688452 - Part 1: Retrofit nsHtml5StreamParser to use RLBoxed libexpat 4 years ago Deian Stefan (deleted), text/x-phabricator-request		Details
Bug 1688452 - Retrofit nsExpatDriver to use RLBoxed libexpat 4 years ago Deian Stefan (deleted), text/x-phabricator-request		Details
Bug 1688452 - Part 3: Add Wasm sandbox support for RLBoxed libexpat 4 years ago Deian Stefan (deleted), text/x-phabricator-request		Details
Bug 1688452 - Part 4: Turn on Wasm sandboxing for expat 4 years ago Deian Stefan (deleted), text/x-phabricator-request		Details
Bug 1688452 - Part 2: Retrofit nsHtml5StreamParser to use RLBoxed libexpat 3 years ago Deian Stefan (deleted), text/x-phabricator-request		Details
Bug 1688452 - Part 3: Eliminate redundant callback implementation in RLBoxed libexpat 3 years ago Deian Stefan (deleted), text/x-phabricator-request		Details
Bug 1688452 - Turn on Wasm sandboxing for libexpat 3 years ago Deian Stefan (deleted), text/x-phabricator-request		Details
Bug 1688452 - Part 5: Make expat sandbox fallible 3 years ago Deian Stefan (deleted), text/x-phabricator-request		Details
Bug 1688452 - Part 6: Use UniquePtr for expat sandbox 3 years ago Deian Stefan (deleted), text/x-phabricator-request		Details
Bug 1688452 - Part 7: Add bump allocator to sandboxed expat 3 years ago Deian Stefan (deleted), text/x-phabricator-request		Details
Bug 1688452 - Part 8: Use bump allocator in nsExpatDriver::Handle{Start,End}Element 3 years ago Deian Stefan (deleted), text/x-phabricator-request		Details

Deian Stefan

Assignee

Description

•

4 years ago

No description provided.

Deian Stefan

Assignee

Comment 1

•

4 years ago

Attached file Bug 1688452 - Part 1: Retrofit nsHtml5StreamParser to use RLBoxed libexpat (obsolete) (deleted) — Details

Deian Stefan

Assignee

Comment 2

•

4 years ago

AFAICT none of the tests in html/parser exercise the stream code that uses expat. I'd love some suggestion for how to at least sanity test this.

EDIT: figured it out. view source of some xml file works

Phabricator Automation

Updated

•

4 years ago

Attachment #9198857 - Attachment description: Bug 1688452 - Part 1: Retrofit streaming HTML parser with RLBox sandboxed expat → Bug 1688452 - Part 1: Retrofit XML view source parser with RLBox-sandboxed libexpat

Gian-Carlo Pascutto [:gcp]

Updated

•

4 years ago

Severity: -- → S4

Priority: -- → P1

Deian Stefan

Assignee

Comment 3

•

4 years ago

Attached file Bug 1688452 - Retrofit nsExpatDriver to use RLBoxed libexpat (deleted) — Details

Depends on D102851

Deian Stefan

Assignee

Updated

•

4 years ago

Blocks: 1693991

Deian Stefan

Assignee

Comment 4

•

4 years ago

Attached file Bug 1688452 - Part 3: Add Wasm sandbox support for RLBoxed libexpat (obsolete) (deleted) — Details

Depends on D102851

Deian Stefan

Assignee

Comment 5

•

4 years ago

Sorry for the two Part 2s. I think we should split out the two uses of libexpat so we can back out things more easily.

So let's hold off on D104658 for a bit. I'll upload a new version of the nsExpatDriver that turns on sandboxing too (and addresses @tjr's comments). For this we need to preload the library much earlier. I did this, but ran into a RLBox assertion when registering callback (could not find an empty slot in sandbox function table). We are creating quite a few sandboxes so this might be the reason, but @shravanrn and I need to sit down and see what's up.

Deian Stefan

Assignee

Comment 6

•

4 years ago

Tom: I was thinking of creating a single sandbox for nsExpatDriver vs. per parser. This might be an okay first step, but we're still curious what the attacker model is here (i.e., do we want something finer grained)?

Flags: needinfo?(tom)

Bobby Holley (:bholley)

Comment 7

•

4 years ago

(In reply to Deian Stefan from comment #6)

Tom: I was thinking of creating a single sandbox for nsExpatDriver vs. per parser. This might be an okay first step, but we're still curious what the attacker model is here (i.e., do we want something finer grained)?

I think the attacker model is basically that somebody injects malformed XML in order to trigger a memory hazard and take over the content process. To that end, I'm not sure what multiple sandboxes would buy us (for libexpat or for any other library I can think of).

What is the memory overhead of each sandbox? If it's non-trivial, we should seriously consider just having a single sandbox per process shared across all RLBoxed libraries. Any reason not to do that?

Deian Stefan

Assignee

Comment 8

•

4 years ago

Sounds good to me. I think we can coalesce things into a single sandbox (beyond expat), but right now the interface tied to library is easier to manage and reason about (e.g., which callbacks can be called by what library). We may want to chat about this in more detail.

For nsExpatDriver, specifically, I think we want two sandboxes (one with callbacks for system principal, one not). I assume like the other usage of expat (specifically this comment), we can't rely on thread locals to keep track of the driver (this)?

Bobby Holley (:bholley)

Comment 9

•

4 years ago

(In reply to Deian Stefan from comment #8)

Sounds good to me. I think we can coalesce things into a single sandbox (beyond expat), but right now the interface tied to library is easier to manage and reason about (e.g., which callbacks can be called by what library).

Perhaps we could keep the interface the same, but simply have create_sandbox share the lucet/wasmtime sandbox under the hood (probably via TLS)?

We may want to chat about this in more detail.

Sounds good, we can talk on Tuesday.

For nsExpatDriver, specifically, I think we want two sandboxes (one with callbacks for system principal, one not).

Can you say more about this?

Bobby Holley (:bholley)

Comment 10

•

4 years ago

(In reply to Bobby Holley (:bholley) from comment #9)

For nsExpatDriver, specifically, I think we want two sandboxes (one with callbacks for system principal, one not).

Can you say more about this?

In particular: I believe the separate codepath for system principal code in nsExpatDriver comes from bug 1539759. The idea there was that attackers could use langpack DTDs to inject markup into privileged documents, and so we modified the parsing code for privileged documents to disallow that. I don't think there's any reason why it would be dangerous for a compromised libexpat (presumably parsing a non-system-principal document containing the exploit) to call the system-principal version of the callbacks.

In general I think it would be helpful to have Peter take a look at the patches here and give a general take as to whether they make sense.

Flags: needinfo?(peterv)

Tom Ritter [:tjr] (Back 9/5)

Comment 11

•

4 years ago

In general I think we should be wary about colocating wasm libraries I think. If we assume an attacker can control the inside of a wasm sandbox, manipulate it fully, and return bad values; then a library A that introduces a vulnerability allowing takeover would allow the attacker to manipulate the result returned by library B. In theory library B's values should be sanity checked to prevent anything too-bad from happening but in practice we aren't putting in all the checks presently. (Bug 1693991) With Fission you won't be able to attack cross-site for the most part, so the risk is lower also.

That said this attack is going to be pretty darn complicated to pull off; and I don't think it's unreasonable to err on the side of performance until we have an inkling that this has become more concerning.

Flags: needinfo?(tom)

Bobby Holley (:bholley)

Comment 12

•

4 years ago

Yeah. I think it's really a question of the memory cost, primarily on windows, which we won't have a good understanding of until we actually land rlbox support on windows.

For now, I think the best approach is to keep the sandboxes conceptually separate (per library I think, rather than per instantiation of the library). That aligns with the design of the rlbox API, and gives us the most flexibility. Later on, if the memory usage is significant, we can coalesce the sandboxes under the hood per my suggestion in comment 9.

Deian Stefan

Assignee

Updated

•

4 years ago

Depends on: 1699028

Phabricator Automation

Updated

•

4 years ago

Attachment #9198857 - Attachment description: Bug 1688452 - Part 1: Retrofit XML view source parser with RLBox-sandboxed libexpat → Bug 1688452 - Part 1: Retrofit nsHtml5StreamParser to use RLBoxed libexpat

Phabricator Automation

Updated

•

4 years ago

Attachment #9202274 - Attachment description: Bug 1688452 - Part 2: Retrofit nsExpatDriver to use RLBox-sandboxed libexpat → Bug 1688452 - Part 2: Retrofit nsExpatDriver to use RLBoxed libexpat

Phabricator Automation

Updated

•

4 years ago

Attachment #9205053 - Attachment description: Bug 1688452 - Part 2: Turn on libexpat sandbox for XML view source parser → Bug 1688452 - Part 3: Add Wasm sandbox support for RLBoxed libexpat

Deian Stefan

Assignee

Comment 13

•

4 years ago

Attached file Bug 1688452 - Part 4: Turn on Wasm sandboxing for expat (obsolete) (deleted) — Details

Depends on D106254

Deian Stefan

Assignee

Comment 14

•

4 years ago

The second patch will need a tiny fix once we fix Bug 1699028, but otherwise I think we're set for another glance (will follow up on phab).

Deian Stefan

Assignee

Comment 15

•

4 years ago

Kicked off try https://treeherder.mozilla.org/jobs?repo=try&revision=81d33b8e68144428a8e3982af2531694f2bbe928

Deian Stefan

Assignee

Comment 16

•

3 years ago

Attached file Bug 1688452 - Part 2: Retrofit nsHtml5StreamParser to use RLBoxed libexpat (obsolete) (deleted) — Details

Phabricator Automation

Updated

•

3 years ago

Attachment #9198857 - Attachment is obsolete: true

Deian Stefan

Assignee

Updated

•

3 years ago

Depends on: 1715198

Phabricator Automation

Updated

•

3 years ago

Attachment #9225773 - Attachment description: Bug 1688452 - Part 1: Retrofit nsHtml5StreamParser to use RLBoxed libexpat → Bug 1688452 - Part 1: Retrofit nsHtml5StreamParser to use RLBoxed libexpat r=tjr

Phabricator Automation

Updated

•

3 years ago

Attachment #9202274 - Attachment description: Bug 1688452 - Part 2: Retrofit nsExpatDriver to use RLBoxed libexpat → Bug 1688452 - Part 2: Retrofit nsExpatDriver to use RLBoxed libexpat r=tjr

Phabricator Automation

Updated

•

3 years ago

Attachment #9205053 - Attachment description: Bug 1688452 - Part 3: Add Wasm sandbox support for RLBoxed libexpat → Bug 1688452 - Part 3: Add Wasm sandbox support for RLBoxed libexpat r=tjr

Phabricator Automation

Updated

•

3 years ago

Attachment #9209651 - Attachment description: Bug 1688452 - Part 4: Turn on Wasm sandboxing for libexpat → Bug 1688452 - Part 4: Turn on Wasm sandboxing for libexpat r=tjr

Phabricator Automation

Updated

•

3 years ago

Attachment #9225773 - Attachment description: Bug 1688452 - Part 1: Retrofit nsHtml5StreamParser to use RLBoxed libexpat r=tjr → Bug 1688452 - Part 1: Retrofit nsHtml5StreamParser to use RLBoxed libexpat

Phabricator Automation

Updated

•

3 years ago

Attachment #9202274 - Attachment description: Bug 1688452 - Part 2: Retrofit nsExpatDriver to use RLBoxed libexpat r=tjr → Bug 1688452 - Part 2: Retrofit nsExpatDriver to use RLBoxed libexpat

Phabricator Automation

Updated

•

3 years ago

Attachment #9205053 - Attachment description: Bug 1688452 - Part 3: Add Wasm sandbox support for RLBoxed libexpat r=tjr → Bug 1688452 - Part 3: Add Wasm sandbox support for RLBoxed libexpat

Phabricator Automation

Updated

•

3 years ago

Attachment #9209651 - Attachment description: Bug 1688452 - Part 4: Turn on Wasm sandboxing for libexpat r=tjr → Bug 1688452 - Part 4: Turn on Wasm sandboxing for libexpat

Phabricator Automation

Updated

•

3 years ago

Attachment #9209651 - Attachment description: Bug 1688452 - Part 4: Turn on Wasm sandboxing for libexpat → Bug 1688452 - Part 4: Load librlbox in artifact jobs

Phabricator Automation

Updated

•

3 years ago

Attachment #9209651 - Attachment description: Bug 1688452 - Part 4: Load librlbox in artifact jobs → Bug 1688452 - Part 4: Turn on Wasm sandboxing for expat

Phabricator Automation

Updated

•

3 years ago

Attachment #9209651 - Attachment is obsolete: true

Deian Stefan

Assignee

Comment 17

•

3 years ago

Attached file Bug 1688452 - Part 3: Eliminate redundant callback implementation in RLBoxed libexpat (obsolete) (deleted) — Details

The default handler and character-data handler callbacks are identical
and some Windows compilers just reconciled them into a single function.
This, unfortunately, resulted in a RLBox runtime error: the same
callback was registered twice. This patch removes the duplicate handler
implementation and just sets the character-data handler callback as the
default handler.

Depends on D106254

Phabricator Automation

Updated

•

3 years ago

Attachment #9242504 - Attachment description: Bug 1688452 - Part 4: Eliminate redundant callback implementation in RLBoxed libexpat → Bug 1688452 - Part 3: Eliminate redundant callback implementation in RLBoxed libexpat

Phabricator Automation

Updated

•

3 years ago

Attachment #9205053 - Attachment description: Bug 1688452 - Part 3: Add Wasm sandbox support for RLBoxed libexpat → Bug 1688452 - Part 4: Add Wasm sandbox support for RLBoxed libexpat

Deian Stefan

Assignee

Comment 18

•

3 years ago

Latest try https://treeherder.mozilla.org/jobs?repo=try&revision=f7f5e3a81d68ef2f5ce53d0b2ca74a498c7df552&selectedTaskRun=bhap1EX9RImo6ljpfEoDGg.0

Deian Stefan

Assignee

Updated

•

3 years ago

Flags: needinfo?(peterv)

Pulsebot

Comment 19

•

3 years ago

Pushed by tritter@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/23ef68478e93 Part 1: Retrofit nsHtml5StreamParser to use RLBoxed libexpat r=tjr,peterv https://hg.mozilla.org/integration/autoland/rev/00fd325069fa Part 2: Retrofit nsExpatDriver to use RLBoxed libexpat r=tjr,peterv https://hg.mozilla.org/integration/autoland/rev/797a7e243d43 Part 3: Eliminate redundant callback implementation in RLBoxed libexpat r=tjr https://hg.mozilla.org/integration/autoland/rev/1a720cffc019 Part 4: Add Wasm sandbox support for RLBoxed libexpat r=tjr

Pulsebot

Comment 20

•

3 years ago

Backout by csabou@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/ab54fa20c599 Backed out 4 changesets for assertion and bc failures on browser_translation_bing.js.

Cosmin Sabou [:CosminS]

Comment 21

•

3 years ago

Backed out for causing several test failures.

Push with failures: https://treeherder.mozilla.org/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception&fromchange=1a720cffc019aff6f92a974b681e7190199406a0&tochange=06aba176b93a62cb358eb2d9b7062c104a411a3a&selectedTaskRun=BM0T3vYfS_6WraQa587lMg.0

Failure logs:

Backout link: https://hg.mozilla.org/integration/autoland/rev/ab54fa20c599a000a04361c6ab6674c2bdaac987

Flags: needinfo?(deian)

Alexandru Ionescu (needinfo me) [:alexandrui]

Comment 22

•

3 years ago

(In reply to Pulsebot from comment #20)

Backout by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ab54fa20c599
Backed out 4 changesets for assertion and bc failures on
browser_translation_bing.js.

== Change summary for alert #31711 (as of Wed, 06 Oct 2021 06:38:13 GMT) ==

Regressions:

Ratio	Test	Platform	Options	Absolute values (old vs new)
2%	google-docs fnbpaint	linux1804-64-shippable-qr	warm webrender	257.02 -> 262.75

Improvements:

Ratio	Test	Platform	Options	Absolute values (old vs new)
2%	google-docs fnbpaint	linux1804-64-shippable-qr	warm webrender	261.82 -> 255.94

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=31711

Alexandru Ionescu (needinfo me) [:alexandrui]

Updated

•

3 years ago

Keywords: perf-alert

Deian Stefan

Assignee

Comment 23

•

3 years ago

With wasmboxing not enabled (just applying first three patches), I think all the windows failures are unrelated to these sets of patches: https://treeherder.mozilla.org/jobs?revision=f62f11e713cf8867b30572c0e22e2698d1332f87&repo=try&selectedTaskRun=Y1Gk7mquRzuSeP8BbReEmg.0

Going to investigate the alternative flags that we set when we enable wasmboxing

Deian Stefan

Assignee

Comment 24

•

3 years ago

Attached file Bug 1688452 - Turn on Wasm sandboxing for libexpat (obsolete) (deleted) — Details

Depends on D106254

Bobby Holley (:bholley)

Comment 25

•

3 years ago

(In reply to Alexandru Ionescu (needinfo me) [:alexandrui] from comment #22)

Regressions:

Ratio Test Platform Options Absolute values (old vs new)

2% google-docs fnbpaint linux1804-64-shippable-qr warm webrender 257.02 -> 262.75

Improvements:

Ratio Test Platform Options Absolute values (old vs new)

2% google-docs fnbpaint linux1804-64-shippable-qr warm webrender 261.82 -> 255.94

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=31711

Bas, could use your insight on this. This patch compiles libexpat to WASM to sandbox it. It was backed out, which caused an associated regression (which reverted post-backout). Round-tripping through WASM definitely adds some overhead, and this regression would suggest that libexpat is on the hot path. However, it's a bit odd to me that the regression only surfaced on linux (and only on gdocs). Do you have a sense of how concerned we should be and how much investigation we should do? What next steps would you recommend?

Sandboxing expat is certainly not required, but it would be nice.

Flags: needinfo?(bas)

Bobby Holley (:bholley)

Comment 26

•

3 years ago

OK nevermind, it's actually straightforward to see non-trivial libexpat usage by just loading a regular gdoc (no need to mess around with automation).

STR:
(1) Run a shippable build
(2) Start the firefox profiler
(3) Open (or shift+reload) https://docs.google.com/document/d/1TNnya6B8pyomDK2F1R9CL3dY10OAmqWlnCxsWyOBDVQ/edit#heading=h.gjgriz8s7dlc
(4) capture profile
(5) Filter samples by MOZ_XML_PARSE

On Nightly OSX, I get around 45 milliseconds spent in expat for SVG parsing. That's pretty significant, and adding normal WASM overhead to that may not be acceptable. That said, a bunch of the work is in callbacks, so the true overhead might be less.

To do a proper comparison, I backfilled shippable OSX builds for the revision this landed, and the one prior:
https://treeherder.mozilla.org/jobs?repo=autoland&revision=1a720cffc019aff6f92a974b681e7190199406a0&searchStr=OSX%2Cshippable
https://treeherder.mozilla.org/jobs?repo=autoland&revision=aa67c5a8e93c72482b4204fe401bef662987cfbe&searchStr=OSX%2Cshippable

When those complete, I'll profile them side by side to get a better picture.

Flags: needinfo?(bas)

Bobby Holley (:bholley)

Comment 27

•

3 years ago

Without rlbox, ~55ms under MOZ_XML_PARSE: https://share.firefox.dev/2Z3CCtU
With rlbox, ~76ms under MOZ_XML_PARSE: https://share.firefox.dev/3aFtyh8

Unfortunately, looks like expat is moderately hot for SVG parsing. :-(

Deian Stefan

Assignee

Comment 28

•

3 years ago

We'll take a look at where the perf overhead is coming from.

Another path: Maybe we can turn it on for nsHtml5StreamParser for in the short term, but not for nsExpatDriver (which we can put behind a flag). The downside is we'd have two copies of the library.

Bobby Holley (:bholley)

Comment 29

•

3 years ago

So I dug into the profiles a bit. Zooming on on the XML parsing in imgRequest::OnDataAvailable (i.e. SVG parsing), it looks the difference is mainly in the callback overhead from the wasmboxed libexpat to nsExpatDriver::Handle{Start,End}Element:

with rlbox, ~19ms: https://profiler.firefox.com/public/jpecm39135gw4spqx6vr3w7dnx8zw2hsj8qtstg/calltree/?globalTrackOrder=80w7&hiddenGlobalTracks=1w6&localTrackOrderByPid=15229-2301~15230-0~15234-0~15397-0~15242-0~15392-0~15237-0~15235-01&thread=9&timelineType=cpu-category&transforms=f-combined-0w859aF9gE7wE9S1OpOtP6WhP4WlwWnWpWr~f-combined-WrXbWuWv&v=6

with rlbox, ~34ms: https://profiler.firefox.com/public/pqrayjcgk7ee9w55zx9m7bbck9fjpkedp1dnn50/calltree/?globalTrackOrder=90w8&hiddenGlobalTracks=1w7&hiddenLocalTracksByPid=15359-0&localTrackOrderByPid=15359-2301~15360-0~15404-0~15414-0~15367-0~15365-0~15361-0~15409-0~15362-01&thread=a&timelineType=cpu-category&transforms=f-combined-0w859ascmorLnIjIkIpToInT8wTbYr~f-combined-YrZcZeZ8Tm&v=6

Seems like there's a bunch of malloc/free at the boundary, both within lookup_symbol and outside of it, as well as some locking. This all strikes me as stuff that can be optimized out. Expat is only ever accessed on the main thread, so the locking shouldn't be necessary, and there's almost certainly a way to design the system such that wasm can repeatedly invoke a callback without mallocing each time.

I was concerned when I first profiled that we were just dealing with the overhead of executing more instructions to implement wasm semantics. But if it's primarily just issues like those described above, I'm much more optimistic that we can make it work.

Shravan Narayan

Comment 30

•

3 years ago

Oh interesting! Yup, the move to static linking will definitely remove lookup_symbol related overheads. Other sources of malloc at the boundary would still have to be addressed though.

Deian Stefan

Assignee

Comment 31

•

3 years ago

Attached file Bug 1688452 - Part 5: Make expat sandbox fallible (obsolete) (deleted) — Details

Depends on D106254

Phabricator Automation

Updated

•

3 years ago

Attachment #9246234 - Attachment description: Bug 1688452 - Part 5: Turn on Wasm sandbox libexpat → Bug 1688452 - Part 6: Turn on Wasm sandbox libexpat

Deian Stefan

Assignee

Comment 32

•

3 years ago

Attached file Bug 1688452 - Part 6: Use UniquePtr for expat sandbox (obsolete) (deleted) — Details

Depends on D130148

Deian Stefan

Assignee

Comment 33

•

3 years ago

Attached file Bug 1688452 - Part 7: Add bump allocator to sandboxed expat (obsolete) (deleted) — Details

Depends on D130238

Deian Stefan

Assignee

Comment 34

•

3 years ago

Attached file Bug 1688452 - Part 8: Use bump allocator in nsExpatDriver::Handle{Start,End}Element (obsolete) (deleted) — Details

Depends on D130239

Phabricator Automation

Updated

•

3 years ago

Attachment #9246234 - Attachment description: Bug 1688452 - Part 6: Turn on Wasm sandbox libexpat → Bug 1688452 - Part 9: Turn on Wasm sandbox libexpat

Bobby Holley (:bholley)

Comment 35

•

3 years ago

I did some more profiling with the static linking changes and the bump allocator. This time I did five refreshes for each profile to smooth out the noise. It appears that if we eliminate the pthread overhead (by stubbing out the synchronization we don't need) as well as the allocation and copying at the boundary (by simply passing the strings directly into the callbacks), we should get within about 1-2% of raw parse performance.

Here are profiles with a bunch of equivalent work filtered out from each, and the aforementioned overhead filtered out from the RLBox profile.

Without RLBox: 98ms (base parse overhead of 282ms) https://share.firefox.dev/3GZY7xd
With RLBox: 103ms (base parse overhead of 333ms) https://share.firefox.dev/3bEZ1AM

Phabricator Automation

Updated

•

3 years ago

Attachment #9248995 - Attachment is obsolete: true

Phabricator Automation

Updated

•

3 years ago

Attachment #9248994 - Attachment is obsolete: true

Phabricator Automation

Updated

•

3 years ago

Attachment #9248993 - Attachment is obsolete: true

Phabricator Automation

Updated

•

3 years ago

Attachment #9248826 - Attachment is obsolete: true

Phabricator Automation

Updated

•

3 years ago

Attachment #9242504 - Attachment is obsolete: true

Phabricator Automation

Updated

•

3 years ago

Attachment #9205053 - Attachment description: Bug 1688452 - Part 4: Add Wasm sandbox support for RLBoxed libexpat → Bug 1688452 - Part 3: Add Wasm sandbox support for RLBoxed libexpat

Phabricator Automation

Updated

•

3 years ago

Attachment #9246234 - Attachment description: Bug 1688452 - Part 9: Turn on Wasm sandbox libexpat → Bug 1688452 - Part 4: Turn on Wasm sandbox libexpat

Bobby Holley (:bholley)

Comment 36

•

3 years ago

PGO build: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f98ba2964b7185b9f6d31d20b063905ef29d8463
auto-try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4fd78fd646a954eea6549e65ac6b45d438cb6ae2

Bobby Holley (:bholley)

Comment 37

•

3 years ago

I did quite a bit of performance measurement on the latest patches (which eliminate all the boundary allocation, copying, and locking, at least on 64-bit). The upshot is that, for the gdocs testcase I measured, RLBox introduces an SVG parsing overhead of about 7% on 64-bit platforms (~80% of our users), and about 20% on 32-bit platforms.

The testcase I focused on was the sprite sheet loaded by gdocs. Gdocs actually loads a number of different sprite sheets. During core page-load, it loads both a small high-priority sheet (5k compressed, 21k uncompressed) and a large low-priority sheet (80k compressed, 335k uncompressed). Then several seconds later (after the doc is largely rendered), it also loads 2 more sheets similar to the large one. The small sheet takes about a millisecond or so to parse, the large sheet takes about 20 milliseconds. Given that the large sheet it marked as low priority, it seems like there we ought to be able to defer parsing it until after initial page load — but that's somewhat out of scope here.

I took measurements by repeatedly loading a testcase 10 times to eliminate noise. I tested on both the gdoc from comment 26, as well as just the sprite sheet — amplified by concatenating the contents 10 times. I did some spot-checking of other test-cases, but nothing exhaustive. The profiles are as follows:

32-bit windows:

Baseline
** loading gdoc (10x): https://share.firefox.dev/3kfVZrm
** loading amplified-sprites (10x): https://share.firefox.dev/3mUGoyT
RLBoxed Expat
** loading gdoc (10x): https://share.firefox.dev/3EY0WwJ
** loading amplified-sprites (10x): https://share.firefox.dev/3o9je7w

64-bit windows:

Baseline
** loading gdoc (10x): https://share.firefox.dev/3H7a5VQ
** loading amplified-sprites (10x): https://share.firefox.dev/3EUn0IA
RLBoxed Expat
** loading gdoc (10x): https://share.firefox.dev/3kiUxV2
** loading amplified-sprites (10x): https://share.firefox.dev/3BWZ4Co

64-bit mac:

Baseline:
** loading amplified-sprites (10x): https://share.firefox.dev/3H6UqFJ
RLBoxed Expat:
** loading amplified-sprites (10x): https://share.firefox.dev/3qyTVPd

Whether or not this overhead is acceptable depends on how frequently users encounter large SVG images. Bas believes that SVG is becoming more important, but that doesn't necessarily mean SVG images. For example, google slides spends much more time in SVG rendering ( https://share.firefox.dev/3qh5gmz ) that google docs, but has identical parse overhead (presumably because all of the slide-specific SVG construction is done programmatically).

I dug into both the tp5 and tp6 pagesets (snapshotted a few years ago), and found very few SVG images. So most widespread cost of shipping this today is likely to be a ~2ms regression loading gsuite — which is probably acceptable, and also probably something we can turn into a 20ms improvement via the priority handling discussed above.

In summary, I think we should ship this.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 38

•

3 years ago

If all the XML parsing is RLBoxed, how much does this affect to initial startup? Frontend has quite a bit xml and in the cases startup cache is empty, all that needs to be parsed.

Bobby Holley (:bholley)

Comment 39

•

3 years ago

(In reply to Olli Pettay [:smaug] from comment #38)

If all the XML parsing is RLBoxed, how much does this affect to initial startup? Frontend has quite a bit xml and in the cases startup cache is empty, all that needs to be parsed.

I believe we have cold startup perf tests running in CI, so the fact that we didn't see them in comment 22 (when the RLBoxing overhead of these patches was about 5 times higher than it is now) suggests we're probably ok. I suspect the total volume of actual semantic XML is quite a bit less than the 335k SVG sprite sheet (but definitely let me know if I'm off-base).

Deian Stefan

Assignee

Updated

•

3 years ago

Depends on: 1740624

Phabricator Automation

Updated

•

3 years ago

Attachment #9202274 - Attachment description: Bug 1688452 - Part 2: Retrofit nsExpatDriver to use RLBoxed libexpat → Bug 1688452 - Retrofit nsExpatDriver to use RLBoxed libexpat

Phabricator Automation

Updated

•

3 years ago

Attachment #9205053 - Attachment is obsolete: true

Phabricator Automation

Updated

•

3 years ago

Attachment #9225773 - Attachment description: Bug 1688452 - Part 1: Retrofit nsHtml5StreamParser to use RLBoxed libexpat → Bug 1688452 - Retrofit nsHtml5StreamParser to use RLBoxed libexpat

Phabricator Automation

Updated

•

3 years ago

Attachment #9246234 - Attachment description: Bug 1688452 - Part 4: Turn on Wasm sandbox libexpat → Bug 1688452 - Turn on Wasm sandboxing for libexpat

Phabricator Automation

Updated

•

3 years ago

Attachment #9202274 - Attachment description: Bug 1688452 - Retrofit nsExpatDriver to use RLBoxed libexpat → Bug 1688452 - Part 1: Retrofit nsExpatDriver to use RLBoxed libexpat

Phabricator Automation

Updated

•

3 years ago

Attachment #9225773 - Attachment description: Bug 1688452 - Retrofit nsHtml5StreamParser to use RLBoxed libexpat → Bug 1688452 - Part 2: Retrofit nsHtml5StreamParser to use RLBoxed libexpat

Phabricator Automation

Updated

•

3 years ago

Attachment #9246234 - Attachment description: Bug 1688452 - Turn on Wasm sandboxing for libexpat → Bug 1688452 - Part 3: Turn on Wasm sandboxing for libexpat

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 40

•

3 years ago

Does this affect innerHTML speed on XHTML documents?
Found an old test http://mozilla.pettay.fi/moztests/innerhtml_nonsimple_nostyle_test.xhtml
xhtml is already quite a bit slower than html http://mozilla.pettay.fi/moztests/innerhtml_nonsimple_nostyle_test.html so I'd rather not slow it even more. But maybe we're lucky and RLBox somehow speeds up memory accesses or something :)

Bobby Holley (:bholley)

Updated

•

3 years ago

Depends on: 1741219

Bobby Holley (:bholley)

Comment 41

•

3 years ago

(In reply to Bobby Holley (:bholley) from comment #37)

I dug into both the tp5 and tp6 pagesets (snapshotted a few years ago), and found very few SVG images. So most widespread cost of shipping this today is likely to be a ~2ms regression loading gsuite — which is probably acceptable, and also probably something we can turn into a 20ms improvement via the priority handling discussed above.

This is now tracked in bug 1741224.

I'll profile startup and the testcases from comment 40 when I get a minute.

Phabricator Automation

Updated

•

3 years ago

Attachment #9225773 - Attachment is obsolete: true

Phabricator Automation

Updated

•

3 years ago

Attachment #9202274 - Attachment description: Bug 1688452 - Part 1: Retrofit nsExpatDriver to use RLBoxed libexpat → Bug 1688452 - Retrofit nsExpatDriver to use RLBoxed libexpat

Phabricator Automation

Updated

•

3 years ago

Attachment #9246234 - Attachment description: Bug 1688452 - Part 3: Turn on Wasm sandboxing for libexpat → Bug 1688452 - Turn on Wasm sandboxing for libexpat

Bobby Holley (:bholley)

Comment 42

•

3 years ago

(In reply to Olli Pettay [:smaug] from comment #38)

If all the XML parsing is RLBoxed, how much does this affect to initial startup? Frontend has quite a bit xml and in the cases startup cache is empty, all that needs to be parsed.

I profiled startup with the patches. It appears that, even for a fresh profile, all of the browser chrome is already read from the prototype cache (presumably we do this as part of the build now), so we don't hit expat.

We do spend about 7 ms parsing SVG, which suggests this might introduce a sub-millisecond impact to startup.

(In reply to Olli Pettay [:smaug] from comment #40)

Does this affect innerHTML speed on XHTML documents?
Found an old test http://mozilla.pettay.fi/moztests/innerhtml_nonsimple_nostyle_test.xhtml
xhtml is already quite a bit slower than html http://mozilla.pettay.fi/moztests/innerhtml_nonsimple_nostyle_test.html so I'd rather not slow it even more. But maybe we're lucky and RLBox somehow speeds up memory accesses or something :)

I profiled this as well, and as a matter of fact, the testcase does get slightly faster with rlbox — probably because mallocing in the sandbox is cheaper and there's not much real parsing work to be done.

Shravan Narayan

Updated

•

3 years ago

Depends on: 1741525

Deian Stefan

Assignee

Updated

•

3 years ago

Depends on: 1738550

Deian Stefan

Assignee

Updated

•

3 years ago

Depends on: 1741552

Deian Stefan

Assignee

Updated

•

3 years ago

Depends on: 1741115
No longer depends on: 1741552

Deian Stefan

Assignee

Updated

•

3 years ago

Component: Security: Process Sandboxing → Security: RLBox

Flags: needinfo?(deian)

Deian Stefan

Assignee

Updated

•

3 years ago

Blocks: 1741995

Phabricator Automation

Comment 43

•

3 years ago

Comment on attachment 9246234 [details]
Bug 1688452 - Turn on Wasm sandboxing for libexpat

Revision D128652 was moved to bug 1741995. Setting attachment 9246234 [details] to obsolete.

Attachment #9246234 - Attachment is obsolete: true

Deian Stefan

Assignee

Comment 44

•

3 years ago

Try without turning on sandbox: https://treeherder.mozilla.org/jobs?repo=try&revision=213f71b0f39c7a1c3b6fb91234e90e7365f9202
Try with it enabled: https://treeherder.mozilla.org/#/jobs?repo=try&revision=489812dc8dacebaa84e31937b0c20dedbb1a16af

Pulsebot

Comment 45

•

3 years ago

Pushed by bholley@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7893bbd002e0 Retrofit nsExpatDriver to use RLBoxed libexpat r=tjr,peterv,bholley,glandium

Alexandru Michis [:malexandru]

Comment 46

•

3 years ago

Backed out changeset 7893bbd002e0 (Bug 1688452) for causing bustages in nsRLBoxExpatDriver.
Backout link: https://hg.mozilla.org/integration/autoland/rev/c73bcc465e0c2bce7debb0a86277e2dcb27444e4
Push with failures, failure log.

Flags: needinfo?(deian)

Pulsebot

Comment 47

•

3 years ago

Pushed by bholley@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4294063f1606 Retrofit nsExpatDriver to use RLBoxed libexpat r=tjr,peterv,bholley,glandium

Deian Stefan

Assignee

Updated

•

3 years ago

Flags: needinfo?(deian)

Sandor Molnar

Comment 48

•

3 years ago

Backed out for causing mochitest failures in test_blockParsing & wpt failures

Backout link: https://hg.mozilla.org/integration/autoland/rev/72b186a7935f3cf4c2c6d7cbf10fde98523b48f1

Push with failures

Failure log mochitest

Failure log wpt

Flags: needinfo?(deian)

Pulsebot

Comment 49

•

3 years ago

Pushed by bholley@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/734e90086444 Retrofit nsExpatDriver to use RLBoxed libexpat r=tjr,peterv,bholley,glandium

Iulian Moraru

Comment 50

•

3 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/734e90086444

Status: NEW → RESOLVED

Closed: 3 years ago

status-firefox96: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 96 Branch

Deian Stefan

Assignee

Updated

•

3 years ago

Flags: needinfo?(deian)

Mike Hommey [:glandium]

Updated

•

3 years ago

Blocks: 1758626

You need to log in before you can comment on or make changes to this bug.