Closed Bug 1673497 Opened 4 years ago Closed 4 years ago

Warp: Eliminate bailout loops

Categories

(Core :: JavaScript Engine: JIT, task, P1)

task

Tracking

()

RESOLVED FIXED
86 Branch
Tracking Status
firefox86 --- fixed

People

(Reporter: iain, Assigned: iain)

References

(Blocks 1 open bug)

Details

Attachments

(26 files)

(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details
(deleted), text/x-phabricator-request
Details

One significant improvement from Ion to Warp was a reduction in the frequency of bailout / invalidation loops. We should move as far in that direction as possible, with the eventual goal of being able to assert that we don’t bail out an unbounded number of times without taking some sort of action to improve the situation. (An example of action would be attaching a new baseline IC stub, invalidating the Warp script, and eventually recompiling with more conservative IC information.)

Right now we have a list of dozens of bailout kinds, which represent the kind of operation that failed and caused us to bail out. In most cases, they are assigned during lowering, based on the type of MIR node. A large majority of them have exactly the same behaviour. We should make bailout kinds represent the action we want to take if the instruction bails out, rather than the operation that failed. For example: if a GuardShape generated by the transpiler fails, we want to bail out and let the try-attach logic invalidate the script. If that GuardShape is hoisted out of a loop by LICM, we might instead want to disable LICM.

More detailed thoughts here.

Keywords: leave-open
Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/cb0510bea446 Part 1: Invalidate Warp when failing to attach for a transpiled IC r=jandem
Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a7e8de3bfbca Part 1: Invalidate Warp when failing to attach for a transpiled IC r=jandem https://hg.mozilla.org/integration/autoland/rev/7dad14e6de1a Part 2: Cancel off-thread compiles when invalidating Warp r=jandem
Flags: needinfo?(iireland)

This patch removes some dead code so that I don't have to modify it in subsequent patches.

(I'm also reordering the members of LSnapshot to improve packing.)

This patch adds a bailout kind field to MDefinition. Subsequent patches will populate that field. Once every node that can generate a snapshot has a known bailout kind, there will be one more patch to propagate the bailout kind from the MDefinition to the snapshot during lowering.

The resultType_ field was moved to improve packing.

Depends on D95786

Some MIR nodes already had a bailoutKind_ field. To avoid confusion, this patch rewrites those instructions to use the bailoutKind_ field in the MInstruction.

MUnbox had 6 dedicated bailout kinds, none of which did anything interesting. I compressed them into a single kind here to make room to define new kinds without running out of bits in the snapshot header.

Depends on D95787

Depends on D95788

Warp currently only generates speculative phi nodes in OSR cases, but once this code lands we can be a bit more aggressive.

Depends on D95789

This patch sets the bailout kind for conversion instructions generated by type policies. In a future patch, I will add invalidate+disable code based on the relative frequency of bailouts vs script entries.

In practice, almost all of the instructions generated in this way use CallPolicy (ensuring that the callee is an object) or ObjectPolicy.

The only exception is ArithPolicy, which can occasionally generate an MToNumberInt32. In the case I looked at, this happened when the MAdd created by InitElemInc depended on an OSR phi, and we had disabled OSR phi specialization. In the long run, I hope all the type policy code can be simplified and driven by YAML. When we do that, we may want to wait until after phi specialization to enforce type policies, to avoid inserting unnecessary unboxes.

The frequent bailout code will be reworked in a later patch. Until then, to avoid regressions, it still disables LICM.

Depends on D96488

Depends on D96489

A grab-bag of places where we create new nodes and there's an obvious predecessor to copy the bailout kind from.

Depends on D96490

Some bailout kinds already have special handling. In preparation for the next patch, this patch moves that code out of lowering and into the constructor.

Depends on D96491

After the previous patches, every MIR node in Warp that can bail out is being assigned a bailout kind. This patch finally flips the switch and assigns snapshots based on the bailout kind in the MIR node, instead of hardcoding it based on the kind of MIR node.

Depends on D96492

Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f242454db37a Part 3: Minor cleanups r=jandem https://hg.mozilla.org/integration/autoland/rev/373823b41fe1 Part 4: Store BailoutKind in MIR nodes r=jandem https://hg.mozilla.org/integration/autoland/rev/8aa8e49802eb Part 5: Remove existing BailoutKind members from MIR nodes r=jandem https://hg.mozilla.org/integration/autoland/rev/7dceebd8ae2f Part 6: Add BailoutKind::TranspiledCacheIR r=jandem https://hg.mozilla.org/integration/autoland/rev/279eb8cf0dcd Part 7: Add BailoutKind::SpeculativePhi r=jandem
Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e0bd160cd356 Part 8: Add BailoutKind::TypePolicy r=jandem https://hg.mozilla.org/integration/autoland/rev/70be3f74d252 Part 9: Add BailoutKind::LICM r=jandem https://hg.mozilla.org/integration/autoland/rev/919c2f5620af Part 10: Add BailoutKind::HoistBoundsCheck r=jandem https://hg.mozilla.org/integration/autoland/rev/02b14e1ba195 Part 11: Propagate bailout kinds during optimization r=jandem https://hg.mozilla.org/integration/autoland/rev/239734791622 Part 12: Hardcode bailout kinds for some nodes r=jandem https://hg.mozilla.org/integration/autoland/rev/6c51def94269 Part 13: Initialize snapshot bailout kind from MIR while lowering r=jandem
Regressions: 1676639
Regressions: 1676631
Regressions: 1678709

This landed a bit before the merge. Given there are some regressions, how do you feel about backing out from beta?

Flags: needinfo?(iireland)

Seems like it will be easier/safer than uplifting fixes.

Flags: needinfo?(iireland)

These patches have caused a few performance regressions on beta (bug 1678391, bug 1677782, bug 1678709). It is easier and safer to back them out than to try uplifting fixes.

This rolls back parts 4-13.

Parts 1-3 were preparatory cleanup. Parts 4-12 mostly have no effect without part 13 (which propagates bailout kinds from MIR to LIR), but there are some potential interactions (like LICM changing the bailout kind of an MUnbox, which already stored the bailout kind in MIR), so it seemed safer to roll back the entire thing.

Comment on attachment 9190411 [details]
Bug 1673497: Back out on beta r=jandem

Beta/Release Uplift Approval Request

  • User impact if declined: Performance regressions (see bug 1678391, bug 1677782, bug 1678709)
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This patch backs out work in progress and restores the previous code, which is well tested.
  • String changes made/needed:
Attachment #9190411 - Flags: approval-mozilla-beta?

Comment on attachment 9190411 [details]
Bug 1673497: Back out on beta r=jandem

Reverts us back to the pre-landing state to resolve performance regressions in 84. Approved for 84.0b7.

Attachment #9190411 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment on attachment 9190411 [details]
Bug 1673497: Back out on beta r=jandem

Landed on Beta:
https://hg.mozilla.org/releases/mozilla-beta/rev/bb8c52ad9943

Clearing the approval flag to get this bug off the needs-uplift radar.

Attachment #9190411 - Flags: approval-mozilla-beta+
Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/63a3451d9bcc Part 14: Remove dead BailoutKinds. r=jandem

This code was originally added to support ConstStringSplit. Later, we removed ConstStringSplit because it didn't do anything useful, so this code is dead.

(This patch originally simplified some of the bailout work. Now that we're intending to eagerly invalidate warp scripts at the beginning of the fallback, it's just a driveby fix.)

In a future patch, we will add special behaviour for this BailoutKind to avoid triggering spurious assertions.

Depends on D99164

GetPropertyPure was added in bug 1127167, to avoid messiness involving the new script properties analysis and unboxed objects. We don't have to worry about that now, and we've already verified that newTarget is a function with a non-configurable prototype.

When I initially wrote this patch, it fixed a double bailout in basic/newTargetRectifier.js, where we weren't invalidating the warp script because we returned TemporarilyUnoptimizable. Now that we are going to eagerly invalidate at the beginning of DoCallFallback, it's just a driveby cleanup.

Note: after this patch, the only remaining use of TemporarilyUnoptimizable is in tryAttachCallScripted, where we use it to avoid allocating template objects for cold constructors. It would be nice if we could get rid of it entirely.

Depends on D99165

Driveby cleanup. It hasn't had a failure path since bug 1535092.

Depends on D99166

Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e6182530443f Part 15: Remove vestigial support for deferred call stubs r=jandem https://hg.mozilla.org/integration/autoland/rev/30a4a3beea91 Part 16: Add BailoutKind::OnStackInvalidation r=jandem https://hg.mozilla.org/integration/autoland/rev/083fbea0812e Part 17: Resolve lazy prototype in getThisForScripted r=jandem https://hg.mozilla.org/integration/autoland/rev/ecdf083be70b Part 18: Make GetElemOptimizedArguments infallible r=jandem
Regressions: 1681806

One major reason for invalidating warp in the baseline fallback (instead of when bailing out) was to help guarantee that we don't recompile without changing anything. However, some of the changes necessary to make that assertable (for example, unlinking the other 0-entry-count stubs to stop them from swallowing a bailout) were causing performance problems. Instead, we're going to use a checksum-based approach to assert that we don't recompile the same thing.

Our previous approach to LICM bailouts relied on the baseline fallback invalidations, so I've rewritten it a bit.

The next patch will add replacement code. I split this up to make the diff clearer.

Depends on D100826

I'm arbitrarily setting the unfixable invalidation threshold at 5x the normal threshold. We can tune it later if necessary. In practice, I browsed around on CNN/Facebook/gdocs/gmail for a few minutes and only hit a single time, so it probably doesn't matter much.

Depends on D100827

ToPropertyKey converts -0 to 0. WarpCacheIRTranspiler::emitGuardToInt32Index should set the flag on MToNumberInt32 so that we unconditionally skip the negative zero check. This matches the comments in maybeGuardInt32Index and CacheIRCompiler::emitGuardToInt32Index.

I'm also changing the name of the flag on MToNumberInt32 from canBeNegativeZero to needsNegativeZeroCheck, which is less ambiguous.

Depends on D100828

Depends on D100829

This will catch cases where we invalidate expecting to recompile with different IC data, but don't change the IC entries.

It should enable the fuzzer to catch any cases where transpiled CacheIR bails out, but the baseline IC does not fail a guard.

The hash is 32 bits. I don't expect collisions to be a problem, especially since the common case will involve rehashing most of the same data with a few differences, and if our hash functions collide on inputs like that, we have bigger problems. If we do start seeing collisions, we can hash twice (initializing one of the hashes with a seed of some sort) to create a 64-bit hash.

This testcase popped up during fuzzing. With --ion-eager and --no-threads:

  1. test calls itself recursively until we overflow the stack.
  2. We bail out into the baseline interpreter to catch the exception.
  3. In the baseline interpreter, we try to execute /a/.test("a").
  4. This calls into self-hosted code, ending up in RegExpBuiltinExec, which calls RegExpTester.
  5. We have never executed the regexp, so we try to parse it. The stack is still full, so we overflow.
  6. We catch the exception in the caller's frame and try to execute the regex again.
  7. We warp-compile RegExpBuiltinExec. Everything after RegExpTester is an MBail.
  8. Because the result of `RegExpTester is unused, and it can be recovered, DCE eliminates it.
  9. We execute in Warp and bail at the first instruction after RegExpTester.
  10. We repeat steps 6 and 7. No CacheIR has changed since our last bailout, so we assert.

One of the necessary elements of this bug is the existence of non-deterministic exceptions, where baseline throws and Warp doesn't. I think the best place to fix this bug is by disabling the assertion if we have ever seen one. It's hard to imagine a bailout loop we would care about that depends on OOM or stack overflow.

Depends on D101101

Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4c650d13ccda Part 19: Invalidate warp lazily r=jandem https://hg.mozilla.org/integration/autoland/rev/cd440be00b8c Part 20: Remove old bailout->invalidation code r=jandem https://hg.mozilla.org/integration/autoland/rev/2129ae8e9459 Part 21: Add BailoutAction r=jandem https://hg.mozilla.org/integration/autoland/rev/6b0fac1de385 Part 22: Don't bail on -0 for GuardToInt32Index r=jandem https://hg.mozilla.org/integration/autoland/rev/f1172191a09e Part 23: Add an [SMDOC] comment r=jandem
Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/64183a554e90 Part 24: Hash the ICScript r=jandem https://hg.mozilla.org/integration/autoland/rev/d005f536e24c Part 25: Don't assert for stack overflow or OOM r=jandem
Keywords: leave-open

Oops. These patches exposed a pre-existing problem (see bug 1684085). Forgot to verify that the fix for that bug had landed. I just landed that fix; I will reland this one shortly.

Flags: needinfo?(iireland)
Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/534ee223b955 Part 19: Invalidate warp lazily r=jandem https://hg.mozilla.org/integration/autoland/rev/7ca4e9290cf5 Part 20: Remove old bailout->invalidation code r=jandem https://hg.mozilla.org/integration/autoland/rev/a337888b836d Part 21: Add BailoutAction r=jandem https://hg.mozilla.org/integration/autoland/rev/04b30c153cf8 Part 22: Don't bail on -0 for GuardToInt32Index r=jandem https://hg.mozilla.org/integration/autoland/rev/efb3854428dd Part 23: Add an [SMDOC] comment r=jandem https://hg.mozilla.org/integration/autoland/rev/3434c588c71f Part 24: Hash the ICScript r=jandem https://hg.mozilla.org/integration/autoland/rev/7925dd49e9a2 Part 25: Don't assert for stack overflow or OOM r=jandem

Bleh, bug 1684085 was backed out. We can figure this out on Monday.

Flags: needinfo?(iireland)
Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/2650cedbdcc5 Part 19: Invalidate warp lazily r=jandem https://hg.mozilla.org/integration/autoland/rev/9d1c5fbfd806 Part 20: Remove old bailout->invalidation code r=jandem https://hg.mozilla.org/integration/autoland/rev/26b00e8e7ac0 Part 21: Add BailoutAction r=jandem https://hg.mozilla.org/integration/autoland/rev/114c93f587d1 Part 22: Don't bail on -0 for GuardToInt32Index r=jandem https://hg.mozilla.org/integration/autoland/rev/c3865dea2f48 Part 23: Add an [SMDOC] comment r=jandem https://hg.mozilla.org/integration/autoland/rev/fe222711a158 Part 24: Hash the ICScript r=jandem https://hg.mozilla.org/integration/autoland/rev/febd0fad0733 Part 25: Don't assert for stack overflow or OOM r=jandem
Regressions: 1686207
Regressions: 1687672
Regressions: 1692857
Regressions: 1699056
Regressions: 1694600
Regressions: 1703817
Regressions: 1713579
Regressions: 1713567
Regressions: 1723943
Regressions: 1735157
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: