Closed Bug 914255 Opened 11 years ago Closed 10 years ago

Repeated bailouts when passing new objects

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla36

People

(Reporter: yves.gwerder, Assigned: h4writer)

References

(Depends on 2 open bugs, Blocks 2 open bugs)

Details

Attachments

(4 files, 1 obsolete file)

Attached file A test to reproduce the issue (deleted) —
The attached test script results in four bailouts with the following signature:

[Bailouts]  bailing from bytecode: getaliasedvar, MIR: typebarrier [52], LIR: typebarrier [57]
[BaselineBailouts]       Resuming after pc offset 50 (op lambda) (line 12) of tests/getaliasedvar_bailout.js:6

... and three bailouts with the following signature:

[Bailouts]  bailing from bytecode: nop, MIR: constant [0], LIR: label [0]
[BaselineBailouts]       Resuming at pc offset 0 (op zero) (line 8) of tests/getaliasedvar_bailout.js:6

I assume it bails each time it gets a new object. the number of bailouts depends on how many new objects you pass into
Map.test(). I also tested it with 10 objects which causes 10 bailouts with the first signature and 9 with the second.
It should figure out that the function is used with different objects and stop bailing.

I initially discovered this issue because there's a function in our code that runs a lot slower in v26 compared with v1.8.5.
I hope this is the reason for it. There's a graph showing the performance difference of the affected function here:
http://www.wildfiregames.com/forum/index.php?app=core&module=attach&section=attach&attach_rel_module=post&attach_id=6014

In our case the object is a new object because it's passed as structured clone in each simulation turn of the game.

Also check these two posts for more information:
http://www.wildfiregames.com/forum/index.php?showtopic=17289&st=100#entry273563
http://www.wildfiregames.com/forum/index.php?showtopic=17289&st=100#entry273741
Blocks: 897962
Attachment #801655 - Attachment mime type: application/octet-stream → text/javascript
So we can keep 31 objects into a Type. So that means we will recompile this code 31 times, before we decide it is time to make any object possible to fly through. I know this is handy for inlining. But there we are limited to max 5 objects.

- Do we have any other place where we benefit of having so many objects in a type. Else we need to consider making them AnyObject sooner?
- Or make it possible for ionmmonkey to see whenever it doesn't make a difference (i.e. where we don't depend on the different objects in the Type.) and only request recompiles when an different MIRType is added, but not when a new object is added...

Brian what are your thoughts on this?
Flags: needinfo?(bhackett1024)
(In reply to Hannes Verschore [:h4writer] from comment #1)
> So we can keep 31 objects into a Type. So that means we will recompile this
> code 31 times, before we decide it is time to make any object possible to
> fly through. I know this is handy for inlining. But there we are limited to
> max 5 objects.
> 
> - Do we have any other place where we benefit of having so many objects in a
> type. Else we need to consider making them AnyObject sooner?
> - Or make it possible for ionmmonkey to see whenever it doesn't make a
> difference (i.e. where we don't depend on the different objects in the
> Type.) and only request recompiles when an different MIRType is added, but
> not when a new object is added...
> 
> Brian what are your thoughts on this?

I think limiting the number of type objects in a set to five should be fine.  I don't think I've ever seen a benefit to having more, and with baseline provided information and checks hoisted in Ion I doubt the large type sets are buying anything.  And, as you point out, they have a significant cost sometimes.
Flags: needinfo?(bhackett1024)
Attached patch temporary test-patch by h4writer (obsolete) (deleted) — Splinter Review
I've tested the attached test-patch with 0 A.D.
In addition to that I had to set alwaysPreserveCode(true) in Runtime.cpp to work around garbage collection of the generated JIT code.

The bailout of the type "Resuming at pc offset 1966 (op getgname) (line 136) of simulation/ai/aegis/map-module.js:39" occurred 46 times before these changes and now only occurs 8 times.
Is the limit of 2 per compartment or why did it bail 8 times instead of 2 times? Two times per compartment would make sense because it was a 4 player replay.
It was a total number of 24 bailouts in map-module.js:39, also counting different types of bailouts.

The strange thing was that it did not improve performance at all in release mode.
On the graph there were still more than 50 peaks visible at the same place as before even though it were only 24 bailouts.

For me there are two possible explanations for this:
 1. Either the patch or alwaysPreserveCode(true) does not work as expected in release mode. I couldn't find any obvious #ifdef statements that could cause this though.
 2. It now runs more in ion code but the ion code isn't faster than the interpreter code in this case.

I wanted to check it with a trace-log because this should work in release mode and should tell if the code runs in interpreter mode or in ion mode. Unfortunately this only caused lots of errors. I think the problem could be that the code now uses multiple compartments again.
Any other ideas how I could check that?
For the record - the tracelogging works now. Here are the results:
simulation/ai/aegis/map-module.js:39 	130 	22 	25.75%	interpreter run: 0.00%, baseline run: 0.29%, ion compile: 0.63%, ion run: 99.08%,
I've attached the full html log here: http://www.wildfiregames.com/forum/index.php?showtopic=17289&st=100#entry273741
I could reduce the amount of objects we keep to 7, before seeing any regression. I still have to find why we can't lower to 5, since I only have knowledge about using max 5 objects at a time...
jspdf.js:14472 (fontLoaderBind) keeps bailing due to this issue. So pdf.js might have a speed increase with this landed. I tested the version with max "7" and got an increase of 3%.
Blocks: 807162
Lowers the limit to 7 objects, before we use AnyObject to reduce recompiles.

Tested again and improvement is pretty small for pdf.js. I see no movement with background compiler and 1.5% gain without background compiler. With bug 932800 checked in I see the predicted 4% without background compiler.
Assignee: general → hv1989
Attachment #825218 - Flags: review?(bhackett1024)
Attachment #825218 - Flags: review?(bhackett1024) → review+
https://hg.mozilla.org/mozilla-central/rev/8408cc15ce95
Status: UNCONFIRMED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
> - Do we have any other place where we benefit of having so many objects in a type.

The 4% Dromaeo DOM regressions from this change say yes: When we're walking a DOM tree and want to still compile optimized DOM property accesses.  In particular, I see 15-25% regressions on the dom-traverse test on all platforms:

http://perf.snarkfest.net/compare-talos/breakdown.html?oldTestIds=31104205&newTestIds=31104803&testName=dromaeo_dom 
http://perf.snarkfest.net/compare-talos/breakdown.html?oldTestIds=31102929&newTestIds=31104449&testName=dromaeo_dom
http://perf.snarkfest.net/compare-talos/breakdown.html?oldTestIds=31103223&newTestIds=31104745&testName=dromaeo_dom

and some regressions to query.html.

This is why I really want us to be able to track subtyping relations in TI...

Note also bug 827404.
Flags: needinfo?(hv1989)
Backout:

https://hg.mozilla.org/integration/mozilla-inbound/rev/1bec2044db5c

So for this to work again we need bug 827404 fixed first
Status: RESOLVED → REOPENED
Depends on: 827404
Ever confirmed: true
Flags: needinfo?(hv1989)
Resolution: FIXED → ---
Depends on: 934433
https://hg.mozilla.org/mozilla-central/rev/1bec2044db5c
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Forgot to annotate with [leave open], since the issue was backed out. So this isn't resolved yet.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [leave open]
This is also an issue on octane-typescript. Would give us a 10% increase on that benchmark
Hannes, 10% on TypeScript sounds pretty good. Do you have this on your radar still?
Flags: needinfo?(hv1989)
OS: Linux → All
Hardware: x86_64 → All
Target Milestone: mozilla28 → ---
It is definitely on my radar. It is one of my p1 performance bugs. I just haven't found the time yet. Looking at last week, I finally could look into performance issues only. So hopefully this continues and I can try this in one of the next weeks again!
Flags: needinfo?(hv1989)
So I adjusted TypeSet to go from
specific objects -> objects have a particular clasp -> any object

As a result if we deoptimize objects, we still retain the class information in most cases.

This doesn't fix the problem with DOM objects. There are way too many Clasp for DOM objects. So they still go to any object. My idea on fixing that is to save a bitmask of the flags of clasp. So when NON_NATIVE, JSCLASS_HAS_PRIVATE, JSCLASS_EMULATES_UNDEFINED, JSCLASS_IS_PROXY and JSCLASS_IS_DOMJSCLASS is the same for all objects. Save that information. That way we can still see if an object is a DOM object.
Attachment #8514215 - Flags: feedback?(bhackett1024)
Attachment #806163 - Attachment is obsolete: true
Attachment #825218 - Attachment description: bug914255-limit-objects → Part 1: Limit the number of objects
Comment on attachment 8514215 [details] [diff] [review]
Part 2: Keep track of clasp for AnyObjects

Review of attachment 8514215 [details] [diff] [review]:
-----------------------------------------------------------------

Generally this looks fine, but I'm mainly concerned about whether this approach will actually help with fixing the dromaeo regressions.  The common DOM getter/setter stuff is based on finding that getter/setter on a prototype shared between the various type objects in a type set, and just knowing the common class (or IS_DOM_CLASS etc. flags) for members of the type set doesn't seem like it will be sufficient to be able to optimize DOM accessors without keeping track of the explicit type objects.  Unless you're planning on doing a subtyping thing too for type sets where we keep track of a common prototype as well as common class information?

Anyhow, if you want to land a reduction in the maximum type set size without all this reengineering, is there a simple filter you could use for when to retain more type objects in type sets, which would give the benefits of smaller type sets without regressing dromaeo?  e.g. if the classes in the type set are DOM classes, allow the type set to have more objects in it.

::: js/src/jit/IonMacroAssembler.cpp
@@ +119,5 @@
>      }
>  
> +    // Emit testing specific objects or object clasp.
> +    if ((!types->unknownObject() && types->getObjectCount() > 0) ||
> +        (types->unknownObject() && !types->unknownObjectWithNoClass()))

This condition is pretty complicated, can you put it in a helper function?  This would also fix the later duplication of this stuff in LIRGenerator::visitTypeBarrier.

@@ +135,5 @@
>          bind(&matched);
>          return;
>      }
>  
> +    // If this is the last check, invert the last branch.

Maybe just 'Invert the last branch, if there is one.'

@@ +164,5 @@
>      // see CodeGenerator::link. Other callers should use TypeSet::readBarrier
>      // to trigger the barrier on the contents of type sets passed in here.
> +
> +    // An unknown type can still contain class information. Make sure the clasp
> +    // equals to the objects we observe.

Second sentence is a bit garbled.  How about:

Make sure the clasp matches the object we are testing.

::: js/src/jit/Lowering.cpp
@@ +2359,5 @@
>      // from inside a type barrier test.
>  
>      const types::TemporaryTypeSet *types = ins->resultTypeSet();
> +    bool needTemp = (!types->unknownObject() && types->getObjectCount() > 0) ||
> +                    (types->unknownObject() && !types->unknownObjectWithNoClass());

This should use the helper function requested above.

::: js/src/jsfriendapi.h
@@ +567,5 @@
>  struct TypeObject {
>      const Class *clasp;
>      JSObject    *proto;
> +
> +    static size_t offsetOfClasp() { return offsetof(TypeObject, clasp); }

Why is this necessary?  I don't see it being called anywhere, and we shouldn't need these offsetof methods in the shadow:: API.

::: js/src/jsinfer.cpp
@@ +306,5 @@
>          flags |= TYPE_FLAG_ANYOBJECT;
>      } else  if (type.isTypeObject() && type.typeObject()->unknownProperties()) {
>          flags |= TYPE_FLAG_ANYOBJECT;
> +        // TODO: is it correct to do this:
> +        // payload.clasp = type.typeObject()->clasp();

Unfortunately it isn't, due to object brain transplants.  If the type object has the OBJECT_FLAG_SETS_MARKED_UNKNOWN flag then we allow its prototype and clasp to be mutated by transplants without changing type information, which would invalidate this optimization.

@@ +542,5 @@
>          TypeObject *nobject = type.typeObject();
>          MOZ_ASSERT(!nobject->singleton());
>          if (nobject->unknownProperties())
> +            // TODO: is it correct to do this:
> +            // goto objectClass;

No, for the same reason as earlier.

@@ +582,5 @@
>  
>      TypeSet::addType(type, &cxArg->typeLifoAlloc());
>  
> +    // TODO: there is no specific class object Type.
> +    // Can we leave this. Or will this cause problems?

This should be fine.
Attachment #8514215 - Flags: feedback?(bhackett1024)
The way ion currently optimizes the DOM accesses is like so:

* Check whether we find the prop on the same proto for all objects in the typeset.
* Get the relevant property getter or setter off that proto (out of the baseline IC, actually).
* Verify that this property getter/setter is compatible with all the object types we've seen.  That requires them to be JSCLASS_IS_DOMJSCLASS _and_ then does a call back into Gecko to see whether that JSClass is compatible with the accessor.  This unfortunately requires the actual JSClass pointer right now.

So that kind of sucks.  Some thoughts on possible ways of dealing with this:

1)  We could have different size limits for typesets depending on whether everything in the typeset is a DOM object or not.  Icky, but probably simple to do.  How fast it is, I'm not sure; isDOMClass() is not a cached state in typesets, so adding a new Type would involve rechecking that state or something.
2)  We could have a single limit, but special-case some particularly common DOM things that have lots of subtypes that tend to pass through polymorphic callsites.  Specifically, EventTarget, Node, and Element.  Basically, store state in the typeset (do we have free bits for this?) if everything in it is in one of those buckets.  This can be done without keeping track of the exact JSClasses of things in a typeset; you just update the bit (via a callback to Gecko, I guess; I hope this is not a common operation) if a new Type gets added to the typeset.  Then we add a new Gecko callback Ion can use that takes these bits instead of a JSClass, for use in the cases when we stopped tracking the exact JSClasses and are just tracking the bits instead.
Ok, after rechecking everything I came to some conclusions:

1) I thought having "clasp" instead of "anyobject" would give more optimization opportunities for IM (not only related to DOM performance). Seems like this is incorrect? E.g. for all used getKnownClass(), we need some more information, like for Array, we need to check extra properties before allowing inlining ... So it might not help IM at all? Or is this a chicken and egg problem, where having clasp would cause us to use it more to decide to optimize?

2) Octane has adjusted TypeScript to run X times instead of X seconds. As a result adding this optimization is now only a small win. So this isn't not really needed to fix octane-typescript.

3) Though there is another solution that isn't based on decreasing the limit. Currently TI keeps track of TypeSet, after which IM uses this information to base things on. Whereafter constraints are added to make sure the TI doesn't change. Now there is a caveat here. E.g. TI can say the type is X and IM can say the operation works for type Y. E.g. "undefined == x", where x is [String], will get translated into Compare_Null, which supports all types.
So here goes the idea: What about using TI, like we do now, but instead of freezing the current TypeSet, only freeze the types IM specializes for. E.g. when calling a function with 7 TI objects, we don't inline (max 6 or something), so the code actually only uses ANY_OBJECT. Shouldn't it be nice to only add that constraint instead, instead of bailing every time for every newly added object.
This would remove the need for a heuristic depending on DOM/nonDOM. And let IM decide when it has used the TI data or if it compiled it based on lest strict rules...

4) You are both in favor of adding a different size limit when all objects in a TypeSet are DOM classes (as quick fix)


Ok, so (4) is really easy and is a quick fix. I'll do that.
I'm now worried that (1) (so this patch), might not help at all. So maybe don't do that?

@Brian: what do you think about (3)? It has been something I've been thinking about and think it would be a good improvement on what we have now.
> Shouldn't it be nice to only add that constraint instead, instead of bailing every time
> for every newly added object.

Yes!

This may help with DOM stuff too, since then we may be able to add constraints like "is valid for this DOM function".
Attachment #8517386 - Flags: review?(bhackett1024)
Comment on attachment 8517386 [details] [diff] [review]
Limit the number of objects when having no DOM class

Review of attachment 8517386 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsinfer.cpp
@@ +537,5 @@
> +
> +        // If object count is higher than the normal limit, all objects should
> +        // be DOM objects.
> +        if (objectCount > TYPE_FLAG_OBJECT_COUNT_LIMIT) {
> +            if (object->clasp()->isDOMClass())

if (!isDOMClass())

off course.
Comment on attachment 8517386 [details] [diff] [review]
Limit the number of objects when having no DOM class

Review of attachment 8517386 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsinfer.cpp
@@ +539,5 @@
> +        // be DOM objects.
> +        if (objectCount > TYPE_FLAG_OBJECT_COUNT_LIMIT) {
> +            if (object->clasp()->isDOMClass())
> +                goto unknownObject;
> +        }

This should spell out more explicitly the optimization that is being performed, as it's kind of hard to follow the logic here.  Maybe remove the below test for TYPE_FLAG_DOMOBJECT_COUNT_LIMIT and do:

// Limit the number of objects we track. There is a different limit
// depending on whether the set only contains DOM objects, which can
// have many different classes and prototypes but are still optimizable
// by IonMonkey.
if (objectCount >= TYPE_FLAG_OBJECT_COUNT_LIMIT) {
    // Examining the entire type set is only required when we first hit
    // the normal object limit.
    if (objectCount == TYPE_FLAG_OBJECT_COUNT_LIMIT
        ? !isDOMClass()
        : !object->clasp()->isDOMClass())
    {
        goto unknownObject;
    }
    if (objectCount == TYPE_FLAG_DOMOBJECT_COUNT_LIMIT)
        goto unknownObject;
}

@@ +547,1 @@
>              goto unknownObject;

There should be a static_assert(TYPE_FLAG_DOMOBJECT_COUNT_LIMIT > TYPE_FLAG_OBJECT_COUNT_LIMIT) somewhere around here.

::: js/src/jsinfer.h
@@ +397,5 @@
>      /* Mask/shift for the number of objects in objectSet */
>      TYPE_FLAG_OBJECT_COUNT_MASK   = 0x3e00,
>      TYPE_FLAG_OBJECT_COUNT_SHIFT  = 9,
> +    TYPE_FLAG_OBJECT_COUNT_LIMIT  = 7,
> +    TYPE_FLAG_DOMOBJECT_COUNT_LIMIT  =

Nit: stray space
Attachment #8517386 - Flags: review?(bhackett1024) → review+
I did remove the "? :", which I felt was hard to follow.
https://hg.mozilla.org/integration/mozilla-inbound/rev/95e326d4fd7d
This patch is a win on a lot of benchmarks but is also a regression on one specific benchmark: solve-sudoku (according to AWFY). Worth investigating?
(In reply to Guilherme Lima from comment #27)
> This patch is a win on a lot of benchmarks but is also a regression on one
> specific benchmark: solve-sudoku (according to AWFY). Worth investigating?

Like I could have guessed before looking into the benchmark, we have 9 different objects somewhere. And since we now only keep track of maximum 7 objects, we see a general object, instead of the specific objects. 

> var possibilities = [[],[],[],[],[],[],[],[],[]];

Generaties unique types for every "[]", so 9 specific objects.

If we change the code to
> var possibilities = []
> for (var i = 0; i < 9; i++)
>     possibilities[i] = []

we only use 1 typeobject. So they are all the same. And we only need to keep track of 1 specific object.
So with this change the performance of the code is also back to before this patch.

- Now I don't how feasible it is to use the same typeobject for every array in [[],[],[],[],[],[],[],[],[]]. Which would be a possible fix.
- Comment 21 option 3 would also fix this.
Oops. Forgot to remove the leave-open flag.
Status: REOPENED → RESOLVED
Closed: 11 years ago10 years ago
Resolution: --- → FIXED
Whiteboard: [leave open]
Target Milestone: --- → mozilla36
Depends on: 1161576
No longer depends on: 1234736
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: