914255 - Repeated bailouts when passing new objects

Reporter

Description

•

11 years ago

Attached file A test to reproduce the issue (deleted) — Details

The attached test script results in four bailouts with the following signature:

[Bailouts]  bailing from bytecode: getaliasedvar, MIR: typebarrier [52], LIR: typebarrier [57]
[BaselineBailouts]       Resuming after pc offset 50 (op lambda) (line 12) of tests/getaliasedvar_bailout.js:6

... and three bailouts with the following signature:

[Bailouts]  bailing from bytecode: nop, MIR: constant [0], LIR: label [0]
[BaselineBailouts]       Resuming at pc offset 0 (op zero) (line 8) of tests/getaliasedvar_bailout.js:6

I assume it bails each time it gets a new object. the number of bailouts depends on how many new objects you pass into
Map.test(). I also tested it with 10 objects which causes 10 bailouts with the first signature and 9 with the second.
It should figure out that the function is used with different objects and stop bailing.

I initially discovered this issue because there's a function in our code that runs a lot slower in v26 compared with v1.8.5.
I hope this is the reason for it. There's a graph showing the performance difference of the affected function here:
http://www.wildfiregames.com/forum/index.php?app=core&module=attach&section=attach&attach_rel_module=post&attach_id=6014

In our case the object is a new object because it's passed as structured clone in each simulation turn of the game.

Also check these two posts for more information:
http://www.wildfiregames.com/forum/index.php?showtopic=17289&st=100#entry273563
http://www.wildfiregames.com/forum/index.php?showtopic=17289&st=100#entry273741

Yves Gwerder

Reporter

Updated

•

11 years ago

Blocks: 897962

Mauricio Collares [:collares]

Updated

•

11 years ago

Attachment #801655 - Attachment mime type: application/octet-stream → text/javascript

Hannes Verschore [:h4writer]

Assignee

Comment 1

•

11 years ago

So we can keep 31 objects into a Type. So that means we will recompile this code 31 times, before we decide it is time to make any object possible to fly through. I know this is handy for inlining. But there we are limited to max 5 objects.

- Do we have any other place where we benefit of having so many objects in a type. Else we need to consider making them AnyObject sooner?
- Or make it possible for ionmmonkey to see whenever it doesn't make a difference (i.e. where we don't depend on the different objects in the Type.) and only request recompiles when an different MIRType is added, but not when a new object is added...

Brian what are your thoughts on this?

Flags: needinfo?(bhackett1024)

Brian Hackett [Laid off!]

Comment 2

•

11 years ago

(In reply to Hannes Verschore [:h4writer] from comment #1)
> So we can keep 31 objects into a Type. So that means we will recompile this
> code 31 times, before we decide it is time to make any object possible to
> fly through. I know this is handy for inlining. But there we are limited to
> max 5 objects.
> 
> - Do we have any other place where we benefit of having so many objects in a
> type. Else we need to consider making them AnyObject sooner?
> - Or make it possible for ionmmonkey to see whenever it doesn't make a
> difference (i.e. where we don't depend on the different objects in the
> Type.) and only request recompiles when an different MIRType is added, but
> not when a new object is added...
> 
> Brian what are your thoughts on this?

I think limiting the number of type objects in a set to five should be fine.  I don't think I've ever seen a benefit to having more, and with baseline provided information and checks hoisted in Ion I doubt the large type sets are buying anything.  And, as you point out, they have a significant cost sometimes.

Flags: needinfo?(bhackett1024)

Yves Gwerder

Reporter

Comment 3

•

11 years ago

Attached patch temporary test-patch by h4writer (obsolete) (deleted) — Details — Splinter Review

Yves Gwerder

Reporter

Comment 4

•

11 years ago

I've tested the attached test-patch with 0 A.D.
In addition to that I had to set alwaysPreserveCode(true) in Runtime.cpp to work around garbage collection of the generated JIT code.

The bailout of the type "Resuming at pc offset 1966 (op getgname) (line 136) of simulation/ai/aegis/map-module.js:39" occurred 46 times before these changes and now only occurs 8 times.
Is the limit of 2 per compartment or why did it bail 8 times instead of 2 times? Two times per compartment would make sense because it was a 4 player replay.
It was a total number of 24 bailouts in map-module.js:39, also counting different types of bailouts.

The strange thing was that it did not improve performance at all in release mode.
On the graph there were still more than 50 peaks visible at the same place as before even though it were only 24 bailouts.

For me there are two possible explanations for this:
 1. Either the patch or alwaysPreserveCode(true) does not work as expected in release mode. I couldn't find any obvious #ifdef statements that could cause this though.
 2. It now runs more in ion code but the ion code isn't faster than the interpreter code in this case.

I wanted to check it with a trace-log because this should work in release mode and should tell if the code runs in interpreter mode or in ion mode. Unfortunately this only caused lots of errors. I think the problem could be that the code now uses multiple compartments again.
Any other ideas how I could check that?

Yves Gwerder

Reporter

Comment 5

•

11 years ago

For the record - the tracelogging works now. Here are the results:
simulation/ai/aegis/map-module.js:39 	130 	22 	25.75%	interpreter run: 0.00%, baseline run: 0.29%, ion compile: 0.63%, ion run: 99.08%,
I've attached the full html log here: http://www.wildfiregames.com/forum/index.php?showtopic=17289&st=100#entry273741

Hannes Verschore [:h4writer]

Assignee

Comment 6

•

11 years ago

I could reduce the amount of objects we keep to 7, before seeing any regression. I still have to find why we can't lower to 5, since I only have knowledge about using max 5 objects at a time...

Hannes Verschore [:h4writer]

Assignee

Comment 7

•

11 years ago

jspdf.js:14472 (fontLoaderBind) keeps bailing due to this issue. So pdf.js might have a speed increase with this landed. I tested the version with max "7" and got an increase of 3%.

Blocks: 807162

Hannes Verschore [:h4writer]

Assignee

Comment 8

•

11 years ago

Attached patch Part 1: Limit the number of objects (deleted) — Details — Splinter Review

Lowers the limit to 7 objects, before we use AnyObject to reduce recompiles.

Tested again and improvement is pretty small for pdf.js. I see no movement with background compiler and 1.5% gain without background compiler. With bug 932800 checked in I see the predicted 4% without background compiler.

Assignee: general → hv1989

Attachment #825218 - Flags: review?(bhackett1024)

Brian Hackett [Laid off!]

Updated

•

11 years ago

Attachment #825218 - Flags: review?(bhackett1024) → review+

Hannes Verschore [:h4writer]

Assignee

Comment 9

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/8408cc15ce95

Carsten Book [:Tomcat]

Comment 10

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/8408cc15ce95

Status: UNCONFIRMED → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla28

Boris Zbarsky [:bzbarsky]

Comment 11

•

11 years ago

> - Do we have any other place where we benefit of having so many objects in a type.

The 4% Dromaeo DOM regressions from this change say yes: When we're walking a DOM tree and want to still compile optimized DOM property accesses.  In particular, I see 15-25% regressions on the dom-traverse test on all platforms:

http://perf.snarkfest.net/compare-talos/breakdown.html?oldTestIds=31104205&newTestIds=31104803&testName=dromaeo_dom 
http://perf.snarkfest.net/compare-talos/breakdown.html?oldTestIds=31102929&newTestIds=31104449&testName=dromaeo_dom
http://perf.snarkfest.net/compare-talos/breakdown.html?oldTestIds=31103223&newTestIds=31104745&testName=dromaeo_dom

and some regressions to query.html.

This is why I really want us to be able to track subtyping relations in TI...

Note also bug 827404.

Flags: needinfo?(hv1989)

Hannes Verschore [:h4writer]

Assignee

Comment 12

•

11 years ago

Backout:

https://hg.mozilla.org/integration/mozilla-inbound/rev/1bec2044db5c

So for this to work again we need bug 827404 fixed first

Status: RESOLVED → REOPENED

Depends on: 827404

Ever confirmed: true

Flags: needinfo?(hv1989)

Resolution: FIXED → ---

Matt Brubeck (:mbrubeck)

Updated

•

11 years ago

Depends on: 934433

Ryan VanderMeulen [:RyanVM]

Comment 13

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/1bec2044db5c

Status: REOPENED → RESOLVED

Closed: 11 years ago → 11 years ago

Resolution: --- → FIXED

Hannes Verschore [:h4writer]

Assignee

Comment 14

•

11 years ago

Forgot to annotate with [leave open], since the issue was backed out. So this isn't resolved yet.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Whiteboard: [leave open]

Hannes Verschore [:h4writer]

Assignee

Comment 15

•

10 years ago

This is also an issue on octane-typescript. Would give us a 10% increase on that benchmark

Till Schneidereit [:till]

Comment 16

•

10 years ago

Hannes, 10% on TypeScript sounds pretty good. Do you have this on your radar still?

Flags: needinfo?(hv1989)

OS: Linux → All

Hardware: x86_64 → All

Target Milestone: mozilla28 → ---

Hannes Verschore [:h4writer]

Assignee

Comment 17

•

10 years ago

It is definitely on my radar. It is one of my p1 performance bugs. I just haven't found the time yet. Looking at last week, I finally could look into performance issues only. So hopefully this continues and I can try this in one of the next weeks again!

Flags: needinfo?(hv1989)

Hannes Verschore [:h4writer]

Assignee

Comment 18

•

10 years ago

Attached patch Part 2: Keep track of clasp for AnyObjects (deleted) — Details — Splinter Review

So I adjusted TypeSet to go from
specific objects -> objects have a particular clasp -> any object

As a result if we deoptimize objects, we still retain the class information in most cases.

This doesn't fix the problem with DOM objects. There are way too many Clasp for DOM objects. So they still go to any object. My idea on fixing that is to save a bitmask of the flags of clasp. So when NON_NATIVE, JSCLASS_HAS_PRIVATE, JSCLASS_EMULATES_UNDEFINED, JSCLASS_IS_PROXY and JSCLASS_IS_DOMJSCLASS is the same for all objects. Save that information. That way we can still see if an object is a DOM object.

Attachment #8514215 - Flags: feedback?(bhackett1024)

Hannes Verschore [:h4writer]

Assignee

Updated

•

10 years ago

Attachment #806163 - Attachment is obsolete: true

Hannes Verschore [:h4writer]

Assignee

Updated

•

10 years ago

Attachment #825218 - Attachment description: bug914255-limit-objects → Part 1: Limit the number of objects

Brian Hackett [Laid off!]

Comment 19

•

10 years ago

Comment on attachment 8514215 [details] [diff] [review]
Part 2: Keep track of clasp for AnyObjects

Review of attachment 8514215 [details] [diff] [review]:
-----------------------------------------------------------------

Generally this looks fine, but I'm mainly concerned about whether this approach will actually help with fixing the dromaeo regressions.  The common DOM getter/setter stuff is based on finding that getter/setter on a prototype shared between the various type objects in a type set, and just knowing the common class (or IS_DOM_CLASS etc. flags) for members of the type set doesn't seem like it will be sufficient to be able to optimize DOM accessors without keeping track of the explicit type objects.  Unless you're planning on doing a subtyping thing too for type sets where we keep track of a common prototype as well as common class information?

Anyhow, if you want to land a reduction in the maximum type set size without all this reengineering, is there a simple filter you could use for when to retain more type objects in type sets, which would give the benefits of smaller type sets without regressing dromaeo?  e.g. if the classes in the type set are DOM classes, allow the type set to have more objects in it.

::: js/src/jit/IonMacroAssembler.cpp
@@ +119,5 @@
>      }
>  
> +    // Emit testing specific objects or object clasp.
> +    if ((!types->unknownObject() && types->getObjectCount() > 0) ||
> +        (types->unknownObject() && !types->unknownObjectWithNoClass()))

This condition is pretty complicated, can you put it in a helper function?  This would also fix the later duplication of this stuff in LIRGenerator::visitTypeBarrier.

@@ +135,5 @@
>          bind(&matched);
>          return;
>      }
>  
> +    // If this is the last check, invert the last branch.

Maybe just 'Invert the last branch, if there is one.'

@@ +164,5 @@
>      // see CodeGenerator::link. Other callers should use TypeSet::readBarrier
>      // to trigger the barrier on the contents of type sets passed in here.
> +
> +    // An unknown type can still contain class information. Make sure the clasp
> +    // equals to the objects we observe.

Second sentence is a bit garbled.  How about:

Make sure the clasp matches the object we are testing.

::: js/src/jit/Lowering.cpp
@@ +2359,5 @@
>      // from inside a type barrier test.
>  
>      const types::TemporaryTypeSet *types = ins->resultTypeSet();
> +    bool needTemp = (!types->unknownObject() && types->getObjectCount() > 0) ||
> +                    (types->unknownObject() && !types->unknownObjectWithNoClass());

This should use the helper function requested above.

::: js/src/jsfriendapi.h
@@ +567,5 @@
>  struct TypeObject {
>      const Class *clasp;
>      JSObject    *proto;
> +
> +    static size_t offsetOfClasp() { return offsetof(TypeObject, clasp); }

Why is this necessary?  I don't see it being called anywhere, and we shouldn't need these offsetof methods in the shadow:: API.

::: js/src/jsinfer.cpp
@@ +306,5 @@
>          flags |= TYPE_FLAG_ANYOBJECT;
>      } else  if (type.isTypeObject() && type.typeObject()->unknownProperties()) {
>          flags |= TYPE_FLAG_ANYOBJECT;
> +        // TODO: is it correct to do this:
> +        // payload.clasp = type.typeObject()->clasp();

Unfortunately it isn't, due to object brain transplants.  If the type object has the OBJECT_FLAG_SETS_MARKED_UNKNOWN flag then we allow its prototype and clasp to be mutated by transplants without changing type information, which would invalidate this optimization.

@@ +542,5 @@
>          TypeObject *nobject = type.typeObject();
>          MOZ_ASSERT(!nobject->singleton());
>          if (nobject->unknownProperties())
> +            // TODO: is it correct to do this:
> +            // goto objectClass;

No, for the same reason as earlier.

@@ +582,5 @@
>  
>      TypeSet::addType(type, &cxArg->typeLifoAlloc());
>  
> +    // TODO: there is no specific class object Type.
> +    // Can we leave this. Or will this cause problems?

This should be fine.

Attachment #8514215 - Flags: feedback?(bhackett1024)

Boris Zbarsky [:bzbarsky]

Comment 20

•

10 years ago

The way ion currently optimizes the DOM accesses is like so:

* Check whether we find the prop on the same proto for all objects in the typeset.
* Get the relevant property getter or setter off that proto (out of the baseline IC, actually).
* Verify that this property getter/setter is compatible with all the object types we've seen. That requires them to be JSCLASS_IS_DOMJSCLASS _and_ then does a call back into Gecko to see whether that JSClass is compatible with the accessor. This unfortunately requires the actual JSClass pointer right now.

So that kind of sucks. Some thoughts on possible ways of dealing with this:

1) We could have different size limits for typesets depending on whether everything in the typeset is a DOM object or not. Icky, but probably simple to do. How fast it is, I'm not sure; isDOMClass() is not a cached state in typesets, so adding a new Type would involve rechecking that state or something.
2) We could have a single limit, but special-case some particularly common DOM things that have lots of subtypes that tend to pass through polymorphic callsites. Specifically, EventTarget, Node, and Element. Basically, store state in the typeset (do we have free bits for this?) if everything in it is in one of those buckets. This can be done without keeping track of the exact JSClasses of things in a typeset; you just update the bit (via a callback to Gecko, I guess; I hope this is not a common operation) if a new Type gets added to the typeset. Then we add a new Gecko callback Ion can use that takes these bits instead of a JSClass, for use in the cases when we stopped tracking the exact JSClasses and are just tracking the bits instead.

Hannes Verschore [:h4writer]

Assignee

Comment 21

•

10 years ago

Ok, after rechecking everything I came to some conclusions:

1) I thought having "clasp" instead of "anyobject" would give more optimization opportunities for IM (not only related to DOM performance). Seems like this is incorrect? E.g. for all used getKnownClass(), we need some more information, like for Array, we need to check extra properties before allowing inlining ... So it might not help IM at all? Or is this a chicken and egg problem, where having clasp would cause us to use it more to decide to optimize?

2) Octane has adjusted TypeScript to run X times instead of X seconds. As a result adding this optimization is now only a small win. So this isn't not really needed to fix octane-typescript.

3) Though there is another solution that isn't based on decreasing the limit. Currently TI keeps track of TypeSet, after which IM uses this information to base things on. Whereafter constraints are added to make sure the TI doesn't change. Now there is a caveat here. E.g. TI can say the type is X and IM can say the operation works for type Y. E.g. "undefined == x", where x is [String], will get translated into Compare_Null, which supports all types.
So here goes the idea: What about using TI, like we do now, but instead of freezing the current TypeSet, only freeze the types IM specializes for. E.g. when calling a function with 7 TI objects, we don't inline (max 6 or something), so the code actually only uses ANY_OBJECT. Shouldn't it be nice to only add that constraint instead, instead of bailing every time for every newly added object.
This would remove the need for a heuristic depending on DOM/nonDOM. And let IM decide when it has used the TI data or if it compiled it based on lest strict rules...

4) You are both in favor of adding a different size limit when all objects in a TypeSet are DOM classes (as quick fix)

Ok, so (4) is really easy and is a quick fix. I'll do that.
I'm now worried that (1) (so this patch), might not help at all. So maybe don't do that?

@Brian: what do you think about (3)? It has been something I've been thinking about and think it would be a good improvement on what we have now.

Boris Zbarsky [:bzbarsky]

Comment 22

•

10 years ago

> Shouldn't it be nice to only add that constraint instead, instead of bailing every time
> for every newly added object.

Yes!

This may help with DOM stuff too, since then we may be able to add constraints like "is valid for this DOM function".

Hannes Verschore [:h4writer]

Assignee

Comment 23

•

10 years ago

Attached patch Limit the number of objects when having no DOM class (deleted) — Details — Splinter Review

Attachment #8517386 - Flags: review?(bhackett1024)

Hannes Verschore [:h4writer]

Assignee

Comment 24

•

10 years ago

Comment on attachment 8517386 [details] [diff] [review]
Limit the number of objects when having no DOM class

Review of attachment 8517386 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsinfer.cpp
@@ +537,5 @@
> +
> +        // If object count is higher than the normal limit, all objects should
> +        // be DOM objects.
> +        if (objectCount > TYPE_FLAG_OBJECT_COUNT_LIMIT) {
> +            if (object->clasp()->isDOMClass())

if (!isDOMClass())

off course.

Brian Hackett [Laid off!]

Comment 25

•

10 years ago

Comment on attachment 8517386 [details] [diff] [review]
Limit the number of objects when having no DOM class

Review of attachment 8517386 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsinfer.cpp
@@ +539,5 @@
> +        // be DOM objects.
> +        if (objectCount > TYPE_FLAG_OBJECT_COUNT_LIMIT) {
> +            if (object->clasp()->isDOMClass())
> +                goto unknownObject;
> +        }

This should spell out more explicitly the optimization that is being performed, as it's kind of hard to follow the logic here.  Maybe remove the below test for TYPE_FLAG_DOMOBJECT_COUNT_LIMIT and do:

// Limit the number of objects we track. There is a different limit
// depending on whether the set only contains DOM objects, which can
// have many different classes and prototypes but are still optimizable
// by IonMonkey.
if (objectCount >= TYPE_FLAG_OBJECT_COUNT_LIMIT) {
    // Examining the entire type set is only required when we first hit
    // the normal object limit.
    if (objectCount == TYPE_FLAG_OBJECT_COUNT_LIMIT
        ? !isDOMClass()
        : !object->clasp()->isDOMClass())
    {
        goto unknownObject;
    }
    if (objectCount == TYPE_FLAG_DOMOBJECT_COUNT_LIMIT)
        goto unknownObject;
}

@@ +547,1 @@
>              goto unknownObject;

There should be a static_assert(TYPE_FLAG_DOMOBJECT_COUNT_LIMIT > TYPE_FLAG_OBJECT_COUNT_LIMIT) somewhere around here.

::: js/src/jsinfer.h
@@ +397,5 @@
>      /* Mask/shift for the number of objects in objectSet */
>      TYPE_FLAG_OBJECT_COUNT_MASK   = 0x3e00,
>      TYPE_FLAG_OBJECT_COUNT_SHIFT  = 9,
> +    TYPE_FLAG_OBJECT_COUNT_LIMIT  = 7,
> +    TYPE_FLAG_DOMOBJECT_COUNT_LIMIT  =

Nit: stray space

Attachment #8517386 - Flags: review?(bhackett1024) → review+

Hannes Verschore [:h4writer]

Assignee

Comment 26

•

10 years ago

I did remove the "? :", which I felt was hard to follow.
https://hg.mozilla.org/integration/mozilla-inbound/rev/95e326d4fd7d

Guilherme Lima

Comment 27

•

10 years ago

This patch is a win on a lot of benchmarks but is also a regression on one specific benchmark: solve-sudoku (according to AWFY). Worth investigating?

Hannes Verschore [:h4writer]

Assignee

Comment 28

•

10 years ago

(In reply to Guilherme Lima from comment #27)
> This patch is a win on a lot of benchmarks but is also a regression on one
> specific benchmark: solve-sudoku (according to AWFY). Worth investigating?

Like I could have guessed before looking into the benchmark, we have 9 different objects somewhere. And since we now only keep track of maximum 7 objects, we see a general object, instead of the specific objects. 

> var possibilities = [[],[],[],[],[],[],[],[],[]];

Generaties unique types for every "[]", so 9 specific objects.

If we change the code to
> var possibilities = []
> for (var i = 0; i < 9; i++)
>     possibilities[i] = []

we only use 1 typeobject. So they are all the same. And we only need to keep track of 1 specific object.
So with this change the performance of the code is also back to before this patch.

- Now I don't how feasible it is to use the same typeobject for every array in [[],[],[],[],[],[],[],[],[]]. Which would be a possible fix.
- Comment 21 option 3 would also fix this.

Carsten Book [:Tomcat]

Comment 29

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/95e326d4fd7d
https://hg.mozilla.org/mozilla-central/rev/30ac35bf1faf

Hannes Verschore [:h4writer]

Assignee

Comment 30

•

10 years ago

Oops. Forgot to remove the leave-open flag.

Status: REOPENED → RESOLVED

Closed: 11 years ago → 10 years ago

Resolution: --- → FIXED

Whiteboard: [leave open]

Target Milestone: --- → mozilla36

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Updated

•

10 years ago

Depends on: 1123011

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Updated

•

10 years ago

No longer depends on: 1123011

Boris Zbarsky [:bzbarsky]

Updated

•

9 years ago

Depends on: 1161576

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Updated

•

9 years ago

Depends on: 1234736

Hannes Verschore [:h4writer]

Assignee

Updated

•

9 years ago

No longer depends on: 1234736

A test to reproduce the issue 11 years ago Yves Gwerder (deleted), text/javascript		Details
temporary test-patch by h4writer 11 years ago Yves Gwerder (deleted), patch		Details \| Diff \| Splinter Review
Part 1: Limit the number of objects 11 years ago Hannes Verschore [:h4writer] (deleted), patch	bhackett1024 : review+	Details \| Diff \| Splinter Review
Part 2: Keep track of clasp for AnyObjects 10 years ago Hannes Verschore [:h4writer] (deleted), patch		Details \| Diff \| Splinter Review
Limit the number of objects when having no DOM class 10 years ago Hannes Verschore [:h4writer] (deleted), patch	bhackett1024 : review+	Details \| Diff \| Splinter Review