Closed Bug 607539 Opened 14 years ago Closed 14 years ago

mandelbrot "benchmark" is a lot slower with profiling on

Tracking

()

Status:

RESOLVED FIXED

Tracking Flags:

Tracking

Status

blocking2.0

---

betaN+

People

(Reporter: bzbarsky, Assigned: billm)

References

(
URL
)

Details

(Keywords: regression, Whiteboard: [jsperf] [fixed-in-tracemonkey][hardblocker])

Attachments

(1 file, 3 obsolete files)

patch 14 years ago Bill McCloskey [inactive unless it's an emergency] (:billm) (deleted), patch	dvander : review+	Details \| Diff \| Splinter Review
additional patch 14 years ago Bill McCloskey [inactive unless it's an emergency] (:billm) (deleted), patch		Details \| Diff \| Splinter Review
additional additional patch 14 years ago Bill McCloskey [inactive unless it's an emergency] (:billm) (deleted), patch	bzbarsky : review+	Details \| Diff \| Splinter Review
new patch 14 years ago Bill McCloskey [inactive unless it's an emergency] (:billm) (deleted), patch	dmandelin : review+	Details \| Diff \| Splinter Review

Boris Zbarsky [:bzbarsky]

Reporter

Description

•

14 years ago

I see 105ms with profiling or jm disabled; 355ms with profiling enabled or tm disabled. I thought we had this in trace-tests... but that might be as a stress-test only. :( Building debug browser now to get some more data.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 1

•

14 years ago

Note that the core part of the code is this: for (var i = 0, curReal = realRange.min; i < numCols; ++i, curReal += realStep) { for (var j = 0, curImag = imagRange.max; j < numRows; ++j, curImag += imagStep) { var n = computeEscapeSpeed(curReal, curImag); addPoint(points, n, i, j) } } where computeEscapeSpeed runs a loop anywhere between 1 and 105 times. Most calls to computeEscapeSpeed run the loop 105 times, but most of the early calls run it just once.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 2

•

14 years ago

So we first start profiling with that inner for loop there (presumably because the loop inside computeEscapeSpeed has never run enough iterations at that point yet to get profiled?). We end up deciding not to trace it because the nested loop inside computeEscapeSpeed (which has run 0 iterations, according to the "NESTED" spew!) has numSelfOps == numSelfOpsMult == 0, so that isCompilationExpensive for that loop returns true. Then we later profile the loop inside computeEscapeSpeed and end up aborting profiling (not sure why yet). So we blacklist that loop. And that sets numSelfOps to MAX_PROFILE_OPS for that innermost loop, which means all loops containing it get blacklisted too. So we blacklist everything.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 3

•

14 years ago

If I change the >= to > in this line: if (numSelfOpsMult >= numSelfOps*100000) then things seem to be good. Or at least, I only see us start profiling for that inner for loop twice, each time decide to trace it and move on. Not sure why I never see us profile the outer loop.... Also not sure yet why we ended up aborting profiling. Trying to figure that out now.

blocking2.0: --- → ?

Keywords: regression

Boris Zbarsky [:bzbarsky]

Reporter

Comment 4

•

14 years ago

Looks like we abort profiling if we return out of a loop we're currently profiling? But unlike aborting tracing, which will back off and try again later it looks like aborting profiling permanently blackists the loop? Is that right? In any case, the abort happens when we take the return in computeEscapeSpeed, which is the common case for that function. In fact, it looks like any loop that returns from inside the loop before hitting 16384 jsops will hit the abort and get blacklisted. In particular, any loop that uses return instead of a break or loop condition and happens to return before it's looped enough times to hit 16384 jsops will hit this case... Would it make more sense to just decide() at that point instead of aborting, possibly?

David Mandelin [:dmandelin]

Updated

•

14 years ago

Whiteboard: [jsperf]

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Comment 5

•

14 years ago

I think there are a lot of things independently going wrong here. I still have a few things to track down, but here's what I've found so far. - The problem with the zero-iteration loop in comment 3 was tripping us up. The fix there seems good to me. - JM wasn't compiling the main script because it used const. I filed bug 607751 to compile those opcodes. The weird thing here is that if JM can't compile a script, then we run in the interpreter, which avoids the profiler entirely. I think this is the correct thing to do: lacking the method jit, we probably should just trace as much as possible, which is what happens without the profiler. - I don't think that aborting due to a return statement is bad. This was done by design. First, we only abort if the return breaks out of the loop being profiled (as opposed to some nested loop). Second, I want to emphasize an important part of profiling: If we profile a loop and decide not to trace it, it's not the end of the world. We may end up profiling an outer loop, deciding that we do want to trace it, and then we'll trace the original loop as well. This is why the "un-blacklisting" mechanism was needed for profiling. In this example, we decide not to trace the computeEscapeSpeed loop due to the return. However, we will still trace it when we decide to trace the |for (var j = 0, ...)| loop. In general, a return statement in a loop suggests that we won't execute many iterations of the loop, so we probably don't want it to be a top-level trace. It's just another "short loop" indicator. - This test case also seems to exhibit a problem with tracer integration, where we start recording a loop when the interpreter is running in JSINTERP_SAFEPOINT mode. It seems like this is undesirable. - I still haven't figured out why we don't trace the outer loop. Once I figure that out, I'll post a patch for these issues. Anyway, thanks for the great test case!

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Comment 6

•

14 years ago

Attached patch patch (obsolete) (deleted) — Details — Splinter Review

This patch fixes the issue with loops with loops of zero iterations. It also fixes the problem with the integration between the profiler and the JITs. With the patch, we trace both the inner and outer main loops and performance with -mjp is that same as with -j. The problem was that, while running in the interpreter, waiting to reach a safe point, we needed to make important profiling decisions. Previously, the profiler could only be called from the method JIT, never the interpreter. This patch fixes the problem. As a result, you can run in weird modes like -j -p. There's no reason to do this, but it should work now. I pushed to the tryserver, since it's slightly risky change.

Assignee: general → wmccloskey

Status: NEW → ASSIGNED

Attachment #486474 - Flags: review?(dvander)

David Anderson [:dvander] - inactive, e-mail if emergency

Comment 7

•

14 years ago

Comment on attachment 486474 [details] [diff] [review] patch >+ if (prof->profiled) { >+ if (prof->traceOK) { >+ return RecordLoopEdge(cx, inlineCallCount, prof->execOK); >+ } else { >+ return MONITOR_NOT_RECORDING; >+ } No return-after-else, just: >+ if (prof->traceOK) >+ return RecordLoopEdge(cx, inlineCallCount, prof->execOK); >+ return MONITOR_NOT_RECORDING;

Attachment #486474 - Flags: review?(dvander) → review+

Boris Zbarsky [:bzbarsky]

Reporter

Comment 8

•

14 years ago

> We may end up profiling an outer loop, deciding that we do want to trace it, > and then we'll trace the original loop as well. See, that's my concern. Once we _abort_ profiling of a loop X (which is different from just profiling and deciding not to trace), then we won't trace any loops that contain X, as far as I can tell. In particular, AbortProfiling will set numSelfOps for X to MAX_PROFILE_OPS. So isCompilationExpensive() for X will return true. And isCompilationExpensive() for a loop containing X will return true if it returns true for X. The only way I can see this working is if we sometimes blow away the LoopProfile structure for X and create a new one, before we start profiling the ancestor loop. As far as I can see this only happens on a ResetJIT call. Am I just missing something?

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Comment 9

•

14 years ago

(In reply to comment #8) > Am I just missing something? No, I was wrong. I forgot about that assignment to numSelfOps in AbortProfiling. I'm pretty sure I added it to avoid tracing something in one of the SunSpider date benchmarks. But I just took out the assignment and I'm not seeing any regressions. It's definitely a good idea to take it out, because it's a silly thing to do in general. I'll post an updated patch with this change and a fix for another problem I found this morning.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Comment 10

•

14 years ago

Attached patch additional patch (obsolete) (deleted) — Details — Splinter Review

This patch applies on top of the previous one. It removes the line in AbortProfiling. It also fixes a problem with PCWithinLoop. The problem was that it didn't work if you had already left the function where the loop was. Now it compares frame pointers. I should have picked up on this when you asked about it, Dave. Oh well.

Attachment #486674 - Flags: review?(dmandelin)

Boris Zbarsky [:bzbarsky]

Reporter

Comment 11

•

14 years ago

We may still want to set some flag to prevent tracing of the loop being aborted (this time around; we could clear the flag the next time we profile the loop)... I do agree that a priori loops that return while being profiled should probably not get traced, since they'd just trace-abort if they did the return while recording.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Comment 12

•

14 years ago

Attached patch additional additional patch (obsolete) (deleted) — Details — Splinter Review

Good point. I think this should do it.

Attachment #486687 - Flags: review?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Reporter

Comment 13

•

14 years ago

Comment on attachment 486687 [details] [diff] [review] additional additional patch r=me

Attachment #486687 - Flags: review?(bzbarsky) → review+

David Mandelin [:dmandelin]

Comment 14

•

14 years ago

Comment on attachment 486674 [details] [diff] [review] additional patch >-PCWithinLoop(JSScript *script, jsbytecode *pc, T& loop) >-{ >- return script != loop.script || (pc >= loop.top && pc <= loop.bottom); >+PCWithinLoop(JSStackFrame *fp, jsbytecode *pc, T& loop) >+{ >+ return fp > loop.fp || (fp == loop.fp && pc >= loop.top && pc <= loop.bottom); > } |fp > loop.fp| doesn't look right to me. I know it didn't work before the new stacks, but I guess it probably does do the right thing now. Luke, does this seem safe to you? Or should it walk the stack to check? > /* > * When we hit a nested loop while profiling, we record where it occurs > * and how many iterations we execute it. > */ > struct InnerLoop { >- JSScript *script; >+ JSStackFrame *fp; > jsbytecode *top, *bottom; > uintN iters; I'd prefer 'entryFp', 'startFp', or something like that. Otherwise looks good.

Attachment #486674 - Flags: review?(dmandelin)

Luke Wagner [:luke]

Comment 15

•

14 years ago

(In reply to comment #14) > Luke, does this seem safe to you? Or should it walk the stack to check? The two exceptions I can think of are: - the 'floating frames' stored in generators (which are copies of inactive generators, hence, should not appear here) - reentering the VM (content -> C++ -> chrome), but it looks like that aborts profiling. > I'd prefer 'entryFp', 'startFp', or something like that. By the by, we've been trying to move toward not capitalizing either of the letters in the register-like acronyms, viz. sp, pc, and fp. (E.g., fp->prevpc(), prevPc or prevPC)

Robert Sayre

Updated

•

14 years ago

blocking2.0: ? → beta8+

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Comment 16

•

14 years ago

Attached patch new patch (deleted) — Details — Splinter Review

Unfortunately, one of the previous patches asserted on the tryserver. I fixed the problem and added some additional assertion coverage. This patch subsumes the previous ones. Could you re-review, Dave? I think the fp comparison is safe, based on what Luke said.

Attachment #486474 - Attachment is obsolete: true

Attachment #486674 - Attachment is obsolete: true

Attachment #486687 - Attachment is obsolete: true

Attachment #489328 - Flags: review?(dmandelin)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Updated

•

14 years ago

Blocks: 606890

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Updated

•

14 years ago

Blocks: 609212

David Mandelin [:dmandelin]

Comment 17

•

14 years ago

Comment on attachment 489328 [details] [diff] [review] new patch I just gave it a light review since I had already reviewed it and you checked out the stack size issue. Looks good.

Attachment #489328 - Flags: review?(dmandelin) → review+

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Comment 18

•

14 years ago

http://hg.mozilla.org/tracemonkey/rev/d5257e1f0c8b

Whiteboard: [jsperf] → [jsperf] fixed-in-tracemonkey

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Comment 19

•

14 years ago

Had to back this out http://hg.mozilla.org/tracemonkey/rev/cb76b2d61096

Mauricio Collares [:collares]

Updated

•

14 years ago

Whiteboard: [jsperf] fixed-in-tracemonkey → [jsperf]

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Comment 20

•

14 years ago

http://hg.mozilla.org/tracemonkey/rev/08794e076ded

Whiteboard: [jsperf] → [jsperf] fixed-in-tracemonkey

Robert Sayre

Comment 21

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/d5257e1f0c8b

Status: ASSIGNED → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Robert Sayre

Comment 22

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/08794e076ded

Boris Zbarsky [:bzbarsky]

Reporter

Comment 23

•

14 years ago

I'm reopening this bug, since the patches that actually landed here, whatever they do, don't fix it (as one can trivially see by trying a Nov 23 nightly, or a current trunk nightly, on the testcase). I still get 325ms or so, unless I disable methodjit or jitprofiling, in which case I get 100ms. I'm not sure what should happen with the blocking+ flag here: leave as 8, or move to 9?

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Whiteboard: [jsperf] fixed-in-tracemonkey → [jsperf]

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Comment 24

•

14 years ago

I landed the lower-risk stuff. This includes some new assertions and bug fixes: http://hg.mozilla.org/tracemonkey/rev/4d66cf4385f0 http://hg.mozilla.org/tracemonkey/rev/cb3f0ac5494f More to come later, I hope.

Eddward

Comment 25

•

14 years ago

Blocking beta8+? Probably it should be beta9+. Beta8 was released on Dec 21..

Boris Zbarsky [:bzbarsky]

Reporter

Comment 26

•

14 years ago

Eddward: please do read the bug before commenting. Comment 23 paragraph two in particular.

Daniel C

Comment 27

•

14 years ago

I'm not sure whether bugs that block previous releases but are still open are being kept track of or not. https://bugzilla.mozilla.org/buglist.cgi?field0-3-0=cf_blocking_20&type0-1-0=notsubstring&field0-1-0=cf_blocking_20&field0-0-0=cf_blocking_20&query_format=advanced&value0-2-0=beta9&value0-1-0=betaN&type0-3-0=notsubstring&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&field0-2-0=cf_blocking_20&type0-0-0=substring&value0-0-0=beta&type0-2-0=notsubstring Shows only this bug and bug 586198.

Boris Zbarsky [:bzbarsky]

Reporter

Updated

•

14 years ago

blocking2.0: beta8+ → betaN+

Bill McCloskey [inactive unless it's an emergency] (:billm)

Assignee

Comment 28

•

14 years ago

http://hg.mozilla.org/tracemonkey/rev/94c6d2a90ae6 This is now running in ~100ms for me.

Whiteboard: [jsperf] → [jsperf] [fixed-in-tracemonkey]

Boris Zbarsky [:bzbarsky]

Reporter

Comment 29

•

14 years ago

Yeah, looks like at this point -mjp is only about a 10% slowdown over -j here; that's reasonable. Thanks!

Luke Wagner [:luke]

Updated

•

14 years ago

Whiteboard: [jsperf] [fixed-in-tracemonkey] → [jsperf] [fixed-in-tracemonkey][hardblocker]

Chris Leary [:cdleary] (not checking bugmail)

Comment 30

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/94c6d2a90ae6

Status: REOPENED → RESOLVED

Closed: 14 years ago → 14 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.