Closed Bug 849325 Opened 11 years ago Closed 10 years ago

Kraken regression between 2012-12-11 and 2012-12-12 - about 40% slower

Categories

(Core :: JavaScript Engine, defect)

x86
Windows Vista
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 890524
Tracking Status
firefox19 --- unaffected
firefox20 + affected
firefox21 --- unaffected
firefox22 --- unaffected

People

(Reporter: bugs, Unassigned)

References

Details

(Keywords: perf, regression)

Attachments

(2 files)

Here's a couple of representative nightlies before the bisect to see the difference
                                           fft       gaussian-blur   crypto-aes
http://x.m8y.org/82 (latest nightly)       261.9ms   616.8ms         241.8ms   
http://x.m8y.org/7u (2012-10-23 nightly)   150.5ms   189.2ms         117.3ms

Although the high-speed stuff continued back to even FF18 nightlies

So, about:build-config for 2012-12-11 reports:
http://hg.mozilla.org/mozilla-central/rev/4dfe323a663d (good)
http://hg.mozilla.org/mozilla-central/rev/634180132e68 (bad)

VM I was testing on was the one described here:
http://m8y.org/tmp/kraken.xhtml
You'll see on awfy that we had a regression starting from December 2th till Januari 30th on kraken. It regressed for 11%. Using awfy we could even find the regression + fix if needed ...

FF19 merged November 19th
FF20 merged Januari 7th (during regression)
FF21 merged February 19th

FF21 will be fine again.
Not sure if we want to take action or not? Dave??
nemo: thanks for the report. And Hannes: thanks for breaking down the versions. We should definitely get this fixed for Fx20 if at all possible. 20 is beta so there is not much time.
I quickly glanced at awfy to find the bugs fixing the regression. I reran the numbers and can confirm we need atleast the following:

1741.9ms to 1639.3ms, 6% improvement
d7dd65663469    Brian Hackett — Bug 833898 - Allow converting mixed arrays of ints an...

1919.8ms to 1727.5ms, 11% improvement
f11a8adb9a77	Brian Hackett — Bug 832578 - Tweaks to improve compilation of element...
Oh, and those are the only bugs we need to back-port. These will give the needed speed-up to be on par with FF19/FF21. Not sure how to advance now? Should Brain get informed and should he request the beta-approval?
Note that according to awfy we have a similar issue with SS:
FF19: 198
FF20: 211, 6% regression
FF21: 198
For SS I see the following bugs (not complete) that improved SS.

3.6% improvement:
71f14579265f	Benjamin Peterson — Bug 836515 - Allow source compression to run while executing the script. r=jorendorff

1.4% improvement:
6450a26eb74f	Brian Hackett — Bug 834242 - Reduce cost of exact stack rooting during property reads, r=terrence.
Bug 832578 could maybe be uplifted, bug 833898 is too complicated.  What caused the regression on awfy?  (wondering if it's noise related to source compression)  Hannes, any chance you could compare performance in the actual browsers to confirm there is a regression?
The regression I measured in the browser between release (FF19) and beta (FF20a1) is far less than reported above. I only see a 3.5% difference (1930 to 1998) for kraken 1.1
And I got 5.2% slowdown on my XP test machine.

Dunno. Virtualbox Vista VM might be more sensitive. I can try running it on another machine.
Quite a few people at work use VMs as their main desktops though, for maintenance ease. (laptop running minimal linux for virtualisation, w/ Windows in a VM).
If I understood nemo correctly on IRC, the regression he's seeing still exists in the current Nightlies. That and the fact that his regression window doesn't match up with the landing of the regressions Hannes is talking about lets me think that maybe there are two different things going on here.
Guilerme, I don't see a From/To - could you just paste the URLs for each?

A URL shorterner might be nice too.  I often use w3m (from alpine) to load these bug responses, and that was a pretty painful URL to copy in segments ;)
In case you're wondering why I didn't just load the bug page into the javascript capable browser - stupid firewall hates BMO :-p
Oh. Also (geez. I'll stop after this, honest).
Looks like BMO doesn't correctly link the kraken benchmark anyway, so you have to copy and paste it to get the final ]}  otherwise you just get a puzzling blank page..

Anyway. Shortening service would solve all that  ☺
This is not a Win-only issue, is it? Then please change the Platfrom attributes. Thanks!
No idea.  But if anyone else is seeing same regression on another platform after that December 2012 nightly identified in report, feel free to change...

TBH, the comments are kinda confusing me.  Not sure if it is the same issue.
(In reply to nemo from comment #16)
> No idea.  But if anyone else is seeing same regression on another platform
> after that December 2012 nightly identified in report, feel free to change...
> 
> TBH, the comments are kinda confusing me.  Not sure if it is the same issue.

That's what I meant in comment 10, too.

@Hannes, maybe you can follow up with nemo regarding the regression he's seeing. If not, I'll do that tomorrow.
Flags: needinfo?(hv1989)
Okay, so the regression I saw was FF20 only and will get solved by taking a part of the patch in Bug 832578. That's covered. (From 3.5% slower, FF20 will get 3.5% faster than FF21)

Now back on this topic, I'm little confused about this 40% slower. So this is an issue starting at 2012-12-11 and is still present in the current nightlies.
@Till, if you want you can take over, as I cannot reproduce meself ...
Flags: needinfo?(hv1989)
(In reply to Hannes Verschore [:h4writer] from comment #18)
> Okay, so the regression I saw was FF20 only and will get solved by taking a
> part of the patch in Bug 832578. That's covered. (From 3.5% slower, FF20
> will get 3.5% faster than FF21)
s/FF21/FF19 offcourse
Ok, so my important results: FROM FF20 - TO FF22
TEST                         COMPARISON            FROM                 TO               DETAILS
crypto-aes:              *2.14x as slow*    140.5ms +/- 1.3%    301.3ms +/- 2.9%     significant
crypto-ccm:              *1.158x as slow*   129.3ms +/- 9.9%    149.7ms +/- 5.0%     significant
crypto-sha256-iterative: ??                 115.3ms +/- 0.8%    137.2ms +/- 29.4%     might be *1.190x as slow*
Attached file Profile before slowdown (deleted) —
Attached file Profile after slowdown (deleted) —
Looking at the two profiles, the first thing to note is that we spend quite a bit more time doing GC.

In terms of the various tests in Kraken, the following differences are interesting:
- FastLog2 goes from 1ms to 156ms!
the most interesting differences are under Timer::Fire:
- calcDFT goes from 312ms to 364ms

Additionally, here's some low-level differences that seemed interesting:
- under symbol 0x2f77: 27ms -> 147ms (GC, probably just different timing)
- under Paint::PresShell::Paint: 65ms -> 106ms
- under symbol 0x28e8 (way deep under reallyNext() @ driver.html:90, through 0x7fcd6), there's a new invocation of RtlpWaitOnCriticalSection, taking 16ms
- under JS::EvaluateString ->0x118ec7:
    - under 0x12d38: 741ms -> 815ms:
        - under 0x27a7: 306ms -> 523ms, all of which is spent in KiFastSystemCallRet (so waiting or doing things we don't record?)
        - some other regressions and a big improvement under 0x4f1c8


To summarize:
- Two tests (FastLog2 and calcDFT) seem to have regressed
- It seems like there's a regression in Paint::PresShell::Paint
- RtlpWaitOnCriticalSection appears in the new profile, taking 16ms
- time spent in or under KiFastSystemCallRet increased quite a lot

Unfortunately, I don't have any idea what to make of all of this, but maybe it's useful to somebody else.
So.  Profiling slows things down significantly, so I wanted to have "ms" values roughly similar to those in the profiling. 
Below are results w/ profiling - best of several runs. Can see they are a lot slower than w/o SPS.
So I'm not totally sure profiling is revealing same thing as w/o.
If I was on a linux system, I'd compare to a system profiler, but don't have much experience w/ that in Windows and no particular desire to learn.
2012-12-11 (good): http://x.m8y.org/89  (3041.9ms)
2012-12-12  (bad): http://x.m8y.org/88  (5033.0ms)

Also, kaelan expressed concern about symbols or lack thereof.
I'd assumed that anyone loading the profile in the same nightly w/ symbols setup, would see nice readable traces.  
If that's not the case, I can certainly try getting symbols set up on here and rerecord.
Just lemme know. I'd rather not add large redundant files.
So. Yeah.  Just 'cause I'm a bit unclear as to whether SPS is revealing the right thing...

With profiling:
2012-12-11 (good): http://x.m8y.org/89  (3041.9ms)
2012-12-12  (bad): http://x.m8y.org/88  (5033.0ms)

Without profiling:
2012-12-11 (good): http://x.m8y.org/8b  (2503.8ms)
2012-12-12  (bad): http://x.m8y.org/8a  (3668.9ms)

Seemed to me more things were a lot slower w/ profiling.
Also the "bad" build was 65% slower w/ profiling but only 47% without.

So profiling exaggerated the difference?
So, FWIW, latest Nightly is allllmost as fast as FF18 was on this setup.
http://x.m8y.org/8d

I assume that's due to BC added to the Ion mix.
I tossed it on http://m8y.org/tmp/kraken.xhtml  - compare w/ FF22 (20130308) which was exhibiting those symptoms that appeared above.

So.  w/ BC y'all don't use Jäger anymore right?  p'raps that was source of the prob?
The other thing I noticed is it got a lot slower between FF19 and FF20 in release builds.
Firefox 19  http://x.m8y.org/7w    2428.2ms ± 3.1%

FF20 (beta) http://x.m8y.org/85    3241.0ms ± 3.5%
Firefox 20  http://x.m8y.org/8e    3183.8ms ± 3.6%

As a result, stable Firefox dropped from 107.5% of stable Chrome's speed (19 vs 25) to 140.9% of stable in 20.

I suppose that's something totally different.  I guess I can try bisecting for something there too, but I guess if FF23 fixes everything, might just be a waste of time.
Keywords: perf, regression
Depends on: 890524
I'm preeeetty sure this can be closed at this point.  There were a couple of commits that basically closed this gap.
Can see how http://m8y.org/tmp/kraken.xhtml now has FF24 on parity w/  FF18, and FF25 passing..
From the results I posted on comment 20, and compare with Nightly, I see that crypto-aes and crypto-sha256-iterative are not a problem anymore. But crypto-ccm regressed a lot.
Firefox 20 - 129ms
Firefox 22 - 149ms
Nightly 28 - 221ms

But maybe I should fill a new bug about this.
nemo, can you see if you see this regression too?
Flags: needinfo?(bugs)
Guilherme, would you mind testing some versions between 22 and 28, like 25?

(And thank you for looking at all these perf bugs!)
Flags: needinfo?(guijoselito)
Before I continue, I checked Firefox 26 (Beta) and there I see 197ms. But this regression doesn't appear on AWFY.

Also, looking on AWFY, overlaping the multiple regressions we can see on the 32-bit machine and 64-bit machine, I see this bugs as regressors:
https://bugzilla.mozilla.org/show_bug.cgi?id=858381
https://bugzilla.mozilla.org/show_bug.cgi?id=867426
https://bugzilla.mozilla.org/show_bug.cgi?id=678037
Don't take this as science, because AWFY doesn't seem to be super stable, but I think this 3 bugs showed up on both machines.
Flags: needinfo?(guijoselito)
Yeah. The whole reason this bug went unnoticed was that things were fine on AWFY.  Shell was ok, browser wasn't.
There was some discussion of a harness for running JS tests in browsers in IRC.  I don't know if any progress was made.
Flags: needinfo?(bugs)
(In reply to nemo from comment #32)
> There was some discussion of a harness for running JS tests in browsers in
> IRC.  I don't know if any progress was made.

We have been running JS tests in the browser via the Talos harness for years, and it did detect this regression!

http://graphs.mozilla.org/graph.html#tests=[[232,1,14],[232,1,12],[232,1,1],[232,1,22]]&displayrange=365

The automated regression detector sent email to dev-tree-management:

https://groups.google.com/forum/#!searchin/mozilla.dev.tree-management/kraken$20regression$20inbound/mozilla.dev.tree-management/GbY_T4CyXZE/Mnja9SQGpwEJ

The regression was discussed in bug 813559 comment 17 and later, though it was dismissed by the developers at first.
I was a bit unclear.  The discussion was about something like AWFY (multiple engines) but in browser.
That would allow testing IE and other browsers too.
I was totally unaware that this had been picked up in existing tests tho.  Go figure.  At the time I'd nagged people about it, it would have been useful to have been able to reference that.
Bug 833898 comment 9 also pointed out one of the browser-only regressions relevant to this bug, and was also dismissed initially.
Assignee: general → nobody
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: