177958 - DHTML Performance Regression

Reporter

Description

•

22 years ago

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.2b) Gecko/20021023 Netscape/7.0+ Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.2b) Gecko/20021023 Netscape/7.0+ We're running some tests using the DHTML Pageload suite, and we've found that one of the test's times has risen a huge amount compared to the 1.0 branch. If you go to the above URL, the main problem is with the slidein test. it's taking 3-4 minutes to finish, where it takes no more than 15 seconds on the branch Reproducible: Always Steps to Reproduce: 1. Run the pageload test with a branch build 2. Run the pageload test with a trunk build 3. Compare Actual Results: Trunk build's results are terrible. working on a standalone testcase for that page. You can see the source of the page at http://pageload.craptastic.org/pages/dhtml/slidein/index.html but it won't do anything, because the js gets triggered by the test script.

Jason Kersey

Reporter

Comment 1

•

22 years ago

Unconfirmed i think not. CCing some folks. Internal pageload server for this is http://cowtools.mcom.com/page-loader-dhtml/loader.pl

Status: UNCONFIRMED → NEW

Ever confirmed: true

Jason Kersey

Reporter

Comment 2

•

22 years ago

Erm, make that http://botbot.mcom.com/cgi-bin/page-loader-dhtml/loader.pl

Oliver Klee

Updated

•

22 years ago

URL: http://pageload.craptastic.org → http://pageload.craptastic.org/

Keywords: perf, regression

John Morrison

Comment 3

•

22 years ago

This is principally due to the fix for bug 123273 - "setTimeout(something, 0) causes 100% CPU constant" - which only applies to the trunk. The fix in that case was to set a minimum of 10msec as the timeout time to prevent a flood of timers. If I modify the testcase to explicitly use a timeout of 10msec and run that test with a branch build, I get essentially the same result as I do in a trunk build. (In other words, this change in performance is the inevitable result of setting that 10msec minimum timeout time). However (I'm a little curious about this), I tried backing out the fix in GlobalWindow.cpp in a trunk build and running the testcase with a timeout of 1msec. The test starts out moving the divs across the page quite rapidly, but then "bogs down", presumably overrun by timer requests. I'm wondering what particular change on the trunk makes us more susceptible to this 'overrun' than we are on the branch (although I vaguely recall that there were some significant changes in timer code that landed after we branched).

John Morrison

Comment 4

•

22 years ago

> it's taking 3-4 minutes to finish, where it takes no more than 15 seconds on > the branch Funny. For me, it takes about 70 seconds on trunk, 18 seconds on branch, for win2k/500MHz. (But then, in some ways, this test is really about timer resolution (and OS/system dependencies), and not so much about painting/reflow/etc.).

Markus Hübner

Updated

•

22 years ago

Blocks: 21762

Laura Krietemeyer

Updated

•

22 years ago

QA Contact: gerardok → trix

Cathleen

Comment 5

•

22 years ago

jst and kevin, can you guys help take a look at this major DHTML regression?

Kevin McCluskey (gone)

Comment 6

•

22 years ago

This is probably the result of the fix for http://bugzilla.mozilla.org/show_bug.cgi?id=164931 which changes the PLEvent notification mechanism to use timers 2 seconds after the document is loaded. John does it "bog down" 2 seconds after the page loads? I agree with John's assessment that the testcase http://pageload.craptastic.org/pages/dhtml/slidein/index.html is only measuring the frequency of the timer callback, not the reflow/painting performance when moving the object. For more detail see http://bugzilla.mozilla.org/show_bug.cgi?id=164931#c24

Jason Kersey

Reporter

Comment 7

•

22 years ago

Yeah, I see a degradation of performance on all the tests in the suite about 2 seconds after they start.

John Morrison

Comment 8

•

22 years ago

I'm currently trying to repair my windows box, but from memory, yeah, it did seem to 'bog down' on that test after about 2 seconds.

Kevin McCluskey (gone)

Comment 9

•

22 years ago

I think this bug is won't fix. The new behavior of limiting the timer frequency is intended and provides up to 100 frames per second for animation's. Which is plenty. This also matches IE's behavior. IE uses native timers which have a 10ms granularity. I would suggest changing the DHTML Pageload suite to do the following. 1. Set the timeout between moves to 10ms instead of 1ms. 2. Change the testcases so the machines we typically run the perf test on can *not* reflow/render within the 10ms interval. This could be accomplished by moving a large number of elements simultaneously.

Kevin McCluskey (gone)

Comment 10

•

22 years ago

Resolving as WONTFIX

Status: NEW → RESOLVED

Closed: 22 years ago

Resolution: --- → WONTFIX

Trix Supremo

Comment 11

•

22 years ago

resolving wontfix

Status: RESOLVED → VERIFIED

Cathleen

Comment 12

•

22 years ago

wait, don't mark this WONTFIX to VERIFIED yet! According to our DHTML test suite, this is a 200-500% regression. We need some 2nd opinions on this WONTFIX decision. CC'ing folks... win98 800mHz --------------------- 1.0 branch 4341 ms trunk 20042 ms ie6 21195 ms win98 1.7GHz --------------------- 1.0 branch 4587 ms trunk 20131 ms ie6 18647 ms winXP 800mHz --------------------- 1.0 branch 4655 ms trunk 9654 ms ie6 7634 ms winXP 1.7GHz --------------------- 1.0 branch 4752 ms trunk 9711 ms ie6 7076 ms

Status: VERIFIED → REOPENED

Resolution: WONTFIX → ---

Cathleen

Comment 13

•

22 years ago

and linux and mac numbers... linux RH7.2 800mHz --------------------- 1.0 branch 4648 ms trunk 9451 ms linux RH7.2 1.7GHz --------------------- 1.0 branch 3962 ms trunk 8732 ms mac OSX G4 733MHz --------------------- 1.0 branch 4646 ms trunk 8436 ms

John Morrison

Comment 14

•

22 years ago

Actually, I was okay with the wontfix resolution. It looks like a big regression in performance but, as noted in comment #3, it's really just the direct consequence of setting a minimum time of 10ms for setTimeout() calls. I'd like to see a "real world" example where that cap is significantly hurting our usability/performance before revisiting that decision. But, other opinions welcome.

Kevin McCluskey (gone)

Comment 15

•

22 years ago

note: the Dhtml perf results using the 1.0 branch are not valid because we used to allow painting to be starved. i.e. we would move the element but we would not always paint it. Using system timers for plevent dispatch fixed this because the timer will not fire until all painting is complete. system timers on WIN32 are limited to a 10ms interval. here's why we have to limit it to 10ms 1. compatibility with DHTML created to run on I.E. 2. allowing sub 10ms intervals cause high CPU usage when compared with IE. 3. If we require sub 10ms intervals we can not use system timers to dispatch pl events and painting will be starved again 4. the only usage for sub 10ms intervals I've seen are these dhtml perf tests which I believe are flawed. In my opinion, these tests need to be measuring relow/painting performance not the minimum timeout interval. 5. A timeout interval of 10ms allows for 100 frame per second, which is more than adequate.

Kevin McCluskey (gone)

Comment 16

•

22 years ago

another indication that the dhtml perf tests are flawed is they show the moz1.0 branch being 200-500% faster than I.E. this does not match what I've seen on real world examples.

Barrett Lindstrom

Comment 17

•

22 years ago

ok, we've updated the test suite to allow specification of the timeout. It defaults to 10 ms. We are still showing a performance regression (although not as bad as before). 2002-12-05-08-1.0 - 5447 ms 2002-12-05-08-trunk - 6988 ms IE 6 - 6835 ms This was using my desktop machine (2.4 ghz).

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 18

•

22 years ago

It's still a bogus comparison because the 1.0 branch can still starve painting when the timeout is set to 10ms.

Markus Hübner

Comment 19

•

22 years ago

So, what would be the right approach to make a comparison?

Kevin McCluskey (gone)

Comment 20

•

22 years ago

Someone would have modify the 1.0 branch to force it to synchronously paint any invalidated areas before processing the PLEvent associated with JavaScript timer driving the animation. I don't think its worth spending time to do this. There is only a 2% performance difference between the trunk and IE according to comment 17.

Markus Hübner

Comment 21

•

22 years ago

Think so too - time is better spent on bug 157681 and optimizing our painting performance now I'd say. What's gonna be done about this one now (wontfix)?

Kevin McCluskey (gone)

Comment 22

•

22 years ago

Resolving as WONTFIX

Status: REOPENED → RESOLVED

Closed: 22 years ago → 22 years ago

Priority: -- → P3

Resolution: --- → WONTFIX

Target Milestone: --- → Future

u88484

Updated

•

20 years ago

No longer blocks: 21762

Daniel Veditz [:dveditz]

Updated

•

20 years ago

Blocks: 21762