Closed Bug 124697 Opened 23 years ago Closed 23 years ago

5% of large page spent measuring time intervals in HTMLContentSink::DidProcessAToken

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla1.0

People

(Reporter: bratell, Assigned: kmcclusk)

References

(
URL
)

Details

(Keywords: perf)

Attachments

(1 file, 2 obsolete files)

Sample the clock after every 200 tokens when in low-frequency parser interupt mode 23 years ago Kevin McCluskey (gone) (deleted), patch		Details \| Diff \| Splinter Review
Same as previous patch with mLastSampledUserEventTime initialized 23 years ago Kevin McCluskey (gone) (deleted), patch	harishd : review+ attinasi : superreview+	Details \| Diff \| Splinter Review
Same as previous patch with #define for 200 specified as NS_MAX_TOKENS_DEFLECTED_IN_LOW_FREQ_MODE 23 years ago Kevin McCluskey (gone) (deleted), patch	asa : approval+	Details \| Diff \| Splinter Review

Daniel Bratell

Reporter

Description

•

23 years ago

I did a quantify run of the page in bug 56854 and got 5% of the time spent in 
HTMLContentSink::DidProcessAToken calling PR_IntervalNow. This a quite slow 
function (bug 124695) and shouldn't be called 90000 times which we do now. 

Apparently this function introduced with the interruptible parsing code. I think 
it might be better to use a timer of some kind than polling the clock. Also, it 
could be better to use PR_Now that is faster than PR_IntervalTime (despite the 
promises in the NSPR documentation).

[Not sure of the component. Could be DOM content or layout or something else 
too.]

Daniel Bratell

Reporter

Updated

•

23 years ago

Blocks: 56854

Depends on: 124695

Keywords: perf

harishd

Comment 1

•

23 years ago

Kevin, could you please take a look at this?

Assignee: harishd → kmcclusk

Kevin McCluskey (gone)

Assignee

Comment 2

•

23 years ago

For interruptibing to work we must measure the actual time to process each
token. so it is not be feasible to use a timer. The processing of a single token
can result in a large amount of down-stream processing (frame construction and
initial reflow) and it is not predictable what types of tokens may trigger this. 

I was aware of the overhead of sampling the system clock and I did some
measurements on WIN98 to try a minimize the overhead by using PR_IntervalNow
instead of PR_Interval. PR_IntervalNow was approx 10X faster than PR_Interval.

There really are only two viable solutions:
- Turn of interruptible parsing which will make Mozilla unresponsive during
large document loads.
- Look for an even lighter weight way to measure time then PR_IntervalNow.

Kevin McCluskey (gone)

Assignee

Updated

•

23 years ago

Target Milestone: --- → mozilla1.0.1

Kevin McCluskey (gone)

Assignee

Comment 3

•

23 years ago

Bulk moving Mozilla1.01 bugs to future-P1. I will pull from the future-P1 list
to schedule bugs for post Mozilla1.0 milestones

Priority: -- → P1

Target Milestone: mozilla1.0.1 → Future

Asa Dotzler [:asa]

Updated

•

23 years ago

Keywords: mozilla1.0+

Kevin McCluskey (gone)

Assignee

Comment 4

•

23 years ago

I think I may have a solution for this bug. Although I must sample the system
clock for each token while we are in the high frequency parser interruption mode
which provides fast UI response at the expense of page load time, I think I can
limit the number of calls to PR_IntervalNow in DidProcessAToken when we are in
the low frequency parser interruption mode which interrupts the parser very
infrequently to provide fast page load.

Kevin McCluskey (gone)

Assignee

Comment 5

•

23 years ago

Attached patch Sample the clock after every 200 tokens when in low-frequency parser interupt mode (obsolete) (deleted) — Details — Splinter Review

This patch should reduce the overhead for sampling the clock from 5% down to
.025%. 
I tested the patch loading
:http://lxr.mozilla.org/seamonkey/source/layout/html/style/src/nsCSSFrameConstructor.cpp

Daniel Bratell

Reporter

Comment 6

•

23 years ago

Comment on attachment 72425 [details] [diff] [review]
Sample the clock after every 200 tokens when in low-frequency parser interupt mode

Looks fine and safe to me. Only, you never initialize mLastSampledUserEventTime
before accessing it for the first time. It won't matter in this particular case
but it will cause warnings in people's purify logs and make the parser look
bad. :-)

Maybe make the 200 into a constant but I'm not sure that would make anything
better. It's used in only one place and it's well documented so I would say
it's most readable the way you wrote it. 

I'm a little curious how to trigger the low-frequency parser. Do I have to keep
the pointer absolutely motionless for instance?

Kevin McCluskey (gone)

Assignee

Updated

•

23 years ago

Attachment #72425 - Attachment is obsolete: true

Kevin McCluskey (gone)

Assignee

Comment 7

•

23 years ago

Attached patch Same as previous patch with mLastSampledUserEventTime initialized (obsolete) (deleted) — Details — Splinter Review

Kevin McCluskey (gone)

Assignee

Comment 8

•

23 years ago

"I'm a little curious how to trigger the low-frequency parser. Do I have to keep
the pointer absolutely motionless for instance?"

Yes, you should not type, move the mouse or click if you want the fastest load
time. Once you stop moving the mouse, etc. it switches back into low-frequency
interruption of the parser so the page load will speed back up. This was a
necessary tradeoff because it was not possible to find a static value that
provided good user response without affecting page load times.

Daniel Bratell

Reporter

Comment 9

•

23 years ago

Looks nice. This should be trivial to review for whoever is suitable to do that.
I don't think my r= counts. :-)

It would be nice if the algorithm could be more intelligent and only break for
events that mattered, but that would probably require some sort of time travel. :-)

Marc Attinasi

Comment 10

•

23 years ago

Comment on attachment 72599 [details] [diff] [review]
Same as previous patch with mLastSampledUserEventTime initialized

sr=attinasi, but I wouldn't mind 200 becoming a symbolic constant (#define).
Was the empirical testing on a 'typical' machine?

Attachment #72599 - Flags: superreview+

Kevin McCluskey (gone)

Assignee

Comment 11

•

23 years ago

Attached patch Same as previous patch with #define for 200 specified as NS_MAX_TOKENS_DEFLECTED_IN_LOW_FREQ_MODE (deleted) — Details — Splinter Review

Kevin McCluskey (gone)

Assignee

Comment 12

•

23 years ago

The empirical testing was on a 750Mhz AMD machine running WinXP. I tested it
with a value of 1000. So I assume that scaling it back to 200 would provide
acceptable performance on machines that where 1/5th as fast.

Kevin McCluskey (gone)

Assignee

Updated

•

23 years ago

Attachment #72599 - Attachment is obsolete: true

Kevin McCluskey (gone)

Assignee

Updated

•

23 years ago

Keywords: nsbeta1+

Target Milestone: Future → mozilla1.0

harishd

Comment 13

•

23 years ago

Comment on attachment 72631 [details] [diff] [review]
Same as previous patch with #define for  200 specified as NS_MAX_TOKENS_DEFLECTED_IN_LOW_FREQ_MODE

>+    nsCOMPtr<nsIViewManager> vm;
>+    shell->GetViewManager(getter_AddRefs(vm));

Would vm always exist? If not you need to add NS_ENSURE_TRUE.

>+      if (mDeflectedCount < NS_MAX_TOKENS_DEFLECTED_IN_LOW_FREQ_MODE) {
>+        mDeflectedCount++;

What is mDeflectedCount? What does it mean?

Kevin McCluskey (gone)

Assignee

Comment 14

•

23 years ago

"Would vm always exist? If not you need to add NS_ENSURE_TRUE."

Lots of other things will break if the shell doesn't have a viewmanager but I
can go ahead and add the NS_ENSURE_TRUE. In addition, We would already be
breaking in the current builds since this patch just moves the existing code
which gets the viewmanager to the top of the function.


"What is mDeflectedCount? What does it mean?"

Its the number of tokens that have been processed while in the low frequency
parser interrupt mode without falling through to the logic which decides whether
to switch to the high frequency parser interrupt mode.

harishd

Comment 15

•

23 years ago

Comment on attachment 72599 [details] [diff] [review]
Same as previous patch with mLastSampledUserEventTime initialized

r=harishd

Attachment #72599 - Flags: review+

Asa Dotzler [:asa]

Comment 16

•

23 years ago

Comment on attachment 72631 [details] [diff] [review]
Same as previous patch with #define for  200 specified as NS_MAX_TOKENS_DEFLECTED_IN_LOW_FREQ_MODE

a=asa (on behalf of drivers) for checkin to the 1.0 trunk

Attachment #72631 - Flags: approval+

Kevin McCluskey (gone)

Assignee

Comment 17

•

23 years ago

Patch checked into the trunk.

Status: NEW → RESOLVED

Closed: 23 years ago

Resolution: --- → FIXED

Moied

Comment 18

•

23 years ago

verified fix checked in CVS (Rev 3.525)

Status: RESOLVED → VERIFIED

Ivan Eisen

Comment 19

•

23 years ago

I am having a problem on NT4 and Windows 2000 that may be related to this fix.
For large text documents nothing appears on my screen for many seconds. IN
versions of Mozilla before 9.9 the screen filled almost immediately. 
I can overcome the long delay on the current version by moving the mouse over
where the scrollbar should appear.  This causes the screen to immediately paint. For
a document that takes 20 seconds to complete I sometimes wait almost the full
20 before seeing the screen change. 

Ivan

Daniel Bratell

Reporter

Comment 20

•

23 years ago

Large documents that have few tags? 

Maybe we should use a timer instead. A timer that sets a volatile boolean to
true which tells us to update.

Markus Hübner

Comment 21

•

22 years ago

Ivan, do you still seee this?

Ivan Eisen

Comment 22

•

22 years ago

I saw the same long delay on the current NT 1.01 build.  I don't get this
problem on OS/2 builds. I think a fix was dropped for the delay problem bugzilla bug
142635 (Painting suppression doesn't unsuppress correctly). 

Ivan

You need to log in before you can comment on or make changes to this bug.