650189 - Recent performance regression in loading huge tinderbox logs

Reporter

Description

•

14 years ago

See the URL for example (warning: it might kill your browser). I've recently (the past couple of weeks maybe) am seeing a huge regression in loading these pages. I leave Firefox for like 5 minutes and then kill it, because it's absolutely unresponsive. I don't have an --enable-profiling opt build to profile this with right now, but shark on our (useless for profiling) nightlies shows me some nsTextFrame stuff. This is the worst regression I've ever seen. Can somebody profile please?

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 1

•

14 years ago

Simon is the most likely culprit :-)

Assignee: nobody → smontagu

Josh Matthews [:jdm]

Comment 2

•

14 years ago

I was loading merrily tinderbox logs on a nightly no older than a couple of days.

Boris Zbarsky [:bzbarsky]

Comment 3

•

14 years ago

> but shark on our (useless for profiling) nightlies http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-central/firefox-6.0a1.en-US.mac-shark.dmg tends to have a useful for profiling nightly. I'll take a look if someone doesn't beat me to it, but I suspect regression range may be more useful than profile.

Alice0775 White

Comment 4

•

14 years ago

Please double check the following range. Regression window(m-c hourly): Works: http://hg.mozilla.org/mozilla-central/rev/d114ab94a859 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.2a1pre) Gecko/20110411 Firefox/4.2a1pre ID:20110411100945 Broken: http://hg.mozilla.org/mozilla-central/rev/f65b79eeabd4 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.2a1pre) Gecko/20110411 Firefox/4.2a1pre ID:20110411101509 Pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=d114ab94a859&tochange=f65b79eeabd4 Regression window(cedar hourly): works: http://hg.mozilla.org/projects/cedar/rev/db98c4f46347 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.2a1pre) Gecko/20110409 Firefox/4.2a1pre ID:20110410231038 Broken: http://hg.mozilla.org/projects/cedar/rev/2ad43389d244 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.2a1pre) Gecko/20110409 Firefox/4.2a1pre ID:20110411010443 Pushlog: http://hg.mozilla.org/projects/cedar//pushloghtml?fromchange=db98c4f46347&tochange=2ad43389d244

Justin Wood (:Callek)

Comment 5

•

14 years ago

(In reply to comment #4) > Regression window(cedar hourly): > Pushlog: > http://hg.mozilla.org/projects/cedar//pushloghtml?fromchange=db98c4f46347&tochange=2ad43389d244 Which is smontagu, as roc suspected. thanks alice

(no longer active)

Reporter

Comment 6

•

14 years ago

(In reply to comment #5) > (In reply to comment #4) > > Regression window(cedar hourly): > > Pushlog: > > http://hg.mozilla.org/projects/cedar//pushloghtml?fromchange=db98c4f46347&tochange=2ad43389d244 > > Which is smontagu, as roc suspected. thanks alice Yeah, makes sense to me. We should investigate to see if this problem can show up in other web pages, and back out bug 263359 from aurora.

tracking-firefox5: --- → ?

Boris Zbarsky [:bzbarsky]

Updated

•

14 years ago

Blocks: 263359

Simon Montagu :smontagu

Assignee

Comment 7

•

14 years ago

Very strange, because all my testing showed a huge improvement in loading tinderbox logs from bug 263359. Why is this tinderbox log different from all other tinderbox logs?

Simon Montagu :smontagu

Assignee

Comment 8

•

14 years ago

Ah, I was forgetting that bug 646359 got split out from bug 263359. Testing with the patch from there applied makes the test URL load speedily for me (a local copy of it, that is). I'm still surprised that bug 263359 made things significantly worse, though.

Depends on: 646359

Simon Montagu :smontagu

Assignee

Comment 9

•

14 years ago

On second thoughts, testing with the local copy doesn't prove anything because it gets decoded as Latin1

Simon Montagu :smontagu

Assignee

Comment 10

•

14 years ago

Bug 646359 doesn't help with this.

No longer depends on: 646359

Simon Montagu :smontagu

Assignee

Comment 11

•

14 years ago

So this is basically the same old problem with incremental bidi resolution in long paragraphs that we have talked about in the past, e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=566066#c2. Bug 263359, which was supposed to help with that, was a performance win with smaller files, but it seems that there is a critical mass above which doing bidi resolution on each line separately takes longer than doing it on the whole block. FTR, when I saw performance improvements I was testing with tinderbox logs around 10MB, 85K lines; the log in the URL is just under 32MB, 146K lines.

(no longer active)

Reporter

Comment 12

•

14 years ago

Does that mean that we can't effectively get good perf in both cases (long lines and large blocks)?

Simon Montagu :smontagu

Assignee

Comment 13

•

14 years ago

I'm sure there are other things we can do. One idea that I want to investigate is a NS_LINE_NEEDS_BIDI_RESOLUTION flag for preformatted elements, similar to the NS_BLOCK_NEEDS_BIDI_RESOLUTION that we already use.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 14

•

14 years ago

Why is doing bidi resolution on each line separately a *huge* regression from doing it on the whole block? That sounds surprising to me. Are we sure that's all that's going on here?

Sheila Mooney

Comment 15

•

14 years ago

Why are we nominating this for Aurora? What are the risks?

JP Rosevear [:jpr]

Updated

•

14 years ago

tracking-firefox5: ? → +

JP Rosevear [:jpr]

Comment 16

•

14 years ago

Will track, we need to get to a solid root cause analysis and find out if backout or something else is a good strategy.

(no longer active)

Reporter

Comment 17

•

14 years ago

(In reply to comment #15) > Why are we nominating this for Aurora? What are the risks? We know that the patches landed in bug 263359 (which is on Aurora) have caused huge performance regressions for at least Tinderbox log pages. We don't know whether this regression can also be observed on other (regular) web pages or not. We need to track this on Aurora, to make sure that we back out bug 263359 if we need to.

Simon Montagu :smontagu

Assignee

Comment 18

•

14 years ago

Attached patch w-i-p (deleted) — Details — Splinter Review

It's cool when working on one bug provides illumination for another one... At least part of this was caused by moving the initialization of the nsBlockInFlowLineIterator from Resolve() into ResolveParagraph() in http://hg.mozilla.org/mozilla-central/rev/ad5009233ae9. That can't be the whole story though, because it doesn't fit the regression range, and wasn't checked in to Aurora. This patch also regresses bug 650189 :(

Daniel C

Comment 19

•

14 years ago

(In reply to comment #18) > This patch also regresses bug 650189 :( This is bug 650189.

Simon Montagu :smontagu

Assignee

Comment 20

•

14 years ago

The patch regress *bug 645193*

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 21

•

14 years ago

I think we should do a backout to fix this, both on Aurora and mozilla-central.

Simon Montagu :smontagu

Assignee

Comment 22

•

14 years ago

Can someone double check whether the regression exists in Aurora before backing out there? I'm still seeing improvement rather than regression in Aurora compared to FF4, but my network connection speed is very irregular and I don't know how reliable my data are.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 23

•

14 years ago

It should be possible to simulate a testcase with a script loop that document.write()s chunks of text and then calls document.body.offsetHeight to flush reflow periodically.

(no longer active)

Reporter

Comment 24

•

14 years ago

Using the link in the URL field, I can reproduce the exact regression on Aurora as well. Things *might* be different under slow connections, in which case roc's suggestion in comment 23 might help, but I think that we have enough information now to assert that this regression exists both on mozilla-central and Aurora.

JP Rosevear [:jpr]

Comment 25

•

14 years ago

Since there is no proposed fix, please backout as per #21 as soon as possible.

(no longer active)

Reporter

Comment 26

•

14 years ago

(In reply to comment #25) > Since there is no proposed fix, please backout as per #21 as soon as possible. Simon, can you please take care of this? I'm not sure which patches to back out exactly.

Simon Montagu :smontagu

Assignee

Comment 27

•

14 years ago

http://hg.mozilla.org/mozilla-aurora/rev/13470a83e372 http://hg.mozilla.org/mozilla-aurora/rev/a307c807cfab (bustage fix)

Simon Montagu :smontagu

Assignee

Comment 28

•

14 years ago

http://hg.mozilla.org/mozilla-central/rev/fa0295a97f1b Ehsan, can you confirm that the backout fixes the regression?

(no longer active)

Reporter

Comment 29

•

14 years ago

(In reply to comment #28) > http://hg.mozilla.org/mozilla-central/rev/fa0295a97f1b > > Ehsan, can you confirm that the backout fixes the regression? It does, yes. Thanks!

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Asa Dotzler [:asa]

Updated

•

14 years ago

status-firefox5: --- → fixed