Closed
Bug 650189
Opened 14 years ago
Closed 14 years ago
Recent performance regression in loading huge tinderbox logs
Categories
(Core :: Layout: Text and Fonts, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: ehsan.akhgari, Assigned: smontagu)
References
()
Details
(Keywords: dogfood, perf, regression)
Attachments
(1 file)
(deleted),
patch
|
Details | Diff | Splinter Review |
See the URL for example (warning: it might kill your browser).
I've recently (the past couple of weeks maybe) am seeing a huge regression in loading these pages. I leave Firefox for like 5 minutes and then kill it, because it's absolutely unresponsive.
I don't have an --enable-profiling opt build to profile this with right now, but shark on our (useless for profiling) nightlies shows me some nsTextFrame stuff.
This is the worst regression I've ever seen. Can somebody profile please?
Simon is the most likely culprit :-)
Assignee: nobody → smontagu
Comment 2•14 years ago
|
||
I was loading merrily tinderbox logs on a nightly no older than a couple of days.
Comment 3•14 years ago
|
||
> but shark on our (useless for profiling) nightlies
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-central/firefox-6.0a1.en-US.mac-shark.dmg tends to have a useful for profiling nightly.
I'll take a look if someone doesn't beat me to it, but I suspect regression range may be more useful than profile.
Comment 4•14 years ago
|
||
Please double check the following range.
Regression window(m-c hourly):
Works:
http://hg.mozilla.org/mozilla-central/rev/d114ab94a859
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.2a1pre) Gecko/20110411 Firefox/4.2a1pre ID:20110411100945
Broken:
http://hg.mozilla.org/mozilla-central/rev/f65b79eeabd4
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.2a1pre) Gecko/20110411 Firefox/4.2a1pre ID:20110411101509
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=d114ab94a859&tochange=f65b79eeabd4
Regression window(cedar hourly):
works:
http://hg.mozilla.org/projects/cedar/rev/db98c4f46347
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.2a1pre) Gecko/20110409 Firefox/4.2a1pre ID:20110410231038
Broken:
http://hg.mozilla.org/projects/cedar/rev/2ad43389d244
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.2a1pre) Gecko/20110409 Firefox/4.2a1pre ID:20110411010443
Pushlog:
http://hg.mozilla.org/projects/cedar//pushloghtml?fromchange=db98c4f46347&tochange=2ad43389d244
Comment 5•14 years ago
|
||
(In reply to comment #4)
> Regression window(cedar hourly):
> Pushlog:
> http://hg.mozilla.org/projects/cedar//pushloghtml?fromchange=db98c4f46347&tochange=2ad43389d244
Which is smontagu, as roc suspected. thanks alice
Reporter | ||
Comment 6•14 years ago
|
||
(In reply to comment #5)
> (In reply to comment #4)
> > Regression window(cedar hourly):
> > Pushlog:
> > http://hg.mozilla.org/projects/cedar//pushloghtml?fromchange=db98c4f46347&tochange=2ad43389d244
>
> Which is smontagu, as roc suspected. thanks alice
Yeah, makes sense to me.
We should investigate to see if this problem can show up in other web pages, and back out bug 263359 from aurora.
tracking-firefox5:
--- → ?
Assignee | ||
Comment 7•14 years ago
|
||
Very strange, because all my testing showed a huge improvement in loading tinderbox logs from bug 263359. Why is this tinderbox log different from all other tinderbox logs?
Assignee | ||
Comment 8•14 years ago
|
||
Ah, I was forgetting that bug 646359 got split out from bug 263359. Testing with the patch from there applied makes the test URL load speedily for me (a local copy of it, that is).
I'm still surprised that bug 263359 made things significantly worse, though.
Depends on: 646359
Assignee | ||
Comment 9•14 years ago
|
||
On second thoughts, testing with the local copy doesn't prove anything because it gets decoded as Latin1
Assignee | ||
Comment 11•14 years ago
|
||
So this is basically the same old problem with incremental bidi resolution in long paragraphs that we have talked about in the past, e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=566066#c2.
Bug 263359, which was supposed to help with that, was a performance win with smaller files, but it seems that there is a critical mass above which doing bidi resolution on each line separately takes longer than doing it on the whole block.
FTR, when I saw performance improvements I was testing with tinderbox logs around 10MB, 85K lines; the log in the URL is just under 32MB, 146K lines.
Reporter | ||
Comment 12•14 years ago
|
||
Does that mean that we can't effectively get good perf in both cases (long lines and large blocks)?
Assignee | ||
Comment 13•14 years ago
|
||
I'm sure there are other things we can do. One idea that I want to investigate is a NS_LINE_NEEDS_BIDI_RESOLUTION flag for preformatted elements, similar to the NS_BLOCK_NEEDS_BIDI_RESOLUTION that we already use.
Why is doing bidi resolution on each line separately a *huge* regression from doing it on the whole block? That sounds surprising to me. Are we sure that's all that's going on here?
Comment 15•14 years ago
|
||
Why are we nominating this for Aurora? What are the risks?
Updated•14 years ago
|
Comment 16•14 years ago
|
||
Will track, we need to get to a solid root cause analysis and find out if backout or something else is a good strategy.
Reporter | ||
Comment 17•14 years ago
|
||
(In reply to comment #15)
> Why are we nominating this for Aurora? What are the risks?
We know that the patches landed in bug 263359 (which is on Aurora) have caused huge performance regressions for at least Tinderbox log pages. We don't know whether this regression can also be observed on other (regular) web pages or not. We need to track this on Aurora, to make sure that we back out bug 263359 if we need to.
Assignee | ||
Comment 18•14 years ago
|
||
It's cool when working on one bug provides illumination for another one...
At least part of this was caused by moving the initialization of the nsBlockInFlowLineIterator from Resolve() into ResolveParagraph() in http://hg.mozilla.org/mozilla-central/rev/ad5009233ae9.
That can't be the whole story though, because it doesn't fit the regression range, and wasn't checked in to Aurora. This patch also regresses bug 650189 :(
Comment 19•14 years ago
|
||
Assignee | ||
Comment 20•14 years ago
|
||
The patch regress *bug 645193*
I think we should do a backout to fix this, both on Aurora and mozilla-central.
Assignee | ||
Comment 22•14 years ago
|
||
Can someone double check whether the regression exists in Aurora before backing out there? I'm still seeing improvement rather than regression in Aurora compared to FF4, but my network connection speed is very irregular and I don't know how reliable my data are.
It should be possible to simulate a testcase with a script loop that document.write()s chunks of text and then calls document.body.offsetHeight to flush reflow periodically.
Reporter | ||
Comment 24•14 years ago
|
||
Using the link in the URL field, I can reproduce the exact regression on Aurora as well. Things *might* be different under slow connections, in which case roc's suggestion in comment 23 might help, but I think that we have enough information now to assert that this regression exists both on mozilla-central and Aurora.
Comment 25•14 years ago
|
||
Since there is no proposed fix, please backout as per #21 as soon as possible.
Reporter | ||
Comment 26•14 years ago
|
||
(In reply to comment #25)
> Since there is no proposed fix, please backout as per #21 as soon as possible.
Simon, can you please take care of this? I'm not sure which patches to back out exactly.
Assignee | ||
Comment 27•14 years ago
|
||
Assignee | ||
Comment 28•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/fa0295a97f1b
Ehsan, can you confirm that the backout fixes the regression?
Reporter | ||
Comment 29•14 years ago
|
||
(In reply to comment #28)
> http://hg.mozilla.org/mozilla-central/rev/fa0295a97f1b
>
> Ehsan, can you confirm that the backout fixes the regression?
It does, yes. Thanks!
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•14 years ago
|
status-firefox5:
--- → fixed
You need to log in
before you can comment on or make changes to this bug.
Description
•