Closed Bug 42138 Opened 24 years ago Closed 23 years ago

unclosed tags and nsBlockFrame::RenumberListsFor (browser seems to hang)

Categories

(Core :: Layout, defect, P2)

defect

Tracking

()

VERIFIED FIXED
mozilla0.9.2

People

(Reporter: zuperdee, Assigned: waterson)

References

()

Details

(Keywords: hang, perf, testcase)

Attachments

(6 files)

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux 2.3.49 i586; en-US; m16) Gecko/20000603
BuildID:    2000042708

When I try to visit the given URL, Mozilla completely hangs up.  It does not
crash, but it becomes totally inoperative.

Reproducible: Always
Steps to Reproduce:
1. Visit the given URL
2. Wait a while
3. Notice that Mozilla hangs up.

Actual Results:  Mozilla hangs up.

Expected Results:  Mozilla should render this page, and let me browse it.
I cannot reproduce this on Linux build 2000060908 (M17).  I waited for over an
hour.  Original reporter, please try to reproduce this with a more modern
version of Mozilla, otherwise mark this bug as WORKSFORME.
More modern version??  I just built this darned thing ONE FRICKIN' MINUTE AGO.
What on earth do you consider "more recent?"
According to the BuildID in your bug report, your version of Mozilla is from
April 27th, six weeks ago.
What??  I find this hard to believe...  My build is not from April 27th.  I pull
every night directly from CVS, and build it immediately thereafter.  My build is
less than a DAY old, unless I am stupid, and doing something wrong...  What
could be going wrong here?
It's a common problem that self-made mozilla builds have this wrong build ID.
See bug 21904 (INVALID).
For me mozilla always hangs when clicking on the link titled
"Carburetor Parts & Accessories" on that page. Updating URL.
While loading this page, mozilla will freeze.
(PC/Linux, nightly 2000060908 and a 6/2 build)
I forgot to note the original URL:
http://www.holley.com/HiOctn/ProdLine/Products/FMS/FMS.html
Daniel, please correct me if you think your hang-up is something else than 
what I see on FMSCA/Access.html.

I'm going to attach a testcase that shows the same behaviour for me.
Probably it can be further simplified. It contains the following body:

<MENU>
<FONT SIZE=5><A HREF="foo.html">foo</A> <FONT SIZE=4> Foo<FONT><BR>
<FONT SIZE=5><A HREF="foo.html">foo</A> <FONT SIZE=4> Foo<FONT><BR>
... // and so on, about 465 lines total
</MENU>

Yes, there are missing </FONT> tags, but most of them are missing on the
original page, too.
If you remove the <MENU>...</MENU>, the page only needs a few seconds to
completely load & display (much less than a minute). If you replace the 
"<A...>foo</A>" with "foo", I sometimes hang when running without gdb,
but sometimes the page loads fine for me when running in gdb.
Reducing the number of lines also may make the hang go away, but there
doesn't seem to be a clear minimum for the number of lines needed to cause
a hang. It's more a stochastic thing, but the given number of lines seems
to be large enough to trigger the problem (at least almost always).

When hung, mozilla uses about 95% CPU. When I interrupt it, mozilla seems to be
very busy renumbering lists:
#0  0x40ca9a12 in nsFrame::GetStyleData ()              from libraptorhtml.so
#1  0x40ca3dc8 in nsBlockFrame::RenumberListsFor ()     from libraptorhtml.so
#2  0x40ca3d51 in nsBlockFrame::RenumberListsIn ()      from libraptorhtml.so
#3  0x40ca3ed5 in nsBlockFrame::RenumberListsFor ()     from libraptorhtml.so
#4  0x40ca3d51 in nsBlockFrame::RenumberListsIn ()      from libraptorhtml.so
#5  0x40ca3ed5 in nsBlockFrame::RenumberListsFor ()     from libraptorhtml.so
#6  0x40ca3d51 in nsBlockFrame::RenumberListsIn ()      from libraptorhtml.so
[...]
#1784 0x40ca3d51 in nsBlockFrame::RenumberListsIn ()      from libraptorhtml.so
#1785 0x40ca3ed5 in nsBlockFrame::RenumberListsFor ()     from libraptorhtml.so
#1786 0x40ca3d51 in nsBlockFrame::RenumberListsIn ()      from libraptorhtml.so
#1787 0x40ca3ed5 in nsBlockFrame::RenumberListsFor ()     from libraptorhtml.so
#1788 0x40ca3c99 in nsBlockFrame::RenumberListsInBlock () from libraptorhtml.so
#1789 0x40ca3c27 in nsBlockFrame::RenumberLists ()        from libraptorhtml.so
#1790 0x40c9d26e in nsBlockFrame::Reflow ()               from libraptorhtml.so
The stack is from a 6/2 build, but M15 and 2000060908 would also match it.
Andreas, thank you SOOO much for saving me...  Boy do I feel stupid now.  I
think the hang-up you are finding on FMSCA/Access.html is indeed the same one
I'm seeing.  I apologize for the confusion--it looks like I originally stated
the wrong URL.  I actually was meaning to post the carburators page at http:
//www.holley.com/HiOctn/ProdLine/Products/FMS/FMSC/FMSC.html

I actually use Mozilla for daily browsing now, and have deleted Netscape 4.x
completely...  I was basically just trying innocently to do some research on
finding a new carburator for a relative's Chevy S-10 Blazer, and in the process
happened to stumble upon a mysterious hang-up in Mozilla that I could report!
Daniel: You're welcome. :) 
Adding crash keyword, cc self. Sending from Browser-General to Layout for a look
because the stack (when interrupted) points to libraptorhtml.so.
Assignee: asa → clayton
Component: Browser-General → Layout
Keywords: crash
QA Contact: doronr → petersen
The stack trace points to block code.  Re-assigning to buster as part of 
clayton's bug triage.
Assignee: clayton → buster
those unclosed font tags are creating a very deep frame tree.  we don't yet 
handle this very well.
Status: NEW → ASSIGNED
OS: Linux → All
Priority: P3 → P1
Hardware: PC → All
Nominating for nsbeta2.
Keywords: nsbeta2
the html on this page is lousy.  it causes us to build an exponentially deep 
frame tree, because they never close <font> tags all within a single block.  I 
think the only "fix" for this is to detect the absurd case and bail out of 
RenumberListsFor().

Patch to be attached soon.  Recommend nsbeta2+ since it's a crash on real sites.  
We've seen other bugs like this, too, though I don't have any numbers handy.
Target Milestone: --- → M17
Bug 18480 is about missing end tags, too. But I would not say they are dupes.
I've got a fix for the immediate problem, though there will always be ways to 
bring the browser to its knees with poor html.  we might want something on the 
frame construction side, that closes a container after a certain depth.  So even 
after I check in my fix for this crash, I'll leave the bug open for that.
Keywords: perf
bug 18480 already covers what I was suggesting here, so forget what I said in my 
last comment (thanks, Andreas!)  I'll just mark this fixed once it's checked in.
I've got this one nailed in layout code.  jst should still do soemthing on the 
content sink side as part of bug 18480.
Whiteboard: [fix in hand]
Putting on [nsbeta2+][w/b minus on 6/22] radar.
Whiteboard: [fix in hand] → [nsbeta2+][w/b minus on 6/22][fix in hand]
fix checked in
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Whiteboard: [nsbeta2+][w/b minus on 6/22][fix in hand] → [nsbeta2+][w/b minus on 6/22]
This problem is fixed on Win 98 but still crashes on Mac. Tested with June 20 
builds (2000062008).
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Tested on PC/Linux, build 2000062020.
The testcase.html attachment now completes within about 10 secs.
The page on holley.com completes within less than a minute.
So I can confirm that it doesn't crash mozilla any more on Linux.
The crash on Mac may be unrelated to the originally reported problem.

There still remains a usability problem: Mozilla shouldn't be 
unresponsive (frozen) for such a long time (between 10 and 30 secs).
Waterson, can you please take a look.
Assignee: buster → waterson
Status: REOPENED → NEW
Status: NEW → ASSIGNED
Cleaning up status whiteboard by marking beta2 minus (6/22 has passed).

It sounds like this now *only* crashes on Mac (and the crash may be distinct 
from the original bug report).  It is also showing a significant performance 
problem (at least on the demo page)

Whiteboard: [nsbeta2+][w/b minus on 6/22] → [nsbeta2-]
I looked into this a bit last week. I'm pretty sure that we're blowing the stack  
on Mac; on Win32 I was seeing a *very* deeply nested stack laying out this page.
Here's my idea for a fix: Would it not be possible to make it so that the
"quirks mode" (I hope I have the right idea here) automatically assumes a
closing tag once it encounters opening tags that reach a certain depth--for example,

<FONT>Number 1<FONT>Number 2<FONT>Number 3

is assumed to be:

<FONT>Number 1</FONT><FONT>Number 2</FONT><FONT>Number 3</FONT>

But only if it reaches a certain depth...?  (I am assuming the "certain depth"
would be something like the depth possible before you blow the stack on the Mac,
for example.)  I assume this is possibly what 4.x did as well...  For this
reason, I am nominating this 4xp.

Also changing the summary to be more descriptive of the real problem, and
nominating for nsbeta2+ again, since although I do deplore the bad HTML that
caushes this situation in the first place, this does nevertheless affect some
real-world websites.
Keywords: 4xp
Summary: Mozilla hangs up indefinitely when loading → Mozilla crashes when frame tree grows very deep
Whiteboard: [nsbeta2-]
Can someone confirm that this is still crashing on Mac with a 6/23 or later
build? I ask because bug 18480 is verified fixed now.
So this is really just the same bug as 18480, but we'll call it "performance
problems with unclosed tags" because we've wallpapered over the crash.
Summary: Mozilla crashes when frame tree grows very deep → performance problems with unclosed tags
In that case, I am taking it off of the crash radar, and putting on performance
radar.  I take it this means that the only problem left is the matter of Mozilla
being unresponsive while loading such pages.
Keywords: 4xp, crash, nsbeta2
So vidur tried zuperdee's fix, and it dramatically improved the performance on
the test page. Kicking to harishd, who knows the details of how to fix. Clearing
milestone; resetting priority.
Assignee: waterson → harishd
Status: ASSIGNED → NEW
Priority: P1 → P3
Target Milestone: M17 → ---
*** Bug 43174 has been marked as a duplicate of this bug. ***
The fix to limit the nesting of tags has already been checked in (bug 18480).  
The only thing that we will do for this bug in the beta 3 timeframe is to tune 
the nesting limit.  Marking beta3+.
Status: NEW → ASSIGNED
Keywords: nsbeta3
Whiteboard: [nsbeta3+]
Setting priority to P2.
Priority: P3 → P2
Whiteboard: [nsbeta3+] → [nsbeta3+] [trivial fix]
Forgot to add a comment to the status whiteboard change earlier.  Added "trivial 
fix" to the status whiteboard to refer to the work of tuning the nesting limit.
Whiteboard: [nsbeta3+] [trivial fix] → [nsbeta3+][trivial fix]
But then we turn this into a correctness problem, right? We need to figure out
longer term how we plan to solve this...
Harish tried setting the nesting limit to 50, but did not see a big improvement 
in performance.  I agree that we need to figure out a longer term fix for this.  
Harish will talk to waterson about that today.

The longer term fix is probably not going to make the beta3 train.  Marking 
nsbeta3- and futuring.
Whiteboard: [nsbeta3+][trivial fix] → [nsbeta3-]
Target Milestone: --- → Future
*** Bug 44998 has been marked as a duplicate of this bug. ***
Upon managerial request, adding the "testcase" keyword to 84 open layout bugs that
do not have the "testcase" keyword and yet have an attachement with the word
"test" in the description field. Apologies for any mistakes.
Keywords: testcase
Blocks: 60444
*** Bug 61159 has been marked as a duplicate of this bug. ***
cc'ing eric who has the assignee for bug 61159
Keywords: nsbeta3hang, nsbeta1
Summary: performance problems with unclosed tags → unclosed tags and nsBlockFrame::RenumberListsFor (browser seems to hang)
Whiteboard: [nsbeta3-]

*** This bug has been marked as a duplicate of 56182 ***
Status: ASSIGNED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → DUPLICATE
Harish, are you sure this is a duplicate? This is a performance bug: unclosed
font tags causes much time to be spent in nsBlockFrame::RenumberListsFor, and
during that time the browser is unresponsive (seems to hang). Fixing the other
bug will avoid stack overflow crashes, but it is not obvious to me that this
will make this bug go away. Can you either replace the "duplicate" relation by a
dependency or add a comment why this is a duplicate?
As a matter of fact it's not an exact duplicate. Though I marked it so because 
fixing the hack ( or rather making the frames code more robust ) should fix a 
bunch of similar problems. However, marking this bug a *dependent* of bug 56182, 
would make more sense than being it's duplicate. Thanx Andreas.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Status: REOPENED → ASSIGNED
Depends on: 56182
Reassigning bug to rickg. Since he is working on similar issues.
Assignee: harishd → rickg
Status: ASSIGNED → NEW
*** Bug 77870 has been marked as a duplicate of this bug. ***
This URL also seems to duplicate the problem:

http://www.geocities.com/robbsan.geo/disclaimer.html
I #if 0'd out the body of |RenumberListsInBlock()| on the above URL, and the 
page loaded fine. So, I think that any remaining problems here would be related 
to bug 3246. I'll take this for now...
Assignee: rickg → waterson
Depends on: 3246
Allright, this actually doesn't depend on bug 3246 as much as I'd thought. The
problem here is that some old block-in-inline code is still lying around where
it shouldn't be. (buster reminded me of this a long time ago, seems I'd
forgotten.) Anyway, we'll never, _ever_ see a |display: block| inside a
|display: inline| nowadays, so there's some code in
|nsBlockFrame::RenumberListsFor()| that is no longer necessary. Attaching a patch.
Status: NEW → ASSIGNED
Target Milestone: Future → mozilla0.9.2
*** Bug 66792 has been marked as a duplicate of this bug. ***
Adding dependency of dup'd bug 66792.
Blocks: 77421
r=rbs
No longer blocks: 60444
*** Bug 60444 has been marked as a duplicate of this bug. ***
Looks fine to me. So, it is not even worth asserting that the block in an inline
case never happens? I'll certainly trust your judgement on that... sr=attinasi
FWIW, we actually don't want to assert if |aKid| is an inline frame, because
|RenumberListsFor()| is called once for each child of the block frame (and it's
fine if those are inline frames). We just don't need to recursively descend into
|aKid|'s frame hierarchy looking for list items to renumber when |aKid| is an
inline frame, because there won't be any.
Blocks: 83989
a=dbaron for trunk checkin (on behalf of drivers)
Fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago23 years ago
Resolution: --- → FIXED
Page renders fine in the June 27th branch build (2001062708).
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: