Closed Bug 203784 Opened 22 years ago Closed 20 years ago

Browser crash following deep recursion in [@ PR_GetCurrentThread]

Categories

(Camino Graveyard :: Page Layout, defect)

PowerPC
macOS
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: bfowler, Assigned: bryner)

References

()

Details

(Keywords: crash)

Crash Data

Attachments

(6 files)

User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4b) Gecko/20030418 Chimera/0.7+ Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4b) Gecko/20030418 Chimera/0.7+ I was browsing normally, and had namy pages open. Camino crashed with a stack trace 988 frames long. Reproducible: Didn't try Steps to Reproduce: 1. 2. 3. Expected Results: Hard to say. If the html was pathological, then Camino should stop a certain point and render what it can. It shouldn't crash Stack trace to follow. I can't say which page was involved because Camino cannot sort its history by date.
Attached file Stack trace from crash (deleted) —
Taken from gdb. Gzipped as bugzilla isn't keen on traces this long
O.K. It seems to be a page on the Powerpage site, <URL: http://www.powerpage.org/story.lasso?newsID=10056 > I got an incomplete stacktrace from Crashreporter with 508 entries, but I don't think that it is needed. I will check in P......Debug.
Summary: Browser crash following deep recursion in → Browser crash following deep recursion in
The build: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4b) Gecko/20030415 Mozilla Firebird/0.6 can download and open this page, giving the following assertion messages ###!!! ASSERTION: reflow dirty lines failed: 'NS_SUCCEEDED(rv)', file ../../../../../mozilla/layout/html/base/src/nsBlockFrame.cpp, line 949 Break: at file ../../../../../mozilla/layout/html/base/src/nsBlockFrame.cpp, line 949 nsTraceRefcnt::WalkTheStack Write me, dammit! nsBlockReflowContext: Block(body)(2)@0xa9a27e8 metrics=-559038737,-559038737! nsBlockReflowContext: Block(body)(2)@0xa9a27e8 didn't set whad -559038737,-559038737,-559038737,-559038737! ###!!! ASSERTION: reflow dirty lines failed: 'NS_SUCCEEDED(rv)', file ../../../../../mozilla/layout/html/base/src/nsBlockFrame.cpp, line 949 Break: at file ../../../../../mozilla/layout/html/base/src/nsBlockFrame.cpp, line 949 nsTraceRefcnt::WalkTheStack Write me, dammit! However, whilst left doing nothing, that program crashed, which I will report as a bug against Firebird
Severity: normal → critical
Keywords: crash
This happened on this page with the latest CVS build. I will attach an interesting part of the crash log, and try to capture the html
Attached file Excerpt from stack trace (deleted) —
I have downloaded the page using 'curl' and this html causes Camino to download something from powerpage and crash. This assertion may be significant: WARNING: NS_ENSURE_TRUE(NS_SUCCEEDED(rv)) failed, file ../../../../mozilla/content/base/src/nsImageLoadingContent.cpp, line 436
Attached file HTML doc which crashes Camino (deleted) —
Downloaded with curl, not edited. G'zipped for size. I will try to get a smaller case.
I strongly suspect that the problem is faulty HTML. Certainly, the crash went away as I fixed the validation errors.
Yes. There were something like 221 unclosed <font> tags. I will attach a copy of the HTML with these and many other errors fixed, leaving about 82.
With Camino I was able to load the attached HTML. It took awhile (there's a lot of HTML there) but it finished after about 20 seconds.
The URL given in comment 2 WorksForMe using Camino/2003-04-25-05; no crash occurs. Attachment 122007 [details] and attachment 122058 [details] look like they were encoded in some way, Ben. Were they?
Summary: Browser crash following deep recursion in → Browser crash following deep recursion in [@ PR_GetCurrentThread]
The attachments that were 'encoded in some way' were g'zipped to get under the size limit. Personally, I think that this is a true bill, and I will send in a patch to fix it in due course. Note that I seem to be still using an old version of Gecko. I am using a debug build. This could be a case of resource depletion somewhere non-obvious in my setup. Obviously, undefined behaviour against highly pathological HTML is not so important. Note that the second attachment does NOT crash Camino.
I am still getting a crash with today's build. I shall attach the log. Are people who are not getting the crash getting similar error messages?
Attachment #122007 - Attachment mime type: text/plain → application/x-gzip
Attachment #122058 - Attachment mime type: text/html → application/x-gzip
On the Console? No messages here.
The interesting bit is at the bottom
Attached file Stack trace from today's crash (deleted) —
The full trace is 980 frames deep
The C++ programming language like its ancestor, C is stack-based, but has next to no protection for the stack. If your RTE could cope with the stack growing by 988 frames, and mine cannot, would that explain our results. Could I make a suggestion that (other things being equal) a limit of some kind should be placed on how many times the functions can be called recursively; would 511 be too few?
Worksform 2033082402
I am sorry to be a pain on this, but last time I tried this, it didn't work. I will find time to look at it again sometime in the next couple of weeks, but no promises. There is a significant difference between mine and most people's set up. I am using a debug version of Camino. Unless you are, you will not see the crash. The problem is that in the case of certain ill-formed HTML/CSS, it is possible for Camino to attempt uncontrolled recursion. This will always crash, if and only if the stack overflows. I may be out of my depth here, but I would have thought that the engineered solution would be to to ensure that there was a limit on how much stack was needed to parse even the most egregrious HTML.
Using my 'recurse.html' attachment, the current CVS Camino crashes in exactly the same way as it did in my comment 17. Today there were 980 frames, mainly repetitions of this sequence of calls: #10 0x0fd40600 in nsBlockFrame::Reflow(nsIPresContext*, nsHTMLReflowMetrics&, nsHTMLReflowState const&, unsigned&) at nsBlockFrame.cpp:714 #11 0x0fd51c2c in nsBlockReflowContext::ReflowBlock(nsRect const&, int, nsCollapsingMargin&, int, nsMargin&, nsHTMLReflowState&, unsigned&) at nsBlockReflowContext.cpp:537 #12 0x0fd471ac in nsBlockFrame::ReflowBlockFrame(nsBlockReflowState&, nsLineList_iterator, int*) at nsBlockFrame.cpp:3195 #13 0x0fd45058 in nsBlockFrame::ReflowLine(nsBlockReflowState&, nsLineList_iterator, int*, int) at nsBlockFrame.cpp:2399 #14 0x0fd44474 in nsBlockFrame::ReflowDirtyLines(nsBlockReflowState&) at nsBlockFrame.cpp:2182 When I tried with Firebird,the page seemed to render, but I got this assertion message and a crash. ###!!! ASSERTION: nsView::GetViewFor null widget ptr: 'nsnull != aWidget', file ../../../src/view/src/nsView.cpp, line 202 Break: at file ../../../src/view/src/nsView.cpp, line 202 I am not sure whether the crash was related to the assertion failure, but obviously there is bound to be a potential problem trying to use a widget which is a null pointer. Note: Bug 203724 applies as this "assertion" is actually a pre-condition. Mozilla also crashes, I don't have a stack trace as I get this message on the console Sep 5 08:49:33 tit002 /usr/libexec/crashreporterd: crashdump hung! (pid 2724) instead.
Ben I will try to reproduce with a debug build.
I can reproduce with a debug build from yesterday's or today's source.
Status: UNCONFIRMED → NEW
Ever confirmed: true
HTML doc from 2003-04-29 "which crashes Camino" wfm with Camino nightly build 20040208, Mac OS 10.3.2. Someone want to test a current build with Jaguar?
WorksForMe using Camino/2004021008 on OS X 10.2.8. Does this only occur using debug builds, then?
For me, debug only. See comment 18. Of course, I cannot build Camino on Jaguar.
WFM using the 20040319 NB on both 10.2.8 and 10.3.3. If this is a debug build bug only does this bug need to stay open?
2004082512 (v0.8.1)/Mac OS X 10.2.8: Page took a few seconds but did load... horribly. Graphics were shifted out of position and there were a lot of stray characters in the story however Camino did load it and did not crash.
Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8a5) Gecko/20041025 Firefox/0.9.1+ Since it seems to be Open Season for code stability and recursion problems (see Bug 265736, for instance). I thought I would take another detailed look at this. I have actually spent all day doing this, and not come up with an engineered solution. I suspect that the problem is a variant on Bug 42138, and is caused bu numerous unclosed <font> tags. The reflow system has some documentation for study, and seems to consist of many very large functions. In our case the important thing is that nsBlockFrame::Reflow calls ReflowDirtyLines, which calls ReflowLine which calls ReflowBlockFrame which calls nsBlockReflowContext::ReflowBlock. which and this is the interesting bit, calls nsBlockFrame::Reflow. I have not been able to determine what causes this recursive cycle to terminate - perhaps only the end of the block on which Reflow was originally called. All I have been able to do is identify a variable, ReflowState, http://lxr.mozilla.org/seamonkey/source/layout/html/base/src/nsFrame.cpp#2579 which can be used to provide an early return if too much stack is being used. I will attach a patch to show what I mean but this won't be in the next few days. If I am on the the right lines then this is a flaw in the design of the Reflow system as it means that there is a fearsome limit on the size/ complexity of a page's HTML. This could be obviated by using memory allocated by new() or malloc(). I realise that I seem to have a lot of leeway to make up even to convince you that there is a problem with the code here, exploitable in the wild, but I do feel the need to open up a second front on this bug, and ask for assistance from someone who knows the Reflow system better than an ingenue such as myself to provide an acceptable fix, as I am pretty sure that mine won't be for the elemental reason that if it were, it would already be in. Alternatively, close the bug as INVALID, as I don't see anyone else bothered by it!
I thought gecko already had some recursion limit built in (for Mac OS 9)? Maybe we need to tweak the setting for Mac OS X?
(In reply to comment #30) > I thought gecko already had some recursion limit built in (for Mac OS 9)? > Maybe we need to tweak the setting for Mac OS X? I will look at this again
The current state of this bug is WORKSFORME, I suspect that it has been fixed en passant by a check in for one of the reflow bugs such as Bug 256180 "Upon too many nested tables, Mozilla begins to break out of the outer tables incorrectly" (but it is not that one!) It could be that the check in for Bug 209694 "[MARGIN-C]clear margin merged with bottom margin on empty last child" means that the code path that caused this problem is no longer exercised. Otherwise, I have looked and can't find a check in to account for the fact that the testcase https://bugzilla.mozilla.org/attachment.cgi?id=122058 no longer crashes.
(In reply to comment #32) > ... Bug 256180 "Upon too many nested tables, Mozilla begins to break out > of the outer tables incorrectly" (but it is not that one!) > > ... Bug 209694 "[MARGIN-C]clear margin merged with bottom margin on empty > last child" means that the code path that caused this problem is no longer > exercised. See also Bug 58917 "[FIX] No rendering after tag nesting/TAGVL has been exceeded"
Thanks for investigating.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → WORKSFORME
Crash Signature: [@ PR_GetCurrentThread]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: