Closed
Bug 203784
Opened 22 years ago
Closed 20 years ago
Browser crash following deep recursion in [@ PR_GetCurrentThread]
Categories
(Camino Graveyard :: Page Layout, defect)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: bfowler, Assigned: bryner)
References
()
Details
(Keywords: crash)
Crash Data
Attachments
(6 files)
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4b) Gecko/20030418 Chimera/0.7+
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4b) Gecko/20030418 Chimera/0.7+
I was browsing normally, and had namy pages open. Camino crashed with
a stack trace 988 frames long.
Reproducible: Didn't try
Steps to Reproduce:
1.
2.
3.
Expected Results:
Hard to say.
If the html was pathological, then Camino should stop a certain point
and render what it can.
It shouldn't crash
Stack trace to follow.
I can't say which page was involved because Camino cannot sort its history
by date.
Reporter | ||
Comment 1•22 years ago
|
||
Taken from gdb. Gzipped as bugzilla isn't keen on traces this
long
Reporter | ||
Comment 2•22 years ago
|
||
O.K. It seems to be a page on the Powerpage site,
<URL: http://www.powerpage.org/story.lasso?newsID=10056 >
I got an incomplete stacktrace from Crashreporter with
508 entries, but I don't think that it is needed.
I will check in P......Debug.
Summary: Browser crash following deep recursion in → Browser crash following deep recursion in
Reporter | ||
Comment 3•22 years ago
|
||
The build:
Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4b) Gecko/20030415
Mozilla Firebird/0.6
can download and open this page, giving the following assertion messages
###!!! ASSERTION: reflow dirty lines failed: 'NS_SUCCEEDED(rv)', file
../../../../../mozilla/layout/html/base/src/nsBlockFrame.cpp, line 949
Break: at file ../../../../../mozilla/layout/html/base/src/nsBlockFrame.cpp,
line 949
nsTraceRefcnt::WalkTheStack Write me, dammit!
nsBlockReflowContext: Block(body)(2)@0xa9a27e8 metrics=-559038737,-559038737!
nsBlockReflowContext: Block(body)(2)@0xa9a27e8 didn't set whad
-559038737,-559038737,-559038737,-559038737!
###!!! ASSERTION: reflow dirty lines failed: 'NS_SUCCEEDED(rv)', file
../../../../../mozilla/layout/html/base/src/nsBlockFrame.cpp, line 949
Break: at file ../../../../../mozilla/layout/html/base/src/nsBlockFrame.cpp,
line 949
nsTraceRefcnt::WalkTheStack Write me, dammit!
However, whilst left doing nothing, that program crashed, which I will report
as a bug against Firebird
Reporter | ||
Comment 4•22 years ago
|
||
This happened on this page with the latest CVS build.
I will attach an interesting part of the crash log, and try to capture the html
Reporter | ||
Comment 5•22 years ago
|
||
Reporter | ||
Comment 6•22 years ago
|
||
I have downloaded the page using 'curl' and this html causes Camino
to download something from powerpage and crash.
This assertion may be significant:
WARNING: NS_ENSURE_TRUE(NS_SUCCEEDED(rv)) failed, file
../../../../mozilla/content/base/src/nsImageLoadingContent.cpp, line 436
Reporter | ||
Comment 7•22 years ago
|
||
Downloaded with curl, not edited.
G'zipped for size.
I will try to get a smaller case.
Reporter | ||
Comment 8•22 years ago
|
||
I strongly suspect that the problem is faulty HTML. Certainly, the crash
went away as I fixed the validation errors.
Reporter | ||
Comment 9•22 years ago
|
||
Yes. There were something like 221 unclosed <font> tags. I will attach a copy
of the HTML with these and many other errors fixed, leaving about 82.
Reporter | ||
Comment 10•22 years ago
|
||
Comment 11•22 years ago
|
||
With Camino I was able to load the attached HTML.
It took awhile (there's a lot of HTML there) but it finished after about 20 seconds.
Comment 12•22 years ago
|
||
The URL given in comment 2 WorksForMe using Camino/2003-04-25-05; no crash occurs.
Attachment 122007 [details] and attachment 122058 [details] look like they were encoded in some way,
Ben. Were they?
Summary: Browser crash following deep recursion in → Browser crash following deep recursion in [@ PR_GetCurrentThread]
Reporter | ||
Comment 13•22 years ago
|
||
The attachments that were 'encoded in some way' were g'zipped to get
under the size limit.
Personally, I think that this is a true bill, and I will send in a patch to
fix it in due course.
Note that I seem to be still using an old version of Gecko.
I am using a debug build.
This could be a case of resource depletion somewhere non-obvious in my
setup.
Obviously, undefined behaviour against highly pathological HTML is not
so important.
Note that the second attachment does NOT crash Camino.
Reporter | ||
Comment 14•22 years ago
|
||
I am still getting a crash with today's build.
I shall attach the log. Are people who are not getting the crash
getting similar error messages?
Attachment #122007 -
Attachment mime type: text/plain → application/x-gzip
Attachment #122058 -
Attachment mime type: text/html → application/x-gzip
Comment 15•22 years ago
|
||
On the Console? No messages here.
Reporter | ||
Comment 16•22 years ago
|
||
The interesting bit is at the bottom
Reporter | ||
Comment 17•22 years ago
|
||
The full trace is 980 frames deep
Reporter | ||
Comment 18•22 years ago
|
||
The C++ programming language like its ancestor, C is stack-based, but has
next to no protection for the stack. If your RTE could cope with the stack
growing by 988 frames, and mine cannot, would that explain our results.
Could I make a suggestion that (other things being equal) a limit of some kind
should be placed on how many times the functions can be called recursively;
would 511 be too few?
Comment 19•21 years ago
|
||
Worksform 2033082402
Reporter | ||
Comment 20•21 years ago
|
||
I am sorry to be a pain on this, but last time I tried this, it didn't
work. I will find time to look at it again sometime in the next couple
of weeks, but no promises.
There is a significant difference between mine and most people's set
up. I am using a debug version of Camino. Unless you are, you will not
see the crash.
The problem is that in the case of certain ill-formed HTML/CSS, it is possible
for Camino to attempt uncontrolled recursion. This will always crash, if
and only if the stack overflows.
I may be out of my depth here, but I would have thought that the engineered
solution would be to to ensure that there was a limit on how much stack was
needed to parse even the most egregrious HTML.
Reporter | ||
Comment 21•21 years ago
|
||
Using my 'recurse.html' attachment, the current CVS Camino crashes in exactly
the same way as it did in my comment 17.
Today there were 980 frames, mainly repetitions of this sequence of
calls:
#10 0x0fd40600 in nsBlockFrame::Reflow(nsIPresContext*, nsHTMLReflowMetrics&,
nsHTMLReflowState const&, unsigned&) at nsBlockFrame.cpp:714
#11 0x0fd51c2c in nsBlockReflowContext::ReflowBlock(nsRect const&, int,
nsCollapsingMargin&, int, nsMargin&, nsHTMLReflowState&, unsigned&) at
nsBlockReflowContext.cpp:537
#12 0x0fd471ac in nsBlockFrame::ReflowBlockFrame(nsBlockReflowState&,
nsLineList_iterator, int*) at nsBlockFrame.cpp:3195
#13 0x0fd45058 in nsBlockFrame::ReflowLine(nsBlockReflowState&,
nsLineList_iterator, int*, int) at nsBlockFrame.cpp:2399
#14 0x0fd44474 in nsBlockFrame::ReflowDirtyLines(nsBlockReflowState&) at
nsBlockFrame.cpp:2182
When I tried with Firebird,the page seemed to render, but I got this
assertion message and a crash.
###!!! ASSERTION: nsView::GetViewFor null widget ptr: 'nsnull != aWidget', file
../../../src/view/src/nsView.cpp, line 202
Break: at file ../../../src/view/src/nsView.cpp, line 202
I am not sure whether the crash was related to the assertion failure, but
obviously there is bound to be a potential problem trying to use a widget
which is a null pointer.
Note: Bug 203724 applies as this "assertion" is actually a pre-condition.
Mozilla also crashes, I don't have a stack trace as I get this message on
the console
Sep 5 08:49:33 tit002 /usr/libexec/crashreporterd: crashdump hung! (pid 2724)
instead.
Comment 22•21 years ago
|
||
Ben I will try to reproduce with a debug build.
Comment 23•21 years ago
|
||
I can reproduce with a debug build from yesterday's or today's source.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 24•21 years ago
|
||
HTML doc from 2003-04-29 "which crashes Camino" wfm with Camino nightly build
20040208, Mac OS 10.3.2.
Someone want to test a current build with Jaguar?
Comment 25•21 years ago
|
||
WorksForMe using Camino/2004021008 on OS X 10.2.8. Does this only occur using
debug builds, then?
Reporter | ||
Comment 26•21 years ago
|
||
For me, debug only. See comment 18.
Of course, I cannot build Camino on Jaguar.
Comment 27•21 years ago
|
||
WFM using the 20040319 NB on both 10.2.8 and 10.3.3. If this is a debug build
bug only does this bug need to stay open?
Comment 28•20 years ago
|
||
2004082512 (v0.8.1)/Mac OS X 10.2.8:
Page took a few seconds but did load... horribly.
Graphics were shifted out of position and there were a lot of stray characters
in the story however Camino did load it and did not crash.
Reporter | ||
Comment 29•20 years ago
|
||
Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8a5) Gecko/20041025
Firefox/0.9.1+
Since it seems to be Open Season for code stability and recursion problems
(see Bug 265736, for instance). I thought I would take another detailed look
at this. I have actually spent all day doing this, and not come up with an
engineered solution.
I suspect that the problem is a variant on Bug 42138, and is caused
bu numerous unclosed <font> tags.
The reflow system has some documentation for study, and seems to consist
of many very large functions. In our case the important thing is that
nsBlockFrame::Reflow calls ReflowDirtyLines, which calls ReflowLine
which calls ReflowBlockFrame which calls nsBlockReflowContext::ReflowBlock.
which and this is the interesting bit, calls nsBlockFrame::Reflow.
I have not been able to determine what causes this recursive cycle
to terminate - perhaps only the end of the block on which Reflow
was originally called.
All I have been able to do is identify a variable, ReflowState,
http://lxr.mozilla.org/seamonkey/source/layout/html/base/src/nsFrame.cpp#2579
which can be used to provide an early return if too much stack is being
used.
I will attach a patch to show what I mean but this won't be in the next few
days.
If I am on the the right lines then this is a flaw in the design of the
Reflow system as it means that there is a fearsome limit on the size/
complexity of a page's HTML. This could be obviated by using memory
allocated by new() or malloc().
I realise that I seem to have a lot of leeway to make up even to convince
you that there is a problem with the code here, exploitable in the wild,
but I do feel the need to open up a second front on this bug, and ask
for assistance from someone who knows the Reflow system better than
an ingenue such as myself to provide an acceptable fix, as I am pretty
sure that mine won't be for the elemental reason that if it were, it
would already be in.
Alternatively, close the bug as INVALID, as I don't see anyone else
bothered by it!
Comment 30•20 years ago
|
||
I thought gecko already had some recursion limit built in (for Mac OS 9)? Maybe we need to tweak the
setting for Mac OS X?
Reporter | ||
Comment 31•20 years ago
|
||
(In reply to comment #30)
> I thought gecko already had some recursion limit built in (for Mac OS 9)?
> Maybe we need to tweak the setting for Mac OS X?
I will look at this again
Reporter | ||
Comment 32•20 years ago
|
||
The current state of this bug is WORKSFORME, I suspect that it has been
fixed en passant by a check in for one of the reflow bugs such as
Bug 256180 "Upon too many nested tables, Mozilla begins to break out of the
outer tables incorrectly" (but it is not that one!)
It could be that the check in for Bug 209694 "[MARGIN-C]clear margin merged with
bottom margin on empty last child" means that the code path that caused this
problem is no longer exercised.
Otherwise, I have looked and can't find a check in to account for the fact
that the testcase https://bugzilla.mozilla.org/attachment.cgi?id=122058 no
longer crashes.
Reporter | ||
Comment 33•20 years ago
|
||
(In reply to comment #32)
> ... Bug 256180 "Upon too many nested tables, Mozilla begins to break out
> of the outer tables incorrectly" (but it is not that one!)
>
> ... Bug 209694 "[MARGIN-C]clear margin merged with bottom margin on empty
> last child" means that the code path that caused this problem is no longer
> exercised.
See also Bug 58917 "[FIX] No rendering after tag nesting/TAGVL has been exceeded"
Comment 34•20 years ago
|
||
Thanks for investigating.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → WORKSFORME
Updated•13 years ago
|
Crash Signature: [@ PR_GetCurrentThread]
You need to log in
before you can comment on or make changes to this bug.
Description
•