Closed
Bug 8150
Opened 25 years ago
Closed 25 years ago
top talkback m6: was raptorhtml.dll crash; now NS_NewConverterStream sometimes fails on Win95
Categories
(Core :: Internationalization, defect, P3)
Tracking
()
VERIFIED
FIXED
M8
People
(Reporter: chofmann, Assigned: dp)
References
Details
Attachments
(1 file)
(deleted),
text/plain
|
Details |
This is a take off from bug 7802 which describes several
crash on startup problems seen buy many users running M6
outside netscape. A few people inside netscape have also
seen this crash. refer to 7802 for a complete listing.
This is the number one problem reported
by the 700 unique users reporting crashes on M6 and is
hindering our ability to see accurate MTBF numbers.
I'm concerned if we ship M7 with this crash we risk
serveral tester giving up...
------- Additional Comments From namachi@netscape.com 06/11/99 12:17 -------
Call Stack: (Signature = nsHTMLReflowState::ComputeContainingBlockRectangle
ae968a8c)
nsHTMLReflowState::ComputeContainingBlockRectangle
[d:\builds\seamonkey\mozilla\layout\html\base\src\nsHTMLReflowState.cpp, line
693]
nsHTMLReflowState::InitConstraints
[d:\builds\seamonkey\mozilla\layout\html\base\src\nsHTMLReflowState.cpp, line
769]
nsHTMLReflowState::Init
[d:\builds\seamonkey\mozilla\layout\html\base\src\nsHTMLReflowState.cpp, line
146]
nsHTMLReflowState::nsHTMLReflowState
[d:\builds\seamonkey\mozilla\layout\html\base\src\nsHTMLReflowState.cpp, line
129]
ViewportFrame::Reflow
[d:\builds\seamonkey\mozilla\layout\html\base\src\nsViewportFrame.cpp, line 433]
PresShell::InitialReflow
[d:\builds\seamonkey\mozilla\layout\html\base\src\nsPresShell.cpp, line 889]
XULDocumentImpl::StartLayout
[d:\builds\seamonkey\mozilla\rdf\content\src\nsXULDocument.cpp, line 3931]
XULDocumentImpl::EndLoad
[d:\builds\seamonkey\mozilla\rdf\content\src\nsXULDocument.cpp, line 1831]
CWellFormedDTD::DidBuildModel
[d:\builds\seamonkey\mozilla\htmlparser\src\nsWellFormedDTD.cpp, line 309]
nsParser::DidBuildModel
[d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, line 512]
nsParser::ResumeParse
[d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, line 867]
nsParser::EnableParser
[d:\builds\seamonkey\mozilla\htmlparser\src\nsParser.cpp, line 587]
CSSLoaderImpl::Cleanup
[d:\builds\seamonkey\mozilla\layout\html\style\src\nsCSSLoader.cpp, line 595]
CSSLoaderImpl::SheetComplete
[d:\builds\seamonkey\mozilla\layout\html\style\src\nsCSSLoader.cpp, line 665]
CSSLoaderImpl::ParseSheet
[d:\builds\seamonkey\mozilla\layout\html\style\src\nsCSSLoader.cpp, line 697]
CSSLoaderImpl::DidLoadStyle
[d:\builds\seamonkey\mozilla\layout\html\style\src\nsCSSLoader.cpp, line 727]
DoneLoadingStyle
[d:\builds\seamonkey\mozilla\layout\html\style\src\nsCSSLoader.cpp, line 537]
nsUnicharStreamLoader::OnStopBinding
[d:\builds\seamonkey\mozilla\network\module\nsNetStreamLoader.cpp, line 158]
nsDocumentBindInfo::OnStopBinding
[d:\builds\seamonkey\mozilla\webshell\src\nsDocLoader.cpp, line 1531]
OnStopBindingProxyEvent::HandleEvent
[d:\builds\seamonkey\mozilla\network\module\nsNetThread.cpp, line 594]
StreamListenerProxyEvent::HandlePLEvent
[d:\builds\seamonkey\mozilla\network\module\nsNetThread.cpp, line 474]
PL_HandleEvent [plevent.c, line 492]
PL_ProcessPendingEvents [plevent.c, line 453]
_md_EventReceiverProc[plevent.c, line 872]
KERNEL32.DLL + 0x3663 (0xbff73663)
KERNEL32.DLL + 0x228e0 (0xbff928e0)
0x00768c14
------- Additional Comments From chofmann@netscape.com 06/11/99 13:12 -------
So it looks like we head into this code and crash under some
kind of condition... the question is had the train already
left the tracks?
683 troy 1.46 // Called by InitConstraints() to compute the containing block
rectangle for
684 // the element. Handles the special logic for absolutely
positioned elements
685 void
686 nsHTMLReflowState::ComputeContainingBlockRectangle(const
nsHTMLReflowState* aContainingBlockRS,
687 nscoord&
aContainingBlockWidth,
688 nscoord&
aContainingBlockHeight)
689 {
690 // Unless the element is absolutely positioned, the containing
block is
691 // formed by the content edge of the nearest block-level
ancestor
692 aContainingBlockWidth = aContainingBlockRS->computedWidth;
693 aContainingBlockHeight = aContainingBlockRS->computedHeight;
694
695 if (NS_FRAME_GET_TYPE(frameType) ==
NS_CSS_FRAME_TYPE_ABSOLUTE) {
696 // See if the ancestor is block-level or inline-level
697 if (NS_FRAME_GET_TYPE(aContainingBlockRS->frameType) ==
NS_CSS_FRAME_TYPE_INLINE) {
698 // The CSS2 spec says that if the ancestor is
inline-level, the containing
699 // block depends on the 'direction' property of the
ancestor. For direction
700 // 'ltr', it's the top and left of the content edges of
the first box and
701 // the bottom and right content edges of the last box
702 //
703 // XXX This is a pain because it isn't top-down and it
requires that we've
704 troy 1.46 // completely reflowed the ancestor. It also isn't clear
what happens when
705 // a relatively positioned ancestor is split across pages.
So instead use
706 // the computed width and height of the nearest
block-level ancestor
707 const nsHTMLReflowState* cbrs = aContainingBlockRS;
708 while (cbrs) {
709 nsCSSFrameType type =
NS_FRAME_GET_TYPE(cbrs->frameType);
710 if ((NS_CSS_FRAME_TYPE_BLOCK == type) ||
711 (NS_CSS_FRAME_TYPE_FLOATING == type) ||
712 (NS_CSS_FRAME_TYPE_ABSOLUTE == type)) {
713
Reporter | ||
Updated•25 years ago
|
Target Milestone: M7
If this bug is as important as the description implies, why is it Priority P3
and Severity Normal?
Reporter | ||
Updated•25 years ago
|
Severity: normal → blocker
I've been running this on NT tonight, and can't seem to get it to crash. The bug
cites win98 as the target OS, but I don't have a 98 machine. I've traced the
code in question, and my current guess is that one of the container frames has a
non-zero (garbage) value for it's reflow state. If that's true and we're
messaging it, it could easily explode. I'm wondering if the XUL guys can confirm
that the frames that get constructed have their reflowstate initialized
properly.
Reporter | ||
Comment 3•25 years ago
|
||
adding hyatt to cc list for more eyes
Comment 4•25 years ago
|
||
I've got a couple of Win98 boxes. I'll see what's up.
Comment 5•25 years ago
|
||
It's worth noting that I've built on Win98 for months, and I've never seen this
crash before. It might only happen with optimized builds?
Comment 6•25 years ago
|
||
I have been seeing an assertion thrown quite regularly on viewer startup that
happens in InitConstraints. This has been going on for a while now on Win98
only.
Comment 7•25 years ago
|
||
Yay! The assertion I've been seeing in VIEWER (note that I'm saying VIEWER and
not APPRUNNER) leads to the same crash if I keep going.
Here is the stack trace in viewer.
nsHTMLReflowState::ComputeContainingBlockRectangle(const nsHTMLReflowState *
0x00000000, int & 1, int & 10483000) line 692 + 6 bytes
nsHTMLReflowState::InitConstraints(nsIPresContext & {...}) line 769
nsHTMLReflowState::Init(nsIPresContext & {...}) line 146
nsHTMLReflowState::nsHTMLReflowState(nsIPresContext & {...}, const
nsHTMLReflowState & {...}, nsIFrame * 0x01b076e0, const nsSize & {...}) line 129
ViewportFrame::Reflow(ViewportFrame * const 0x01b065e4, nsIPresContext & {...},
nsHTMLReflowMetrics & {...}, const nsHTMLReflowState & {...}, unsigned int & 0)
line 433
PresShell::InitialReflow(PresShell * const 0x01ae6130, int 9120, int 4410) line
894
HTMLContentSink::StartLayout() line 2019
HTMLContentSink::OpenBody(HTMLContentSink * const 0x00da7570, const
nsIParserNode & {...}) line 1772
CNavDTD::OpenBody(const nsIParserNode & {...}) line 2381 + 40 bytes
CNavDTD::OpenContainer(const nsIParserNode & {...}, int 1) line 2547 + 12 bytes
CNavDTD::HandleDefaultStartToken(CToken * 0x01aeb6c0, nsHTMLTag eHTMLTag_body,
nsIParserNode & {...}) line 1094 + 14 bytes
CNavDTD::HandleStartToken(CToken * 0x01aeb6c0) line 1411 + 31 bytes
NavDispatchTokenHandler(CToken * 0x01aeb6c0, nsIDTD * 0x01ae8a60) line 249 + 12
bytes
CTokenHandler::operator()(CToken * 0x01aeb6c0, nsIDTD * 0x01ae8a60) line 80 + 14
bytes
CNavDTD::HandleToken(CNavDTD * const 0x01ae8a60, CToken * 0x01aeb6c0, nsIParser
* 0x00da76d0) line 691 + 18 bytes
CNavDTD::BuildModel(CNavDTD * const 0x01ae8a60, nsIParser * 0x00da76d0,
nsITokenizer * 0x01ae80a0, nsITokenObserver * 0x00000000, nsIContentSink *
0x00da7570) line 522 + 20 bytes
nsParser::BuildModel() line 902 + 34 bytes
nsParser::ResumeParse(nsIDTD * 0x00000000) line 849 + 11 bytes
nsParser::OnDataAvailable(nsParser * const 0x00da76d4, nsIURL * 0x00dbcc50,
nsIInputStream * 0x00dbc260, unsigned int 5978) line 1071 + 17 bytes
nsDocumentBindInfo::OnDataAvailable(nsDocumentBindInfo * const 0x00dbce40,
nsIURL * 0x00dbcc50, nsIInputStream * 0x00dbc260, unsigned int 5978) line 1504 +
24 bytes
OnDataAvailableProxyEvent::HandleEvent(OnDataAvailableProxyEvent * const
0x00dbdd00) line 634
StreamListenerProxyEvent::HandlePLEvent(PLEvent * 0x00dbdd04) line 473 + 12
bytes
PL_HandleEvent(PLEvent * 0x00dbdd04) line 491 + 10 bytes
PL_ProcessPendingEvents(PLEventQueue * 0x00d43e90) line 452 + 9 bytes
_md_EventReceiverProc(HWND__ * 0x00000f20, unsigned int 55404, unsigned int 0,
long 13909648) line 877 + 9 bytes
KERNEL32! bff7363b()
KERNEL32! bff942e7()
I will follow this up with the assertion that's getting hit in InitConstraints,
since that might help rickg et. al. diagnose what's going wrong.
Note that my viewer just crashes randomly on Win98 with this bug. It happens 1
out of every 5 times or so.
Comment 8•25 years ago
|
||
I hit an assertion on line 761 of nsHTMLReflowState.cpp. The containing block
is null.
NS_ASSERTION(nsnull != cbrs, "no containing block");
It's only after I keep going past this assertion that I crash. The containing
block gets dereferenced even though it's null, and then I crash.
Comment 9•25 years ago
|
||
Ok, so here's what's going down.
In the InitialReflow of a document, we have this ViewPortFrame. We start
looking at child frames. The first child frame is a ScrollFrame. When we first
initialize this child's reflow state object, that object calls its Init method.
That Init method then tries to compute the nearest enclosing containing block
(basically it's looking for a parent frame with a display type of block). It
does this search by crawling up the reflow state stack and looking at the frame
stored in each reflow state object.
It ends up looking at the ViewPortFrame. Now normally when the mDisplay
variable of the mStyleContext for that frame is examined, it has a display type
of BLOCK (represented with a numeric value of 1).
However, about 1 out of every 5 times I run viewer, the outermost ViewPortFrame
instead has a style context whose mDisplay has value of 2, indicating an INLINE
rather than a BLOCK display type.
The containing block search then basically fails, since it crawls all the way up
the reflow state stack without finding a containing block. Then you hit the
assertion warning you about the fact that no containing block was found, and if
you keep going, you crash, since the following code assumes a containing block
was found and tries to dereference it.
I don't know yet why the display type is sometimes 2 instead of 1, but that's
what's happening, folks.
Comment 10•25 years ago
|
||
cc'ing Peter Linss.
Comment 11•25 years ago
|
||
Given my trace and hyatts comments, it may be something you know how to kill.
Can you please take a look?
Comment 12•25 years ago
|
||
Of course the nasty part about this bug is that it does seem to occur only on
Win98.
Reporter | ||
Updated•25 years ago
|
Whiteboard: top m6 talkback crasher
Updated•25 years ago
|
Status: NEW → ASSIGNED
Comment 13•25 years ago
|
||
This is caused by UA.css failing to load ocasionally on Win95/98.
The reason for that is NS_NewConverterStream fails. I haven't looked too deeply
into that, but it seems that maybe the component manager fails to load the
converters. Could be related to threading/race issues in component manager that
we've seen before.
I have a workaround ready to go that prevents the layout code from crashing when
UA.css is absent.
Updated•25 years ago
|
Assignee: peterl → ftang
Status: ASSIGNED → NEW
Component: Layout → Internationalization
Summary: Crash on Startup in raptorhtml.dll -> nsHTMLReflowState -> ComputeContainingBlockRetangle → NS_NewConverterStream sometimes fails on Win95
Whiteboard: top m6 talkback crasher
Comment 14•25 years ago
|
||
Work-around to layout dependency on UA.css checked in.
Now someone needs to fix the converter stream problem. Starting with intl folks.
Updated•25 years ago
|
Status: NEW → ASSIGNED
Comment 15•25 years ago
|
||
Is this still M7 blocker after peterl check in his work around ? (Does that mean
it won't crash anymore ?)
Anyone have a machine which can reproduce the problem ?
Do we know which part of the NS_NewConverterStream failed ?
Comment 16•25 years ago
|
||
I saw the problem on our IQA lab- Japanese 98
It say it cannot load UA.css error code 80040154
Updated•25 years ago
|
Assignee: ftang → dp
Status: ASSIGNED → NEW
Comment 17•25 years ago
|
||
80040154 is
#define NS_ERROR_FACTORY_NOT_REGISTERED ((nsresult) 0x80040154L
I have check inside GetUnicodeConverter() code and there are no place we can
possible return that error code.
I am sure the failure is not inside GetUnicodeDecoder() by code review. The only
other place it could return this particular error code is in
xpcom/io/nsUnicharInputStream.cpp :
145 res = nsServiceManager::GetService(kCharsetConverterManagerCID,
146 kICharsetConverterManagerIID, (nsISupports**)&ccm);
inside 131 NS_NewB2UConverter() implementation.
[ Note- the converter seems load ok LATER when I load some Japanese HTML pages.
Which mean we must register the CID of converter manager correctly- otherwise,
it won't do the Japanese converter later neither ]
reassign to dp since he own nsServiceManager::GetService
Assignee | ||
Comment 18•25 years ago
|
||
Ok. I read the report.
Is is apprunner too or only viewer. I am assuming that some saw this
the crash in apprunner release. And hyatt has gotten only viewer DEBUG to crash
with this symptom.
Peter, viewer shouldn't crash. I presume your workaround is error
checking on the return value. I think your workaround should go in nomatter
what.
Am I on the right track so far.
Next, someone with a debug build of apprunner/viewer on win98: could you do
this:
set NSPR_LOG_MODULES nsComponentManager:5
set NSPR_LOG_FILE xpcom.log
./viewer (or) apprunner
Reproduce the bug and add the xpcom.log file as an attachment to the bug.
Assignee | ||
Updated•25 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 19•25 years ago
|
||
The reason I am concerned is that I want to know if we are dealing with the
same bug. viewer DEBUG crashing could be different from apprunner release
crashing.
Reporter | ||
Updated•25 years ago
|
Summary: NS_NewConverterStream sometimes fails on Win95 → top talkback m6: was raptorhtml.dll crash; now NS_NewConverterStream sometimes fails on Win95
Reporter | ||
Comment 20•25 years ago
|
||
about 400 people saw this crash in m6 apprunner.
it was the top win32 talkback crash reported
Comment 21•25 years ago
|
||
This crash has been happening since M6 with release mode apprunner (via
talkback). I've reproduced it in viewer under both Win95 and Win98 with current
debug code, not under NT. (It happens randomly 1 of 5 times or so).
The crash no longer happens with my fix (the crash was layout code not handling
the absence of UA.css), but when it happens, we still no longer have UA.css
which leads to exceptionally bad layout (for instance, everything in INLINE).
I'll attach a log file shortly.
Comment 22•25 years ago
|
||
Assignee | ||
Comment 23•25 years ago
|
||
I see the log. Thanks peterl for the super fast reponse.
This is the registry/xpcom multithreading thing as you suspected peter. bug#
7308
From peter's log:
0[10229e0]: nsComponentManager:
CreateInstance({1e3f79f1-6b6b-11d2-8a86-00600811a836})
0[10229e0]: nsComponentManager:
FindFactory({1e3f79f1-6b6b-11d2-8a86-00600811a836})
0[10229e0]: not found in factory cache. Looking in registry
-429249[1052a90]: nsComponentManager:
ProgIDToCLSID(application/x-unknown-content-type)->[FAILED]
0[10229e0]: FindFactory() FAILED
0[10229e0]: CreateInstance() FAILED.
Let me see what I can do about it in
Assignee | ||
Comment 24•25 years ago
|
||
I have provided a patch to peterl. Here what the patch does:
Index: nsComponentManager.cpp
===================================================================
RCS file: /cvsroot/mozilla/xpcom/components/nsComponentManager.cpp,v
retrieving revision 1.35
diff -c -r1.35 nsComponentManager.cpp
*** nsComponentManager.cpp 1999/06/14 02:06:44 1.35
--- nsComponentManager.cpp 1999/06/18 22:31:08
***************
*** 875,881 ****
--- 875,892 ----
{
PR_LOG(nsComponentManagerLog, PR_LOG_ALWAYS,
("\t\tnot found in factory cache. Looking in registry"));
+
+ // bug# 7308 , bug# 8150
+ // Findfactory randomly fails if a ProgIDToCLSID() happenes
+ // at the same time from another thread.
+ // The registry seems to be locking properly. Until I figureout
+ // what the right problem is, I am putting this major locks on
+ // these two routines
+ // PlatformFind() and PlatformProgIDToCLSID()
+ //to achieve mutual exclusion at a course level.
+ PR_EnterMonitor(mMon);
nsresult rv = PlatformFind(aClass, &entry);
+ PR_ExitMonitor(mMon);
// If we got one, cache it in our hashtable
if (NS_SUCCEEDED(rv))
***************
*** 957,963 ****
--- 968,985 ----
else {
// This is the first time someone has asked for this
// ProgID. Go to the registry to find the CID.
+
+ // bug# 7308 , bug# 8150
+ // Findfactory randomly fails if a ProgIDToCLSID() happenes
+ // at the same time from another thread.
+ // The registry seems to be locking properly. Until I figureout
+ // what the right problem is, I am putting this major locks on
+ // these two routines
+ // PlatformFind() and PlatformProgIDToCLSID()
+ //to achieve mutual exclusion at a course level.
+ PR_EnterMonitor(mMon);
res = PlatformProgIDToCLSID(aProgID, aClass);
+ PR_ExitMonitor(mMon);
if (NS_SUCCEEDED(res)) {
// Found it. So put it into the cache.
This is a more course locking fix than needs be. But a much safer one. No
outside module function are ever called in this function thread. Hence no chance
for dead-lock. No returns being missed, hence no forgetting to unlock. Since I
cannot reproduce the bug, I have to rely on the few people who can. Sorry to be
bothering you peter. Thanks for the help.
Assignee | ||
Updated•25 years ago
|
Severity: blocker → critical
Target Milestone: M7 → M8
Assignee | ||
Comment 25•25 years ago
|
||
Course grain locks checked in to achieve mutual exclusion. This fixes the
problem but aint the right fix.
Keeping bug open until I checkin the right fix.
Assignee | ||
Updated•25 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 26•25 years ago
|
||
Full fix checked in. Rolled back the course grain fix.
Updated•25 years ago
|
Status: RESOLVED → VERIFIED
Comment 27•25 years ago
|
||
Marking as verified fixed.
Comment 28•25 years ago
|
||
Something is still very very wrong on Windows 98, and it happens on my machine
with viewer and with apprunner. Every so often (still about 1 out of 5 times),
there is a very long hang before viewer starts up. When it finally does start
up, everything does seem to look and run ok...
This happens on any new window creation and not necessarily just on the first
window creation.
Should I file a separate bug on this issue, or do we assume that it's related to
this problem?
Regardless, things are still very horked on Windows 98, and we need to fix it.
Comment 29•25 years ago
|
||
hyatt, see my comments dated 6/08 in bug 4901 dealing with the console slowing
down launch on win95. Is this the same problem?
You need to log in
before you can comment on or make changes to this bug.
Description
•