Closed
Bug 102113
Opened 23 years ago
Closed 23 years ago
nsCompressedCharMap crashes during startup on 64bit Solaris.
Categories
(Core :: Layout, defect)
Tracking
()
RESOLVED
FIXED
mozilla0.9.6
People
(Reporter: pavlov, Assigned: bstell)
References
Details
(Keywords: 64bit)
Attachments
(1 file, 1 obsolete file)
(deleted),
patch
|
shanjian
:
review+
brendan
:
superreview+
|
Details | Diff | Splinter Review |
stack trace:
=>[1] nsCompressedCharMap::SetChar(this = 0xffffffff7fff4d10, aChar = 338U),
line 156 in "nsCompressedCharMap.cpp"
[2] InitGlobals(), line 810 in "nsFontMetricsGTK.cpp"
[3] nsFontMetricsGTK::Init(this = 0x10050a490, aFont = STRUCT, aLangGroup =
0x100407380, aContext = 0x100441250), line 1035 in "nsFontMetricsGTK.cpp"
[4] nsFontCache::GetMetricsFor(this = 0x100506bc0, aFont = STRUCT, aLangGroup
= 0x100407380, aMetrics = (nil)), line 568 in "nsDeviceContext.cpp"
[5] DeviceContextImpl::GetMetricsFor(this = 0x100441250, aFont = STRUCT,
aLangGroup = 0x100407380, aMetrics = (nil)), line 233 in "nsDeviceContext.cpp"
[6] ComputeLineHeight(aRenderingContext = 0x100508910, aStyleContext =
0x1004ffd18), line 2140 in "nsHTMLReflowState.cpp"
[7] nsHTMLReflowState::CalcLineHeight(aPresContext = 0x10040c290,
aRenderingContext = 0x100508910, aFrame = 0x1004ffd80), line 2182 in
"nsHTMLReflowState.cpp"
[8] nsBlockReflowState::nsBlockReflowState(this = 0xffffffff7fffa140,
aReflowState = STRUCT, aPresContext = 0x10040c290, aFrame = 0x1004ffd80,
aMetrics = STRUCT, aBlockMarginRoot = 4194304), line 156 in
"nsBlockReflowState.cpp"
[9] nsBlockFrame::Reflow(this = 0x1004ffd80, aPresContext = 0x10040c290,
aMetrics = STRUCT, aReflowState = STRUCT, aStatus = 0), line 693 in
"nsBlockFrame.cpp"
I am seeing this on a build on Solaris 8 built with Forte 6U2 with -xarch=v9
Comment 1•23 years ago
|
||
In LXR, there's nothing line 156 in the current version of nsCompressedCharMap.cpp
http://lxr.mozilla.org/seamonkey/source/gfx/src/nsCompressedCharMap.cpp
Which version of Mozilla are you using?
Reporter | ||
Comment 2•23 years ago
|
||
it ends up being line 171 because of the license changes. there havn't been any
other changes to the file... I will update my tree though.. so my line numbers
will be right.
Assignee | ||
Comment 3•23 years ago
|
||
Pav: is sheep a 64 bit system?
If not is there a system I can build/debug on?
Reporter | ||
Comment 4•23 years ago
|
||
yeah, sheep (can be) a 64bit system.
add /opt/64bit/bin at the beginning of PATH and /opt/64bit/lib to the beginning
of LD_LIBRARY_PATH (this is where I installed 64bit glib/gtk/libIDL libraries on
sheep)
then set CC to "cc -xarch=v9" and CXX to "CC -xarch=v9" and ASFLAGS="-xarch=v9"
run configure as you normally would, and build.. when it is done, you'll have a
64bit build. dbx/workshop work as normal.
Assignee | ||
Comment 5•23 years ago
|
||
okay, made the indicated changes and I have started a build
Assignee | ||
Comment 6•23 years ago
|
||
It seems to be failing to find a 64 bit thread locking routine.
rm -f libmozjs.so
CC -xarch=v9 -I/usr/openwin/include -mt -DDEBUG -DDEBUG_ -DTRACING -g -G
-Qoption ld -z,muldefs -h libmozjs.so -o libmozjs.so jsapi.o jsarena.o
jsarray.o jsatom.o jsbool.o jscntxt.o jsdate.o jsdbgapi.o jsdhash.o jsdtoa.o
jsemit.o jsexn.o jsfun.o jsgc.o jshash.o jsinterp.o jslock.o jslog2.o jslong.o
jsmath.o jsnum.o jsobj.o jsopcode.o jsparse.o jsprf.o jsregexp.o jsscan.o
jsscope.o jsscript.o jsstr.o jsutil.o jsxdrapi.o prmjtime.o lock_SunOS.o
-xildoff -lm -lposix4 -ldl -lnsl -lsocket -L../../dist/bin
-L/builds/bstell/mozilla/dist/lib -lplds4 -lplc4 -lnspr4 -lpthread -ldl
-lsocket -ldl -lm
ld: fatal: file lock_SunOS.o: wrong ELF class: ELFCLASS32
ld: fatal: File processing errors. No output written to libmozjs.so
Reporter | ||
Comment 7•23 years ago
|
||
is this build on top of another build or a fresh tree? sun's cache might be
getting confused if this is on top of another build. I would recommend doing a
'gmake -f client.mk distclean' on the tree.
Assignee | ||
Comment 8•23 years ago
|
||
I did a "gmake -f client.mk distclean" then a "./configure" before the build
Comment 9•23 years ago
|
||
Looks like lock_SunOS.s wasn't built using -xarch=v9. Can you double check
ASFLAGS in config/autoconf.mk and make sure it was set. Also, can you check the
compile line in the log to see how lock_SunOS.o was built?
Assignee | ||
Comment 10•23 years ago
|
||
config/autoconf.mk:
ASFLAGS = -K PIC -L -P -D_ASM -D__STDC__=0
Assignee | ||
Comment 11•23 years ago
|
||
/usr/ccs/bin/as -o lock_SunOS.o -K PIC -L -P -D_ASM -D__STDC__=0 lock_SunOS.s
Comment 12•23 years ago
|
||
Ok, that's the problem. Re-reading your previous comment, I don't see where
CC/CXX/ASFLAGS were passed into the build. You need to either add those
settings to your mozconfig or pass them on the ./configure line.
Add:
CC="cc -xarch=v9"
CXX="CC -xarch=v9"
ASFLAGS="-xarch=v9"
to ~/.mozconfig
or
env CC="cc -xarch=v9" CXX="CC -xarch=v9" ASFLAGS="-xarch=v9" ./configure
Assignee | ||
Comment 13•23 years ago
|
||
okay, I finally have a build.
Assignee | ||
Comment 14•23 years ago
|
||
okay, I'll set "-g" in CFLAGS and CCFLAGS and see if I get debug symbols
Comment 15•23 years ago
|
||
bstell- mark it assign if you agree to work on it.
Assignee | ||
Updated•23 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 16•23 years ago
|
||
Here is the error:
signal BUS (invalid address alignment)
Looks like the array needs to be 64 bit aligned.
Comment 17•23 years ago
|
||
bstell, sorry I didn't catch this 64-bit impurity in review. RISCs generally
require natural alignment. The only way to ensure it is with a union around the
array of PRUint16s. That'll cost an extra "u." member name and dot operator,
but no big deal.
/be
Comment 18•23 years ago
|
||
Oh, and (of course) round up to a 0 mod 8 byte boundary when allocating from the
map -- is that going to waste too much space? We have to 0 mod 4 align for
uint32 access, already.
/be
Assignee | ||
Comment 19•23 years ago
|
||
Assignee | ||
Comment 20•23 years ago
|
||
Attachment 52160 [details] [diff] forces the map into 16 bit access. This stops the crash.
A complete fix would probably involve typing the memory arrays (both
stack and heap) to ALU_TYPE and doing casts for all the 16 bit accesses.
At present the 64 bit version runs, the profile manager looks okay,
but the pages are completely blank. Not even images show. I believe
this is unrelated but it prevents me from verifying this patch.
Assignee | ||
Updated•23 years ago
|
Target Milestone: --- → mozilla0.9.5
Assignee | ||
Comment 21•23 years ago
|
||
This close to 0.9.4 branch I'd prefer to get the simplest fix in.
Assignee | ||
Comment 22•23 years ago
|
||
local files display but remote URLs do not
Assignee | ||
Comment 23•23 years ago
|
||
failing to display remote URLs is probably a separate bug
Assignee | ||
Comment 24•23 years ago
|
||
When I click the off-line icon I get this error:
###!!! ASSERTION: Should have thread when shutting down.: 'Not Reached', file
nsSocketTransportService.cpp, line 733
###!!! Break: at file nsSocketTransportService.cpp, line 733
JavaScript error:
line 0: uncaught exception: [Exception... "Component returned failure code:
0x80004005 (NS_ERROR_FAILURE) [nsIIOService.offline]" nsresult: "0x80004005
(NS_ERROR_FAILURE)" location: "JS frame ::
chrome://communicator/content/utilityOverlay.js :: toggleOfflineStatus :: line
69" data: no]
Reporter | ||
Comment 25•23 years ago
|
||
Comment on attachment 52160 [details] [diff] [review]
patch; force all to use 16 bit access
r=pavlov
Attachment #52160 -
Flags: review+
Comment 26•23 years ago
|
||
Comment on attachment 52160 [details] [diff] [review]
patch; force all to use 16 bit access
Good for 0.9.5, sr=brendan@mozilla.org.
Please leave this bug open so we can look into wider memory accesses for 0.9.6 trunk.
Attachment #52160 -
Flags: superreview+
Comment 27•23 years ago
|
||
Comment on attachment 52160 [details] [diff] [review]
patch; force all to use 16 bit access
a=asa (on behalf of drivers) for checkin to 0.9.5.
Attachment #52160 -
Flags: approval+
Comment 28•23 years ago
|
||
did this check into m0.9.5 branch ?
IF so, please move it to m0.9.6 if you want to keep it open.
Assignee | ||
Updated•23 years ago
|
Target Milestone: mozilla0.9.5 → mozilla0.9.6
Assignee | ||
Updated•23 years ago
|
Target Milestone: mozilla0.9.6 → mozilla0.9.7
Assignee | ||
Comment 29•23 years ago
|
||
from bobbell@zk3.dec.com in bug 108950:
> nsCompressedCharMap::SetChars accesses unaligned memory. With the default
> settings on Tru64 UNIX, Tru64 UNIX correctly detects this, corrects it, and
> prints a warning message. However, Tru64 UNIX can also be set to crash with
> this behavior, and it is technically incorrect.
>
> The problem was discovered using a recent nightly build. Lines 298 and 299 of
> nsCompressedCharMap.cpp are at fault. They read:
> NS_ASSERTION(page[i]==0, "this page should be unused");
> page[i] = aPage[i];
>
> page (from my crash dump) is a pointer on a four byte boundary. This is
> because is an offset into mCCMap, which is an array of 16-bit data types.
> However, page is a point to ALU_TYPE, which on Tru64 UNIX is a 64-bit data
> type. Thus, page is not properly aligned.
Assignee | ||
Comment 30•23 years ago
|
||
What I do not understand is where the misalignment comes from. (I would
appreciate anyone pointing out what I am missing or where I am mistaken).
The pages are each 16 shorts (32 bytes) so the page-to-page distance should
maintain the same ALU boundry alignment of the start of the map.
In the section around line 299 the code that accesses the page does so
in ALU sized groups so it should maintain the ALU boundry alignment as
the start of the page.
Doesn't malloc return memory that is aligned to the largest ALU size?
If not how could the code safely alloc space for the largest ALU?
Is the base CCMap address on a 4 byte boundry?
Comment 31•23 years ago
|
||
I'm seeing this problem again with a tip v9 build using WS5 .
(/opt/SUNWspro/WS5.0/bin/sparcv9/dbx) where
current thread: t@1
=>[1] nsCompressedCharMap::SetChar(this = 0xffffffff7fff5e80, aChar = 338U),
line 223 in "nsCompressedCharMap.cpp"
[2] InitGlobals(), line 825 in "nsFontMetricsGTK.cpp"
[3] nsFontMetricsGTK::Init(this = 0x1004e9ec0, aFont = STRUCT, aLangGroup =
0x100420340, aContext = 0x100441430), line 1050 in "nsFontMetricsGTK.cpp"
[4] nsFontCache::GetMetricsFor(this = 0x1004e5a50, aFont = STRUCT, aLangGroup
= 0x100420340, aMetrics = (nil)), line 631 in "nsDeviceContext.cpp"
[5] DeviceContextImpl::GetMetricsFor(this = 0x100441430, aFont = STRUCT,
aLangGroup = 0x100420340, aMetrics = (nil)), line 266 in "nsDeviceContext.cpp"
dbx: warning: can't find file
"/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsHTMLReflowState.o"
dbx: warning: see `help pathmap'
[6] ComputeLineHeight(0x1004e9340, 0x1004e0360, 0xffffffff7fffb104,
0xffffffff74d9ef8c, 0x0, 0xffffffff74d7a808), at 0xffffffff74e1521c
[7] nsHTMLReflowState::CalcLineHeight(0x1004417b0, 0x1004e9340, 0x1004e03c0,
0x0, 0x0, 0x0), at 0xffffffff74e1550c
dbx: warning: can't find file
"/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsBlockReflowState.o"
[8] nsBlockReflowState::nsBlockReflowState(0xffffffff7fffb038,
0xffffffff7fffb4b8, 0x1004417b0, 0x1004e03c0, 0xffffffff7fffb5c8, 0x400000), at
0xffffffff74d9ef8c
dbx: warning: can't find file
"/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsBlockFrame.o"
[9] nsBlockFrame::Reflow(0xffffffff7fffb5c8, 0x1004417b0, 0xffffffff7fffb5c8,
0xffffffff7fffb4b8, 0xffffffff7fffbc04, 0xffffffff755926c0), at 0xffffffff74d7ebb4
dbx: warning: can't find file
"/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsContainerFrame.o"
[10] nsContainerFrame::ReflowChild(0x100489e90, 0x1004e03c0, 0x1004417b0,
0xffffffff7fffb5c8, 0xffffffff7fffb4b8, 0x0), at 0xffffffff74db3324
dbx: warning: can't find file
"/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsHTMLFrame.o"
[11] CanvasFrame::Reflow(0x0, 0x0, 0xffffffff7fffbc04, 0xffffffff7fffb7f8,
0xffffffff7fffbc04, 0x2), at 0xffffffff74e055bc
dbx: warning: can't find file
"/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsBoxToBlockAdaptor.o"
[12] nsBoxToBlockAdaptor::Reflow(0x1004e02d0, 0xffffffff7fffc920, 0x1004417b0,
0xffffffff7fffbbc0, 0xffffffff7fffcc98, 0xffffffff7fffbc04), at 0xffffffff75164ebc
[13] nsBoxToBlockAdaptor::DoLayout(0x0, 0x0, 0x76c, 0x76c, 0x1,
0xffffffff75164588), at 0xffffffff7516435c
dbx: warning: can't find file
"/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsBox.o"
[14] nsBox::Layout(0x1004e02d0, 0xffffffff7fffc920, 0xffffffff7fffbff8, 0x0,
0x0, 0x2), at 0xffffffff751553fc
dbx: warning: can't find file
"/space/home/cls/src/moz/main/obj-opt-ws-64-g/layout/build/nsScrollBoxFrame.o"
Comment 32•23 years ago
|
||
*** Bug 108950 has been marked as a duplicate of this bug. ***
Comment 33•23 years ago
|
||
In response to Brian Stell's comment #30:
> Doesn't malloc return memory that is aligned to the largest ALU size?
> If not how could the code safely alloc space for the largest ALU?
> Is the base CCMap address on a 4 byte boundry?
malloc() does indeed return memory that is aligned so that it can be used by any
data type (which here would make it 64-bit aligned). However, here the memory
is not being explicitly malloc'ed. The definition of the class
nsCompressedCharMap includes:
protected:
PRUint16 mUsedLen; // in PRUint16
PRUint16 mAllOnesPage;
PRUint16 mCCMap[CCMAP_MAX_LEN];
Thus, mCCMap is only guaranteed to be aligned for PRUint16 access.
I believe what the Compaq cxx compiler is doing internally is aligning mUsedLen
on a 64-bit boundary (either intentionally or by chance), which puts mCCMap only
four bytes (two PRUint16's) later, which is not on a 64-bit boundary.
From some debug printfs I added:
mCCMap @ 0x11fff30f4
page_offset == 0x40
page == 0x11fff3174
Assignee | ||
Comment 34•23 years ago
|
||
thanks for the insight
this I can fix
Assignee | ||
Comment 35•23 years ago
|
||
bobbell: could you try this patch?
thanks
Attachment #52160 -
Attachment is obsolete: true
Assignee | ||
Comment 36•23 years ago
|
||
the patch was made in the gfx directory
Assignee | ||
Updated•23 years ago
|
Target Milestone: mozilla0.9.7 → mozilla0.9.6
Comment 37•23 years ago
|
||
Comment on attachment 57165 [details] [diff] [review]
patch; use a union to make the C++ object align the map on the largest ALU
Why not give that ALU_TYPE dummy; member the canonical (and less insulting :-)
name, namely 'align'?
sr=brendan@mozilla.org in any event.
/be
Attachment #57165 -
Flags: superreview+
Comment 38•23 years ago
|
||
bstell, can you get r= and then mail drivers@mozilla.org for a= to check in for
0.9.6? Thanks,
/be
Assignee | ||
Comment 39•23 years ago
|
||
okay, after only 4 hours I have a 64 bit build on sheep and it crashes without
the patch and runs with that patch. (This is so weird: both my linux systems
are still horked from the network upgrade :( so I'm using my Win98 system
to display the Solaris client.)
Comment 40•23 years ago
|
||
Comment on attachment 57165 [details] [diff] [review]
patch; use a union to make the C++ object align the map on the largest ALU
I don't see any problem with the patch. r=shanjian
Attachment #57165 -
Flags: review+
Assignee | ||
Comment 42•23 years ago
|
||
checked into 0.9.6 branch
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•