Closed
Bug 590683
Opened 14 years ago
Closed 14 years ago
connect to sync crashes Firefox in [@ pk11_RawPBEKeyGenWithKeyType] with gcc 4.5 because gcc 4.5 is broken
Categories
(Core :: js-ctypes, defect)
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
blocking2.0 | --- | betaN+ |
People
(Reporter: wolfiR, Assigned: glandium)
References
Details
(Keywords: crash)
Crash Data
Attachments
(2 files, 1 obsolete file)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
application/javascript
|
Details |
If I try to connect Sync with Firefox 4.0b4 it crashes in libnss3:
WeaveCrypto: _deriveKeyFromPassphrase() called.
Program received signal SIGSEGV, Segmentation fault.
pk11_RawPBEKeyGenWithKeyType (slot=0xa6a0a400, type=944, params=0x9e4db4d0, keyType=31, keyLen=32, pwitem=0x0, wincx=0x0) at pk11pbe.c:1305
1305 pbev2_params->pPassword = pwitem->data;
(gdb) bt
#0 pk11_RawPBEKeyGenWithKeyType (slot=0xa6a0a400, type=944, params=0x9e4db4d0, keyType=31, keyLen=32, pwitem=0x0, wincx=0x0) at pk11pbe.c:130
#1 0xb5e93922 in PK11_PBEKeyGen (slot=0xa6a0a400, algid=0x9d7a7f20, pwitem=0x0, faulty3DES=0, wincx=0x0) at pk11pbe.c:1392
#2 0xb7a7b95e in ffi_call_SYSV () from /usr/lib/xulrunner-2.0b4/libmozjs.so
I also get the following error in the error console which might be related.
Reporter | ||
Comment 1•14 years ago
|
||
Error: cannot read contents of null pointer
Source File: file:///usr/lib/firefox/components/WeaveCrypto.js
Line: 1004
Reporter | ||
Comment 2•14 years ago
|
||
Line 1004
let intData = ctypes.cast(data, ctypes.uint8_t.array(len).ptr).contents;
Reporter | ||
Updated•14 years ago
|
Keywords: crash
Summary: connect to sync crashes Firefox in libnss3 → connect to sync crashes Firefox in libnss3 [@ pk11_RawPBEKeyGenWithKeyType]
Reporter | ||
Comment 3•14 years ago
|
||
Mike, have you seen something like this by any chance? Installed system NSS is 3.12.8pre
Assignee | ||
Comment 4•14 years ago
|
||
Sync works for me on 4.0b4 with libnss 3.12.8beta2. But I only tested on x86-64, not x86.
Reporter | ||
Comment 5•14 years ago
|
||
hmm, interesting. I've tried it on a x86-64 installation now and it doesn't crash.
I need to test a bit more then. My "master" Firefox is a 64bit one.
I'll need to test another 32bit installation to see if it's the architecture or something else.
Reporter | ||
Comment 6•14 years ago
|
||
Tried it with a new VM 32-bit. And Firefox crashes when sync data is to be imported.
It really seems to me that something is not arch agnostic here.
Assignee | ||
Comment 7•14 years ago
|
||
I just tried in a 32-bits chroot, and it worked properly for me.
Reporter | ||
Comment 8•14 years ago
|
||
Hmm,
http://mxr.mozilla.org/mozilla-central/source/services/crypto/WeaveCrypto.js#1043
let passItem = this.makeSECItem(passphrase, false);
returns a null pointer here.
passphrase is set correctly.
http://mxr.mozilla.org/mozilla-central/source/services/crypto/WeaveCrypto.js#1038
return new this.nss_t.SECItem(this.nss.SIBUFFER, outputData, outputData.length);
This returns NULL while the input (actually outputData) looks sane.
Reporter | ||
Comment 9•14 years ago
|
||
I'm failing to further debug this. Not sure where to set a breakpoint.
Comment 10•14 years ago
|
||
Did you confirm if a Mozilla-produced nightly build crashes? [From IRC you said this was with your own build.]
Reporter | ||
Comment 11•14 years ago
|
||
Firefox 4.0b4 as shipped from Mozilla does not crash.
(Good for you but doesn't help me much)
Comment 12•14 years ago
|
||
What's different about your build env? CFLAGS etc?
Reporter | ||
Comment 13•14 years ago
|
||
Compiler Version Compiler flags
gcc gcc version 4.5.0 20100604 [gcc-4_5-branch revision 160292] (SUSE Linux) -Wall -W -Wno-unused -Wpointer-arith -Wcast-align -W -pedantic -Wno-long-long -fomit-frame-pointer -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables -fasynchronous-unwind-tables -g -Os -fno-strict-aliasing-pthread -pipe -DNDEBUG -DTRIMMED -Os -freorder-blocks -fomit-frame-pointer -finline-limit=50
c++ gcc version 4.5.0 20100604 [gcc-4_5-branch revision 160292] (SUSE Linux) -fno-rtti -fno-exceptions -Wall -Wpointer-arith -Woverloaded-virtual -Wsynth -Wno-ctor-dtor-privacy -Wno-non-virtual-dtor -Wcast-align -Wno-invalid-offsetof -Wno-variadic-macros -Werror=return-type -pedantic -Wno-long-long -fomit-frame-pointer -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables -fasynchronous-unwind-tables -g -Os -fno-strict-aliasing -fno-strict-aliasing -fshort-wchar -pthread -pipe -DNDEBUG -DTRIMMED -Os -freorder-blocks -fomit-frame-pointer -finline-limit=50
Configure arguments
--enable-application=xulrunner --prefix=/usr --libdir=/usr/lib --sysconfdir=/etc --mandir=/usr/share/man --includedir=/usr/include --enable-optimize --enable-extensions=default --with-system-jpeg --with-system-zlib --with-l10n-base=../l10n --disable-tests --disable-mochitest --disable-installer --disable-updater --disable-javaxpcom --enable-startup-notification --enable-url-classifier --enable-system-hunspell --enable-shared-js --with-system-nspr --with-system-nss --enable-libproxy
Assignee | ||
Comment 14•14 years ago
|
||
Could that be related to -D_FORTIFY_SOURCE=2 -fstack-protector ?
Assignee | ||
Comment 15•14 years ago
|
||
... or gcc 4.5. These are the 2 main differences I see with both Debian builds and Mozilla builds.
Reporter | ||
Comment 16•14 years ago
|
||
(In reply to comment #14)
> Could that be related to -D_FORTIFY_SOURCE=2 -fstack-protector ?
We have that since ages (which doesn't rule it out though).
(In reply to comment #15)
> ... or gcc 4.5. These are the 2 main differences I see with both Debian builds
> and Mozilla builds.
I can actually try a build on openSUSE 11.2 with an older gcc to see if it works there.
Reporter | ||
Comment 17•14 years ago
|
||
Running a build from openSUSE 11.2 (pre gcc4.5) on openSUSE 11.3 (gcc4.5) does not crash. So it's most likely compiler toolchain related.
Assignee | ||
Comment 18•14 years ago
|
||
(In reply to comment #17)
> Running a build from openSUSE 11.2 (pre gcc4.5) on openSUSE 11.3 (gcc4.5) does
> not crash. So it's most likely compiler toolchain related.
That could be the compiler breaking libffi (not unheard of). What you can try to do is build against system libffi (bug 551138) with gcc 4.5, and try with libffi itself built with gcc < 4.5 and gcc 4.5.
Reporter | ||
Updated•14 years ago
|
Summary: connect to sync crashes Firefox in libnss3 [@ pk11_RawPBEKeyGenWithKeyType] → connect to sync crashes Firefox in [@ pk11_RawPBEKeyGenWithKeyType] with gcc 4.5
Comment 19•14 years ago
|
||
Or hack the configure arguments to libffi (in js/src/configure.in) to give it CC=gcc4.3.
It'd be somewhat hilarious if libffi was broken with 4.5, given that it lives in the gcc tree. ;)
I'm thinking it's something else, but you should definitely rule this out first.
Comment 20•14 years ago
|
||
Taras says other people have seen this... libffi is broken with 4.5. (Good job guys!)
Component: Crypto → js-ctypes
Product: Weave → Core
QA Contact: crypto → js-ctypes
Summary: connect to sync crashes Firefox in [@ pk11_RawPBEKeyGenWithKeyType] with gcc 4.5 → connect to sync crashes Firefox in [@ pk11_RawPBEKeyGenWithKeyType] with gcc 4.5 because libffi is broken
Version: unspecified → Trunk
Comment 21•14 years ago
|
||
Is this fixed with gcc trunk? If so let's dupe bug 590683 to this.
Comment 22•14 years ago
|
||
Also, we should check if 4.1 branch tip works. (Small chance, but worth a shot!) If so, we can just wait for the next point release, or pull branch tip directly.
Comment 23•14 years ago
|
||
(In reply to comment #22)
> Also, we should check if 4.1 branch tip works.
You mean 4.5 branch tip, right? ;-)
Comment 24•14 years ago
|
||
Um. Yes. :(
Comment 25•14 years ago
|
||
I can spend cycles on this if it blocks, say, betaN. (I think it should, since we can't really deploy 4.5 with this!)
I don't think it'll be that hard to track down, but getting the gcc fix (unless we can work around it in libffi) might take some doing. That alone might make this a non-starter for 4. :(
blocking2.0: --- → ?
Comment 26•14 years ago
|
||
(In reply to comment #25)
> I can spend cycles on this if it blocks, say, betaN. (I think it should, since
> we can't really deploy 4.5 with this!)
>
> I don't think it'll be that hard to track down, but getting the gcc fix (unless
> we can work around it in libffi) might take some doing. That alone might make
> this a non-starter for 4. :(
Problem is that I expect linux distributions will be shipping ff with 4.5(some already are), so whether we switch to 4.5 or not, users will suffer at distribution's mercy.
Updated•14 years ago
|
blocking2.0: ? → betaN+
Updated•14 years ago
|
Assignee: nobody → dwitte
Status: NEW → ASSIGNED
blocking2.0: betaN+ → ?
Updated•14 years ago
|
blocking2.0: ? → betaN+
Comment 27•14 years ago
|
||
I can reproduce this on 4.5.1 opt linux x86_64. Not debug.
We can't get around this by simply building libffi without -O2; I still see the problem. So this is related to the code generation of the function being called by the test (sum_many_bool_cdecl).
I have assembly dumps from 4.4 opt and 4.5.1 opt, but they're rather different, so it might take a bit to grok them.
Comment 28•14 years ago
|
||
This goes away at -O0, -O1, and -O2; it only occurs for -Os. Manually splitting out -Os into all its constituent optimization flags passes, so the gcc docs are out of date or -Os is doing other magic that you can't get otherwise.
I can't reproduce using a .c file that contains the relevant function and calls it, so the bug here is in the interaction between libffi and gcc.
Assignee | ||
Comment 29•14 years ago
|
||
Building CTypes.o with -O2 -finline-functions is enough to break jsctypes. -O3 enables it. Note that depending on how you were setting your build flags, you may not have been overriding the flags set in js/src/Makefile.in, which uses -O3 and some other flags.
Forcing to build CTypes.o (only) with -O2 -fstrict-aliasing -fomit-frame-pointer makes everything work for me (xpcshell-tests and Firefox Sync).
Assignee | ||
Comment 30•14 years ago
|
||
FWIW, here is a reduced testcase (reduced from test_jsctypes.js) that exhibits the problem.
Assignee | ||
Comment 31•14 years ago
|
||
Note that I *can't* reproduce on x86-64 with gcc 4.5.1. Only x86.
Assignee | ||
Comment 32•14 years ago
|
||
I identified where in the CTypes.cpp code the problem is occuring:
in CData::Address:
// Manually set the pointer inside the object, so we skip the conversion step.
void** data = static_cast<void**>(GetData(cx, result));
*data = GetData(cx, obj);
The wierdest thing is that when adding a printf after that for the value of *data, it prints the expected value. But when the value is read in PointerType::ContentsGetter, it's NULL. It turns out at the assembly level (the one gcc 4.5 generates), it actually never assigns *data! It also appears to be a gcc bug, not a C++ backend bug.
I wrote a simple (and scary) testcase to send to the upstream gcc bug.
Summary: connect to sync crashes Firefox in [@ pk11_RawPBEKeyGenWithKeyType] with gcc 4.5 because libffi is broken → connect to sync crashes Firefox in [@ pk11_RawPBEKeyGenWithKeyType] with gcc 4.5 because gcc 4.5 is broken
Assignee | ||
Comment 33•14 years ago
|
||
(In reply to comment #32)
> I identified where in the CTypes.cpp code the problem is occuring:
> in CData::Address:
> // Manually set the pointer inside the object, so we skip the conversion
> step.
> void** data = static_cast<void**>(GetData(cx, result));
> *data = GetData(cx, obj);
The bug only triggers when GetData is inlined. So if you add the inline keyword to the GetData function definition, it starts breaking at -O1.
Assignee | ||
Updated•14 years ago
|
Comment 34•14 years ago
|
||
We could take a workaround in our tree to disable the relevant optimization, but if we do, it should be treewide -- since we have no idea what other code gcc is generating incorrectly (and the use of jsval is rather prevalent!). Per Taras, since we're deploying a custom gcc 4.5 anyway, it wouldn't be hard to add a gcc patch locally so I think that's preferable. If we don't get a fix soon we can reconsider the workaround.
Comment 35•14 years ago
|
||
Updated•14 years ago
|
Attachment #476399 -
Attachment is obsolete: true
Comment 36•14 years ago
|
||
Marking fixed since the relevant patch is in trunk and 4.5 branch gcc, and we've pulled the branch patch into our 4.5 for when we deploy.
Assignee: dwitte → mh+mozilla
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•14 years ago
|
Crash Signature: [@ pk11_RawPBEKeyGenWithKeyType]
You need to log in
before you can comment on or make changes to this bug.
Description
•