Closed
Bug 516858
Opened 15 years ago
Closed 15 years ago
c-central + m-central MacOSX builds fail to compile after m-c changeset 32506 : 9c3a70ea7acf
Categories
(MailNews Core :: Build Config, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
Thunderbird 3.0rc1
People
(Reporter: sgautherie, Assigned: standard8)
References
()
Details
Attachments
(4 files)
(deleted),
patch
|
gozer
:
review+
|
Details | Diff | Splinter Review |
(deleted),
application/octet-stream
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
gozer
:
review+
standard8
:
approval-thunderbird3+
|
Details | Diff | Splinter Review |
http://hg.mozilla.org/mozilla-central/rev/9c3a70ea7acf
{
9c3a70ea7acf
2009-09-15 15:58 -0400
Josh Aas - Breakpad PPC bustage fix. r=ted
}
*****
{
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1253051783.1253055323.22710.gz
OS X 10.5 comm-central-trunk build on 2009/09/15 14:56:23
[...]
/builds/slave/comm-central-trunk-macosx/build/mozilla/toolkit/crashreporter/google-breakpad/src/client/mac/handler/minidump_generator.cc:392: error: 'struct ppc_thread_state' has no member named '__r31'
/builds/slave/comm-central-trunk-macosx/build/mozilla/toolkit/crashreporter/google-breakpad/src/client/mac/handler/minidump_generator.cc:396: error: 'struct ppc_thread_state' has no member named '__mq'
make[8]: *** [minidump_generator.o] Error 1
}
"Same" with
http://tinderbox.mozilla.org/showlog.cgi?log=Thunderbird/1253046599.1253047478.2650.gz
MacOSX 10.5 comm-central build on 2009/09/15 13:29:59
Assignee | ||
Comment 1•15 years ago
|
||
My bet would be that we haven't ported the configure changes from http://hg.mozilla.org/mozilla-central/rev/a4e2df0a6af5
Hence comm-central is configuring for build 10.4, mozilla-central is wanting 10.5 and I expect there is quite a mis-match there.
Did I see a patch around somewhere that will add a MOZILLA_1_9_2_BRANCH define?
Comment 2•15 years ago
|
||
My bustage fix there unfortunately makes Breakpad not compile for PPC with the 10.4 SDK. You can switch your m-c builds to the 10.5 SDK (you can do so in the mozconfig, or include the universal mozconfig from m-c, which sets it) which will fix this bustage. We're not planning on supporting 10.4 on m-c anymore, so you might hit other bustage anyway (like when we switch to Core Text).
Assignee | ||
Comment 3•15 years ago
|
||
Well for Thunderbird's mozconfigs we include the universal one:
http://hg.mozilla.org/build/buildbot-configs/file/8612f756fead/thunderbird/macosx/mozconfig
So I'm pretty sure this is just the fact our configure is doing something to mozilla-central's.
Comment 4•15 years ago
|
||
Ah. Might be a mismsatch with --enable-macos-target. Try adding:
ac_add_options --enable-macos-target=10.5
to the mozconfig. That's the default in m-c's configure, but your configure is probably still setting 10.4.
Comment 5•15 years ago
|
||
Yes, I think we should just go and match mozilla-central and build with 10.5 as a minimum for non-1.9.1 builds. We possibly can revisit 1.9.2 later if we might decide to go with it, but we might just go with 1.9.3 in any case.
Assignee | ||
Comment 6•15 years ago
|
||
(In reply to comment #5)
> Yes, I think we should just go and match mozilla-central and build with 10.5 as
> a minimum for non-1.9.1 builds. We possibly can revisit 1.9.2 later if we might
> decide to go with it, but we might just go with 1.9.3 in any case.
I was thinking that I'd see one of Serge's patches with a MOZILLA_1_9_2_BRANCH definition (although I could easily write that), hence we could easily detect our configuration based on the branch we're building with.
Reporter | ||
Comment 7•15 years ago
|
||
(In reply to comment #1)
> Did I see a patch around somewhere that will add a MOZILLA_1_9_2_BRANCH define?
Yes: bug 516195.
Depends on: 516195
Assignee | ||
Updated•15 years ago
|
Assignee: nobody → bugzilla
Assignee | ||
Updated•15 years ago
|
Assignee: bugzilla → nobody
Severity: blocker → critical
Component: Breakpad Integration → Build Config
Product: Toolkit → MailNews Core
QA Contact: breakpad.integration → build-config
Target Milestone: --- → Thunderbird 3.0rc1
Assignee | ||
Comment 8•15 years ago
|
||
This should fix the bustage - I'm currently rebuilding and testing the patch, however it is basically a port of the configure.in part of the patch on bug 501436.
Comment 9•15 years ago
|
||
Comment on attachment 401289 [details] [diff] [review]
The fix
[Checkin: Comment 10]
Looks safe to me, so +1. But I have to say, this bit makes me cringe a little:
if test "$MOZILLA_1_9_1_BRANCH$MOZILLA_1_9_2_BRANCH" = "1"; then
as opposed to actually testing for what is being tested. if $191 or $192; then
I understand they can never both be true, but still...
Attachment #401289 -
Flags: review?(gozer) → review+
Assignee | ||
Comment 10•15 years ago
|
||
Ok, time for an update. I checked in the patch and a couple of bustage fixes:
http://hg.mozilla.org/comm-central/rev/8897768028ba
http://hg.mozilla.org/comm-central/rev/c10f2b5bb6c3
http://hg.mozilla.org/comm-central/rev/96649d27f85a
The bustage fixes should fix us for bustage from bug 516213.
However we're still broken on trunk - for some reason I think the build is timing out/crashing when we're doing the ppc part of the build, possibly in the LDAP code. No idea as to why yet.
Comment 11•15 years ago
|
||
I've been able to reproduce on one of the nightly builders.
It's when it's running:
gcc-4.2 -arch ppc -o ufn.o -c -gdwarf-2 -Wmost -fno-common -isysroot /Developer/SDKs/MacOSX10.5.sdk -pthread -O -UDEBUG -DMOZILLA_CLIENT=1 -DNDEBUG=1 -DXP_UNIX=1 -DDARWIN=1 -DHAVE_BSD_FLOCK=1 -Dppc=1 -DHAVE_LCHOWN=1 -DHAVE_STRERROR=1 -DHAVE_GETADDRINFO=1 -DHAVE_GETNAMEINFO=1 -DFORCE_PR_LOG -D_PR_PTHREADS -UHAVE_CVAR_BUILT_ON_SEM -DUSE_WAITPID -DNEEDPROTOS -DNET_SSL -DNO_LIBLCACHE -DLDAP_REFERRALS -DNS_DOMESTIC -UMOZILLA_CLIENT -DUSE_PTHREADS -I/Volumes/Build/comm-central-trunk-macosx-nightly/build/objdir-tb/ppc/mozilla/dist/public/ldap -I/Volumes/Build/comm-central-trunk-macosx-nightly/build/directory/c-sdk/ldap/include -I/Volumes/Build/comm-central-trunk-macosx-nightly/build/objdir-tb/ppc/mozilla/dist/./public /Volumes/Build/comm-central-trunk-macosx-nightly/build/directory/c-sdk/ldap/libraries/libldap/ufn.c
It just sits there. I've been able to re-trigger by running that compilation line by itself, and gcc just sits there. No CPU, no RAM being used, and dtruss shows absolutely no activity, so it must be stuck in gcc's c-land
Comment 12•15 years ago
|
||
0x942ce791 in __wait4 ()
(gdb) bt
#0 0x942ce791 in __wait4 ()
#1 0x942ce787 in waitpid$UNIX2003 ()
#2 ...
So for some reason, it's stuck waiting for a process to come back, but it never will, since it doesn't exist.
Comment 13•15 years ago
|
||
managed to re-run gcc with dtruss, and here is the output. Unfortunately, osx's dtrace doesn't ship a pid provider, so the observability into gcc is pretty much null
Comment 14•15 years ago
|
||
Interestingly, gcc-4.0 has no issues with the file
momo-vm-osx-leopard-05:tmp cltbld$ gcc-4.0 -arch ppc -c -o ufn.o -gdwarf-2 -Wmost -fno-common -isysroot /Developer/SDKs/MacOSX10.5.sdk -pthread -O ufn.c
Works just fine
momo-vm-osx-leopard-05:tmp cltbld$ gcc-4.2 -arch ppc -c -o ufn.o -gdwarf-2 -Wmost -fno-common -isysroot /Developer/SDKs/MacOSX10.5.sdk -pthread -O ufn.c
Gets stuck and tried the same on my Snow Leopard MacBook Pro, and same thing. gcc-4.0 okay, gcc-4.2 gets stuck.
Comment 15•15 years ago
|
||
Now that I can repro on my own box, I'll try and shrink ufn.c down some more.
Comment 16•15 years ago
|
||
(In reply to comment #14)
> gcc-4.0 okay, gcc-4.2 gets stuck.
Since the day I started to build my own Thunderbird I only used Apples gcc-4.2 for my builds. I don't know which version you or Mozilla is using, but I also had some build problems with older versions of Apples gcc-4.2. The first gcc-4.2 version with no problems was gcc version 4.2.1 "Apple Inc. build 5566" (included in Xcode 3.1.2 Developer Tools). So updating the version of gcc-4.2 could be a possibility.
Comment 17•15 years ago
|
||
gozer: do you have XCode 3.1 installed? That's what our build slaves are using.
Comment 18•15 years ago
|
||
$ gcc --version
i686-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1 (Apple Inc. build 5484)
$ gcc-4.2 --version
i686-apple-darwin9-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5564)
$ grep 'Xcode version' /Developer/Applications/Xcode.app/Contents/Info.plist
<string>Xcode version 3.1</string>
Interesting, maybe we need to update to Xcode 3.1.2 ? Looking at https://wiki.mozilla.org/ReferencePlatforms/Mac-10.5 seems to indicate the OS X refplatform is at Xcode 3.1 just like mine, Ted ?
Comment 19•15 years ago
|
||
It's also the case that our builds are busted in code that Firefox doesn't compile, we break in LDAP after all.
Comment 20•15 years ago
|
||
Ben could comment on exactly what version of XCode is installed. It is possible that there's a compiler problem with that LDAP code, sure. If you have a spare machine, you might try updating to the absolute latest XCode to see if that works. I know Josh mentioned that XCode 3.1.3 did contain some bug fixes in gcc 4.2.
Comment 21•15 years ago
|
||
We've got:
bm-xserve17:~ cltbld$ gcc --version
i686-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1 (Apple Inc. build 5484)
Copyright (C) 2005 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
bm-xserve17:~ cltbld$ gcc-4.2 --version
i686-apple-darwin9-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5564)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
bm-xserve17:~ cltbld$ cat /Developer/Applications/Xcode.app/Contents/version.plist
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>BuildVersion</key>
<string>1</string>
<key>CFBundleShortVersionString</key>
<string>3.1</string>
<key>CFBundleVersion</key>
<string>1099</string>
<key>ProjectName</key>
<string>DevToolsIDE</string>
<key>SourceVersion</key>
<string>10990000</string>
</dict>
</plist>
Comment 22•15 years ago
|
||
Also, I can reproduce this bustage with Xcode 3.2 on my Snow Leopard box
i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5646)
So definitely has the smell of a gcc bug
Ben, can you try and compile attachment 401461 [details] on one of your build boxes? It's preprocessed, so it requires nothing beside of gcc
This should hang
$> gcc-4.2 -arch ppc -c -o ufn.o -gdwarf-2 -O1 ufn.c
But not this
$> gcc-4.2 -arch ppc -c -o ufn.o -gdwarf-2 -O2 ufn.c
or this
$> gcc-4.2 -arch ppc -c -o ufn.o -g -O1 ufn.c
Comment 23•15 years ago
|
||
(In reply to comment #22)
> This should hang
> $> gcc-4.2 -arch ppc -c -o ufn.o -gdwarf-2 -O1 ufn.c
It does
>
> But not this
> $> gcc-4.2 -arch ppc -c -o ufn.o -gdwarf-2 -O2 ufn.c
It doesn't
> or this
> $> gcc-4.2 -arch ppc -c -o ufn.o -g -O1 ufn.c
This one hangs
Comment 24•15 years ago
|
||
FYI, I've tested it today and on my Mac (Intel iMac) I can build TB 3.1a1pre with 10.5 SDK, gcc-4.2 and "-arch ppc" without any problems...
Comment 25•15 years ago
|
||
I suspect we could work around this issue by just having the LDAP sdk compiled with -O0, for instance.
Comment 26•15 years ago
|
||
I am going to try and spin a nightly with
--enable-optimize=-O2
to try and work around that issue, let's see what happens.
Reporter | ||
Comment 28•15 years ago
|
||
Any hope of at least a temporary workaround to get builds (even without ldap) again?
Assignee | ||
Comment 29•15 years ago
|
||
(In reply to comment #28)
> Any hope of at least a temporary workaround to get builds (even without ldap)
> again?
gozer is going to write up what tests he's done when he gets time. I'm then going to look at getting a fix into the LDAP code base.
All of which shouldn't really take long, but a b4 release and string freeze has just delayed it.
I'm not too concerned yet as Windows & Linux builds are still running and all the test boxes so we have reasonable coverage there.
Comment 30•15 years ago
|
||
I believe I narrowed it down to a single line of code, but it's certainly very strange.
in ldap_ufn_expand
if (( msgid = ldap_search( ld, dn, scope, filter, attrs,
aonly )) == -1 ) {
ldap_msgfree( tmpcand );
*err = ldap_get_lderrno( ld, ((void *)0), ((void *)0) );
return( ((void *)0) ); /* XXX */
that last return is causing the gcc hang for me. Commenting it out makes the bug dissapear. Changing it is more interesting. return 1; works just fine, gcc is happy. Anything else that the optimizer can resolve to 0 seems to cause problems, variants I've tried
return 1; //WORKS
return 2; //WORKS
return 0; //HANGS
return i; //WORKS
return i-i; //HANGS
return 2-2; //HANGS
Absolutely not sure *why* the compiler is doing this, but definitely something tripping up the optimizer somehow.
Assignee | ||
Comment 31•15 years ago
|
||
Given the original fix for this bug was to build with 10.5 not 10.4, and the issue we have now is ldap specific, I've spun the LDAP issue off into bug 520401.
Therefore I'm closing this bug as fixed, even though the builds won't work yet.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 32•15 years ago
|
||
(In reply to comment #10)
> http://hg.mozilla.org/comm-central/rev/c10f2b5bb6c3
> http://hg.mozilla.org/comm-central/rev/96649d27f85a
>
> The bustage fixes should fix us for bustage from bug 516213.
Mark, should SeaMonkey copy these additional/WebGL changes?
Reporter | ||
Comment 33•15 years ago
|
||
Comment on attachment 401289 [details] [diff] [review]
The fix
[Checkin: Comment 10]
>diff --git a/configure.in b/configure.in
>@@ -383,8 +383,14 @@ if test -n "$CROSS_COMPILE" && test "$ta
>+ dnl 1.9.1 and 1.9.2 support 10.4, 1.9.3 and later don't.
>+ if test "$MOZILLA_1_9_1_BRANCH$MOZILLA_1_9_2_BRANCH" = "1"; then
> CFLAGS="-isysroot /Developer/SDKs/MacOSX10.4u.sdk $CFLAGS"
> CXXFLAGS="-isysroot /Developer/SDKs/MacOSX10.4u.sdk $CXXFLAGS"
>+ else
>+ CFLAGS="-isysroot /Developer/SDKs/MacOSX10.5u.sdk $CFLAGS"
>+ CXXFLAGS="-isysroot /Developer/SDKs/MacOSX10.5u.sdk $CXXFLAGS"
Previously, m-c and c-c used MacOSX10.4u.sdk.
Is it expected that they now use MacOSX10.5.sdk and MacOSX10.5u.sdk respectively?
Assignee | ||
Comment 34•15 years ago
|
||
(In reply to comment #32)
> Mark, should SeaMonkey copy these additional/WebGL changes?
If SeaMonkey is compiling fine without then it doesn't need it.
(In reply to comment #33)
> Previously, m-c and c-c used MacOSX10.4u.sdk.
> Is it expected that they now use MacOSX10.5.sdk and MacOSX10.5u.sdk
> respectively?
Oh yes, that's wrong. It might explain some of the problems we've been having as well. I'll attach a patch in a bit.
Assignee | ||
Comment 35•15 years ago
|
||
s/MacOSX10.5u.sdk/MacOSX10.5.sdk/ - could explain why our non-universal tinderboxes had a bit of trouble.
Attachment #405243 -
Flags: review?(gozer)
Comment 36•15 years ago
|
||
Comment on attachment 405243 [details] [diff] [review]
[checked in] The fix
(In reply to comment #35)
> Created an attachment (id=405243) [details]
> The fix
>
> s/MacOSX10.5u.sdk/MacOSX10.5.sdk/ - could explain why our non-universal
> tinderboxes had a bit of trouble.
Definitely, there is no such thing as MacOSX10.5u.sdk, it's MacOSX10.5.sdk. Does this also makes the previous mozconfig change unnecessary ?
Attachment #405243 -
Flags: review?(gozer) → review+
Assignee | ||
Comment 37•15 years ago
|
||
Comment on attachment 405243 [details] [diff] [review]
[checked in] The fix
a=Standard8: minor configure fix to pick up the correct sdk version rather than one that doesn't exist.
Attachment #405243 -
Flags: approval-thunderbird3+
Assignee | ||
Comment 38•15 years ago
|
||
Comment on attachment 405243 [details] [diff] [review]
[checked in] The fix
Checked in:
http://hg.mozilla.org/comm-central/rev/2330bc790d88
I've also backed out just the unit test change where we added --with-macosx-sdk to the mozconfig:
http://hg.mozilla.org/build/buildbot-configs/rev/3ef4faf32076
and clobbered the trunk unit test boxes.
If the builds still pass then I'll do the same to the bloat boxes.
Attachment #405243 -
Attachment description: The fix → [checked in] The fix
Assignee | ||
Comment 39•15 years ago
|
||
(In reply to comment #38)
> I've also backed out just the unit test change where we added --with-macosx-sdk
> to the mozconfig:
>
> http://hg.mozilla.org/build/buildbot-configs/rev/3ef4faf32076
>
> and clobbered the trunk unit test boxes.
>
> If the builds still pass then I'll do the same to the bloat boxes.
The configure.in fix wasn't enough, so I've backed out (well, put back in) the mozconfig change:
http://hg.mozilla.org/build/buildbot-configs/rev/388a3e541e8f
I've raised bug 522028 for actually figuring out what we're getting wrong that these builds need the --with-macosx-sdk option.
Reporter | ||
Updated•15 years ago
|
Attachment #405243 -
Attachment description: [checked in] The fix → The fix
[Checkin: Comment 38]
Reporter | ||
Updated•15 years ago
|
Attachment #401289 -
Attachment description: The fix → The fix
[Checkin: Comment 10]
Assignee | ||
Updated•15 years ago
|
Attachment #405243 -
Attachment description: The fix
[Checkin: Comment 38] → [checked in] The fix
Reporter | ||
Comment 40•15 years ago
|
||
(In reply to comment #34)
> (In reply to comment #32)
> > Mark, should SeaMonkey copy these additional/WebGL changes?
>
> If SeaMonkey is compiling fine without then it doesn't need it.
Eventually, SeaMonkey port need became bug 523562 :-/
Blocks: 523562
Flags: in-litmus-
Reporter | ||
Updated•15 years ago
|
Flags: in-litmus- → in-testsuite-
You need to log in
before you can comment on or make changes to this bug.
Description
•