Closed
Bug 448658
Opened 16 years ago
Closed 15 years ago
nsAutoTArray aligns buffer to 32 bits, causing SIGBUS when array entry type requires 64-bit alignment
Categories
(Core :: XPCOM, defect)
Tracking
()
RESOLVED
FIXED
mozilla1.9.3a4
People
(Reporter: armin76, Assigned: glandium)
References
(Blocks 1 open bug)
Details
(Keywords: crash)
Attachments
(3 files, 5 obsolete files)
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review |
See debian bug:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=483949
I'll attach the backtrace using Gentoo and Ubuntu. Using that patch from Debian, it just delays the sigbus 45mins the first time, so it needs a proper fix.
Reporter | ||
Comment 1•16 years ago
|
||
Assignee | ||
Comment 2•16 years ago
|
||
It would be useful to have the backtrace with the patch applied. I doubt it to be the same.
Reporter | ||
Comment 3•16 years ago
|
||
There we go
Assignee | ||
Comment 4•16 years ago
|
||
Confirmed, it is a different one
Comment 5•16 years ago
|
||
See bug 469276, it discusses probably the same issue.
Reporter | ||
Comment 7•16 years ago
|
||
On trunk, it now fails due to unaligned access on sqlite, which have been handled upstream. See http://www.sqlite.org/cvstrac/tktview?tn=3777
Adding 3.6.13 bug to depend, one small step for sparc...
Depends on: 487664
Reporter | ||
Comment 8•16 years ago
|
||
Patch by Fridrich Oslage <bluebird at gentoo dot org>
Attachment #372899 -
Flags: review?(timeless)
Updated•16 years ago
|
Attachment #372899 -
Flags: review?(timeless) → review?(tony)
Updated•16 years ago
|
Attachment #372899 -
Flags: review?(dcamp)
Comment 9•16 years ago
|
||
Comment on attachment 372899 [details] [diff] [review]
align.patch
Seems fine to me.
Attachment #372899 -
Flags: review?(tony) → review+
Comment 10•16 years ago
|
||
Comment on attachment 372899 [details] [diff] [review]
align.patch
Actually, does this still compiled on MSVC?
Updated•16 years ago
|
Attachment #372899 -
Flags: review?(dcamp)
Updated•16 years ago
|
Assignee: nobody → armin76
Updated•16 years ago
|
Keywords: checkin-needed
Reporter | ||
Comment 11•16 years ago
|
||
(In reply to comment #10)
> (From update of attachment 372899 [details] [diff] [review])
> Actually, does this still compiled on MSVC?
I don't have a system to test that.
Comment 12•16 years ago
|
||
It doesn't...
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1240065334.1240069311.31725.gz
Keywords: checkin-needed
Reporter | ||
Comment 13•16 years ago
|
||
Well, i've been told that that is only a gcc function, so either it has an ifdef or you guys come up with a better solution.
Reporter | ||
Comment 14•16 years ago
|
||
Using an ifdef GNUC
Attachment #372899 -
Attachment is obsolete: true
Attachment #374897 -
Flags: review?(tony)
Assignee | ||
Comment 15•16 years ago
|
||
(In reply to comment #14)
> Created an attachment (id=374897) [details]
> alignv2.patch
>
> Using an ifdef GNUC
If you're depending on gcc specific stuff, why not use __attribute__((aligned(some_value)), instead?
Comment 16•16 years ago
|
||
I have two (potentially stupid) questions: did anyone investigate
(a) why mID is misaligned in the first place (misuse of packed somewhere or compiler bug?)
(b) why does forcing this alignement make the patch suggested to solve #469276 not necessary any more?
IMHO (but of course I'm biased) that patch was better, and sufficient to solve the problem for me, at least. But then I do not understand the bigger picture here...
Reporter | ||
Comment 17•16 years ago
|
||
(In reply to comment #15)
> If you're depending on gcc specific stuff, why not use
> __attribute__((aligned(some_value)), instead?
I'm sorry but i have no clue about C, i just did what i saw on other part of the code. Feel free to improve the patch, i can test it if needed.
(In reply to comment #16)
> I have two (potentially stupid) questions: did anyone investigate
> (a) why mID is misaligned in the first place (misuse of packed somewhere or
> compiler bug?)
Same answer as avobe. The guy who did the patch said he does not know Mozilla's code enough to find where its being unaligned.
> (b) why does forcing this alignement make the patch suggested to solve #469276
> not necessary any more?
Your patch, or some other almost the same if not the same, was already checked-in on trunk on other bug.
See http://hg.mozilla.org/mozilla-central/rev/4151690668c7 and bug 477727
Comment 18•16 years ago
|
||
Comment on attachment 374897 [details] [diff] [review]
alignv2.patch
I agree with Martin that we should find out why mId is getting mis-aligned before patching.
Attachment #374897 -
Flags: review?(tony) → review-
Reporter | ||
Comment 19•16 years ago
|
||
So who's going to do it?
Reporter | ||
Comment 20•15 years ago
|
||
Hi everyone,
the problem is still present on all the 1.9.1.x releases. I've found that Debian doesn't have that issue anymore, so i looked at why.
On gentoo we first did this to compile firefox:
export BUILD_OFFICIAL=1
export MOZILLA_OFFICIAL=1
Debian doesn't do that, so i removed that. Then i compared our version and debian's, and found that the thing that made firefox sigbus or not sigbus(apart from removing thise two variables) is the string on application.ini of firefox.
I have this on application.ini:
[App]
Vendor=Mozilla
Name=Firefox
Version=3.5.5
BuildID=20091109192036
Replacing Firefox with Firefoxa (or whatever not being Firefox, f.ex debian has Iceweasel), it doesn't sigbus anymore.
Have no clue why.
Assignee | ||
Comment 21•15 years ago
|
||
(In reply to comment #16)
> I have two (potentially stupid) questions: did anyone investigate
> (a) why mID is misaligned in the first place (misuse of packed somewhere or
> compiler bug?)
I got some time (and machine) to investigate this issue, and here is what I found:
mID is misaligned because of 2 nsAutoTArray<nsUrlClassifierEntry, 5> declarations, which allocate these nsAutoTArray on stack.
It happens that from the compiler perspective the nsAutoTArrays look like this struct:
struct array {
Header *mHdr; // its value is &mAutoBuf
char mAutoBuf[sizeof(Header) + 5 * sizeof(nsUrlClassifierEntry)];
}
The compiler can't thus know the alignment requirements, and aligns for the Header* pointer and nothing better.
Now, as I didn't want to try to find the combination of unions and modifications to GetAutoArrayBuffer and possibly deeper changes to nsTArray to have a proper alignment, I figured replacing these nsAutoTArray with simple nsTArray works. Patch coming.
Assignee | ||
Comment 22•15 years ago
|
||
See comment 21.
Assignee: armin76 → mh+mozilla
Attachment #374897 -
Attachment is obsolete: true
Attachment #425467 -
Flags: review?(benjamin)
Comment 23•15 years ago
|
||
Comment on attachment 425467 [details] [diff] [review]
Patch
Working around the broken class is certainly expedient, but will probably cause problems reappearing frequently. dbaron, how should this actually be solved?
Attachment #425467 -
Flags: review?(benjamin) → review?(dbaron)
Comment 24•15 years ago
|
||
I presume the underlying requirement in this case is that the entries require 8-byte alignment. However, in theory, entries could require more than that. (I think there may be such things if you're using SSE2 instructions or similar; I recall issues with stack frame alignment on the Mac at one point when we weren't 16-byte aligning stack frames in our xptcall assembly.)
Right now, nsTArray gives 8-byte alignment and nsAutoTArray gives 4-byte alignment. They should really determine the alignment from the type being allocated.
Fixing nsTArray (which isn't the current problem, but could be a problem in theory) is relatively straightforward, since nsTArray is using the allocator to allocate Header + elements, and you can assume that malloc() returns memory aligned to the most strict alignment requirement on the system. To do this, we could make the way the array blocks (Header + elements) are allocated allocate sizeof(S) + (N - 1) * sizeof(elem_type) given struct S { elem_type e; Header h; }, and start the first element at the array at sizeof(S) - sizeof(elem_type) rather than at sizeof(Header). (sizeof() on a struct is guaranteed to round up to that struct's alignment requirements so that you can allocate the structs in an array.)
It's also relatively straightforward to make nsAutoTArray in a similar manner (especially if we're willing to take the hit of extra padding in two different places). However, that would only fix the use of nsAutoTArray on the heap; I don't think it would force the compiler to correctly align an nsAutoTArray on the stack (though I could be wrong). Not sure who else might have ideas about how to fix this, but cc:ing a few candidates.
Summary: nsUrlClassifierDBService has bad alignment, causes SIGBUS → nsUrlClassifierDBService has bad alignment due to nsAutoTArray<nsUrlClassifierEntry, N>, causes SIGBUS
Comment 25•15 years ago
|
||
(In reply to comment #24)
> Right now, nsTArray gives 8-byte alignment and nsAutoTArray gives 4-byte
> alignment.
Er, I should have said that nsAutoTArray gives alignment
min(max(sizeof(void*), alignment requirement of PRUint32), 8 bytes),
which is 4 bytes on a system with 32-bit pointers.
Comment 26•15 years ago
|
||
Two members of a union start at the same address, so one way to get nsAutoTArray::mAutoBuf to be aligned for elem_type would be:
union {
char mAutoBuf[...];
elem_type mAlignmentDummy;
};
Unfortunately, this only works if elem_type is a POD. I can't think of any way to build a POD with the same alignment as elem_type, so a more brute-force strategy would be to pick a POD type that has the greatest alignment on the platform (lets say it was long double) and do:
struct FatPOD { long double _; };
union {
char mAutoBuf[...];
FatPOD mAlignmentDummy;
};
Perhaps that would work?
Assignee | ||
Comment 27•15 years ago
|
||
(In reply to comment #26)
> struct FatPOD { long double _; };
> union {
> char mAutoBuf[...];
> FatPOD mAlignmentDummy;
> };
>
> Perhaps that would work?
I would have gone this way if that didn't mean modifying GetAutoArrayBuffer.
Assignee | ||
Comment 28•15 years ago
|
||
(In reply to comment #27)
> I would have gone this way if that didn't mean modifying GetAutoArrayBuffer.
and Elements() and other similar functions.
Comment 29•15 years ago
|
||
For nsTArray itself, how about we change it so that the contents of the current nsTArray_base::Header are embedded directly into the nsTArray_base object, and then we have a pointer to the elements? Like this
class nsTArray_base {
// ...
protected:
PRUint32 mLength;
PRUint32 mCapacity : 31;
PRUint32 mIsAutoArray : 1;
};
template <typename T>
class nsTArray : public nsTArray_base {
// ...
protected:
elem_type* mElements;
};
That way the compiler and runtime will DTRT.
For nsAutoTArray, I endorse Luke's suggestion, if we're willing to live with always having enough padding in there for the maximum required alignment, which should be fine (it's just another few bytes of wasted space on the stack...) If we really, really need to squeeze out that space, well, have a look at boost's aligned_storage.hpp and imagine what it would take to import all the relevant compiler-specific hacks. (Many of them have to do with MSVC, unfortunately.)
Assignee | ||
Comment 30•15 years ago
|
||
There is the same problem in extensions/cookie/nsPermissionManager.h: nsHostEntry has a mPermissions member defined as nsAutoTArray<nsPermissionEntry, 1>, and nsPermissionEntry has a 64-bits member.
As for the fix, I do agree fixing nsTArray would be better, but I'd need directions, especially regarding GetAutoArrayBuffer. If we are to let the compiler decide the correct alignment, the assumptions in GetAutoArrayBuffer and other places are wrong, and especially in the case of GetAutoArrayBuffer, it is impossible to keep the function the way it is. The best I can think of for it would be to check if the buffer is at most a word size after the header.
Assignee | ||
Comment 31•15 years ago
|
||
(In reply to comment #29)
> For nsTArray itself, how about we change it so that the contents of the current
> nsTArray_base::Header are embedded directly into the nsTArray_base object, and
> then we have a pointer to the elements?
I don't think this is possible, this could break binary compatibility with external components.
Assignee | ||
Comment 32•15 years ago
|
||
After further consideration, it looks like it should be safe to break binary compatibility, as the nsTArray (and nsTPtrArray that derives from it) is in the xpcom glue. I can be wrong, but I don't think a component would share a pointer to a nsTArray with another component.
Other that that, I care more about fixing the current problem being the 8-bytes alignment requirements than fixing all possible alignment problems that hypothetically could happen. I really doubt there is currently any use of nsTArray that would need more than 8-bytes alignment.
Comment 33•15 years ago
|
||
I think we could fix nsAutoTArray by having a little extra buffer size for the cases where we *might* need it depending on the alignment of the stack allocation, and then (in those cases) dynamically changing the pointer to the beginning of the internal stack buffer.
Assignee | ||
Comment 34•15 years ago
|
||
(In reply to comment #29)
> If we really, really need to squeeze out that space, well, have a look at
> boost's aligned_storage.hpp and imagine what it would take to import all the
> relevant compiler-specific hacks. (Many of them have to do with MSVC,
> unfortunately.)
Note there is apparently a TR1 standard implementation of aligned_storage, that seems to be available at least under MS Visual Studio and gcc: std::tr1::aligned_storage. I don't know if that could be used in the mozilla code base.
(In reply to comment #33)
> I think we could fix nsAutoTArray by having a little extra buffer size for the
> cases where we *might* need it depending on the alignment of the stack
> allocation, and then (in those cases) dynamically changing the pointer to the
> beginning of the internal stack buffer.
Since that pretty much needs a change in GetAutoArrayBuffer anyways, I think there is actually no need to make the nsAutoTArray buffer bigger. I'm currently testing a patch that doesn't care about squeezing space and aligns for a 8-bytes word, more on that when it is validated to work.
Assignee | ||
Comment 35•15 years ago
|
||
The idea here is to align at 8-byte boundaries in nsAutoTArray and 4 or 8-byte boundaries in nsAutoTPtrArray (depending on the pointer size ; this is necessary for sparc64).
Considering the alignment constraints, there is only going to be either 0 or 4 bytes between the end of the nsTArray_base contents and the Header in the mAutoBuf. The idea then is to only keep track if we need to adjust for alignment (add 4 bytes) or not, which is 1 bit of information. As the Capacity is already only 31 bits, I tought it is possible to take a bit off the Length to store that information.
In the worst case this implementation wastes 7 + 4 bytes: 4 bytes for alignment, and 7 bytes if the type of data stored in the array is char and the array is created with 1 element only.
Attachment #425467 -
Attachment is obsolete: true
Attachment #427953 -
Flags: review?
Attachment #425467 -
Flags: review?(dbaron)
Assignee | ||
Updated•15 years ago
|
Attachment #427953 -
Flags: review? → review?(dbaron)
Comment 36•15 years ago
|
||
It seems like you're assuming that meeting the alignment requirements of PRUint64 will force an extra word of padding to cause alignment, but I don't think that's guaranteed. And if it doesn't happen, I think the approach you're taking will cause writing and reading 32 bits past the end of the array when it does adjust alignment. I think you could fix this and simplify the patch at the same time by just making GetAutoArrayBuffer static_cast this to an nsAutoTArray, and then just skip mAdjustAlignment.
(I eventually need to look a little more closely, though; this was just a relatively quick look.)
That said, I'd like to see the more correct approach that (a) uses the real alignment requirements of the type and (b) doesn't add the extra padding when it's not needed.
Comment 37•15 years ago
|
||
Er, never mind about the array bounds violation. I was crossing what you were actually doing with what I was expecting to see (and what we'd need to do for the full fix).
However, you could still simplify the patch a good bit and avoid stealing the extra bit by using a static_cast. (It might require moving the inline function's body down to the end of the file, with just a declaration as explicitly |inline| within the class definition.)
Assignee | ||
Comment 38•15 years ago
|
||
(In reply to comment #37)
> However, you could still simplify the patch a good bit and avoid stealing the
> extra bit by using a static_cast. (It might require moving the inline
> function's body down to the end of the file, with just a declaration as
> explicitly |inline| within the class definition.)
That can't work unless the function is moved in either nsTArray or nsTAutoArray. The latter would be painful because the function is used on nsTArrays in nsTArray.cpp, it would also mean having to either reimplement it in nsTPtrAutoArray or make nsTPtrAutoArray inherit from nsTAutoArray too.
As for real alignment requirements of the type, that can't work if the type is not a POD.
Assignee | ||
Comment 39•15 years ago
|
||
(In reply to comment #38)
> That can't work unless the function is moved in either nsTArray or
> nsTAutoArray. The latter would be painful because the function is used on
> nsTArrays in nsTArray.cpp, it would also mean having to either reimplement it
> in nsTPtrAutoArray or make nsTPtrAutoArray inherit from nsTAutoArray too.
Actually, that can't work at all, because nsTArray.cpp implements nsTArray_base and uses both UsesAutoArrayBuffer and GetAutoArrayBuffer in various functions. Moving these functions in the nsTArray template would be silly.
Comment 40•15 years ago
|
||
Comment on attachment 427953 [details] [diff] [review]
Patch for 8-byte alignment of nsAutoT{Ptr,}Array buffer
I think this will be acceptable (although not the ideal solution) if you:
* drop mAdjustAlignment
* make both auto array classes use the same extra thing for alignment (probably better as void* than PRInt64 to have less effect on 32-bit machines)
* make GetAutoArrayBuffer use a static_cast to either nsTAutoArray or (given what you say above) a struct nested inside nsTArray that looks just like it
Depending on the alignment requirements of the type *is* possible using the approach I described above, which is different from the approach you're taking in this patch.
Attachment #427953 -
Flags: review?(dbaron) → review-
Comment 41•15 years ago
|
||
(In reply to comment #40)
> Depending on the alignment requirements of the type *is* possible using the
> approach I described above, which is different from the approach you're taking
> in this patch.
(By "above", I mean comment 24 plus comment 33.)
Assignee | ||
Comment 42•15 years ago
|
||
(In reply to comment #40)
> (From update of attachment 427953 [details] [diff] [review])
> I think this will be acceptable (although not the ideal solution) if you:
> * drop mAdjustAlignment
> * make both auto array classes use the same extra thing for alignment
> (probably better as void* than PRInt64 to have less effect on 32-bit machines)
sparc has 32 bits pointers and needs 64 bits alignment for 64-bits ints, which is the original problem.
> * make GetAutoArrayBuffer use a static_cast to either nsTAutoArray or (given
> what you say above) a struct nested inside nsTArray that looks just like it
The problem is that GetAutoArrayBuffer is in nsTArray_base, and nsTAutoArray is a template... nsTAutoArray<PRInt32> will have different alignment requirements than nsTAutoArray<PRInt64>. What do you want to static_cast to ? The only way this is going to work is to move GetAutoArrayBuffer into the templates, but then, functions in nsTArray.cpp won't be able to work properly.
Comment 43•15 years ago
|
||
(In reply to comment #42)
> (In reply to comment #40)
> > (From update of attachment 427953 [details] [diff] [review] [details])
> > I think this will be acceptable (although not the ideal solution) if you:
> > * drop mAdjustAlignment
> > * make both auto array classes use the same extra thing for alignment
> > (probably better as void* than PRInt64 to have less effect on 32-bit machines)
>
> sparc has 32 bits pointers and needs 64 bits alignment for 64-bits ints, which
> is the original problem.
ok, stick to PRInt64
> > * make GetAutoArrayBuffer use a static_cast to either nsTAutoArray or (given
> > what you say above) a struct nested inside nsTArray that looks just like it
>
> The problem is that GetAutoArrayBuffer is in nsTArray_base, and nsTAutoArray is
> a template... nsTAutoArray<PRInt32> will have different alignment requirements
> than nsTAutoArray<PRInt64>. What do you want to static_cast to ? The only way
> this is going to work is to move GetAutoArrayBuffer into the templates, but
> then, functions in nsTArray.cpp won't be able to work properly.
Leave it in the base class; just nest the dummy struct in nsTArray_base.
Assignee | ||
Comment 44•15 years ago
|
||
(In reply to comment #43)
> Leave it in the base class; just nest the dummy struct in nsTArray_base.
Oh I see, so you just want to always align for 64 bits data, not depending on the type.
Comment 45•15 years ago
|
||
Yes. Your approach already does that for the size of the nsAutoTArray, so we may as well do it for the alignment as well.
Assignee | ||
Comment 46•15 years ago
|
||
One of the reasons I did it the way it currently is was to allow better alignment later (it should be possible to align at 4 or 8 bytes depending on the type being put in the array). If you are fine with always aligning at 8 bytes boundary, we can just go this way.
Assignee | ||
Comment 47•15 years ago
|
||
This should work as expected (currently building on sparc to really make sure).
2 comments:
- I didn't know where would be best to cut the long line in GetAutoArrayBuffer().
- I'm tempted to define a template for the AutoBuf, so that the trick doesn't have to be modified in two different files whenever you feel you want more compact arrays.
Attachment #427953 -
Attachment is obsolete: true
Assignee | ||
Comment 48•15 years ago
|
||
Another one:
- I'm also tempted to define the AutoArray struct using nsTArray_base itself, so that any change in the base class doesn't break everything.
Assignee | ||
Comment 49•15 years ago
|
||
Comment on attachment 429307 [details] [diff] [review]
Patch v2
Confirmed to work on amd64 and sparc. Please see comment 47 and comment 48.
Attachment #429307 -
Flags: review?(dbaron)
Comment 50•15 years ago
|
||
Comment on attachment 429307 [details] [diff] [review]
Patch v2
r=dbaron, if you add a comment above |AutoArray| that says it's a dummy struct just to simulate what the compiler does to the alignment of nsAutoTArray and nsAutoTPtrArray.
Sorry for the delay getting to this.
Attachment #429307 -
Flags: review?(dbaron) → review+
Assignee | ||
Comment 51•15 years ago
|
||
Addressing dbaron's comment 50.
Attachment #429307 -
Attachment is obsolete: true
Assignee | ||
Updated•15 years ago
|
Keywords: checkin-needed
Updated•15 years ago
|
Summary: nsUrlClassifierDBService has bad alignment due to nsAutoTArray<nsUrlClassifierEntry, N>, causes SIGBUS → nsAutoTArray aligns buffer to 32 bits, causing SIGBUS when array entry type requires 64-bit alignment
Comment 52•15 years ago
|
||
Status: NEW → RESOLVED
Closed: 15 years ago
Component: Phishing Protection → XPCOM
Keywords: checkin-needed
Product: Firefox → Core
QA Contact: phishing.protection → xpcom
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.3a4
Comment 53•15 years ago
|
||
Backed out. This caused a crash on Win debug builds:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1269270930.1269273254.15093.gz
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1269273122.1269275969.26503.gz
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 54•15 years ago
|
||
I don't see why this would crash. Is there a way to get more information from the crash ?
Comment 55•15 years ago
|
||
dao, both of those builds look like known random-orange entirely unrelated to this push.
Comment 56•15 years ago
|
||
I didn't spend too much time on it, since I was on the run. What's the bug for the random orange?
Assignee | ||
Comment 57•15 years ago
|
||
(In reply to comment #55)
> dao, both of those builds look like known random-orange entirely unrelated to
> this push.
To be fair, the second one crashes in nsTArray.h.
Updated•15 years ago
|
Keywords: checkin-needed
Comment 58•15 years ago
|
||
(In reply to comment #57)
> (In reply to comment #55)
> > dao, both of those builds look like known random-orange entirely unrelated to
> > this push.
>
> To be fair, the second one crashes in nsTArray.h.
Yeah, I saw that and considered the case closed...
Comment 59•15 years ago
|
||
I'm told the random orange is bug 542700.
Comment 60•15 years ago
|
||
Status: REOPENED → RESOLVED
Closed: 15 years ago → 15 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•