Bugzilla

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Comment 3

•

18 years ago

So nsBaseContentList::Reset gets re-called inside
nsBaseContentList::Reset... :(

Comment 4

•

18 years ago

It would probably be a good idea to make nsCOMArray::Clear safer such that it kills elements from the back by removing them from the array first and then releases them.

Assignee

Comment 5

•

18 years ago

Attached patch fixes the crash (obsolete) (deleted) — Details — Splinter Review

This fixes the crash, but I agree it is better to make nsCOMArray safer.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Assignee

Comment 6

•

18 years ago

Attached patch make nsCOMArray safer (obsolete) (deleted) — Details — Splinter Review

Comment 7

•

18 years ago

Yeah, I think we should fix this in the comarray instead.

Actually, the best way to fix it in the comarray would probably be to copy mImpl to a local variable and clear the elements in there.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 8

•

18 years ago

Comment on attachment 240593 [details] [diff] [review]
make nsCOMArray safer

I'd say this is a bit too slow. Either do the mImpl trick, or you could alternativly add a .swap() function to nsVoidArray and nsCOMArray use that.

Assignee

Comment 9

•

18 years ago

Attached patch proposed patch. (deleted) — Details — Splinter Review

The patch fixes the crash (tested both 1.8 and trunk) and
is perhaps the simplest way to fix this.

Adding Swap wouldn't work in general cases, if someone tried
to swap for example SimpleVoidArray and nsAutoVoidArray.
Or at least I couldn't come up with any sensible API.

Assignee: general → Olli.Pettay

Attachment #240591 - Attachment is obsolete: true

Attachment #240593 - Attachment is obsolete: true

Status: NEW → ASSIGNED

Assignee

Updated

•

18 years ago

Component: DOM → XPCOM

:Gavin Sharp [email: gavin@gavinsharp.com]

Assignee

Comment 10

•

18 years ago

Comment on attachment 240610 [details] [diff] [review]
proposed patch.

Not sure who is the right person to review this.

Attachment #240610 - Flags: review?(dbaron)

Updated

•

18 years ago

QA Contact: ian → xpcom

Comment 11

•

18 years ago

So wait.  How did we get into a state where the content list was the only thing holding a ref to the image?  That shouldn't be happening, should it?  Or is this on branch and the image is in a disconnected subtree or something?

Comment 12

•

18 years ago

Or is the issue that state is DIRTY and hence what's in the list has no bearing on reality?

We should probably clear the list when state goes DIRTY now that we're holding strong refs.  :(  Otherwise we could well hold refs to random stuff for a _long_ time.  I don't like the performance aspects, but I'm not that happy with the perf aspects of the COMArray changes either, and those would affect other callers too.  And the memory usage issue remains even with the COMArray changes.

Comment 13

•

18 years ago

Actually, perf of clearing when going to DIRTY is fine -- just moves the clear to a different location.  And then assert that we're empty where we clear right now.

Comment 14

•

18 years ago

This is a regression, for what it's worth.  We should really get this fixed for 1.8.1, I think.  :(

Flags: blocking1.9?

Flags: blocking1.8.1?

Flags: blocking1.8.0.8?

Assignee

Comment 15

•

18 years ago

(In reply to comment #11)
>Or is this on branch 

This happens both in trunk and branch

Comment 16

•

18 years ago

Comment on attachment 240610 [details] [diff] [review]
proposed patch.

For what it's worth, I tend to think bugs like this are bugs in the caller -- and I'm not really confident that this will prevent bad things from happening in the future, since we'll still make it all the way to recurring into nsBaseContentList::Reset, which could easily do bad things.

That said, this patch does seem ok, although

> void
> nsCOMArray_base::Clear()
> {
>-    mArray.EnumerateForwards(ReleaseObjects, nsnull);
>-    mArray.Clear();
>+    nsVoidArray tmpArray;

I don't know of a good way to write an assertion around types without writing a bunch of templates, but at least add an assertion that sizeof(tmpArray) and sizeof(mArray) are equal, with the assertion text that tmpArray and mArray must be the same type.


I'd offset these 4 lines with whitespace before and after them, with the comment "swap tmpArray and mArray".
>+    char tmp[sizeof(nsVoidArray)];
>+    memcpy(tmp, &tmpArray, sizeof(nsVoidArray));
>+    memcpy(&tmpArray, &mArray, sizeof(nsVoidArray));
>+    memcpy(&mArray, tmp, sizeof(nsVoidArray));

>+    tmpArray.EnumerateForwards(ReleaseObjects, nsnull);

>+    tmpArray.Clear();

No need to bother with explicitly Clearing tmpArray.

Might want additional review from bsmedberg if he's available.

Attachment #240610 - Flags: review?(dbaron)

Attachment #240610 - Flags: review?(benjamin)

Attachment #240610 - Flags: review+

Comment 17

•

18 years ago

(Recurring into PopulateSelf seems more dangerous than recurring into Reset, and we'd still do that as well.)

Benjamin Smedberg

Comment 18

•

18 years ago

Comment on attachment 240610 [details] [diff] [review]
proposed patch.

Why can't this just be:

nsVoidArray tmp(mArray);
mArray.Clear();

tmp.EnumerateForward(ReleaseObjects, nsnull);

Assignee

Comment 19

•

18 years ago

(In reply to comment #18)
> (From update of attachment 240610 [details] [diff] [review] [edit])
> Why can't this just be: 

because
‘nsVoidArray::nsVoidArray(const nsVoidArray&)’ is private

Benjamin Smedberg

Comment 20

•

18 years ago

Yay for us. Then how about nsVoidArray tmp; tmp = mArray; since the assignment operator is implemented?

Assignee

Comment 21

•

18 years ago

(In reply to comment #20)
> Yay for us. Then how about nsVoidArray tmp; tmp = mArray; since the assignment
> operator is implemented?
> 

That does more work than just those 3 memcpys.

But I'm just testing whether this could be fixed by calling Reset()
whenever state goes dirty.

Assignee

Comment 22

•

18 years ago

Attached patch alternative patch (deleted) — Details — Splinter Review

This works too. Clearing the elements list whenever the state goes dirty.
(Tested on 1.8 and trunk, though this patch applies to trunk only)

Attachment #240637 - Flags: review?(bzbarsky)

Comment 23

•

18 years ago

Any idea when this regressed?

Comment 24

•

18 years ago

So what caused this regression, and why?  It seems like we might be better off moving to as well-tested a state as we can at this point.

The second patch seems to be asserting that we're better off releasing during ContentInserted, ContentAppended, ContentRemoved, and AttributeChanged.  How do we know it's safe to release there?  Could it introduce some other bug?

Adam Guthrie

Comment 25

•

18 years ago

This regressed on branch between 2006-08-28-04 and 2006-08-29-04. Checkins: http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=PhoenixTinderbox&branch=MOZILLA_1_8_BRANCH&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2006-08-28+03&maxdate=2006-08-29+05&cvsroot=%2Fcvsroot

So I guess that puts bug 348062 on the hook.

Comment 26

•

18 years ago

Talking to DBaron and Sicking we are a little nervous about this fix for RC2.  Given we have the same problem on 1.5.0.7 (unfortunately) we should probably review and land the second patch on trunk and try and get this on a 1.8.1.1 release..

Flags: blocking1.8.1?

Flags: blocking1.8.1.1?

Flags: blocking1.8.1-

Comment 27

•

18 years ago

And by second patch, we mean attachment 240637 [details] [diff] [review].

Comment 28

•

18 years ago

Comment on attachment 240610 [details] [diff] [review]
proposed patch.

Removing my review+ given that I think it's really the callers responsibility not to recur into manipulating objects that are currently being manipulated.

Attachment #240610 - Flags: review+

Comment 29

•

18 years ago

Yeah, this is a regression from bug 348062.

I do think we should fix this on branches, and I can see being nervous about it for RC2.  Given that this issue can only be triggered by extensions (because it involves progress listeners) and that we can probably live with the memory issues, doing this for 1.8.0.9/1.8.1.1 sounds reasonable.

Blocks: 348062

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 30

•

18 years ago

Comment on attachment 240637 [details] [diff] [review]
alternative patch

>Index: content/base/src/nsContentList.h
>+  /**
>+   * Sets the state to LIST_DIRTY and clears mElements array.
>+   */

Document that this is the only acceptable way to set state to LIST_DIRTY?

With that r+sr=bzbarsky. Thanks for dealing with this!

Attachment #240637 - Flags: superreview+

Attachment #240637 - Flags: review?(bzbarsky)

Attachment #240637 - Flags: review+

Comment 31

•

18 years ago

So we have three options here:

1. Back out the original patch that caused this. That regresses a security bug.
   Though arguably that patch caused another security bug, i.e. this one, so 
   we'd regress an uncommon crasher for a more common one.

2. Leave as status quo. I'm actually worried that there may be other things that
   could be regressed here. All sorts of things tend to happen in destructors,
   so it's bad that we're potentially killing nodes in unexpected places.

3. Take one of the patches in this bug.


I actually think we should do 3. I don't think that the crash we're seing in this bug is known uncommon enough that we should try to live with it until the next dot-release. Landing attachment 240637 [details] [diff] [review] makes us behave much more like we did before the regression, without exposing us to security issues.

I do think we should land attachment 240610 [details] [diff] [review] as well (though in a slightly different form), but not for 2.0 release.

Flags: blocking1.8.1- → blocking1.8.1?

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 32

•

18 years ago

Comment on attachment 240637 [details] [diff] [review]
alternative patch

>@@ -787,20 +783,19 @@ void 
> nsContentList::PopulateSelf(PRUint32 aNeededLength)
> {
>   if (!mRootNode) {
>     return;
>   }
> 
>   ASSERT_IN_SYNC;
> 
>-  if (mState == LIST_DIRTY) {
>-    Reset();
>-  }

Please leave this in on the branch, just in case.

>   PRUint32 count = mElements.Count();
>+  NS_ASSERTION(mState != LIST_DIRTY || count == 0,
>+               "Reset() not called when setting state to LIST_DIRTY?");
> 
>   if (count >= aNeededLength) // We're all set
>     return;
> 
>   PRUint32 elementsToAppend = aNeededLength - count;
> #ifdef DEBUG
>   PRUint32 invariant = elementsToAppend + mElements.Count();
> #endif

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 33

•

18 years ago

Comment on attachment 240637 [details] [diff] [review]
alternative patch

r/sr=me fwiw

Attachment #240637 - Flags: approval1.8.1?

Mike Beltzner [:beltzner, not reading bugmail]

Comment 34

•

18 years ago

Comment on attachment 240637 [details] [diff] [review]
alternative patch

Approved for RC2. Thanks everyone.

Attachment #240637 - Flags: approval1.8.1? → approval1.8.1+

Updated

•

18 years ago

Whiteboard: [checkin needed (1.8 branch)]

Assignee

Comment 35

•

18 years ago

Checked in to trunk and 1.8.1

Keywords: fixed1.8.1

Whiteboard: [checkin needed (1.8 branch)]

Assignee

Updated

•

18 years ago

Attachment #240637 - Flags: approval1.8.0.8?

:Gavin Sharp [email: gavin@gavinsharp.com]

Assignee

Comment 36

•

18 years ago

(Btw, when using *trunk* to test this (with or without patches), I see quite often crash Bug 330776 happening before the actual test is loaded. 
I've seen that bug also in some other cases when leaving a bug document. But anyway, that has nothing to do with this bug.)

Updated

•

18 years ago

Depends on: 355026

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 37

•

18 years ago

This caused bug 355026, apparently on the 1.8 branch only.

Assignee

Comment 38

•

18 years ago

(In reply to comment #37)
> This caused bug 355026, apparently on the 1.8 branch only.
> 

ARGH!

Assignee

Comment 39

•

18 years ago

Comment on attachment 240637 [details] [diff] [review]
alternative patch

Because of bug 355026, need to think better patch for branches.

Attachment #240637 - Flags: approval1.8.0.8?

Assignee

Comment 40

•

18 years ago

Backed out from 1.8.1 per comment
https://bugzilla.mozilla.org/show_bug.cgi?id=355026#c7

Keywords: fixed1.8.1

Comment 41

•

18 years ago

Clearing the 1.8.1 nom.  We are likely going to be shipping with this since the fix has caused regressions.  Would be great to get a fix ready for 1.8.1.1

Flags: blocking1.8.1?

Comment 42

•

18 years ago

Er, right.  So the problem is that the design on branch is to mark state as dirty at the end of PopulateSelf if we have no way of keeping track of mutations.

That doesn't work if marking dirty clears the list.

So the answer is probably to hoist that SetDirty call out to the callers of PopulateSelf.  So the code flow should be:

1)  PopulateSelf
2)  Get the answer the caller wants (node or index or whatever)
3)  SetDirty().

Right now #3 is happening before #2, which means we always have an empty list there, which is what breaks.

Comment 43

•

18 years ago

Attached patch Branch patch with comment 42 taken into account (deleted) — Details — Splinter Review

I'm more or less gone until sometime Tuesday late afternoon.  Sicking, if you decide this is ok, and it gets some testing, would you mind driving it in?

Attachment #240859 - Flags: superreview?(bugmail)

Attachment #240859 - Flags: review?(bugmail)

Assignee

Comment 44

•

18 years ago

Comment on attachment 240859 [details] [diff] [review]
Branch patch with comment 42 taken into account

Doesn't really matter, but r=me, even though this is a bit ugly. (But don't know anything else that could be done on branches.)

Attachment #240859 - Flags: review+

Updated

•

18 years ago

Keywords: regression

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 45

•

18 years ago

Moving nomination to 1.8.0.9 -- if it's too risky for 1.8.1 then we don't want it in 1.8.0.8 either. Especially given the regressions.

If 1.8.1 approves attachment 240859 [details] [diff] [review] we'll take it for 1.8.0.8 too, but given the paltry number of nightly 1.8.0.x testers we can't take as many regression risks.

Flags: blocking1.8.0.8? → blocking1.8.0.9?

Comment 46

•

18 years ago

Oh, and note to self and everyone else here. When we're this close to release we should always make sure to write, review and test branch patches, not trunk patches. Not trying to put blame one anyone else since I reviewed and recommended the original patch to go on branch, just trying to learn from the misstake.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 47

•

18 years ago

Comment on attachment 240859 [details] [diff] [review]
Branch patch with comment 42 taken into account

> nsContentList::PopulateSelf(PRUint32 aNeededLength)
> {
>   if (mState == LIST_DIRTY) {
>     Reset();
>   }
>   PRUint32 count = mElements.Count();
>+  NS_ASSERTION(mState != LIST_DIRTY || count == 0,
>+               "Reset() not called when setting state to LIST_DIRTY?");

The assertion should go before the |if| statement above.

r/sr=me with that. But I do think we should wait until 1.8.1.1 for this. Unfortunately the trunk is different enough that we can't really trunk bake this patch, so we should land asap after the tree opens.

Nominating for 1.8.0.8, but we may consider waiting for .9

Attachment #240859 - Flags: superreview?(bugmail)

Attachment #240859 - Flags: superreview+

Attachment #240859 - Flags: review?(bugmail)

Attachment #240859 - Flags: review+

Attachment #240859 - Flags: approval1.8.0.8?

Comment 48

•

18 years ago

The other thing we should do is add dynamic regression tests for content lists (there are XXX comments to this effect in the file) so we could exercise this code.  If we'd had those up, testing of the initial patch would have caught the problem.  At least if we had a way to run the tests on branch...

Comment 49

•

18 years ago

Oh, and removing 1.9 blocking nomination, since things are all good there.

Flags: blocking1.9?

Benjamin Smedberg

Updated

•

18 years ago

Attachment #240610 - Flags: review?(benjamin)

Comment 50

•

18 years ago

Comment on attachment 240859 [details] [diff] [review]
Branch patch with comment 42 taken into account

We still don't want this on 1.8.0.8 if you think it should wait for 1.8.1.1. Moving approval request to 1.8.0.9

Attachment #240859 - Flags: approval1.8.0.9?

Attachment #240859 - Flags: approval1.8.0.8?

Attachment #240859 - Flags: approval1.8.0.8-

Comment 51

•

18 years ago

Comment on attachment 240637 [details] [diff] [review]
alternative patch

Clearing a flag to match patch backout

Attachment #240637 - Flags: approval1.8.1+

Comment 52

•

18 years ago

Moving nomination so we don't lose track, since this hasn't landed on trunk for testing the current release seems unlikely.

Flags: blocking1.8.1.2?

Flags: blocking1.8.1.1?

Flags: blocking1.8.0.9?

Flags: blocking1.8.0.10?

Assignee

Comment 53

•

18 years ago

This has landed on trunk, but 1.8 patch is a bit different.

Status: ASSIGNED → RESOLVED

Closed: 18 years ago

Resolution: --- → FIXED

Updated

•

18 years ago

Attachment #240859 - Flags: approval1.8.1.1?

Worcester12345

Comment 54

•

18 years ago

(In reply to comment #53)
> This has landed on trunk, but 1.8 patch is a bit different.
> 

So, shouldn't this be reopened, since it is a branch bug? Or is there another bug for branch and make this one trunk?

Comment 55

•

18 years ago

Comment on attachment 240859 [details] [diff] [review]
Branch patch with comment 42 taken into account

approved for 1.8/1.8.0 branch, a=dveditz for drivers

Attachment #240859 - Flags: approval1.8.1.1?

Attachment #240859 - Flags: approval1.8.1.1+

Attachment #240859 - Flags: approval1.8.0.9?

Attachment #240859 - Flags: approval1.8.0.9+