Closed Bug 320742 Opened 19 years ago Closed 19 years ago

crash [@ nsSHistory::EvictGlobalContentViewer] because session history listener shouldn't get called in middle of nsSHistory::AddEntry()

Categories

(Core :: DOM: Navigation, defect)

x86
Windows XP
defect
Not set
critical

Tracking

()

VERIFIED FIXED

People

(Reporter: waymost+firefox, Assigned: bryner)

Details

(Keywords: fixed1.8.1, verified1.8.0.2, Whiteboard: [rft-dl])

Attachments

(3 files, 7 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8) Gecko/20051111 Firefox/1.5
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8) Gecko/20051111 Firefox/1.5

When an entry is added to the session history, the session history listeners get called by AddEntry() before the entry is added to the history, but after the nsSHTransaction corresponding to the current entry is called. If someone registers a nsISHistoryListener that changes mIndex, the browser will crash from an access violation in EvictGlobalContentViewer (see TB13034494Y).

Either the listeners should be called after the entry is added or before currentTxn is acquired.

Reproducible: Always

Steps to Reproduce:
1. Register nsISHistoryListener that changes mIndex at OnHistoryNewEntry (by calling GetEntryAtIndex(entry, true).
2. Run Fx and surf to new page not in session history.

Actual Results:  
Browser crash.

Expected Results:  
Browser should not crash upon changing nsSHistory.mIndex using nsISHistoryListener.OnHistoryNewEntry via GetEntryAtIndex(entry, true).

This bug prevents a nsISHistoryListener from changing the index that the session history is currently at. This affects extensions working with the session history as well as updates/modifications to the session history made by the browser on the JS-level.
Attached file example javascript code (obsolete) (deleted) —
Here's an example of code that causes this bug. See lines 253-272.
This might be related to bug 320488.

Incident ID: 13034494
Stack Signature	nsSHistory::EvictGlobalContentViewer 25b32c3e
Product ID	Firefox15
Build ID	2005111116
Trigger Time	2005-12-16 14:46:11.0
Platform	Win32
Operating System	Windows NT 5.1 build 2600
Module	firefox.exe + (0032183a)
URL visited	
User Comments	SHE test crash
Since Last Crash	2224 sec
Total Uptime	596063 sec
Trigger Reason	Access violation
Source File, Line No.	c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpfe/components/shistory/src/nsSHistory.cpp, line 852
Stack Trace 	
nsSHistory::EvictGlobalContentViewer  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpfe/components/shistory/src/nsSHistory.cpp, line 852]
nsSHistory::EvictContentViewers  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpfe/components/shistory/src/nsSHistory.cpp, line 646]
nsPresContext::EnsureVisible  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/layout/base/nsPresContext.cpp, line 1294]
PresShell::UnsuppressAndInvalidate  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/layout/base/nsPresShell.cpp, line 5022]
PresShell::UnsuppressPainting  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/layout/base/nsPresShell.cpp, line 5070]
nsDocShell::EndPageLoad  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/docshell/base/nsDocShell.cpp, line 4747]
nsWebShell::EndPageLoad  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/docshell/base/nsWebShell.cpp, line 664]
nsDocShell::OnStateChange  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/docshell/base/nsDocShell.cpp, line 4673]
nsDocLoader::FireOnStateChange  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/uriloader/base/nsDocLoader.cpp, line 1210]
nsDocLoader::doStopDocumentLoad  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/uriloader/base/nsDocLoader.cpp, line 844]
nsDocLoader::OnStopRequest  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/uriloader/base/nsDocLoader.cpp, line 665]
nsLoadGroup::RemoveRequest  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/netwerk/base/src/nsLoadGroup.cpp, line 732]
nsDocument::UnblockOnload  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/content/base/src/nsDocument.cpp, line 4968]
DestroyImagePLEvent  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/content/base/src/nsImageLoadingContent.cpp, line 670]
SHELL32.dll + 0x4e0c24 (0x778b0c24)
Severity: normal → critical
Summary: session history listener shouldn't get called in middle of nsSHistory::AddEntry() → crash [@ nsSHistory::EvictGlobalContentViewer] because session history listener shouldn't get called in middle of nsSHistory::AddEntry()
Version: unspecified → Trunk
I was unable to crash a recent trunk build by evaluating this code in the scope of browser window, calling the onload handler manually, and navigating to a page not in session history.
1) Is it still a problem with trunk builds?
2) If so, could you provide some code that can be used to reproduce the crash.
Attached file simpler bug example code (obsolete) (deleted) —
The relevant source (nsSHistory::AddEntry and nsSHistory::EvictGlobalContentViewer) hasn't changed between the branch and trunk so the bug should still be there. I am including simpler example code to make it easier to review. All you should have to do is make the SomeHistoryListener object a SHistoryListener. Then you should get a dialog right before the crash occurs. (This is to help make sure that the listener is being added correctly, since I had issues doing this through the js console, as well as easily verify the crash).

As I mentioned earlier, it should be a pretty simple fix: just move the shistorylistener notification code to either before all the add entry code or after it all, not in the middle of it like it is now.
Attachment #206262 - Attachment is obsolete: true
Attached file my test code (deleted) —
Okay, I'm probably not helping, since I'm actually not familiar with this code and just trying to reproduce.

That last testcase has syntax error. Here's the (a bit simplified, but I believe equivalent) code I'm testing. I get the alert, then no crash. The session history is truncated to the first two entries. This happens on both trunk debug and 1.5 release, so no difference there :)
Looking back at my original testcase, the crash may also have to do with the replaceEntry calls, which I originally didn't think we're playing a role. The location in the code where the history listeners get called should be fixed regardless, but it sounds like something else in that original block of code is contributing to the crash. With the correct minimum case, it should definitely be crashing, because when I first came upon this, I tried plenty of different things and it always crashed at the exact same spot in the code.

If you could e-mail the js to get the listener working through the console (which would be easier than reconstructing the old extension code at this point), then I can hammer out a correct minimum testcase that actually reproduces the crash, and repost it here for confirmation.

(btw, Nikolay, you definitely are helping; since this is turning out to be caused by something different than what I suspected initially)
Attached file minimum testcase (deleted) —
So I'm pretty sure I got to the bottom of this bug and the recent confusion. It does have to do with the listener being called in the middle of the AddEntry function, but there's another wrinkle I didn't notice until this morning.

What my original and new code does, when the listener is registered and a new history entry is being created, is get the index and highest index (history.count - 1) in the session history, see if they are unequal and, if so, go to a different entry in the session history -- the last entry. That's the wrinkle. The code I posted to Nikolay's initial comment didn't get the last entry, it got the entry at index 1, because I hadn't realized this aspect of the problem. AFAICT, this is what's happening in the code.

Given: the session history currently has an mLength of 5, is current at mIndex=2, my most recent shistorylistener is attached
1) I click on a link, generating a new history entry, calling nsSHistory:AddEntry
2) The code before the listener runs: the current transaction is gotten and the persist flag is verified. (If the persist flag is false, the new entry won't get created and we'll be okay, but that's only the case with things like chrome urls.)
3) The SHistoryListeners are iterated through and each one's OnHistoryNewEntry method is called.
4) My listener verifies that we are not at the last entry in the session history (which, given the assumptions, is true) and then sets the current entry of the session history to the last entry in it. This causes mIndex to be set to 4 (mLength - 1).
5) AddEntry then creates a new transaction for the new entry and appends it to the current transaction. (At this point, the transaction list is actually okay. The only problem we have is mIndex being wrong.)
6) mLength is set to (++mIndex)+1. Now mLength is set to 6 since mIndex is erroneously 4. This is the crux of the problem. The session history should only have a length of 4 (with mIndex = 3). So now, any code that uses mLength and uses it to iterate along the transaction list will break...
7) which happens in nsSHistory::EvictGlobalContentViewers. The code uses mLength to set the end condition of the for loop. SHTransactions are iterated through. The last transaction is reached, GetNext is called, and *pop* we get an access violation and crash.

So that's what's going on. The attachment here should create the crash by attaching the listener, getting a few entries in the session history, go back a few entries, and click a link. You'll get the polite dialog saying you're going to crash, and you should. I don't have the trunk installed, but I looked at the code and it's exactly the same from what I can tell, so it should crash on either.

Here's a talkback entry for a crash with the attached code. TB14319458X

Stack Signature	 nsSHistory::EvictGlobalContentViewer bcd008dc
Product ID	Firefox15
Build ID	2005111116
Trigger Time	2006-01-23 09:28:54.0
Platform	Win32
Operating System	Windows NT 5.1 build 2600
Module	firefox.exe + (0032183a)
URL visited	http://forums.mozillazine.org/viewforum.php?f=23
User Comments	bug 320742 crash
Since Last Crash	161211 sec
Total Uptime	3577157 sec
Trigger Reason	Access violation
Source File, Line No.	c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpfe/components/shistory/src/nsSHistory.cpp, line 852
Stack Trace 	
nsSHistory::EvictGlobalContentViewer  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpfe/components/shistory/src/nsSHistory.cpp, line 852]
nsSHistory::EvictContentViewers  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpfe/components/shistory/src/nsSHistory.cpp, line 646]
nsPresContext::EnsureVisible  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/layout/base/nsPresContext.cpp, line 1294]
PresShell::UnsuppressAndInvalidate  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/layout/base/nsPresShell.cpp, line 5022]
PresShell::ProcessReflowCommands  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/layout/base/nsPresShell.cpp, line 6938]
ReflowEvent::HandleEvent  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/layout/base/nsPresShell.cpp, line 6696]
PL_HandleEvent  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpcom/threads/plevent.c, line 689]
SHELL32.dll + 0x4e0c24 (0x778b0c24)
Attachment #209283 - Attachment is obsolete: true
Status: UNCONFIRMED → NEW
Ever confirmed: true
With regards to where the listener should be called, it makes sense to me that it be called after the new entry is added to the history. That way, one can write code that can access the new entry. Currently, the newURI is passed, but that doesn't do anything for the actual new entry. The AddEntry call can't be overriden by a SHistoryListener either (unlike every other action a session history listener can react to), so why not allow access to the new entry via OnHistoryNewEntry?

As an aside, why don't we allow the listener to prevent the addition of the new entry? It can stop any other session history navigation, purge, and so forth.

Also, should we have this bug block bug 324285 so the IDL documentation can be more specific as to when OnHistoryNewEntry is called?
Component: History → History: Session
Product: Firefox → Core
QA Contact: history → history.session
I think fully adding and only then notifying makes the most sense, yes.

We might as well fix the IDL here.  That is, no need to block bug 324285 -- just let it land and then patch on top of it.

Matthew, would you be willing to take a stab at fixing this?  At this point you probably know this code as well as anyone else does....
Flags: blocking1.9a1?
Flags: blocking1.8.1?
Sure, I'm more than willing to write up a fix.
Attached patch patch (obsolete) (deleted) — Splinter Review
Here's the patch with the listener called after the entry has been completely added. The only code that comes after the listener call is whether to purge the history of extra entries, since the code in the listener might add/remove entries.
Attachment #209647 - Flags: review?(bryner)
Hmm, I just realized an unintended consequence of the patch. If we complete the addition of the entry, then we lose the index that the session history was previously at, which might also be useful. We could pass this through the OnHistoryNewEntry as an argument, but that, of course, would require the API to be changed.

We could also put the call to the listener before we add the entry to the history (which would be consistent with when other functions of SHistoryListener is called since every other action is stoppable). Then we don't have access to the added entry.

For my purposes as someone who uses the SHistoryListener in extensions, I'd find the first option easiest. Currently, Session History Extension uses the index of where the session history was previously and then adds a WebProgressListener to get the newly added entry. Avoiding this hacky solution would only be accomplished by passing the index through to the method via arguments. I'm sure there are other concerns as well, but that's my two cents as an extension author. (IMHO, the current SHistoryListener interface is somewhat inflexible and not as useful as it could be, but that's a discussion which probably doesn't belong in this bug.)

(Also, I guess I can't assign myself to the bug? Can a driver assign me then?)
Assignee: nobody → waymost+firefox
Status: NEW → ASSIGNED
Comment on attachment 209647 [details] [diff] [review]
patch

I'm working on a new version of the patch bcause I think there's a better solution than just moving the listener calling code around.
Attachment #209647 - Flags: review?(bryner)
Attached patch patch v2 (obsolete) (deleted) — Splinter Review
I figured the best thing to do for now was to fix the bug while keeping everything else consistent. Since every other method in SHistoryListener is called prior to the event occurring, I think keeping that consistent for now makes most sense.

So the listener is actually called in the exact same spot, except I store mIndex prior to calling the listeners. Then, if mIndex exceeds the original current index, I set it back (basically nullify the change). This gets rid of the bug without changing how anything else works.
Attachment #209647 - Attachment is obsolete: true
Attachment #209676 - Flags: review?(bryner)
Comment on attachment 209676 [details] [diff] [review]
patch v2

It seems like a better solution would be to refetch currentTxn after notifying the listener.  That way, the listener can still change mIndex if desired (certainly that seems like what your extension wants to do), and we'll still cut off the list at the right place.
Attachment #209676 - Flags: review?(bryner) → review-
bryner and I stepped through this and found that what caused unexpected behavior was when a listener changes mIndex and then currentTxn is not pointing to the same place as mIndex.  An easy fix to this is retrieving the currentTxn again after the listeners are called.
Attachment #209676 - Attachment is obsolete: true
Attachment #209851 - Flags: review?(bryner)
(In reply to comment #15)
> (From update of attachment 209676 [details] [diff] [review] [edit])
> It seems like a better solution would be to refetch currentTxn after notifying
> the listener.  That way, the listener can still change mIndex if desired
> (certainly that seems like what your extension wants to do), and we'll still
> cut off the list at the right place.
> 

That definitely makes more sense since it actually fixes the expected behavior as opposed to working around it. And you're absolutely right, that is what my extension wants to do.

Should we add a comment to the new currentTxn retrieval, so its clear why we're getting it a second time? Maybe something along the lines of "Getting currentTxn again in case a listener modified mIndex." I know if I looked at the code without having gone through this, I'd be a little confused by what would look like a redundant call.
Comment on attachment 209851 [details] [diff] [review]
retrieve the currentTxn again after notifying observers

>--- docshell/shistory/src/nsSHistory.cpp	5 Jan 2006 21:01:24 -0000	1.74
>+++ docshell/shistory/src/nsSHistory.cpp	27 Jan 2006 16:37:51 -0000
>@@ -298,16 +298,19 @@ nsSHistory::AddEntry(nsISHEntry * aSHEnt
>       nsCOMPtr<nsIHistoryEntry> hEntry(do_QueryInterface(aSHEntry));
>       if (hEntry) {
>         hEntry->GetURI(getter_AddRefs(uri));
>         listener->OnHistoryNewEntry(uri);
>       }
>     }
>   }
> 
>+  if(mListRoot)
>+    GetTransactionAtIndex(mIndex, getter_AddRefs(currentTxn));
>+

I'm a little worried about this slowing down entry insertion in the common case (where mIndex will be unchanged) since it has to walk a linked list up to mIndex.  So I'd suggest two changes:

1. Compare the index before and after calling the listener, and only call GetTransactionAtIndex a second time if the index has changed.  Add a comment saying that the listener could change the index, since it's clearly not obvious given this bug.

2. Move the whole thing up into the |if (hEntry)| block, since we know that a history listener is the only thing that could change mIndex during execution of this method.
Attachment #209851 - Flags: review?(bryner) → review-
Attached patch patch v3.1 (obsolete) (deleted) — Splinter Review
Patch updated per bryner's changes.

When deciding whether to refetch currentTxn, I only check whether currentIndex == mIndex. If we're inside |if (hEntry)| block, we should already know that there's an mListRoot already, making that check unnecessary.
Attachment #209874 - Flags: review?(bryner)
Comment on attachment 209874 [details] [diff] [review]
patch v3.1

>--- xpfe/components/shistory/src/nsSHistory.cpp	30 Sep 2005 02:14:22 -0000	1.66.6.2
>+++ xpfe/components/shistory/src/nsSHistory.cpp	27 Jan 2006 19:25:06 -0000
>@@ -292,25 +292,32 @@ nsSHistory::AddEntry(nsISHEntry * aSHEnt
>     NS_ENSURE_SUCCESS(currentTxn->SetSHEntry(aSHEntry),NS_ERROR_FAILURE);
>     currentTxn->SetPersist(aPersist);
>     return NS_OK;
>   }
> 
>   nsCOMPtr<nsISHTransaction> txn(do_CreateInstance(NS_SHTRANSACTION_CONTRACTID));
>   NS_ENSURE_TRUE(txn, NS_ERROR_FAILURE);
> 
>+  PRInt32 currentIndex = mIndex;
>+

This should move inside the |if (hEntry)| block as well.  r=me with that fixed, I'll check it in.
Attachment #209874 - Flags: review?(bryner) → review+
Attached patch final patch (obsolete) (deleted) — Splinter Review
Final patch
Attachment #209874 - Attachment is obsolete: true
Attachment #209876 - Flags: review?(bryner)
Comment on attachment 209876 [details] [diff] [review]
final patch

this doesn't actually have the change I mentioned.  don't worry about attaching another patch though.
Attachment #209876 - Flags: review?(bryner)
Whoops, attached the same patch again, instead of the new one. Sorry about that.
/cvsroot/mozilla/docshell/shistory/src/nsSHistory.cpp,v  <--  nsSHistory.cpp
new revision: 1.75; previous revision: 1.74

I'm shooting to get this in on the branches as well.
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Attached patch patch that was checked in (deleted) — Splinter Review
Assignee: waymost+firefox → bryner
Attachment #209851 - Attachment is obsolete: true
Attachment #209876 - Attachment is obsolete: true
Status: RESOLVED → ASSIGNED
Attachment #209884 - Flags: approval1.8.0.2?
Resolution: FIXED → ---
Attachment #209884 - Flags: approval1.8.1?
Comment on attachment 209884 [details] [diff] [review]
patch that was checked in

drivers approval isn't needed for 1.8.1
Attachment #209884 - Flags: approval1.8.1?
checked in on MOZILLA_1_8_BRANCH.
Status: ASSIGNED → RESOLVED
Closed: 19 years ago19 years ago
Keywords: fixed1.8.1
Resolution: --- → FIXED
This crash is still showing up a little bit on 1.5.0.1, so it would be good to get this fix in for the next update.
This bug fix has been checked in on both the 1.8.1 branch and trunk for over two week without any reports of regression. Based on that and what bryner mentioned in comment 29, this should get checked in for 1.8.0.2.
Matthew, in case you missed it the patch is waiting on driver approval for 1.8.0.2.  Drivers are currentl working on the 1.7 branch releases, after which they will look at 1.8.0.2 stuff.
Comment on attachment 209884 [details] [diff] [review]
patch that was checked in

approved for 1.8.0 branch, a=dveditz
Attachment #209884 - Flags: approval1.8.0.2? → approval1.8.0.2+
Flags: blocking1.8.0.2+
checked in on MOZILLA_1_8_0_BRANCH
Keywords: fixed1.8.0.1
oops, fixed1.8.0.2, not 1.8.0.1.
Marking [rft-dl] (ready for testing in Firefox 1.5.0.2 release candidates)
Whiteboard: [rft-dl]
Verified with Fx 1.5.0.2 on Windows build 2006-03-02-06-mozilla1.8.0
Status: RESOLVED → VERIFIED
Whiteboard: [rft-dl]
Whiteboard: [rft-dl]
Flags: blocking1.8.1?
Flags: blocking1.9a1?
Component: History: Session → Document Navigation
QA Contact: history.session → docshell
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: