Closed Bug 572980 Opened 14 years ago Closed 14 years ago

e10s: HttpChannelChild incorrectly refcounted

Categories

(Core :: Networking: HTTP, defect)

x86
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
fennec 2.0a1+ ---

People

(Reporter: jdm, Assigned: jdm)

References

Details

Attachments

(1 file, 6 obsolete files)

I still can't figure out what's going wrong here, but this stack shows the HttpBaseChannel destructing, which ends up causing the same HttpBaseChannel to destruct again later.


(gdb) bt
#0  0x00daf416 in __kernel_vsyscall ()
#1  0x00b16df6 in nanosleep () from /lib/libc.so.6
#2  0x00b16c11 in sleep () from /lib/libc.so.6
#3  0x0106a299 in ah_crap_handler (signum=11) at /home/t_mattjo/src/firefox/mobilebase/toolkit/xre/nsSigHandlers.cpp:132
#4  0x0106a2f2 in child_ah_crap_handler (signum=11) at /home/t_mattjo/src/firefox/mobilebase/toolkit/xre/nsSigHandlers.cpp:145
#5  <signal handler called>
#6  0x01047d62 in nsTArray_base::Length (this=0xb08bf100) at ../../../../dist/include/nsTArray.h:66
#7  0x01140ce1 in nsTArray<nsHttpHeaderArray::nsEntry>::Clear (this=0xb08bf100) at ../../../dist/include/nsTArray.h:729
#8  0x0114095f in nsHttpHeaderArray::Clear (this=0xb08bf100) at /home/t_mattjo/src/firefox/mobilebase/netwerk/protocol/http/nsHttpHeaderArray.cpp:218
#9  0x01149ac3 in nsHttpResponseHead::ClearHeaders (this=0xb08bf100) at /home/t_mattjo/src/firefox/mobilebase/netwerk/protocol/http/nsHttpResponseHead.h:85
#10 0x011493c5 in nsHttpResponseHead::Reset (this=0xb08bf100) at /home/t_mattjo/src/firefox/mobilebase/netwerk/protocol/http/nsHttpResponseHead.cpp:484
#11 0x01153d2d in nsHttpResponseHead::~nsHttpResponseHead (this=0xb08bf100, __in_chrg=<value optimized out>) at /home/t_mattjo/src/firefox/mobilebase/netwerk/protocol/http/nsHttpResponseHead.h:62
#12 0x0115fc5a in nsAutoPtr<nsHttpResponseHead>::~nsAutoPtr (this=0xb05e25f0, __in_chrg=<value optimized out>) at ../../../dist/include/nsAutoPtr.h:104
#13 0x0115be50 in mozilla::net::HttpBaseChannel::~HttpBaseChannel (this=0xb05e2564, __in_chrg=<value optimized out>) at /home/t_mattjo/src/firefox/mobilebase/netwerk/protocol/http/HttpBaseChannel.cpp:78
#14 0x0115bf43 in mozilla::net::HttpBaseChannel::~HttpBaseChannel (this=0xb05e2564, __in_chrg=<value optimized out>) at /home/t_mattjo/src/firefox/mobilebase/netwerk/protocol/http/HttpBaseChannel.cpp:78
#15 0x022f9ce6 in nsHashPropertyBag::Release (this=0xb05e2564) at /home/t_mattjo/src/firefox/mobilebase/xpcom/ds/nsHashPropertyBag.cpp:72
#16 0x0115c515 in mozilla::net::HttpBaseChannel::Release (this=0xb05e2564) at /home/t_mattjo/src/firefox/mobilebase/netwerk/protocol/http/HttpBaseChannel.cpp:149
#17 0x0104f874 in nsCOMPtr<nsIChannel>::~nsCOMPtr (this=0xb11826f4, __in_chrg=<value optimized out>) at ../../../../dist/include/nsCOMPtr.h:510
#18 0x0125af4b in nsProgressNotificationProxy::~nsProgressNotificationProxy (this=0xb11826e0, __in_chrg=<value optimized out>) at /home/t_mattjo/src/firefox/mobilebase/modules/libpr0n/src/imgLoader.cpp:277
#19 0x01252ab7 in nsProgressNotificationProxy::Release (this=0xb11826e0) at /home/t_mattjo/src/firefox/mobilebase/modules/libpr0n/src/imgLoader.cpp:284
#20 0x0104e99c in nsCOMPtr<nsIInterfaceRequestor>::~nsCOMPtr (this=0xb05e25c8, __in_chrg=<value optimized out>) at ../../../../dist/include/nsCOMPtr.h:510
#21 0x0115be98 in mozilla::net::HttpBaseChannel::~HttpBaseChannel (this=0xb05e2564, __in_chrg=<value optimized out>) at /home/t_mattjo/src/firefox/mobilebase/netwerk/protocol/http/HttpBaseChannel.cpp:78
#22 0x01179863 in mozilla::net::HttpChannelChild::~HttpChannelChild (this=0xb05e2550, __in_chrg=<value optimized out>) at /home/t_mattjo/src/firefox/mobilebase/netwerk/protocol/http/HttpChannelChild.cpp:81
#23 0x011798bd in mozilla::net::HttpChannelChild::~HttpChannelChild (this=0xb05e2550, __in_chrg=<value optimized out>) at /home/t_mattjo/src/firefox/mobilebase/netwerk/protocol/http/HttpChannelChild.cpp:81
#24 0x0118261a in mozilla::net::NeckoChild::DeallocPHttpChannel (this=0xb3108640, channel=0xb05e2550) at /home/t_mattjo/src/firefox/mobilebase/netwerk/ipc/NeckoChild.cpp:103
#25 0x02298ccf in mozilla::net::PNeckoChild::DeallocSubtree (this=0xb3108640) at PNeckoChild.cpp:424
#26 0x02288cb5 in mozilla::dom::PContentProcessChild::DeallocSubtree (this=0xb7d2d038) at PContentProcessChild.cpp:1068
#27 0x0228802c in mozilla::dom::PContentProcessChild::OnChannelClose (this=0xb7d2d038) at PContentProcessChild.cpp:794
#28 0x021cb0ae in mozilla::ipc::AsyncChannel::NotifyChannelClosed (this=0xb7d2d040) at /home/t_mattjo/src/firefox/mobilebase/ipc/glue/AsyncChannel.cpp:324
#29 0x021cb0fa in mozilla::ipc::AsyncChannel::NotifyMaybeChannelError (this=0xb7d2d040) at /home/t_mattjo/src/firefox/mobilebase/ipc/glue/AsyncChannel.cpp:340
#30 0x021cb033 in mozilla::ipc::AsyncChannel::OnNotifyMaybeChannelError (this=0xb7d2d040) at /home/t_mattjo/src/firefox/mobilebase/ipc/glue/AsyncChannel.cpp:311
#31 0x021cc3bd in DispatchToMethod<mozilla::ipc::AsyncChannel, void (mozilla::ipc::AsyncChannel::*)()> (obj=0xb7d2d040, method=
    (void (mozilla::ipc::AsyncChannel::*)(mozilla::ipc::AsyncChannel *)) 0x21caf32 <mozilla::ipc::AsyncChannel::OnNotifyMaybeChannelError()>, arg=...) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/tuple.h:383
#32 0x021cc27b in RunnableMethod<mozilla::ipc::AsyncChannel, void (mozilla::ipc::AsyncChannel::*)(), Tuple0>::Run (this=0xb1be30c0) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/task.h:307
#33 0x023bb69c in MessageLoop::RunTask (this=0xbfffef10, task=0xb1be30c0) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/message_loop.cc:336
#34 0x023bb705 in MessageLoop::DeferOrRunPendingTask (this=0xbfffef10, pending_task=...) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/message_loop.cc:344
#35 0x023bbadb in MessageLoop::DoWork (this=0xbfffef10) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/message_loop.cc:444
#36 0x021d0322 in mozilla::ipc::MessagePump::Run (this=0xb7d11130, aDelegate=0xbfffef10) at /home/t_mattjo/src/firefox/mobilebase/ipc/glue/MessagePump.cpp:122
#37 0x021d0830 in mozilla::ipc::MessagePumpForChildProcess::Run (this=0xb7d11130, aDelegate=0xbfffef10) at /home/t_mattjo/src/firefox/mobilebase/ipc/glue/MessagePump.cpp:232
#38 0x023bb1f9 in MessageLoop::RunInternal (this=0xbfffef10) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/message_loop.cc:216
#39 0x023bb179 in MessageLoop::RunHandler (this=0xbfffef10) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/message_loop.cc:199
#40 0x023bb11d in MessageLoop::Run (this=0xbfffef10) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/message_loop.cc:173
#41 0x02084e02 in nsBaseAppShell::Run (this=0xb58822e0) at /home/t_mattjo/src/firefox/mobilebase/widget/src/xpwidgets/nsBaseAppShell.cpp:175
#42 0x0106b577 in XRE_RunAppShell () at /home/t_mattjo/src/firefox/mobilebase/toolkit/xre/nsEmbedFunctions.cpp:566
#43 0x021d0750 in mozilla::ipc::MessagePumpForChildProcess::Run (this=0xb7d11130, aDelegate=0xbfffef10) at /home/t_mattjo/src/firefox/mobilebase/ipc/glue/MessagePump.cpp:218
#44 0x023bb1f9 in MessageLoop::RunInternal (this=0xbfffef10) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/message_loop.cc:216
#45 0x023bb179 in MessageLoop::RunHandler (this=0xbfffef10) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/message_loop.cc:199
#46 0x023bb11d in MessageLoop::Run (this=0xbfffef10) at /home/t_mattjo/src/firefox/mobilebase/ipc/chromium/src/base/message_loop.cc:173
#47 0x0106aff4 in XRE_InitChildProcess (aArgc=1, aArgv=0xbffff164, aProcess=GeckoProcessType_Content) at /home/t_mattjo/src/firefox/mobilebase/toolkit/xre/nsEmbedFunctions.cpp:447
#48 0x08049080 in main (argc=3, argv=0xbffff164) at /home/t_mattjo/src/firefox/mobilebase/ipc/app/MozillaRuntimeMain.cpp:87
Blocks: fennecko
Severity: normal → critical
Ok, I've figured this problem out.  This actually occurs because the child errors out and tries to clean up the channel actor by calling DeallocPHttpChannel.  This deletes the actor, which still has a positive refcnt.  I think this is worth taking the time to fix properly, even though it should only occur when the child is dying anyways - it's one less crash to feel obligated to investigate in gdb.
Attached patch Patch (obsolete) (deleted) — Splinter Review
I have not been able to reproduce the crash with this patch applied.
Assignee: nobody → josh
Attachment #452205 - Flags: review?(jduell.mcbugs)
Comment on attachment 452205 [details] [diff] [review]
Patch

Heard on IRC:

jduell> cjones: when does ActorDestroy get called with AbnormalShutdown?
jduell> Whenever an IPDL call returns false?
jduell> Does it get called synchronously/immediately with the error?  Otherwise I'm worried about a possible race in one of jdm's patches, where he assumes refcount won't hit 0 in between ActorDestroy getting called, and DeallocPHttpChannel getting called
cjones> jduell, no, on Close() if it's called manually or off the task posted by the IO thread to main thread notifying connection error
cjones> jduell, isn't that always a problem?
cjones> does AllocPHttpChannel() not ref whatever you're concerned about?
jduell_> cjones: code may speak more clearly: look at https://bugzilla.mozilla.org/attachment.cgi?id=452205&action=diff
jduell_> I'm worried about a race possibly setting mExistingReferences to true after refcnt = 0
cjones> why are using |delete| on refcounted stuff?
cjones> instead of NS_RELEASE()?
jduell_> When the refcount hits 0 on the channel child, we send Delete, so when ActorDestoy is called normally, we want to delete
jduell_> It's the joy of having refcounted httpChannels that are also IPDL channels
jduell_> Does that make sense?
cjones> jduell_, it looks like maybe you want a bool mIPDLHasRef or something
cjones> in addition to normal refcount
cjones> on AllocPChannel(), flip to true
cjones> Dealloc(), flip to false
cjones> and then your Release() impl would delete on refcount == 0 && !ipdlhasref
cjones> and on refcount == 0 && ipdlhasref you could send__delete__()
cjones> actually, that'd be refcount == 0 && ipcactive
cjones> since ipcactive => ipdlhasref
jduell_> cjones: But since normal path is to hit refcnt ==0 and ipcactive, then send_delete, we won't hit Release again, and we still need way to delete, no?
cjones> Dealloc() would just NS_RELEASE()
cjones> well
cjones> yeah
cjones> |Dealloc() { actor->IPDLUnref(); actor->Release(); }
cjones> or something
cjones> maybe actor->IPDLRelease(), which would flip the flag to false and then Release()
cjones> whatever makes most sense
cjones> jduell_, the key is you want NS_RELEASE() called by XPCOM code and DeallocPChannel() to end up going through the same path to |delete|
cjones> the former drops the refcount, the latter drops the IPDL flag, and then you check whether it's time to delete in the common code
jduell_> cjones: why is that important in particular?  Why can't you have separate NS_RELEASE call locations, so long as only called once?
cjones> i think
cjones> separate NS_RELEASE call locations?
jduell_> sorry, separate delete calls, as we do now in the patch
jduell_> cjones: ack, I don't have time to spend on this--though it would be a quickie review.  Will post our IRC chatter to bug and let jdm ponder.
cjones> the problem is the logic to determine whether it's time for |delete| is the same for both NS_RELEASE and Dealloc()
cjones> so you want both to end up in common code
jduell_> ok, point taken. 
cjones> i think jdm's patch is about right, but the HasOtherRefs() is backwards --- you want IPDLHasARef
cjones> the salient features are: int refcount, bool ipdlref, bool ipcactive
cjones> (ipdlref is 0/1)
cjones> and there's a relatively simple state machine based on that state that leads to maybe __delete__ and always delete
cjones> make sense?
jduell_> So now you think we need separate ipdlref and ipcactive?
jduell_> Otherwise, yes.  The devil in in the details, but yes.
cjones> they're separate things
cjones> i think it's best for the current impl
cjones> we might change sending messages on dead channels in the future
cjones> (we've talked about that before i think)
jduell_> Indeed we have :)
Attached patch Patch (obsolete) (deleted) — Splinter Review
I did some more pondering, and I no longer believe that the mIPCActive/mIPDLReference split is necessary.  We know that whenever DeallocPHttpChannel is called, the channel's refcount is non-zero - either IPDL is closing down because of an error (in which case things still have references the channel) or we're being called via RefcountHitZero, in which case the recount is temporarily set to 1.  In either case, we can release the channel, and only cause deletion to occur in the correct case.  I've got a fairly comprehensive explanation of this process in the patch.
Attachment #452205 - Attachment is obsolete: true
Attachment #452205 - Flags: review?(jduell.mcbugs)
Note that this turns out to be fairly tied up with bug 575113, I've discovered, so we'll probably want to land both at the same time.
Depends on: 575113
Attached patch Patch (obsolete) (deleted) — Splinter Review
The product of more pondering.  cjones' suggestion of tracking whether IPDL retained a reference or not was correct - we always want to delete the object when the refcount hits zero, but we may not always want to release the actor inside DeallocPHttpChannel.  The release should only ever happen in OnStopRequest has not been and will never be called - ie. if IPDL is shutting down in error.
Attachment #454387 - Attachment is obsolete: true
Attachment #454792 - Flags: review?(jduell.mcbugs)
Attachment #454792 - Flags: review?(jduell.mcbugs) → review?(chris)
Sorry, need to context switch back into this code --- can you remind me again of the problem(s) with Addref()ing in Alloc() and Release()ing in Dealloc()?
DeallocPHttpChannel is called whenever the IPDL tree is deleted.  Since we tightly control the situation in which it is called usually (ie. the refcount of the channel hits 0), we cannot simply delete the actor unconditionally as other objects could still be retaining references to the channel.
I had in mind setting mIPCActive to false and calling Release() in DeallocPHttpChannel() (for a corresponding AddRef() in Alloc()).  When jduell and I chatted, I thought I remembered thinking something was wrong about that, but now I can't remember what.
(In reply to comment #10)
> I had in mind setting mIPCActive to false and calling Release() in
> DeallocPHttpChannel() (for a corresponding AddRef() in Alloc()).  When jduell
> and I chatted, I thought I remembered thinking something was wrong about that,
> but now I can't remember what.

Disregard that, I remember the problem now.  Will get a reviewin'.
After a first glance, I think the meat of this patch looks OK, but
some things are missing.  The current implementation doesn't quite
seem to match my understanding, which is a relatively simple state
machine.  The state is defined by the tuple (refcount, ipdlhasref,
ipcactive) and a tag.  (Below "ipdlhasref" is abbreviated as "ipdl"
and "ipcactive" as "ipc".)

  (0, -, -):START
    new(), AddRef() -->  (1, -, -):IPC_PENDING

  (1, -, -):IPC_PENDING
    AddRef()    -------------->  (>1, -, -):IPC_PENDING
    Release()   --delete------>  [END]
    AsyncOpen() --SendCtor()-->  (1, ipdl, ipc):IPC_ACTIVE

  (>1, -, -):IPC_PENDING
    AddRef()    -------------->  IPC_PENDING
    Release()   -------------->  IPC_PENDING
    AsyncOpen() --SendCtor()-->  (>1, ipdl, ipc):IPC_ACTIVE

  (>1, ipdl, ipc):IPC_ACTIVE
    AddRef()       -------------------->  IPC_ACTIVE
    Release()      -------------------->  IPC_ACTIVE
    Stop()         --Send__delete__()-->  (>1, ipdl, -):IPC_INACTIVE
    ActorDestroy() -------------------->  (>1, ipdl, -):IPC_INACTIVE

  (1, ipdl, ipc):IPC_ACTIVE
    AddRef()       -------------------->  IPC_ACTIVE
    Release()      --Send__delete__()-->  (0, ipdl, -):IPC_INACTIVE
    Stop()         --Send__delete__()-->  (1, ipdl, -):IPC_INACTIVE
    ActorDestroy() -------------------->  (1, ipdl, -):IPC_INACTIVE
     
  (>1, ipdl, -):IPC_INACTIVE
    AddRef()  -->  IPC_INACTIVE
    Release() -->  IPC_INACTIVE
    Dealloc() -->  (>1, -, -):LIMBO

  (1, ipdl, ipc):IPC_INACTIVE
    AddRef()  -->  IPC_INACTIVE
    Release() -->  (0, ipdl, -):IPC_INACTIVE
    Dealloc() -->  (1, -, -):LIMBO

  (0, ipdl, -):IPC_INACTIVE
    Dealloc() --delete--> [END]

  (>1, -, -):LIMBO
    AddRef()  --> LIMBO
    Release() --> LIMBO

  (1, -, -):LIMBO
    AddRef()  ----------> LIMBO
    Release() --delete--> [END]

Is that correct?  If not, then I think there's something I'm missing,
and I'd like to clear that up before we move on.

If we're on the same page, here's what I think is missing

 - Detailed comment describing channel lifetime and ownership.  It
   doesn't have to be a state machine as above, but it needs to at
   least describe the two "refcounts", why they exist, and how
   channels end up being |Send__deleted__()|d and |delete|d.

 - Tests (xpcshell?) that exercise the nontrivial paths to [END] in
   the machine above.  That is, [END] without SendCtor(),
   Send__delete__() triggered by both Stop() and by Release(), and
   Dealloc() triggered by both Release() and Dealloc().  By my count,
   that's five.

I'm being anal about this because it's code we really really want to
get right and ensure stays right.
Nom for blocking because shutting down with (AFAICT) any live channels causes a crash.
tracking-fennec: --- → ?
Yep, live channels are the devil.  I think I'll roll bug 576698 in with this work as well.
Depends on: 576698
tracking-fennec: ? → 2.0a1+
This is made easier by the demands of bug 576698, given that we no longer need custom refcounting behaviour.  There are now a couple states:

* channel has not been opened - if the refcount drops to zero, the object dies.
* channel has been opened, IPDL retains a reference
  - onStopRequest triggers release of reference and destroys protocol
  - any IPDL error triggers release of reference, protocol is already being destroyed

Due to IPDL's retained reference, the refcount will never hit zero between AsyncOpen and either OnStopRequest or an IPDL error.

Does this sound correct to you?
"destroys protocol" means Send__delete__() --> DeallocChannel() --> Unref()?  If so, this sounds good, nice and simple.
Attached patch Simplify refcounting tremendously, add tests (obsolete) (deleted) — Splinter Review
Updated refcounting and greedier Send__delete__.  I've added tests which show different behaviour pre-patch and post-patch, so I think they're demonstrating the correct results now.
Attachment #454792 - Attachment is obsolete: true
Attachment #458357 - Flags: review?(jones.chris.g)
Attachment #454792 - Flags: review?(chris)
Comment on attachment 458357 [details] [diff] [review]
Simplify refcounting tremendously, add tests

>diff --git a/netwerk/ipc/NeckoChild.cpp b/netwerk/ipc/NeckoChild.cpp
>--- a/netwerk/ipc/NeckoChild.cpp
>+++ b/netwerk/ipc/NeckoChild.cpp
>@@ -94,18 +94,21 @@ NeckoChild::AllocPHttpChannel(PIFrameEmb
>   return nsnull;
> }
> 
> bool 
> NeckoChild::DeallocPHttpChannel(PHttpChannelChild* channel)
> {
>   NS_ABORT_IF_FALSE(IsNeckoChild(), "DeallocPHttpChannel called by non-child!");
> 
>-  // Delete channel (HttpChannelChild's refcnt must already hit 0 to get here)
>-  delete channel;
>+  // We only want to do something here if IPDL still holds a logical reference
>+  // (ie. if OnStopRequest has not been called yet). This should only occur if
>+  // IPDL is shutting down abrubtly due to an error condition.
>+  HttpChannelChild* child = static_cast<HttpChannelChild*>(channel);
>+  child->IPDLReleaseReference();

This comment doesn't help me here; I think it's subsumed by the one
for ReleaseIPDLReference().  I'd nuke the version here.

>diff --git a/netwerk/protocol/http/HttpChannelChild.cpp b/netwerk/protocol/http/HttpChannelChild.cpp
>--- a/netwerk/protocol/http/HttpChannelChild.cpp
>+++ b/netwerk/protocol/http/HttpChannelChild.cpp
> //-----------------------------------------------------------------------------
> // HttpChannelChild::PHttpChannelChild
> //-----------------------------------------------------------------------------
> 
>+// We track the lifetime of the IPDL actor with an extra strong reference to
>+// ensure that the actor is not released out from under us.  The protocol

This comment belongs in HttpChannelChild.h.  s/protocol/actor/ on the
above line.

>+// itself will be deleted upon OnStopRequest, and the extra reference released

s/will be deleted/will be Send__delete__()d/

>+// in the event of IPDL error as OnStopRequest will never be reached.
>+void
>+HttpChannelChild::IPDLReleaseReference()

Functions are verbs: s/IPDLReleaseReference/ReleaseIPDLReference/.

>+{
>+  if (!mIPDLHasRef)
>+    return;
>+

AFAICT there's no valid reason for calling ReleaseIPDLReference()
twice, and allowing it seems to me like a recipe for use-after-free
bugs.  Note that Release() isn't idempotent either; it's can't be.
Please convert this into an |NS_ABORT_IF_FALSE(mIPDLHasRef);|
assertion to match Release() semantics.

>+  mIPDLHasRef = false;
>+  Release();
>+}
>+

Please also add a corresponding AddIPDLRef() function to match AddRef.
You should check |NS_ABORT_IF_FALSE(!mIPDLHasRef)| there.

> bool 
> HttpChannelChild::RecvOnStartRequest(const nsHttpResponseHead& responseHead,
>                                      const PRBool& useResponseHead,
>                                      const PRBool& isFromCache,
>                                      const PRBool& cacheEntryAvailable,
>                                      const PRUint32& cacheExpirationTime,
>                                      const nsCString& cachedCharset)
> {
>@@ -189,20 +193,20 @@ HttpChannelChild::RecvOnStopRequest(cons
>   nsresult rv = mListener->OnStopRequest(this, mListenerContext, statusCode);
>   mListener = 0;
>   mListenerContext = 0;
>   mCacheEntryAvailable = PR_FALSE;
> 
>   if (mLoadGroup)
>     mLoadGroup->RemoveRequest(this, nsnull, statusCode);
> 
>-  SendOnStopRequestCompleted();
>+  PHttpChannelChild::Send__delete__(this);
> 
>   // Corresponding AddRef in AsyncOpen().
>-  this->Release();
>+  IPDLReleaseReference();
>

The contract for PFoo::Send__delete__() is that it always ends in
DeallocFoo() regardless of IPC channel state.  I'm guessing that this
second, superfluous ReleaseIPDLReference() is why you added the
early-return from ReleaseIPDLReference() above ;).

Again AFAICT, nothing guarantees that something other than IPDL holds
a ref to the channel at OnStop().  So after the Send__delete__(),
please add a comment loudly warning that |this| may have been deleted.

I trust that part of the PHttpChannel protocol is that OnStop() is
delivered 0 or 1 times?  (This isn't documented in PHttpChannel.ipdl
:/ .)  Otherwise things will go boom here.

>   if (NS_FAILED(rv)) {
>     // TODO: Cancel request: see OnStartRequest (bug 536317)
>     return false;  
>   }
>   return true;
> }
> 
>@@ -383,16 +387,17 @@ HttpChannelChild::AsyncOpen(nsIStreamLis
>                 IPC::URI(mReferrer), mLoadFlags, mRequestHeaders, 
>                 mRequestHead.Method(), uploadStreamData, 
>                 uploadStreamInfo, mPriority, mRedirectionLimit, 
>                 mAllowPipelining, mForceAllowThirdPartyCookie);
> 
>   // The socket transport layer in the chrome process now has a logical ref to
>   // us, until either OnStopRequest or OnRedirect is called.
>   this->AddRef();
>+  mIPDLHasRef = true;
> 

See comment above; please move this code into AddIPDLRef().

To keep things simpler, please move the call to AddIPDLRef() to just
before |gNeckoChild->SendPHttpChannelConstructor(this, tabChild);|.
If sending that constructor message fails, then IPDL will
automatically call DeallocPHttpChannel() on your behalf and you'll hit
the normal ReleaseIPDLReference() path.

>+  void IPDLReleaseReference();
>+

(Per above, docs here, s/IPDLReleaseReference/ReleaseIPDLReference/,
and add |AddIPDLRef|.)

>diff --git a/netwerk/protocol/http/HttpChannelParent.cpp b/netwerk/protocol/http/HttpChannelParent.cpp
>--- a/netwerk/protocol/http/HttpChannelParent.cpp
>+++ b/netwerk/protocol/http/HttpChannelParent.cpp
>@@ -62,16 +62,17 @@ HttpChannelParent::HttpChannelParent(PIF
>   NS_ASSERTION(handler, "no http handler");
> 
>   mTabParent = do_QueryInterface(static_cast<nsITabParent*>(
>       static_cast<TabParent*>(iframeEmbedding)));
> }
> 
> HttpChannelParent::~HttpChannelParent()
> {
>+  mCacheDescriptor = nsnull;

I don't know what a cache descriptor is or why you need the explicit
unref here.  Please add a comment.

>diff --git a/netwerk/test/unit/test_channel_close.js b/netwerk/test/unit/test_channel_close.js
>new file mode 100644
>--- /dev/null
>+++ b/netwerk/test/unit/test_channel_close.js

I didn't review the test.  Have you tried running it under valgrind?

I'd like to see an updated patch.
Attached patch Simplify refcounting tremendously, add tests v2 (obsolete) (deleted) — Splinter Review
Comments addressed.  Valgrind run is clean.
Attachment #458357 - Attachment is obsolete: true
Attachment #458541 - Flags: review?(jones.chris.g)
Attachment #458357 - Flags: review?(jones.chris.g)
Comment on attachment 458541 [details] [diff] [review]
Simplify refcounting tremendously, add tests v2

>diff --git a/netwerk/protocol/http/HttpChannelChild.cpp b/netwerk/protocol/http/HttpChannelChild.cpp
>--- a/netwerk/protocol/http/HttpChannelChild.cpp
>+++ b/netwerk/protocol/http/HttpChannelChild.cpp
>+void
>+HttpChannelChild::AddIPDLReference()
>+{

NS_ABORT_IF_FALSE(!mIPDLHasRef) here.

>+  mIPDLHasRef = true;
>+  AddRef();
>+}
>@@ -189,21 +194,19 @@ HttpChannelChild::RecvOnStopRequest(cons
>   nsresult rv = mListener->OnStopRequest(this, mListenerContext, statusCode);
>   mListener = 0;
>   mListenerContext = 0;
>   mCacheEntryAvailable = PR_FALSE;
> 
>   if (mLoadGroup)
>     mLoadGroup->RemoveRequest(this, nsnull, statusCode);
> 
>-  SendOnStopRequestCompleted();
>+  // The following call will most likely end up invalidating |this|
>+  PHttpChannelChild::Send__delete__(this);
> 

s/will most likely end up invalidating/might delete/

>diff --git a/netwerk/protocol/http/HttpChannelChild.h b/netwerk/protocol/http/HttpChannelChild.h
>--- a/netwerk/protocol/http/HttpChannelChild.h
>+++ b/netwerk/protocol/http/HttpChannelChild.h
>@@ -107,18 +107,25 @@ public:
>   NS_IMETHOD SetRequestHeader(const nsACString& aHeader, 
>                               const nsACString& aValue, 
>                               PRBool aMerge);
>   // nsIHttpChannelInternal
>   NS_IMETHOD SetupFallbackChannel(const char *aFallbackKey);
>   // nsISupportsPriority
>   NS_IMETHOD SetPriority(PRInt32 value);
> 
>+  // We track the lifetime of the IPDL actor with an extra strong reference
>+  // to ensure that the actor is not released out from under us.  The actor
>+  // itself will be Send__delete__()d upon OnStopRequest, and the extra
>+  // reference released in the event of IPDL error as OnStopRequest will never
>+  // be reached.

Would note here that both Send__delete__() and error cleanup will hit DeallocHttpChannel(), where the IPDL ref is dropped.

>diff --git a/netwerk/protocol/http/HttpChannelParent.cpp b/netwerk/protocol/http/HttpChannelParent.cpp
>--- a/netwerk/protocol/http/HttpChannelParent.cpp
>+++ b/netwerk/protocol/http/HttpChannelParent.cpp
>@@ -62,16 +62,17 @@ HttpChannelParent::HttpChannelParent(PIF
>   NS_ASSERTION(handler, "no http handler");
> 
>   mTabParent = do_QueryInterface(static_cast<nsITabParent*>(
>       static_cast<TabParent*>(iframeEmbedding)));
> }
> 
> HttpChannelParent::~HttpChannelParent()
> {
>+  mCacheDescriptor = nsnull;

I still don't understand why this is explicitly unrefed rather than by ~nsCOMPtr when it goes out of scope after ~HttpChannelParent().

r=me with the assertion and comment fixes.
Attachment #458541 - Flags: review?(jones.chris.g) → review+
Carrying forward r+.
Attachment #458541 - Attachment is obsolete: true
Attachment #458559 - Flags: review+
Keywords: checkin-needed
This can land on e10s any time.  We'll be merging to m-c periodically.
As discussed on IRC, we needed to add checks to make sure the child doesn't send IPDL msgs after __delete.
Attachment #458559 - Attachment is obsolete: true
http://hg.mozilla.org/projects/electrolysis/rev/47ac7a7f8b5b
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Keywords: checkin-needed
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: