Closed Bug 942835 Opened 11 years ago Closed 11 years ago

Better CacheEntry purge prevention: browser_bug666317.js | application crashed [@ nsHttpChannelAuthProvider::Release()] with cache2=on

Categories

(Core :: Networking: Cache, defect)

x86_64
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla29

People

(Reporter: mayhemer, Assigned: mayhemer)

References

Details

Attachments

(1 file, 2 obsolete files)

https://hg.mozilla.org/projects/gum/rev/875d13e28b98: https://tbpl.mozilla.org/php/getParsedLog.php?id=30912328&tree=Gum https://tbpl.mozilla.org/php/getParsedLog.php?id=30914194&tree=Gum 13:08:42 INFO - Operating system: Windows NT 13:08:42 INFO - 6.1.7601 Service Pack 1 13:08:42 INFO - CPU: x86 13:08:42 INFO - GenuineIntel family 6 model 30 stepping 5 13:08:42 INFO - 8 CPUs 13:08:42 INFO - Crash reason: EXCEPTION_BREAKPOINT 13:08:42 INFO - Crash address: 0x65d06508 13:08:42 INFO - Thread 36 (crashed) 13:08:42 INFO - 0 xul.dll!nsHttpChannelAuthProvider::Release() [nsHttpChannelAuthProvider.cpp:875d13e28b98 : 1338 + 0x26] 13:08:42 INFO - eip = 0x65d06508 esp = 0x0ec4fb34 ebp = 0x0ec4fb40 ebx = 0x0062ecb0 13:08:42 INFO - esi = 0x1c6ba6a8 edi = 0x63f45ff0 eax = 0x00000000 ecx = 0x6447bfed 13:08:42 INFO - edx = 0x6825e4d8 efl = 0x00000212 13:08:42 INFO - Found by: given as instruction pointer in context 13:08:42 INFO - 1 xul.dll!nsCOMPtr<nsIHttpChannelAuthProvider>::~nsCOMPtr<nsIHttpChannelAuthProvider>() [nsCOMPtr.h:875d13e28b98 : 514 + 0x5] 13:08:42 INFO - eip = 0x65d07a9e esp = 0x0ec4fb48 ebp = 0x0ec4fb64 13:08:42 INFO - Found by: call frame info 13:08:42 INFO - 2 xul.dll!mozilla::net::nsHttpChannel::~nsHttpChannel() [nsHttpChannel.cpp:875d13e28b98 : 216 + 0x51] 13:08:42 INFO - eip = 0x65d23475 esp = 0x0ec4fb54 ebp = 0x0ec4fb64 13:08:42 INFO - Found by: call frame info 13:08:42 INFO - 3 xul.dll!mozilla::net::nsHttpChannel::`vector deleting destructor'(unsigned int) + 0xa 13:08:42 INFO - eip = 0x65d241f6 esp = 0x0ec4fb60 ebp = 0x0ec4fb64 13:08:42 INFO - Found by: call frame info 13:08:42 INFO - 4 xul.dll!nsHashPropertyBag::Release() [nsHashPropertyBag.cpp:875d13e28b98 : 28 + 0x6c] 13:08:42 INFO - eip = 0x65bab2d2 esp = 0x0ec4fb6c ebp = 0x0ec4fb7c 13:08:42 INFO - Found by: call frame info 13:08:42 INFO - 5 xul.dll!mozilla::net::HttpBaseChannel::Release() [HttpBaseChannel.cpp:875d13e28b98 : 159 + 0xb] 13:08:42 INFO - eip = 0x65ce62df esp = 0x0ec4fb84 ebp = 0x0ec4fb8c 13:08:42 INFO - Found by: call frame info 13:08:42 INFO - 6 xul.dll!mozilla::net::nsHttpChannel::Release() [nsHttpChannel.cpp:875d13e28b98 : 4298 + 0xb] 13:08:42 INFO - eip = 0x65d057ec esp = 0x0ec4fb94 ebp = 0x0ec4fb9c 13:08:42 INFO - Found by: call frame info 13:08:42 INFO - 7 xul.dll!nsCOMPtr<nsICacheEntryOpenCallback>::~nsCOMPtr<nsICacheEntryOpenCallback>() [nsCOMPtr.h:875d13e28b98 : 514 + 0x5] 13:08:42 INFO - eip = 0x65caf1e5 esp = 0x0ec4fba4 ebp = 0x0ec4fbb8 13:08:42 INFO - Found by: call frame info 13:08:42 INFO - 8 xul.dll!nsTArray_Impl<mozilla::net::CacheEntry::Callback,nsTArrayInfallibleAllocator>::DestructRange(unsigned int,unsigned int) [nsTArray.h:875d13e28b98 : 1573 + 0x6] 13:08:42 INFO - eip = 0x65cb67da esp = 0x0ec4fbb0 ebp = 0x0ec4fbb8 13:08:42 INFO - Found by: call frame info 13:08:42 INFO - 9 xul.dll!nsTArray_Impl<mozilla::net::CacheEntry::Callback,nsTArrayInfallibleAllocator>::RemoveElementsAt(unsigned int,unsigned int) [nsTArray.h:875d13e28b98 : 1290 + 0x8]
We must release the callback on the same thread where AsyncOpen with it has been called.
Attached patch v1 (obsolete) (deleted) — Splinter Review
This was a tricky one! I realized that just releasing the callback where needed is not a 100% solution here. And more problems appeared when finding the right and a simple as possible solution. First, we must not purge an entry when it's in "progress of being accessed". This access progress is bound to the fact, we want just a single instance of a CacheEntry related to a single storage/URI combination. It means, we must not allow purging between call to CacheStorageService::AddStorageEntry (responsible for creating or finding an entry for a URI/storage and keeping such entry a single instance only) and invoking the callback it self. Besides, when the entry is still referenced by some consumer, we must prevent it as well. Simple reference counting doesn't work here, we store CacheEntry object in more then one hashtable, and the number of internal references is hard to track. So, I decided to have a new counter (beside the standard ref counter) that counts consumers that keep the entry + counts the number of pending open operations. It works like this: Process of opening an entry: we call AddStorageEntry, that returns a handle that first increases the handle-count under the service lock, the handle gets to the storage that calls AsyncOpen on the entry. The entry creates Callback record (put to the queue) that increases the handle-count again. Then, we give a new handle to the entry to the consumer via OnAvail callback. The consumer keeps the handle until it no longer needs it. The handle-count is all this time > 0. cache service lock: |-----| opening reference: |-----| callback reference: |-----| consumer reference: |--...--| total 'handle-count': 0000001111122111221...111000.... When we want to purge an entry, we check the handle-count value again under the service lock. When non-null, we don't purge. If null, we remove the entry. When a new consumer wants it right away or a different thread it must wait for the service lock acquirement. After that, the entry will be reloaded and single again. The patch: - renames and de-nests CacheEntry::Handle to a lonely class CacheEntryHandle - introduced mHandleCount atomic counter on CacheEntry - CacheEntryHandle 'refs' this counter - CacheEntry::Callback does so as well - CacheEntryHandle is passed to OnCacheEntryAvailable also when only open for reading or as a ready entry to keep info on being referenced by all consumers - CacheStorageService::AddStorageEntry returns CacheEntryHandle as well, needed to ++mHandleCount under the lock while manipulating/searching the entry hash table This all ensures that a cache entry is not purged when still in use. Thus, we will never have more then one entry to access the same content. https://tbpl.mozilla.org/?tree=Try&rev=24df36c150ce
Attachment #8338956 - Flags: review?(michal.novotny)
Status: NEW → ASSIGNED
Summary: browser_bug666317.js | application crashed [@ nsHttpChannelAuthProvider::Release()] with cache2=on → Better CacheEntry purge prevention: browser_bug666317.js | application crashed [@ nsHttpChannelAuthProvider::Release()] with cache2=on
Comment on attachment 8338956 [details] [diff] [review] v1 Review of attachment 8338956 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/cache2/CacheEntry.cpp @@ +808,5 @@ > +bool CacheEntry::IsReferenced() const > +{ > + CacheStorageService::Self()->Lock().AssertCurrentThreadOwns(); > + > + // No need to lock, since: This comment is misleading since this code is actually executed under the lock. @@ +810,5 @@ > + CacheStorageService::Self()->Lock().AssertCurrentThreadOwns(); > + > + // No need to lock, since: > + // 1. increasing this counter from 0 to non-null happens only under > + // the the service lock It would be good to have AssertCurrentThreadOwns() at all places where the counter is increased.
Attachment #8338956 - Flags: review?(michal.novotny) → review+
(In reply to Michal Novotny (:michal) from comment #4) > Comment on attachment 8338956 [details] [diff] [review] > v1 > > Review of attachment 8338956 [details] [diff] [review]: Thanks. > This comment is misleading since this code is actually executed under the > lock. Yep. > It would be good to have AssertCurrentThreadOwns() at all places where the > counter is increased. This is only needed in a place where this can get from 0 to 1 (which is at CacheStorageService::AddStorageEntry) since we only need to check for non-nullness of the counter. Other places are not needed to sync, that is the magic of this mechanism :) Note that the counter it self is atomic. I'll add an assertion checking there is one of service lock being held or mRefCnt > 0.
Attached patch v1->v1.1 idiff (obsolete) (deleted) — Splinter Review
- added the assertion suggested in comment 5 - caught instantly a small issue with Recreate: we were throwing the handle away too soon and created later a new one, out of the lock - changed to return the handle that was created in the service under the lock all the way up to caller of recreate()
Attachment #8338956 - Attachment is obsolete: true
Attachment #8348954 - Flags: review?(michal.novotny)
Attached patch v1.2 (deleted) — Splinter Review
Forget the 1-1.1 idiff. This is even better + fixes the bad comment. I've added even more asserts to all other places we increase the handler counter to check it's already non-null.
Attachment #8348954 - Attachment is obsolete: true
Attachment #8348954 - Flags: review?(michal.novotny)
Attachment #8348966 - Flags: review?(michal.novotny)
Attachment #8348966 - Flags: review?(michal.novotny) → review+
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Needs a followup, I mistakenly landed the 1.1 version. Try for cache2=on: https://tbpl.mozilla.org/?tree=Try&rev=cdeb2d7640b3
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: