Closed
Bug 513008
Opened 15 years ago
Closed 14 years ago
Eliminate synchronous reads from cache (make cache reads asynchronous)
Categories
(Core :: Networking: Cache, defect)
Core
Networking: Cache
Tracking
()
RESOLVED
FIXED
mozilla2.0b5
Tracking | Status | |
---|---|---|
blocking2.0 | --- | betaN+ |
People
(Reporter: vlad, Assigned: michal)
References
Details
(Keywords: dev-doc-complete)
Attachments
(2 files, 8 obsolete files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
sdwilsh
:
review+
|
Details | Diff | Splinter Review |
Right now, the cache does reads and, more importantly, writes synchronously in its channel implementation. While this does happen on an IO thread, for writes in particular, that sync write means that the write will block any future reads from the socket (and thus handing off data to the parsers) until we finish that writing. This is really very very bad for mobile perf. This is intended as a stopgap until we do bug 512849.
Assignee | ||
Comment 1•15 years ago
|
||
This patch tries to eliminate only blocking reads since the writes will be fixed in bug #513074. This version is not yet ready for review, but I would like to hear some opinions. Also can someone suggest a way to verify a performance gain? It works so that there is a nsIInputStreamAsyncWrapper that converts any blocking input stream into async input stream. Reads are done on a background thread. Anyone who wants to read from the cache asynchronously will use this wrapper initialized with input stream obtained from cache entry descriptor (the patch currently does this only in nsHttpChannel). What needs to be changed yet: - use one global thread or threadpool - implement nsISeekableStream?
Assignee: nobody → michal.novotny
Comment 2•15 years ago
|
||
(In reply to comment #1) > This patch tries to eliminate only blocking reads since the writes will be > fixed in bug #513074. This version is not yet ready for review, but I would > like to hear some opinions. Also can someone suggest a way to verify a > performance gain? It may be difficult to do that, unless you have a really slow hard drive. A lot of this depends on a number of factors (disk fragmentation, disk cache, file system, disk load, etc), so getting good consistent numbers can be hard. > It works so that there is a nsIInputStreamAsyncWrapper that converts any > blocking input stream into async input stream. Reads are done on a background > thread. Anyone who wants to read from the cache asynchronously will use this > wrapper initialized with input stream obtained from cache entry descriptor (the > patch currently does this only in nsHttpChannel). What about using NS_AsyncCopy instead of creating a whole new stream wrapper?
Summary: Eliminate synchronous writes (and reads?) from cache → Eliminate synchronous reads from cache
Target Milestone: --- → mozilla1.9.3a1
Comment 3•15 years ago
|
||
Stuart et al, this could use some testing on some slow mobile devices where we know disk IO on the main thread is a big deal. The patch is not ready to land, but should apply and show whether it's worth spending time on getting this ready to land and so on.
Assignee | ||
Comment 4•15 years ago
|
||
(In reply to comment #2) > It may be difficult to do that, unless you have a really slow hard drive. A > lot of this depends on a number of factors (disk fragmentation, disk cache, > file system, disk load, etc), so getting good consistent numbers can be hard. I can imagine a testing environment, e.g. profile directory on filesystem located on remote nbd or iscsi, that is accessed via some throttled bandwidth (e.g. OpenVPN). But how should look the testing page and what exactly should I measure? > What about using NS_AsyncCopy instead of creating a whole new stream wrapper? Where exactly would you use the NS_AsyncCopy? The nsIInputStreamAsyncWrapper is really easy to use, see the tiny change in nsHttpChannel.cpp. Also NS_AsyncCopy copies always all content but I think that it is useful to have the possibility to read asynchronously only specified amount of data.
Comment 5•15 years ago
|
||
(In reply to comment #4) > Where exactly would you use the NS_AsyncCopy? The nsIInputStreamAsyncWrapper is > really easy to use, see the tiny change in nsHttpChannel.cpp. Also NS_AsyncCopy > copies always all content but I think that it is useful to have the possibility > to read asynchronously only specified amount of data. In what situations do we need to do that? Having concrete data here would be good, otherwise I'm not sure the extra code is worthwhile.
Updated•15 years ago
|
blocking2.0: --- → ?
Assignee | ||
Comment 6•15 years ago
|
||
(In reply to comment #5) > In what situations do we need to do that? Having concrete data here would be > good, otherwise I'm not sure the extra code is worthwhile. We don't need it right now. The bad is that while looking at where exactly to put the NS_AsyncCopy I found out that reading from the cache is already done on separate thread using NS_AsyncCopy. nsCacheEntryDescriptor::nsInputStreamWrapper is blocking therefore a transport that copies data asynchronously is created in nsInputStreamPump::AsyncRead. In GDB I caught only this PR_Read on the main thread: PR_Read nsDiskCacheBlockFile::ReadBlocks nsDiskCacheMap::ReadDiskCacheEntry nsDiskCacheDevice::FindEntry nsCacheService::SearchCacheDevices nsCacheService::ActivateEntry nsCacheService::ProcessRequest nsCacheService::OpenCacheEntry Could someone with DTrace check where are all the reads on the main thread called from?
Comment 7•15 years ago
|
||
(In reply to comment #6) > Could someone with DTrace check where are all the reads on the main thread > called from? From http://files.shawnwilsher.com/2010/3/9/data.log: operation:read (/Users/sdwilsh/Library/Caches/Firefox/Profiles/bsw4j032.io/Cache/_CACHE_001_) libSystem.B.dylib`read$UNIX2003+0xa XUL`nsDiskCacheBlockFile::ReadBlocks(void*, int, int, int*)+0xf0 XUL`nsDiskCacheMap::ReadDiskCacheEntry(nsDiskCacheRecord*)+0x1d6 XUL`nsDiskCacheDevice::FindEntry(nsCString*, int*)+0xc4 XUL`nsCacheService::SearchCacheDevices(nsCString*, int, int*)+0xc7 XUL`nsCacheService::ActivateEntry(nsCacheRequest*, nsCacheEntry**)+0xbd XUL`nsCacheService::ProcessRequest(nsCacheRequest*, int, nsICacheEntryDescriptor**)+0x55 XUL`nsCacheService::OpenCacheEntry(nsCacheSession*, nsACString_internal const&, int, int, nsICacheListener*, nsICacheEntryDescriptor**)+0xc6 XUL`nsCacheSession::OpenCacheEntry(nsACString_internal const&, int, int, nsICacheEntryDescriptor**)+0x35 XUL`nsHttpChannel::OpenCacheEntry(int, int*)+0x4fb XUL`nsHttpChannel::Connect(int)+0x21e XUL`nsHttpChannel::AsyncOpen(nsIStreamListener*, nsISupports*)+0x360
Assignee | ||
Comment 8•15 years ago
|
||
Hmm, it isn't hard to do AsyncOpenCacheEntry() on a background thread but it isn't so easy with OpenCacheEntry() since it should succeed/fail synchronously. My idea is that we could post methods that do IO operations to the background thread and call NS_ProcessNextEvent() while waiting for the result. During that time we can't hold the cache lock on the main thread due to possible deadlock. I don't think we can safely release the lock anywhere in nsCacheService::ProcessRequest() so we should process whole this method on the background thread and implement the posting and event queue spinning in nsCacheService::OpenCacheEntry() (now I'm just talking about OpenCacheEntry() and ignoring that ProcessRequest() is called also from ProcessPendingRequests() which is called from DoomEntry_Internal(), CloseDescriptor() and ValidateEntry()). The problem is that in nsCacheService::OpenCacheEntry() we don't know if ProcessRequest() would do some IO and we must always post it to the background thread. So it is a question if there would be some performance gain. We could also post only AsyncOpenCacheEntry() to the background thread and get rid of using OpenCacheEntry() in nsHttpChannel. This could be done probably easily in nsHttpChannel::OpenCacheEntry() but not in nsHttpChannel::OpenOfflineCacheEntryForWriting() and nsHttpChannel::MaybeInvalidateCacheEntryForSubsequentGet(). Any thoughts are appreciated.
Comment 9•15 years ago
|
||
(In reply to comment #8) > Hmm, it isn't hard to do AsyncOpenCacheEntry() on a background thread but it > isn't so easy with OpenCacheEntry() since it should succeed/fail synchronously. Why does it need to fail synchronously? > My idea is that we could post methods that do IO operations to the background > thread and call NS_ProcessNextEvent() while waiting for the result. During that > time we can't hold the cache lock on the main thread due to possible deadlock. > I don't think we can safely release the lock anywhere in > nsCacheService::ProcessRequest() so we should process whole this method on the > background thread and implement the posting and event queue spinning in > nsCacheService::OpenCacheEntry() (now I'm just talking about OpenCacheEntry() > and ignoring that ProcessRequest() is called also from ProcessPendingRequests() > which is called from DoomEntry_Internal(), CloseDescriptor() and > ValidateEntry()). The problem is that in nsCacheService::OpenCacheEntry() we > don't know if ProcessRequest() would do some IO and we must always post it to > the background thread. So it is a question if there would be some performance > gain. It's not exactly safe to spin the event loop, so we should really try to avoid doing that. > We could also post only AsyncOpenCacheEntry() to the background thread and get > rid of using OpenCacheEntry() in nsHttpChannel. This could be done probably > easily in nsHttpChannel::OpenCacheEntry() but not in > nsHttpChannel::OpenOfflineCacheEntryForWriting() and > nsHttpChannel::MaybeInvalidateCacheEntryForSubsequentGet(). Why can't we in the latter two?
Assignee | ||
Comment 10•15 years ago
|
||
(In reply to comment #9) > Why does it need to fail synchronously? Because it is how OpenCacheEntry() works. > It's not exactly safe to spin the event loop, so we should really try to avoid > doing that. OK, so we need to use AsyncOpenCacheEntry() anywhere if possible. > > rid of using OpenCacheEntry() in nsHttpChannel. This could be done probably > > easily in nsHttpChannel::OpenCacheEntry() but not in > > nsHttpChannel::OpenOfflineCacheEntryForWriting() and > > nsHttpChannel::MaybeInvalidateCacheEntryForSubsequentGet(). > Why can't we in the latter two? It seems to me that it will require quite a big change in the code, but I could be wrong. Anyway we should do it since we can't change behavior of OpenCacheEntry().
Comment 11•15 years ago
|
||
(In reply to comment #10) > It seems to me that it will require quite a big change in the code, but I could > be wrong. Anyway we should do it since we can't change behavior of > OpenCacheEntry(). Agreed. Do you have an estimate on when you might have a patch that does all this?
Assignee | ||
Comment 12•15 years ago
|
||
This patch eliminates now only calls to OpenCacheEntry in nsCacheService::OpenCacheEntry(). For now, please review only changes in nsHttpChannel and ignore changes in the cache part.
Attachment #420327 -
Attachment is obsolete: true
Attachment #435632 -
Flags: review?(jduell.mcbugs)
Updated•15 years ago
|
Attachment #435632 -
Flags: review?(jduell.mcbugs) → review+
Comment 13•15 years ago
|
||
Comment on attachment 435632 [details] [diff] [review] patch v2 - WIP Sorry to take so long to review this--I wasn't familiar with this code, and it's fairly deep spaghetti :) I think the patch looks good. Despite adding more logic, it actually makes the code slightly easier to follow. There are some nits, though, and at least one error that needs fixing. I assume we've got decent test coverage of app caching in mochitests? This code definitely needs to be extensively tested. And I think biesi or bz should sr it. >diff --git a/netwerk/protocol/http/src/nsHttpChannel.cpp >+ if (mLoadFlags & LOAD_BYPASS_LOCAL_CACHE_IF_BUSY) { >+ // we must use synchronous open when LOAD_BYPASS_LOCAL_CACHE_IF_BUSY >+ // was specified shorter: // must use synchronous open for LOAD_BYPASS_LOCAL_CACHE_IF_BUSY >+ rv = session->OpenCacheEntry(cacheKey, >+ nsICache::ACCESS_READ, PR_FALSE, >+ getter_AddRefs(mCacheEntry)); >+ if (NS_SUCCEEDED(rv)) { >+ mCacheEntry->GetAccessGranted(&mCacheAccess); >+ LOG(("nsHttpChannel::OpenCacheEntry [this=%p grantedAccess=%d]", this, mCacheAccess)); >+ return NS_OK; I'm pretty sure you should set mLoadedFromApplicationCache = PR_TRUE here before returning--the code in trunk does it for the equivalent logic, and you do it for the async case, but you're missing it here. >+ rv = session->AsyncOpenCacheEntry(cacheKey, >+ nsICache::ACCESS_READ, >+ this); >+ >+ if (NS_SUCCEEDED(rv) || rv == NS_ERROR_CACHE_WAIT_FOR_VALIDATION) { >+ *delayed = PR_TRUE; >+ return NS_OK; > } AFAICT you don't need to check for NS_ERROR_CACHE_WAIT_FOR_VALIDATION here--nsCacheSession::AsyncOpenCacheEntry converts it to NS_OK. >+nsresult >+nsHttpChannel::OnOfflineCacheEntryAvailable(nsICacheEntryDescriptor *aEntry, >+ nsCacheAccessMode aAccess, >+ nsresult aStatus, >+ PRBool *aDelayed) >+{ So this is called with aDelayed != null only if a sync request failed? I'd like to see a comment explaining that (and an assert). Otherwise it takes a while to figure out what's going on here. More broadly: the whole "delayed" chain starts as a local variabe in Connect(), and is winding up running down quite a chain of function calls, where it ends up also meaning "this function is called synchronously". Would it be better to just make it a member variable, say "mAsyncCacheOpen", and then have the function parameters be "bool sync"? I think this would make the logic much clearer when you look at the function signatures. >+ if (NS_SUCCEEDED(aStatus)) { >+ // We successfully opened an offline cache session and the entry, >+ // now indiciate we load from the offline cache. edit comment: s/now indiciate we/so indicate we will/ >+ if (mCanceled && NS_FAILED(mStatus)) { >+ LOG(("channel was canceled [this=%p status=%x]\n", this, mStatus)); >+ return mStatus; >+ } >+ >+ if (NS_SUCCEEDED(aStatus)) >+ // Called from OnCacheAvailable, advance to the next state >+ return Connect(PR_FALSE); Do you really want to drop down to Connect again if mCanceled, but mStatus is OK? (Is that possible?) Is there any reason why the "return Connect" above shouldn't be part of the first if(NS_SUCCEEDED) block (the one with the indiciate typo)? Also, in comment about "Called from OnCacheAvailable": s/OnCacheAvailable/onCacheEntryAvailable/ >+ if (!mCacheForOfflineUse && !mFallbackChannel) { Would be nice to have a comment explaining what's going on here--i.e. namespace search of app cache? >+ if (aDelayed) >+ NS_ENSURE_SUCCESS(rv, rv); >+ else >+ return Connect(PR_FALSE); Either the "return Connect" here is wrong, or the next two are. See the next few comments. >+ >+ PRUint32 namespaceType = 0; >+ if (!namespaceEntry || >+ NS_FAILED(namespaceEntry->GetItemType(&namespaceType)) || >+ (namespaceType & >+ (nsIApplicationCacheNamespace::NAMESPACE_FALLBACK | >+ nsIApplicationCacheNamespace::NAMESPACE_OPPORTUNISTIC | >+ nsIApplicationCacheNamespace::NAMESPACE_BYPASS)) == 0) { >+ // When loading from an application cache, only items >+ // on the whitelist or matching a >+ // fallback/opportunistic namespace should hit the >+ // network... >+ mLoadFlags |= LOAD_ONLY_FROM_CACHE; >+ >+ // ... and if there were an application cache entry, >+ // we would have found it earlier. >+ if (aDelayed) >+ return NS_ERROR_CACHE_KEY_NOT_FOUND; >+ else >+ return Connect(PR_FALSE); We already called "return Connect" above if !aDelayed, so I don't see how we can get to the 2nd return Connect here. >+ if (namespaceType & >+ nsIApplicationCacheNamespace::NAMESPACE_FALLBACK) { >+ rv = namespaceEntry->GetData(mFallbackKey); >+ if (aDelayed) >+ NS_ENSURE_SUCCESS(rv, rv); >+ else >+ return Connect(PR_FALSE); Ditto for this 3rd "return Connect" as well. >+ if (mLoadFlags & LOAD_BYPASS_LOCAL_CACHE_IF_BUSY) { >+ if (!delayed) { >+ // This is unexpected state. Delayed is valid iff called from >+ // Connect(). We were called from OnCacheEntryAvailable() and that >+ // shouldn't happen when LOAD_BYPASS_LOCAL_CACHE_IF_BUSY flag was >+ // specifed unless somebody altered mLoadFlags between >+ // OpenCacheEntry() and OnCacheEntryAvailable(). Shorter version: // Unexpected state: delayed==null iff we were called from // OnCacheEntryAvailable(), so LOAD_BYPASS_LOCAL_CACHE_IF_BUSY // shouldn't be set. Unless somebody altered mLoadFlags between // OpenCacheEntry() and OnCacheEntryAvailable()... >+ NS_WARNING( >+ "OpenNormalCacheEntry() called from OnCacheEntryAvailable() when" >+ "LOAD_BYPASS_LOCAL_CACHE_IF_BUSY was specified"); Move "when" to next line to avoid 80+ character line >+ // we must use synchronous open when LOAD_BYPASS_LOCAL_CACHE_IF_BUSY >+ // was specified Shorter: // must use synchronous open for LOAD_BYPASS_LOCAL_CACHE_IF_BUSY >+ else { >+ mOnCacheEntryAvailableCallback = &nsHttpChannel::OnNormalCacheEntryAvailable; > rv = session->AsyncOpenCacheEntry(cacheKey, accessRequested, this); >- if (NS_FAILED(rv)) return rv; >- // we'll have to wait for the cache entry >- *delayed = PR_TRUE; >- } >- else if (NS_SUCCEEDED(rv)) { >- mCacheEntry->GetAccessGranted(&mCacheAccess); >- LOG(("nsHttpChannel::OpenCacheEntry [this=%p grantedAccess=%d]", this, mCacheAccess)); >- } >+ if (NS_SUCCEEDED(rv) || rv == NS_ERROR_CACHE_WAIT_FOR_VALIDATION) { Again, I think the check for NS_ERROR_CACHE_WAIT_FOR_VALIDATION isn't needed. >+ if (!delayed) >+ // Called from OnCacheAvailable, advance to the next state >+ rv = Connect(PR_FALSE); In comment about "Called from OnCacheAvailable": s/OnCacheAvailable/onCacheEntryAvailable >+nsresult >+nsHttpChannel::OnNormalCacheEntryAvailable(nsICacheEntryDescriptor *aEntry, >+ nsCacheAccessMode aAccess, >+ nsresult aStatus, >+ PRBool *aDelayed) >+{ >+ NS_ASSERTION(!aDelayed, "aDelayed should be null"); >+ >+ if (NS_SUCCEEDED(aStatus)) { >+ mCacheEntry = aEntry; >+ mCacheAccess = aAccess; >+ } >+ >+ if (mCanceled && NS_FAILED(mStatus)) { >+ LOG(("channel was canceled [this=%p status=%x]\n", this, mStatus)); >+ return mStatus; >+ } >+ >+ if ((mLoadFlags & LOAD_ONLY_FROM_CACHE) && NS_FAILED(aStatus)) >+ // if this channel is only allowed to pull from the cache, then >+ // we must fail if we were unable to open a cache entry. >+ return NS_ERROR_DOCUMENT_NOT_CACHED; I'd prefer if (NS_FAILED(aStatus) { if (mCanceled) { LOG(("channel was canceled [this=%p status=%x]\n", this, mStatus)); return mStatus; } if ((mLoadFlags & LOAD_ONLY_FROM_CACHE) && NS_FAILED(aStatus)) { // if this channel is only allowed to pull from the cache, then // we must fail if we were unable to open a cache entry. return NS_ERROR_DOCUMENT_NOT_CACHED; } } Good work!
Comment 14•15 years ago
|
||
(In reply to comment #13) > >+{ > >+ NS_ASSERTION(!aDelayed, "aDelayed should be null"); > >+ > >+ if (NS_SUCCEEDED(aStatus)) { > >+ mCacheEntry = aEntry; > >+ mCacheAccess = aAccess; > >+ } > >+ > >+ if (mCanceled && NS_FAILED(mStatus)) { Note mStatus test here. > >+ LOG(("channel was canceled [this=%p status=%x]\n", this, mStatus)); > >+ return mStatus; > >+ } > >+ > >+ if ((mLoadFlags & LOAD_ONLY_FROM_CACHE) && NS_FAILED(aStatus)) Note aStatus test here. > >+ // if this channel is only allowed to pull from the cache, then > >+ // we must fail if we were unable to open a cache entry. > >+ return NS_ERROR_DOCUMENT_NOT_CACHED; > > I'd prefer > > if (NS_FAILED(aStatus) { > if (mCanceled) { > LOG(("channel was canceled [this=%p status=%x]\n", this, mStatus)); > return mStatus; > } > if ((mLoadFlags & LOAD_ONLY_FROM_CACHE) && NS_FAILED(aStatus)) { > // if this channel is only allowed to pull from the cache, then > // we must fail if we were unable to open a cache entry. > return NS_ERROR_DOCUMENT_NOT_CACHED; > } > } Since mStatus is not necessarily aStatus, this is not equivalent -- or so it seems from my drive-by point of view. Hope this helps, /be
Comment 15•15 years ago
|
||
nice catch--I didn't notice the aStatus/mStatus difference.
Comment 16•15 years ago
|
||
Let's get this done for 1.9.3.
blocking2.0: ? → beta1+
Target Milestone: mozilla1.9.3a1 → mozilla1.9.3
Assignee | ||
Comment 17•15 years ago
|
||
(In reply to comment #13) > (From update of attachment 435632 [details] [diff] [review]) > Sorry to take so long to review this--I wasn't familiar with this code, and > it's fairly deep spaghetti :) Actually that was one of the quickest reviews I got :) > I assume we've got decent test coverage of app caching in mochitests? This > code definitely needs to be extensively tested. I need to check if the tests exist. If not, I'll create them. > >+nsresult > >+nsHttpChannel::OnOfflineCacheEntryAvailable(nsICacheEntryDescriptor *aEntry, > >+ nsCacheAccessMode aAccess, > >+ nsresult aStatus, > >+ PRBool *aDelayed) > >+{ > > So this is called with aDelayed != null only if a sync request failed? I'd > like to see a comment explaining that (and an assert). Otherwise it takes a > while to figure out what's going on here. > > More broadly: the whole "delayed" chain starts as a local variabe in Connect(), > and is winding up running down quite a chain of function calls, where it ends > up also meaning "this function is called synchronously". Would it be better to > just make it a member variable, say "mAsyncCacheOpen", and then have the > function parameters be "bool sync"? I think this would make the logic much > clearer when you look at the function signatures. aDelayed is always non-null when called from Connect(), so in this case when sync or async request failed in OpenCacheEntry(). What should be the condition for the assertion? Anyway I've changed it to suggested mAsyncCacheOpen and aSync. > >+ if (mCanceled && NS_FAILED(mStatus)) { > >+ LOG(("channel was canceled [this=%p status=%x]\n", this, mStatus)); > >+ return mStatus; > >+ } > >+ > >+ if (NS_SUCCEEDED(aStatus)) > >+ // Called from OnCacheAvailable, advance to the next state > >+ return Connect(PR_FALSE); > > Do you really want to drop down to Connect again if mCanceled, but mStatus is > OK? (Is that possible?) In theory it could happen if somebody calls Cancel() with a success value (though we don't do this internally). I'm not sure that this is correct, but AFAICS this is the same behavior as in the old code. > Is there any reason why the "return Connect" above shouldn't be part of the > first if(NS_SUCCEEDED) block (the one with the indiciate typo)? Because of the mCanceled check. > >+ if (!mCacheForOfflineUse && !mFallbackChannel) { > > Would be nice to have a comment explaining what's going on here--i.e. > namespace search of app cache? I've just took the old code and tried to keep the logic with async opening. I'm afraid I don't know the appCache part enough to describe it exactly. > Either the "return Connect" here is wrong, or the next two are. See the next > few comments. > ... > > Ditto for this 3rd "return Connect" as well. You're right! These 2 statements were wrong. > > I'd prefer > > > > if (NS_FAILED(aStatus) { > > if (mCanceled) { > > LOG(("channel was canceled [this=%p status=%x]\n", this, mStatus)); > > return mStatus; > > } > > if ((mLoadFlags & LOAD_ONLY_FROM_CACHE) && NS_FAILED(aStatus)) { > > // if this channel is only allowed to pull from the cache, then > > // we must fail if we were unable to open a cache entry. > > return NS_ERROR_DOCUMENT_NOT_CACHED; > > } > > } > > nice catch--I didn't notice the aStatus/mStatus difference. I agree that the names are easy to confuse. I've kept my code but I've renamed aStatus to aResult in OnOfflineCacheEntryAvailable() and OnNormalCacheEntryAvailable().
Attachment #435632 -
Attachment is obsolete: true
Attachment #437740 -
Flags: review?(jduell.mcbugs)
Assignee | ||
Comment 18•15 years ago
|
||
Cache IO thread was added to nsCacheService and mCacheWriteThread was removed from nsHttpHandler.
Attachment #437740 -
Attachment is obsolete: true
Attachment #438628 -
Flags: review?(jduell.mcbugs)
Attachment #437740 -
Flags: review?(jduell.mcbugs)
Comment 19•14 years ago
|
||
Jason - do you know when you might be able to get to this review?
Updated•14 years ago
|
Attachment #438628 -
Flags: review?(jduell.mcbugs) → review+
Comment 20•14 years ago
|
||
Comment on attachment 438628 [details] [diff] [review] patch v4 Looks good. Only a couple of things. You mentioned that you'd either find or write tests for this code. There's no new test in the patch--did you find any existing tests? I'm giving you a +r so this can move to the sr stage, but we'll need test coverage for as much of the many codepaths this touches before we can check this in. >+nsresult >+nsHttpChannel::OnOfflineCacheEntryAvailable(nsICacheEntryDescriptor *aEntry, >+ nsCacheAccessMode aAccess, >+ nsresult aResult, >+ PRBool aSync) 'aResult' is a terrible name. "result" is generally understood to be an "out" parameter. I understand you want to distinguish it from "mStatus": I suggest you call it something like "aEntryStatus". I don't really like 'aSync' either--it seems possible to parse it as meaning 'async', when it really means 'sync.' But I can live with it. >+ // Check for namespace match. >+ nsCOMPtr<nsIApplicationCacheNamespace> namespaceEntry; >+ rv = mApplicationCache->GetMatchingNamespace >+ (cacheKey, getter_AddRefs(namespaceEntry)); >+ NS_ENSURE_SUCCESS(rv, aSync ? rv : Connect(PR_FALSE)); So you're now using the NS_ENSURE_SUCCESS line above, when you used to have if (aDelayed) NS_ENSURE_SUCCESS(rv, rv); else return Connect(PR_FALSE); This is not equivalent. Previously you would always "return Connect", but now you'll only return if Connect does not return a success code. Is that OK? I'm guessing it's not. >+ if (namespaceType & >+ nsIApplicationCacheNamespace::NAMESPACE_FALLBACK) { >+ rv = namespaceEntry->GetData(mFallbackKey); >+ NS_ENSURE_SUCCESS(rv, aSync ? rv : Connect(PR_FALSE)); >+ } Same issue with NS_ENSURE_SUCCESS call. >- if (!gHttpHandler->mCacheWriteThread || >- NS_FAILED(rv) || >- policy == nsICache::STORE_ON_DISK_AS_FILE) { >- LOG(("nsHttpChannel::InstallCacheListener sync tee %p\n", tee.get())); >+ if (NS_FAILED(rv) || policy == nsICache::STORE_ON_DISK_AS_FILE) { >+ LOG(("nsHttpChannel::InstallCacheListener sync tee %p rv=%x policy=%d", >+ tee.get(), rv, policy)); > rv = tee->Init(mListener, out, nsnull); > } else { >- LOG(("nsHttpChannel::InstallCacheListener async tee %p\n", >+ nsCOMPtr<nsICacheService> serv = >+ do_GetService(NS_CACHESERVICE_CONTRACTID, &rv); >+ NS_ENSURE_SUCCESS(rv, rv); >+ >+ nsCOMPtr<nsIEventTarget> cacheIOTarget; >+ rv = serv->GetCacheIOTarget(getter_AddRefs(cacheIOTarget)); >+ >+ if (NS_FAILED(rv) || !cacheIOTarget) { >+ LOG(("nsHttpChannel::InstallCacheListener sync tee %p because of " >+ "missing cache IO target)", tee.get())); >+ rv = tee->Init(mListener, out, nsnull); >+ } else { >+ LOG(("nsHttpChannel::InstallCacheListener async tee %p", > tee.get())); >- rv = tee->InitAsync(mListener, gHttpHandler->mCacheWriteThread, out, nsnull); >- } >- >+ rv = tee->InitAsync(mListener, cacheIOTarget, out, nsnull); >+ } >+ } >+ So you've replaced if (missing cache thread || other_conditions) setup sync tee else setup async tee with if (other_conditions) setup sync tee else if (missing cache thread) setup sync tee else setup async tee Any reason why you couldn't keep the same, simpler branch structure? It looks like you'd just need to move the check for nsICacheService's cacheIOTarget before the first if, right? If there's a performance reason (i.e. common case is other_conditions == true), or if you want to distinguish these cases in the LOG calls (as you now do), then keep it. But all things equal, it's nice to have shorter, simpler code.
Comment 21•14 years ago
|
||
biesi, do you know how well our existing xpcshell/mochitests test our caching logic? You're the most likely candidate to sr this, by the way, so if you can take a look at whether all the codepaths touched by this are already tested, or whether some need additional tests written, that would help move things along.
Comment 22•14 years ago
|
||
(In reply to comment #20) > I don't really like 'aSync' either--it seems possible to parse it as meaning > 'async', when it really means 'sync.' But I can live with it. aIsSync?
Comment 23•14 years ago
|
||
(In reply to comment #20) [...] > >+ nsCOMPtr<nsIApplicationCacheNamespace> namespaceEntry; > >+ rv = mApplicationCache->GetMatchingNamespace > >+ (cacheKey, getter_AddRefs(namespaceEntry)); > >+ NS_ENSURE_SUCCESS(rv, aSync ? rv : Connect(PR_FALSE)); > > So you're now using the NS_ENSURE_SUCCESS line above, when you used to have > > if (aDelayed) > NS_ENSURE_SUCCESS(rv, rv); > else > return Connect(PR_FALSE); > > This is not equivalent. Previously you would always "return Connect", but now > you'll only return if Connect does not return a success code. Is that OK? I'm > guessing it's not. Random comment here w/o actually understanding what this code really does. I agree that the old and new are not equivalent, but the new code does always return if (!aDelayed). NS_ENSURE_SUCCESS() always returns if the first code is an error code, so the new code returns rv if aSync is true, or the return value of Connect() if it's false, whether Connect() returns success or error.
Comment 24•14 years ago
|
||
Actually, I think jst's comment is wrong, too. The new code returns iff GetMatchingNamespace fails. If it succeeds, we used to return Connect if aSync, but we won't any more. I also don't understand exactly what's going on here. It seems weird to get the cacheKey, and then, even if it succeeds, call Connect unconditionally (the previous patch's logic) without keeping the cacheKey around (why bother looking it up then?). So maybe the change in the new version is intentional. Michal?
Assignee | ||
Comment 25•14 years ago
|
||
(In reply to comment #24) > I also don't understand exactly what's going on here. It seems weird to get > the cacheKey, and then, even if it succeeds, call Connect unconditionally (the > previous patch's logic) without keeping the cacheKey around (why bother looking > it up then?). So maybe the change in the new version is intentional. Michal? Yes, the change is intentional. You've pointed out the bad 'return Connect(...)' in comment #13 and I agreed that it was wrong. The original code was: // Check for namespace match. nsCOMPtr<nsIApplicationCacheNamespace> namespaceEntry; rv = mApplicationCache->GetMatchingNamespace (cacheKey, getter_AddRefs(namespaceEntry)); NS_ENSURE_SUCCESS(rv, rv); This code was called always synchronously from Connect, so if GetMatchingNamespace failed then returning an error code led to continuing without the cache entry. With this patch the code can be called from Connect or from OnCacheEntryAvailable. In case it is called from Connect (aIsSync is true) we want to do the same as before - just throw an error. In case it is called from OnCacheEntryAvailable we need to call Connect(PR_FALSE) to start proceeding the request without the cache entry. Anyway in both cases we want to return from OnOfflineCacheEntryAvailable when GetMatchingNamespace fails.
Comment 26•14 years ago
|
||
Comment on attachment 438628 [details] [diff] [review] patch v4 OK, this is ready for sr. Biesi, it would be great to know which of the codepaths this touches are tested by our existing xpcshell/mochi tests, and which we might want to write new tests for.
Attachment #438628 -
Flags: superreview?(cbiesinger)
Assignee | ||
Comment 27•14 years ago
|
||
(In reply to comment #20) > You mentioned that you'd either find or write tests for this code. There's no > new test in the patch--did you find any existing tests? I'm giving you a +r > so this can move to the sr stage, but we'll need test coverage for as much of > the many codepaths this touches before we can check this in. I haven't looked at this yet. I suppose that some new tests will be needed. I'll create a separate patch... > 'aResult' is a terrible name. "result" is generally understood to be an "out" > parameter. I understand you want to distinguish it from "mStatus": I suggest > you call it something like "aEntryStatus". Fixed. > I don't really like 'aSync' either--it seems possible to parse it as meaning > 'async', when it really means 'sync.' But I can live with it. I've renamed it to aIsSync. > So you've replaced > > if (missing cache thread || other_conditions) > setup sync tee > else > setup async tee > > with > > if (other_conditions) > setup sync tee > else > if (missing cache thread) > setup sync tee > else > setup async tee > > Any reason why you couldn't keep the same, simpler branch structure? It looks > like you'd just need to move the check for nsICacheService's cacheIOTarget > before the first if, right? > > If there's a performance reason (i.e. common case is other_conditions == true), > or if you want to distinguish these cases in the LOG calls (as you now do), > then keep it. But all things equal, it's nice to have shorter, simpler code. Yes, that was an attempt at performance optimization, but I've changed it because other_condition is false probably in most cases.
Attachment #438628 -
Attachment is obsolete: true
Attachment #445931 -
Flags: superreview?(cbiesinger)
Attachment #438628 -
Flags: superreview?(cbiesinger)
Comment 28•14 years ago
|
||
Can somebody else possibly do this sr if biesi cannot get to it soon? We have ten days until the beta freeze, and this is blocking b1.
Comment 29•14 years ago
|
||
Comment on attachment 445931 [details] [diff] [review] patch v5 bz is the only one besides biesi who can sr netwerk patches.
Attachment #445931 -
Flags: superreview?(cbiesinger) → superreview?(bzbarsky)
Updated•14 years ago
|
Summary: Eliminate synchronous reads from cache → Eliminate synchronous reads from cache (make cache reads asynchronous)
Any sr can sr. Does this affect interfaces? Want to make sure we know what we're sring for.
Comment 31•14 years ago
|
||
(In reply to comment #30) > Any sr can sr. Huh. I think this page disagrees: <http://www.mozilla.org/hacking/reviewers.html> > Does this affect interfaces? Want to make sure we know what > we're sring for. AFAICT, the only thing in this patch which needs sr is the addition of the cacheIOTarget attribute to nsICacheService.
Updated•14 years ago
|
Version: unspecified → Trunk
"Super-review does not require domain expertise (module owners and peers supply that, usually), so the areas below are not pigeon-holes -- you can solicit a super-review from any reviewer on the list, but using area as a guide will get quicker results in the typical case." sr coming right up.
Comment on attachment 445931 [details] [diff] [review] patch v5 cacheIOTarget would like a better comment, perhaps describing what events are dispatched (or a URL to MDC on which it's explained). "which is used for I/O operations" is unusefully vague. Let's get the patch in, but make sure to document that piece better on MDC at the least. sr=shaver
Attachment #445931 -
Flags: superreview?(bzbarsky) → superreview+
Updated•14 years ago
|
Keywords: dev-doc-needed
Comment 34•14 years ago
|
||
Comment on attachment 445931 [details] [diff] [review] patch v5 see comments below. +++ b/netwerk/cache/src/nsCacheService.cpp #endif - nsCacheService * nsCacheService::gService = nsnull; why'd you remove this empty line? + nsresult rv = NS_NewThread(getter_AddRefs(mCacheIOThread)); + if (NS_FAILED(rv)) { + NS_WARNING("Can't create cache IO thread"); why not return rv; after this? this shouldn't happen normally. +NS_IMETHODIMP nsCacheService::GetCacheIOTarget(nsIEventTarget * *aCacheIOTarget) +{ + NS_ENSURE_ARG_POINTER(aCacheIOTarget); no need to nullcheck out params +++ b/netwerk/protocol/http/src/nsHttpChannel.cpp Why did you make mAsyncCacheOpen a member variable? NS_ENSURE_SUCCESS(rv, aIsSync ? rv : Connect(PR_FALSE)); This is kinda hard to read, why not: if (NS_FAILED(rv) && !aIsSync) return Connect(PR_FALSE); NS_ENSURE_SUCCESS(rv, rv); +++ b/netwerk/protocol/http/src/nsHttpChannel.h + typedef nsresult (nsHttpChannel:: *nsOnCacheEntryAvailableCallback)(nsICacheEntryDescriptor *, nsCacheAccessMode, nsresult, PRBool); 80 columns please
Comment 35•14 years ago
|
||
> Why did you make mAsyncCacheOpen a member variable? See comment 13. I might be wrong, but it seemed better than passing an extra arg through a long set of function calls.
Comment 36•14 years ago
|
||
Michal, can you provide an updated patch here for someone to land ASAP, we'd love to have this in beta1, which is just a few days away now...
Assignee | ||
Comment 37•14 years ago
|
||
(In reply to comment #34) > +++ b/netwerk/cache/src/nsCacheService.cpp > #endif > - > nsCacheService * nsCacheService::gService = nsnull; > > why'd you remove this empty line? I've probably added some code that I removed later... Fixed. > + nsresult rv = NS_NewThread(getter_AddRefs(mCacheIOThread)); > + if (NS_FAILED(rv)) { > + NS_WARNING("Can't create cache IO thread"); > > why not return rv; after this? this shouldn't happen normally. Fixed. > +NS_IMETHODIMP nsCacheService::GetCacheIOTarget(nsIEventTarget * > *aCacheIOTarget) > +{ > + NS_ENSURE_ARG_POINTER(aCacheIOTarget); > > no need to nullcheck out params Fixed. > +++ b/netwerk/protocol/http/src/nsHttpChannel.cpp > > Why did you make mAsyncCacheOpen a member variable? The first patch added a new meaning to the parameter delayed. It was null when the call was from OnCacheEntryAvailable. I've split it into aIsSync and mAsyncCacheOpen. As Jason already wrote, this seems to be a little easier to read. > NS_ENSURE_SUCCESS(rv, aIsSync ? rv : Connect(PR_FALSE)); > > This is kinda hard to read, why not: > if (NS_FAILED(rv) && !aIsSync) > return Connect(PR_FALSE); > NS_ENSURE_SUCCESS(rv, rv); Fixed > +++ b/netwerk/protocol/http/src/nsHttpChannel.h > > + typedef nsresult (nsHttpChannel:: > *nsOnCacheEntryAvailableCallback)(nsICacheEntryDescriptor *, nsCacheAccessMode, > nsresult, PRBool); > > 80 columns please Fixed
Attachment #445931 -
Attachment is obsolete: true
Comment 38•14 years ago
|
||
--> beta2+ but jst said he might land it tonight
blocking2.0: beta1+ → beta2+
Comment 39•14 years ago
|
||
If this is ready to land, I will land it tomorrow morning. Just let me know if I need to do that.
Comment 40•14 years ago
|
||
sdwilsh, if I don't manage to land this tonight then please do land tomorrow morning if you have a chance to!
Keywords: checkin-needed
Comment 41•14 years ago
|
||
This was landed <http://hg.mozilla.org/mozilla-central/rev/b560649a36f1> and then backed out <http://hg.mozilla.org/mozilla-central/rev/e6fcd886df50> due to test failures.
Updated•14 years ago
|
Keywords: checkin-needed
Assignee | ||
Comment 42•14 years ago
|
||
This patch fixes following test failures: 1) mochitest /tests/dom/tests/mochitest/ajax/offline/test_bypass.html The same check for (mLoadFlags & LOAD_ONLY_FROM_CACHE) as after OpenCacheEntry() is needed in case Connect() is called again. Otherwise we would hit the net in case we shouldn't... 2) xpcshelltest content/unit/test_error_codes.js Devices are created on the cache IO thread because of lazy initialization. The assertion occurs because StorageService is initialized off the main thread (see backtrace below). I've disabled lazy initialization for offline device. Does anybody have a better idea? Calling simply do_GetService("@mozilla.org/storage/service;1") in nsCacheService::Init() would also fix the problem but this would be IMHO ugly hack. #3 0x0021a014 in mozalloc_abort (msg= 0xf5dfe5ec "###!!! ASSERTION: Using observer service off the main thread!: 'Error', file /opt/moz/hg/src/xpcom/ds/nsObserverService.cpp, line 128") at /opt/moz/hg/src/memory/mozalloc/mozalloc_abort.cpp:75 #4 0x0229aaee in Abort (aMsg=0xf5dfe5ec "###!!! ASSERTION: Using observer service off the main thread!: 'Error', file /opt/moz/hg/src/xpcom/ds/nsObserverService.cpp, line 128") at /opt/moz/hg/src/xpcom/base/nsDebugImpl.cpp:379 #5 0x0229aab1 in NS_DebugBreak_P (aSeverity=1, aStr=0x2929290 "Using observer service off the main thread!", aExpr=0x2929287 "Error", aFile= 0x2929188 "/opt/moz/hg/src/xpcom/ds/nsObserverService.cpp", aLine=128) at /opt/moz/hg/src/xpcom/base/nsDebugImpl.cpp:366 #6 0x0223ee7c in nsObserverService::AddObserver (this=0x8a30ce8, anObserver=0xf5200be4, aTopic=0x286de48 "xpcom-shutdown", ownsWeak=0) at /opt/moz/hg/src/xpcom/ds/nsObserverService.cpp:128 #7 0x01f19143 in mozilla::storage::Service::initialize (this=0xf5200be0) at /opt/moz/hg/src/storage/src/mozStorageService.cpp:189 #8 0x01f18cec in mozilla::storage::Service::getSingleton () at /opt/moz/hg/src/storage/src/mozStorageService.cpp:114 #9 0x01f18512 in mozilla::storage::ServiceConstructor (aOuter=0x0, aIID=..., aResult=0xf5dfecc0) at /opt/moz/hg/src/storage/build/mozStorageModule.cpp:53 #10 0x0222b110 in nsGenericFactory::CreateInstance (this=0xf5200bc8, aOuter=0x0, aIID=..., aResult=0xf5dfecc0) at nsGenericFactory.cpp:80 #11 0x022805d8 in nsComponentManagerImpl::CreateInstanceByContractID (this=0x893ef88, aContractID=0x257bde0 "@mozilla.org/storage/service;1", aDelegate=0x0, aIID=..., aResult= 0xf5dfecc0) at /opt/moz/hg/src/xpcom/components/nsComponentManager.cpp:1725 #12 0x02281961 in nsComponentManagerImpl::GetServiceByContractID (this=0x893ef88, aContractID=0x257bde0 "@mozilla.org/storage/service;1", aIID=..., result=0xf5dfedac) at /opt/moz/hg/src/xpcom/components/nsComponentManager.cpp:2310 #13 0x0221e536 in CallGetService (aContractID=0x257bde0 "@mozilla.org/storage/service;1", aIID=..., aResult=0xf5dfedac) at nsComponentManagerUtils.cpp:94 #14 0x0221ea6a in nsGetServiceByContractIDWithError::operator() (this=0xf5dfeedc, aIID=..., aInstancePtr=0xf5dfedac) at nsComponentManagerUtils.cpp:288 #15 0x0103b726 in nsCOMPtr<mozIStorageService>::assign_from_gs_contractid_with_error (this=0xf5dfeec0, gs=..., aIID=...) at ../../dist/include/nsCOMPtr.h:1239 #16 0x0103a759 in nsCOMPtr<mozIStorageService>::nsCOMPtr (this=0xf5dfeec0, gs=...) at ../../dist/include/nsCOMPtr.h:612 #17 0x01031837 in nsOfflineCacheDevice::Init (this=0xf5200780) at /opt/moz/hg/src/netwerk/cache/nsDiskCacheDeviceSQL.cpp:980 #18 0x0101a73a in nsCacheService::CreateOfflineDevice (this=0x8fd51f8) at /opt/moz/hg/src/netwerk/cache/nsCacheService.cpp:1067 3) xpcshelltest browser/components/places/tests/test_txnGUIDs.js Cache IO thread is shutdown outside the cache service lock to prevent a deadlock.
Attachment #452729 -
Attachment is obsolete: true
Attachment #454810 -
Flags: review?(jduell.mcbugs)
Comment 43•14 years ago
|
||
(In reply to comment #42) > Devices are created on the cache IO thread because of lazy initialization. The > assertion occurs because StorageService is initialized off the main thread (see > backtrace below). I've disabled lazy initialization for offline device. Does > anybody have a better idea? Calling simply > do_GetService("@mozilla.org/storage/service;1") in nsCacheService::Init() would > also fix the problem but this would be IMHO ugly hack. File a bug on storage. This isn't supposed to be illegal, and it looks like I regressed it. I can get that fixed for you pretty quickly.
Updated•14 years ago
|
Target Milestone: mozilla1.9.3 → mozilla2.0
Updated•14 years ago
|
blocking2.0: beta2+ → betaN+
Comment 44•14 years ago
|
||
Comment on attachment 454810 [details] [diff] [review] patch v7 Looks good. Patch has bitrotted slightly. It sounds like sdwilsh thinks the lazy initialization for offline device should be kept--did we file/fix a bug on that?
Attachment #454810 -
Flags: review?(jduell.mcbugs) → review+
Assignee | ||
Comment 45•14 years ago
|
||
> It sounds like sdwilsh thinks the lazy initialization for offline device should
> be kept--did we file/fix a bug on that?
Yes, the bug number is #578939.
Assignee | ||
Comment 46•14 years ago
|
||
Lazy initialization of the offline device is kept in this patch.
Attachment #454810 -
Attachment is obsolete: true
Comment 47•14 years ago
|
||
Pushed to mozilla-central now that bug 578939 is fixed. http://hg.mozilla.org/mozilla-central/rev/1e8961782824
http://hg.mozilla.org/mozilla-central/pushloghtml?changeset=1ac135c8a484 Backed out due to s: talos-r3-fed64-015 TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/mozilla-central-fedora64-debug-u-xpcshell/build/xpcshell/tests/content/unit/test_error_codes.js | test failed (with xpcshell return code: -6), see following log: Bug 560197 - [xpcshell-tests] Intermittent timeout after test_csputils.js, when running test_error_codes.js PROCESS-CRASH | /home/cltbld/talos-slave/mozilla-central-fedora64-debug-u-xpcshell/build/xpcshell/tests/content/unit/test_error_codes.js | application crashed (minidump found) Thread 0 (crashed) TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/mozilla-central-fedora64-debug-u-xpcshell/build/xpcshell/tests/test_necko/unit/test_fallback_no-cache-entry_passing.js | test failed (with xpcshell return code: 0), see following log: TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/mozilla-central-fedora64-debug-u-xpcshell/build/xpcshell/tests/test_necko/unit/head_channels.js | Could not get contentLength - See following stack: TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/mozilla-central-fedora64-debug-u-xpcshell/build/xpcshell/tests/test_necko/unit/head_channels.js | Error in onStartRequest: 2147500036 - See following stack: TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/mozilla-central-fedora64-debug-u-xpcshell/build/xpcshell/tests/test_necko/unit/head_channels.js | Failed to load URL: 804b0046 - See following stack: TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/mozilla-central-fedora64-debug-u-xpcshell/build/xpcshell/tests/test_necko/unit/head_channels.js | Error in onStopRequest: 2147500036 - See following stack: all over Xo and Xd.
Assignee | ||
Comment 49•14 years ago
|
||
Bug 578939 doesn't seem to fix the problem with storage initialization. I'll have a look at it during the weekend.
Assignee | ||
Comment 50•14 years ago
|
||
There are 2 failing tests: 1) test_fallback_no-cache-entry_passing.js I'm still investigating this. 2) test_error_codes.js There is an assertion because nsOfflineCacheDevice is destroyed on a different thread than it was created and it isn't declared as threadsafe. If I do so (I think that the code is threadsafe) then I got assertion from mozstorage that the database must be closed on the same thread as it was opened (see below). Shawn, do you think that this can be fixed in the mozstorage code? ###!!! ASSERTION: Must close the database on the thread that you opened it with!: 'Error', file /opt/moz/hg/src/storage/src/mozStorageConnection.cpp, line 543 (gdb) bt 20 #0 0x00a96832 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x00254a81 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #2 0x0025634a in abort () at abort.c:92 #3 0x0075e014 in mozalloc_abort (msg= 0xffa9488c "###!!! ASSERTION: Must close the database on the thread that you opened it with!: 'Error', file /opt/moz/hg/src/storage/src/mozStorageConnection.cpp, line 543") at /opt/moz/hg/src/memory/mozalloc/mozalloc_abort.cpp:75 #4 0x0248115e in Abort (aMsg= 0xffa9488c "###!!! ASSERTION: Must close the database on the thread that you opened it with!: 'Error', file /opt/moz/hg/src/storage/src/mozStorageConnection.cpp, line 543") at /opt/moz/hg/src/xpcom/base/nsDebugImpl.cpp:379 #5 0x02481121 in NS_DebugBreak_P (aSeverity=1, aStr=0x2ac0eec "Must close the database on the thread that you opened it with!", aExpr=0x2ac0ee6 "Error", aFile=0x2ac0a50 "/opt/moz/hg/src/storage/src/mozStorageConnection.cpp", aLine=543) at /opt/moz/hg/src/xpcom/base/nsDebugImpl.cpp:366 #6 0x02065e95 in mozilla::storage::Connection::setClosedState (this=0xf5e00c38) at /opt/moz/hg/src/storage/src/mozStorageConnection.cpp:543 #7 0x02066341 in mozilla::storage::Connection::Close (this=0xf5e00c38) at /opt/moz/hg/src/storage/src/mozStorageConnection.cpp:630 #8 0x020651ee in mozilla::storage::Connection::~Connection (this=0xf5e00c38, __in_chrg=<value optimized out>) at /opt/moz/hg/src/storage/src/mozStorageConnection.cpp:326 #9 0x020653b1 in mozilla::storage::Connection::Release (this=0xf5e00c38) at /opt/moz/hg/src/storage/src/mozStorageConnection.cpp:329 #10 0x0206474d in nsRefPtr<mozilla::storage::Connection>::~nsRefPtr (this=0xf5e02b7c, __in_chrg=<value optimized out>) at ../../dist/include/nsAutoPtr.h:969 #11 0x0206d453 in mozilla::storage::StorageBaseStatementInternal::~StorageBaseStatementInternal (this=0xf5e02b78, __in_chrg=<value optimized out>) at /opt/moz/hg/src/storage/src/StorageBaseStatementInternal.h:75 #12 0x0206a638 in mozilla::storage::Statement::~Statement (this=0xf5e02b70, __in_chrg=<value optimized out>) at /opt/moz/hg/src/storage/src/mozStorageStatement.cpp:282 #13 0x0206a7a7 in mozilla::storage::Statement::Release (this=0xf5e02b70) at /opt/moz/hg/src/storage/src/mozStorageStatement.cpp:288 #14 0x010f6ca4 in nsCOMPtr<mozIStorageStatement>::~nsCOMPtr (this=0xf5e00820, __in_chrg=<value optimized out>) at ../../dist/include/nsCOMPtr.h:533 #15 0x010ecee4 in nsOfflineCacheDevice::~nsOfflineCacheDevice (this=0xf5e00808, __in_chrg=<value optimized out>) at /opt/moz/hg/src/netwerk/cache/nsDiskCacheDeviceSQL.cpp:787 #16 0x010ecf49 in nsOfflineCacheDevice::~nsOfflineCacheDevice (this=0xf5e00808, __in_chrg=<value optimized out>) at /opt/moz/hg/src/netwerk/cache/nsDiskCacheDeviceSQL.cpp:787 #17 0x010ecaa4 in nsOfflineCacheDevice::Release (this=0xf5e00808) at /opt/moz/hg/src/netwerk/cache/nsDiskCacheDeviceSQL.cpp:775 #18 0x010d6834 in nsCacheService::Shutdown (this=0x9f30eb0) at /opt/moz/hg/src/netwerk/cache/nsCacheService.cpp:778 #19 0x010d5118 in nsCacheProfilePrefObserver::Observe (this=0x9f326e0, subject=0x9cb3f14, topic=0x2b9023c "xpcom-shutdown", data_unicode=0x0) at /opt/moz/hg/src/netwerk/cache/nsCacheService.cpp:262
Comment 51•14 years ago
|
||
(In reply to comment #50) > Shawn, do you think that this can be fixed in the mozstorage code? > > ###!!! ASSERTION: Must close the database on the thread that you opened it > with!: 'Error', file /opt/moz/hg/src/storage/src/mozStorageConnection.cpp, line > 543 No, that's an assertion that we have because we need that invariant to hold true.
Assignee | ||
Comment 52•14 years ago
|
||
- nsOfflineCacheDevice releases DB on the same thread as it was created - fixed handling of fallback channel in nsHttpChannel::OnCacheEntryAvailable This patch seems to pass try server. There were few failures that seem to be unrelated to this patch and I couldn't reproduce them.
Attachment #459002 -
Attachment is obsolete: true
Assignee | ||
Updated•14 years ago
|
Keywords: checkin-needed
Assignee | ||
Comment 53•14 years ago
|
||
Shawn, could you review the database closing in the new patch?
Attachment #467461 -
Flags: review?(sdwilsh)
Comment 54•14 years ago
|
||
This is still waiting for review, right? Removing checkin-needed.
Keywords: checkin-needed
Comment 55•14 years ago
|
||
Comment on attachment 467461 [details] [diff] [review] difference between v8 and v10 Your indentation appears to be weird (could be the interdiff), but otherwise looks fine. r=sdwilsh
Attachment #467461 -
Flags: review?(sdwilsh) → review+
Updated•14 years ago
|
Keywords: checkin-needed
Comment 56•14 years ago
|
||
Landed as: http://hg.mozilla.org/mozilla-central/rev/d0b284052d29
Comment 57•14 years ago
|
||
Looks like this stuck this time! Marking FIXED.
Comment 58•14 years ago
|
||
I backed out the patch because it seems that it was the cause of a talos regression (tp4). On Linux-x86-65 opt, results without the patch are around 327 and with the patch around 337. http://hg.mozilla.org/mozilla-central/rev/643758385088
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to comment #58) > I backed out the patch because it seems that it was the cause of a talos > regression (tp4). On Linux-x86-65 opt, results without the patch are around 327 > and with the patch around 337. > > http://hg.mozilla.org/mozilla-central/rev/643758385088 Did we see Talos numbers go back down after backing this out?
Comment 60•14 years ago
|
||
(In reply to comment #59) > (In reply to comment #58) > > I backed out the patch because it seems that it was the cause of a talos > > regression (tp4). On Linux-x86-65 opt, results without the patch are around 327 > > and with the patch around 337. > > > > http://hg.mozilla.org/mozilla-central/rev/643758385088 > > Did we see Talos numbers go back down after backing this out? Talos tp4 on Linux 64bits is back to 327. However, there were no notifications in tree-management but I have no idea how these notifications should behave...
Comment 61•14 years ago
|
||
So I've been looking more at the data from the talos runs that we did while this patch was in the tree and here's what I see: Windows: flat Linux: ~3% regression Mac: ~8% gain Given that I think we're worse off w/o this patch so I'm voting for landing this again. Michal says he will do some investigation with callgrind here to see if we're missing anyting obvious, but I think we should land that independent of what Michal finds (which he won't have time to do until later in the week). If anyone disagrees please speak up, if not I'll add this to the landing queue later today and get this back in the tree. Here's some graph links for reference, this change landed at 11 am on the 21st and got backed out at 8pm on the 22nd. Windows: http://graphs.mozilla.org/graph.html#tests=[[38,1,351],[38,1,352],[38,1,353],[38,1,354],[38,1,355],[38,1,356],[38,1,357],[38,1,358],[38,1,359],[38,1,360],[38,1,361],[38,1,362],[38,1,363],[38,1,364],[38,1,365],[38,1,366],[38,1,367],[38,1,368],[38,1,369],[38,1,370],[38,1,483],[38,1,484],[38,1,485],[38,1,486],[38,1,487],[38,1,488],[38,1,489],[38,1,490],[38,1,491],[38,1,492],[38,1,493],[38,1,494],[38,1,495],[38,1,496],[38,1,497],[38,1,498],[38,1,499],[38,1,500],[38,1,501],[38,1,648],[38,1,649],[38,1,650],[38,1,651],[38,1,652],[38,1,653],[38,1,654],[38,1,655],[38,1,656],[38,1,657],[38,1,883],[38,1,884],[38,1,885]]&sel=1282129268,1282573826 Linux: http://graphs.mozilla.org/graph.html#tests=[[38,1,431],[38,1,432],[38,1,433],[38,1,434],[38,1,435],[38,1,436],[38,1,437],[38,1,438],[38,1,439],[38,1,440],[38,1,441],[38,1,442],[38,1,443],[38,1,444],[38,1,445],[38,1,446],[38,1,447],[38,1,448],[38,1,449],[38,1,450],[38,1,688],[38,1,689],[38,1,690],[38,1,691],[38,1,692],[38,1,693],[38,1,694],[38,1,695],[38,1,696],[38,1,697],[38,1,698],[38,1,699],[38,1,700],[38,1,701],[38,1,702],[38,1,703],[38,1,704],[38,1,705],[38,1,706],[38,1,707],[38,1,708],[38,1,709],[38,1,710],[38,1,711],[38,1,712],[38,1,713],[38,1,714],[38,1,715],[38,1,716],[38,1,717],[38,1,895],[38,1,896],[38,1,897],[38,1,898],[38,1,899]]&sel=1282129268,1282573826 Mac: http://graphs.mozilla.org/graph.html#tests=[[38,1,530],[38,1,531],[38,1,532],[38,1,533],[38,1,534],[38,1,535],[38,1,536],[38,1,537],[38,1,538],[38,1,539],[38,1,540],[38,1,576],[38,1,577],[38,1,578],[38,1,579],[38,1,580],[38,1,581],[38,1,582],[38,1,718],[38,1,719],[38,1,720],[38,1,721],[38,1,722],[38,1,723],[38,1,724],[38,1,725],[38,1,726],[38,1,727],[38,1,728],[38,1,729],[38,1,730],[38,1,731],[38,1,732],[38,1,733],[38,1,734],[38,1,735],[38,1,736],[38,1,737],[38,1,738],[38,1,739],[38,1,740],[38,1,741],[38,1,742],[38,1,743],[38,1,744],[38,1,745],[38,1,746],[38,1,747],[38,1,900],[38,1,901],[38,1,902],[38,1,903],[38,1,904]]&sel=1282129268,1282573826
Comment 62•14 years ago
|
||
I'm of the opinion that it is best to land something like this regardless of the regressions as long as it doesn't break anything else and work on finding/fixing the regression after it has landed. I think the positives outweigh the negatives here. =)
Comment 63•14 years ago
|
||
Relanded: http://hg.mozilla.org/mozilla-central/rev/72d2863f43c7
Comment 64•14 years ago
|
||
Seems like everything's green.
Status: REOPENED → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → FIXED
Updated•14 years ago
|
Target Milestone: mozilla2.0 → mozilla2.0b5
![]() |
||
Comment 65•14 years ago
|
||
Because this patch is suspected to cause following top crash: http://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&signature=nsHttpChannel%3A%3AContinueProcessResponse%28unsigned%20int%29&version=Firefox%3A4.0b5pre I would like to take it for few days off the trunk and see of this crash disappears. Everybody agree?
Comment 66•14 years ago
|
||
Please back this out, it caused the very common test_twostreams failure also, and there's definitely logic errors there which need to be fixed.
Assignee | ||
Comment 67•14 years ago
|
||
I'd prefer to back out only the HTTP part of the patch. If there is any problem it should be there.
![]() |
||
Comment 68•14 years ago
|
||
(In reply to comment #67) > I'd prefer to back out only the HTTP part of the patch. If there is any problem > it should be there. Good idea, probably no need to back the rest out.
Comment 69•14 years ago
|
||
Not sure if this is related, but any idea what this crash is about? Seen on a try run recently, after this patch landed. http://tinderbox.mozilla.org/showlog.cgi?tree=MozillaTry&errorparser=unittest&logfile=1283410607.1283430259.8701.gz&buildtime=1283410607&buildname=WINNT%205.2%20tryserver%20leak%20test%20build&fulltext=1
So...did anyone back out this problematic patch? "Prefer to back out only X" is nice and all, but the responsible thing to do is back out the thing that broke, so it doesn't interfere with others and then land parts of it if they test correctly.
Comment 71•14 years ago
|
||
Yes, this did get backed out in bug 589296, but yes, only partially. Last I spoke with Honza he still wanted to check a few things before confirming in bug 589296 that the crash really is gone (looks good so far AFAIK). This also lead to bug 594882, which now has the likely reason for the crash that was introduced with this patch.
![]() |
||
Comment 72•14 years ago
|
||
Johnny is correct. The crash is gone after backing out the nsHttpChannel part of this bug's patch. Fix + re-land will be tracked in bug 594882.
Comment 73•14 years ago
|
||
The backout appears to be http://hg.mozilla.org/mozilla-central/rev/a93c9118f16d
Comment 74•14 years ago
|
||
Backout landed on GECKO20b5_20100831_RELBRANCH http://hg.mozilla.org/mozilla-central/rev/11e16ebb2d38
Comment 75•14 years ago
|
||
Other than the addition of the cacheIOTarget attribute on nsICacheService, is there anything here that needs noting for developers? After poring over the code for a while, there doesn't really seem to be anything here to document.
Assignee | ||
Comment 76•14 years ago
|
||
(In reply to comment #75) > Other than the addition of the cacheIOTarget attribute on nsICacheService, is > there anything here that needs noting for developers? Maybe somewhere could be noted that using asyncOpenCacheEntry() is preferred to openCacheEntry() since it doesn't generate IO on the calling thread.
Comment 77•14 years ago
|
||
Added a note here: https://developer.mozilla.org/en/XPCOM_Interface_Reference/nsICacheSession#asyncOpenCacheEntry() Added doc for cacheIOTarget: https://developer.mozilla.org/en/XPCOM_Interface_Reference/nsICacheService
Keywords: dev-doc-needed → dev-doc-complete
You need to log in
before you can comment on or make changes to this bug.
Description
•