Closed Bug 1032254 Opened 10 years ago Closed 9 years ago

Provide a way to pin resources in the http cache

Categories

(Core :: Networking: Cache, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla45
Tracking Status
firefox45 --- fixed

People

(Reporter: fabrice, Assigned: mayhemer)

References

(Depends on 2 open bugs, Blocks 2 open bugs)

Details

Attachments

(1 file, 22 obsolete files)

(deleted), patch
mayhemer
: review+
Details | Diff | Splinter Review
To support the new way of installing apps by downloading packages from http(s), we need to make sure they are not evicted from the cache. That will guarantee that apps will work offline.
Obviously some overlap here with what we are planning for ServiceWorker cache.  There we chose not to use beckon cache since we want quota manager support in addition to unlimited age.  Just FYI.
So, after talking with Fabrice on IRC, this is the list of requirements:

- first, we want to make it work only for "!//" URLs, so that the package zip is always "pinned" for an app
- reading from a pinned cached ZIP must be transparent (w/o telling the channel explicitly to do so)
- the storage path must be different than the http cache's (since the cache partition is too small)
- hence, *the "!//" is going to be our indication* the URL want to store pinned or load from a pinning storage
- the pinned cached entries must not be deleted with "Clear Recent History / Cache" or "Forget About This Site"
- pinned entries will only be deleted when the whole app goes away, otherwise persist indefinitely
- the storage will be under the "roaming" part of the profile
- private browsing is not a concern, but will probably be respected (simple to implement)
According the API, when the channel is told to use a pinned cache (for read or write) it will use a new kind of storage I will add to nsICacheStorageService: pinningStorage(nsILoadContext).  That will for normal web content load (appid == 0) use the normal disk cache.  For apps (appid > 0) it will pass down the information the entry wants to pin/be searched in the pinned storage, which will simply use a different path: <roaming-profile>/pinnedCache/entries etc.
Assignee: nobody → honzab.moz
(In reply to Ben Kelly [:bkelly] from comment #1)
> Obviously some overlap here with what we are planning for ServiceWorker
> cache.  There we chose not to use beckon cache since we want quota manager
> support in addition to unlimited age.  Just FYI.

There is no overlap.  This is different, SW cache lives in DOM.  This is for loading web content/app content transparently via http: URLs - no SW involved.
(In reply to Honza Bambas (:mayhemer) from comment #4)
> There is no overlap.  This is different, SW cache lives in DOM.  This is for
> loading web content/app content transparently via http: URLs - no SW
> involved.

I think that's a little strong.  The whole purpose of ServiceWorkers is to allow apps to persist offline.  So it seems very relevant to installing apps as mentioned in comment 0.

Also, Cache/CacheStorage will ultimately be available on main-thread and dedicated workers in addition to just ServiceWorkers.

Still, I agree that based on our conversation there is not much overlap here if pinning is relatively easy in http cache.  The main functional difference between the two approaches is that Cache/CacheStorage will live in QuotaManager owned space while http pinning will not.
(In reply to Ben Kelly [:bkelly] from comment #5)
> (In reply to Honza Bambas (:mayhemer) from comment #4)
> > There is no overlap.  This is different, SW cache lives in DOM.  This is for
> > loading web content/app content transparently via http: URLs - no SW
> > involved.
> 
> I think that's a little strong.  The whole purpose of ServiceWorkers is to
> allow apps to persist offline.  So it seems very relevant to installing apps
> as mentioned in comment 0.
> 
> Also, Cache/CacheStorage will ultimately be available on main-thread and
> dedicated workers in addition to just ServiceWorkers.

Yes, but that wants to use Quota Manager.  And that is what HTTP cache doesn't and is not planned to.  There were more reasons to do so anyway (cannot recall all of them).

And AFAIK, this is already being coded the way I outline since it has already been decided a long time ago.

> 
> Still, I agree that based on our conversation there is not much overlap here
> if pinning is relatively easy in http cache.  The main functional difference
> between the two approaches is that Cache/CacheStorage will live in
> QuotaManager owned space while http pinning will not.

Yes, exactly.
Jonas also clarified for me that the current implementation plan is good.  Sorry for the noise.

Note, though, he wanted to make sure http cache pinning did not impact normal browser cache performance.  Is the ensured by this requirement from comment 2?

- the storage path must be different than the http cache's (since the cache partition is too small)
(In reply to Ben Kelly [:bkelly] from comment #7)
> Jonas also clarified for me that the current implementation plan is good. 
> Sorry for the noise.
> 
> Note, though, he wanted to make sure http cache pinning did not impact
> normal browser cache performance.  Is the ensured by this requirement from
> comment 2?
> 
> - the storage path must be different than the http cache's (since the cache
> partition is too small)

Yes, there will be zero performance impact.  But the channel must be told before AsyncOpen where to query the cache entry.  It can look either to the standard HTTP cache we have now or the pinned storage for an app.  Not both.
Status: NEW → ASSIGNED
Attached patch wip1 (obsolete) (deleted) — Splinter Review
- slightly de-statics CacheFileIOManager
- now it's possible to have more than a single instance
- gInstance remains intact and does the same job/same lifetime
- other instances managed by CacheFileIOManager::Pinning helper static class are mapped per-app (only for appId > 0, at least for now)
- each points to a different directory: <roaming-profile>/cache2/<appid>/
- CacheHandle and some of the IO manager events are parametrized with CacheFileIOManager (no more use of gInstance directly)
- CacheIndex is not accessible/doesn't work for these pinning managers (it is not a requirement to selectively delete or visit, can be well done in a followup bug)
- special files (used by the index) are still only using gInstance
- eviction of the whole pinning-app-id cache is implemented as a simple trash of the whole <roaming-profile>/cache2/<appid>/ dir
- CacheFileUtils::ParseKey now returns additional params (beyond load context info) in a structure (that structure is intended to grow with any new storage kind we add)

When you ask why pinning-app cannot be carried on load context info, here is the answer: load context info is a security isolation definer.  This definition has to apply to all storage forms we provide in the cache, affects all them.  Therefor I've decided to rather provide the pinning as a new kind of a storage (hence the new method on nsICacheStorageService), also because it actually is a new storage (regardless it's implemented on top of the existing IO manager code.)  Hence the pinning-app-id argument is only a parameter (indication) of use of a different storage type that must not affect other storage forms we have.
Attachment #8481540 - Flags: feedback?(michal.novotny)
Note to the patch: it's known the trash timer when not fired is going to leak.  I will revisit the added addref/release around this code in the next patch version.
Specifically what I was worried about was if pinned packages count towards the storage limits that the http cache has. I.e. if a package that is 100MB large gets pinned, that should not mean that we now have 100MB less available for "normal" http caching.
(In reply to Jonas Sicking (:sicking) from comment #11)
> Specifically what I was worried about was if pinned packages count towards
> the storage limits that the http cache has. I.e. if a package that is 100MB
> large gets pinned, that should not mean that we now have 100MB less
> available for "normal" http caching.

Pinned space *is not* counted to the common http downloaded web content.
Comment on attachment 8481540 [details] [diff] [review]
wip1

Review of attachment 8481540 [details] [diff] [review]:
-----------------------------------------------------------------

>    /**
> +   * Entries bound to an app will be persisted in a roaming part of the profile
> +   * and won't unpersist until the app goes away.  Common web content will use
> +   * disk cache storage, when under the private browsing mode, entries will be
> +   * held in memory and evicted on limit.
> +   */
> +  nsICacheStorage pinningCacheStorage(in nsILoadContextInfo aLoadContextInfo);

The description is a bit confusing to me. What are entries bound to the app? From bugs 1032254 and 1036275 I understand that we need to store to such storage only one or more zip files. So what exactly will be stored in this storage and who and when will set mPinCacheContent to true in nsHttpChannel? Btw, clarification of this will probably make some of my further questions/comments invalid.

Anyway, I don't like the terminology. From the name of the method one could think that the storage could be used to store permanently any entry. I think something like appPermanentCacheStorage would be IMO less confusing and more self explaining. I just don't like the word "pinning" which evokes to me a situation when it is possible to selectively pin some entry in the storage that normally doesn't ensure any persistence.


> - CacheIndex is not accessible/doesn't work for these pinning managers (it
> is not a requirement to selectively delete or visit, can be well done in a
> followup bug)

I don't see in the patch how you ensure that pinning managers do not call any method on the CacheIndex. AFAICS they use CacheIndex which is wrong.


How much entries do we expect to be stored in the pinning managers? If the common case will be just 1 zip file then it seems to me as an overhead to have one IO manager for each application.


Another quick thoughts:
- AFAICS, once the pinning IO manager is created it stays alive until shutdown. This needs some optimization.
- Some global limit of opened files (kOpenHandlesLimit) is needed.
- Overlimit eviction must be used only by the global IO manager.
Attachment #8481540 - Flags: feedback?(michal.novotny) → feedback+
(In reply to Michal Novotny (:michal) from comment #13)
> Comment on attachment 8481540 [details] [diff] [review]
> wip1
> 
> Review of attachment 8481540 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> >    /**
> > +   * Entries bound to an app will be persisted in a roaming part of the profile
> > +   * and won't unpersist until the app goes away.  Common web content will use
> > +   * disk cache storage, when under the private browsing mode, entries will be
> > +   * held in memory and evicted on limit.
> > +   */
> > +  nsICacheStorage pinningCacheStorage(in nsILoadContextInfo aLoadContextInfo);
> 
> The description is a bit confusing to me. What are entries bound to the app?
> From bugs 1032254 and 1036275 I understand that we need to store to such
> storage only one or more zip files. So what exactly will be stored in this
> storage and who and when will set mPinCacheContent to true in nsHttpChannel?
> Btw, clarification of this will probably make some of my further
> questions/comments invalid.

Actually, the method behavior may be a bit confusing itself, but makes the caller code much simpler and is somewhat forward compatible - when we enable pinning for even a common content, then it will "just work".

> 
> Anyway, I don't like the terminology. From the name of the method one could
> think that the storage could be used to store permanently any entry. I think
> something like appPermanentCacheStorage would be IMO less confusing and more
> self explaining. 

I'm not against.

> I just don't like the word "pinning" which evokes to me a
> situation when it is possible to selectively pin some entry in the storage
> that normally doesn't ensure any persistence.

On the other hand, everybody talks about this as "pinning" for ages, so, depends.

> 
> 
> > - CacheIndex is not accessible/doesn't work for these pinning managers (it
> > is not a requirement to selectively delete or visit, can be well done in a
> > followup bug)
> 
> I don't see in the patch how you ensure that pinning managers do not call
> any method on the CacheIndex. AFAICS they use CacheIndex which is wrong.

There is nothing like that - I mean any code that would prevent it.  But, since the pinning storage has a different context key, hashes will not conflict with regular storage.

I'll check the code and see what calls on the index by pinning managers exist, how serious are (mainly anything we could store there!) and for now block it.  It would though be great to also de-globalize the index - followup of course.

> 
> 
> How much entries do we expect to be stored in the pinning managers? If the
> common case will be just 1 zip file then it seems to me as an overhead to
> have one IO manager for each application.

For now I believe there will be just few files.  But I more prefer to keep the code cleaner than to pass the pinAppId argument everywhere (I tried and stopped after 30 minutes of figuring out where all I have to add it!).  In this case it makes perfectly sense to have a new instance for each pinning appid with all the handles/directory/trashing lists than something inside a single global manager.  The manager instance is small enough and it's not expected to have 100+ apps anyway.  If we have a smart cleanup code, we can get rid of unused managers (as you point out bellow) and still have a well functioning and well _isolated_ code.  Also, I believe we can have more kinds of storage and more IO manager customization may be needed on the future.  Hence isolate by instance is IMO better, the price is small.

> 
> 
> Another quick thoughts:
> - AFAICS, once the pinning IO manager is created it stays alive until
> shutdown. This needs some optimization.

As mentioned above, but I would prefer to do this in a followup.

> - Some global limit of opened files (kOpenHandlesLimit) is needed.

Very good point, should probably be done in the first patch.

> - Overlimit eviction must be used only by the global IO manager.

It should be - or I have missed that?

Seems like there is a need for CacheFileIOManager.mPinningManager flag that would completely prevent it, is that so?
Flags: needinfo?(michal.novotny)
After talking to Jason, this seems like a bug we don't need.  The original goal of pinning jar zips somewhere will be implemented a different way and pinning in http cache is not needed.

Let's leave the bug open, patch in, maybe one day it will actually be needed.
Flags: needinfo?(michal.novotny)
So, we still need it.  I'll update the patch as following:
- I'll leave the storage name, I still think it will have wider application in the future and the name fits the original requirement
- I'll have a global handles limit
- I'll implement prune of unused managers in a followup bug (not that simple)
- Same for the index de-globalization, that actually depends on it
Attached patch v1 (obsolete) (deleted) — Splinter Review
- new "pin" attribute on nsICachingChannel
- CacheFileIOManager now keeps mKind { GENERAL, PINNING } member
- added blocks on use of the index (index left global) based on mKind
- some metadata write schedule code changes reverted between wip1 and this patch
- trash timer now fires Notify(), distinction by m*Timer == aTimer checks, to avoid crashes/shutdown leak
- handle count limit is now global
- memory reporting (not individual) for the pinning managers too
- renaming some badly numbered cache2 xpcshell tests


https://tbpl.mozilla.org/?tree=Try&rev=b22421d0ea16
Attachment #8481540 - Attachment is obsolete: true
Attachment #8497690 - Flags: review?(michal.novotny)
Comment on attachment 8497690 [details] [diff] [review]
v1

Review of attachment 8497690 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/cache2/CacheFileIOManager.cpp
@@ +2028,5 @@
>    if (bytesWritten != -1 && aHandle->mFileSize < aOffset+bytesWritten) {
>      aHandle->mFileSize = aOffset+bytesWritten;
>  
>      if (!aHandle->IsDoomed() && !aHandle->IsSpecialFile()) {
>        uint32_t size = aHandle->FileSizeInK();

No need to get the size if the IOmanager is pinning. I.e. put the mKind contidition to the one above this line.

@@ +2777,5 @@
>  
>  // static
>  nsresult
>  CacheFileIOManager::EvictAll()
>  {

What's the reason for removing this comment?

@@ +3707,5 @@
>  {
>    MOZ_ASSERT(CacheFileIOManager::IsOnIOThreadOrCeased());
>    MOZ_ASSERT(!aHandle->mFD);
> +  MOZ_ASSERT(gInstance->mHandlesByLastUsed.IndexOf(aHandle) == mHandlesByLastUsed.NoIndex);
> +  MOZ_ASSERT(gInstance->mHandlesByLastUsed.Length() <= kOpenHandlesLimit);

We should make mHandlesByLastUsed static since only the one in the global IO manager is now used.

::: netwerk/cache2/CacheStorageService.cpp
@@ +1776,5 @@
>    NS_ENSURE_ARG(aStorage);
>  
> +  if (aStorage->IsPinning()) {
> +    return NS_ERROR_NOT_AVAILABLE;
> +  }

Shouldn't we implement this for pinning storage too rather than failing?
Attachment #8497690 - Flags: review?(michal.novotny) → feedback+
Depends on: 1121097
Honza, I'd like to get this landed. How can I help?
Flags: needinfo?(honzab.moz)
(In reply to Michal Novotny (:michal) from comment #18)
> >      if (!aHandle->IsDoomed() && !aHandle->IsSpecialFile()) {
> >        uint32_t size = aHandle->FileSizeInK();
> 
> No need to get the size if the IOmanager is pinning. I.e. put the mKind
> contidition to the one above this line.

Definitely.

> >  CacheFileIOManager::EvictAll()
> >  {
> 
> What's the reason for removing this comment?

Ni idea.  Leave it?

> > +  MOZ_ASSERT(gInstance->mHandlesByLastUsed.IndexOf(aHandle) == mHandlesByLastUsed.NoIndex);
> > +  MOZ_ASSERT(gInstance->mHandlesByLastUsed.Length() <= kOpenHandlesLimit);
> 
> We should make mHandlesByLastUsed static since only the one in the global IO
> manager is now used.

Probably yes.

> > +  if (aStorage->IsPinning()) {
> > +    return NS_ERROR_NOT_AVAILABLE;
> > +  }
> 
> Shouldn't we implement this for pinning storage too rather than failing?

It depends on having a separate index for the per-app pinned data.  So definitely not in this patch/bug.



(In reply to Valentin Gosu [:valentin] from comment #19)
> Honza, I'd like to get this landed. How can I help?

Rebase the patch, address the comments from Michal's review and give a new patch to me and him for review.

Please change the assignee field accordingly.
Flags: needinfo?(honzab.moz)
Attachment #8633518 - Flags: review?(michal.novotny)
Attachment #8497690 - Attachment is obsolete: true
Assignee: honzab.moz → valentin.gosu
Attachment #8633518 - Flags: review?(honzab.moz)
Attached patch interdiff.patch (obsolete) (deleted) — Splinter Review
Just a note - big fat note - that as the patch is written, we only pin stuff having an appid > 0 assigned.  if you want to pin anything, then it needs some rethinking.
(In reply to Honza Bambas (:mayhemer) from comment #23)
> Just a note - big fat note - that as the patch is written, we only pin stuff
> having an appid > 0 assigned.  if you want to pin anything, then it needs
> some rethinking.

I think that's a fair assumption. So if you have an appId > 0 you can pin stuff, which can be deleted when the app is uninstalled, but you can't do that with appId == 0. I don't know if that could be necessary for any other feature, but for packaged apps it seems ok.
Comment on attachment 8633518 [details] [diff] [review]
"Provide a way to pin resources in the http cache"

Review of attachment 8633518 [details] [diff] [review]:
-----------------------------------------------------------------

ok with me.

::: netwerk/cache2/CacheFileIOManager.cpp
@@ +898,1 @@
>                                                               mNewName);

nit: indention (one line will do?)

::: netwerk/cache2/CacheFileIOManager.h
@@ +399,5 @@
>    // before we start an eviction loop.
>    nsresult UpdateSmartCacheSize(int64_t aFreeSpace);
>  
> +  static CacheFileIOManager           *gInstance;
> +  static nsTArray<CacheFileHandle *>   sHandlesByLastUsed;

hmm.. static.. not good.  this should be dynamically allocated.  probably a reason I made it the gInstance->m.. way.

::: netwerk/cache2/PinningCacheStorage.h
@@ +17,5 @@
> +    : CacheStorage(aInfo, true, false)
> +  {
> +  }
> +
> +  virtual bool IsPinning() const;

'override' should be here.
Attachment #8633518 - Flags: review?(honzab.moz) → review+
(In reply to Valentin Gosu [:valentin] from comment #24)
> (In reply to Honza Bambas (:mayhemer) from comment #23)
> > Just a note - big fat note - that as the patch is written, we only pin stuff
> > having an appid > 0 assigned.  if you want to pin anything, then it needs
> > some rethinking.
> 
> I think that's a fair assumption. So if you have an appId > 0 you can pin
> stuff, which can be deleted when the app is uninstalled, but you can't do
> that with appId == 0. I don't know if that could be necessary for any other
> feature, but for packaged apps it seems ok.

Jonas do originAttributes change anything about the way we do cache pinning?
Flags: needinfo?(jonas)
Blocks: 1180093
Attachment #8633518 - Attachment is obsolete: true
Attachment #8633518 - Flags: review?(michal.novotny)
sorry had to back this out for bustage like https://treeherder.mozilla.org/logviewer.html#?job_id=12339447&repo=mozilla-inbound
Flags: needinfo?(valentin.gosu)
Valentin, I'm stealing this bug back.  This patch needs some rethinking for the new sec model.
Assignee: valentin.gosu → honzab.moz
Flags: needinfo?(valentin.gosu)
A couple of questions here.

First, I think Honza is right that this needs some changes to fit with the new security model. Specifically we should ideally use the OriginAttributes struct rather than manually handling appid/browserflag.

But that aside, it looks like this code is only keying things on appid, which isn't correct. At the very least we should use appid+browserflag as extra key in addition to the URL of the resource.


Second, will this enable us to pin a whole package into the http cache? I.e. if we make a request to the URL of the package, and then set the 'pin' attribute of that to true, will that enable us to make request to any resource within the package and know that there will be a cached resource?
Flags: needinfo?(jonas)
Also, if it matters, FirefoxOS currently only has requirements around pinning packages into the cache. However some of the features that we've talked about past the 2.5 release will involve pinning arbitrary other resources. It'd be great if that was possible here too, but definitely not required at this point.
(In reply to Jonas Sicking (:sicking) from comment #32)
> A couple of questions here.
> 
> First, I think Honza is right that this needs some changes to fit with the
> new security model. Specifically we should ideally use the OriginAttributes
> struct rather than manually handling appid/browserflag.
> 
> But that aside, it looks like this code is only keying things on appid,
> which isn't correct. At the very least we should use appid+browserflag as
> extra key in addition to the URL of the resource.
> 
> Second, will this enable us to pin a whole package into the http cache? I.e.
> if we make a request to the URL of the package, and then set the 'pin'
> attribute of that to true, will that enable us to make request to any
> resource within the package and know that there will be a cached resource?

With the current patch, you could only pin resources by setting the flag before loading the resource. Presumably, for packages, when the user pinned them, you could theoretically reload the package and pin it then, but that may not be the best way of doing it.

Honza may be able to improve the API during the OriginAttributes rewrite.

(In reply to Jonas Sicking (:sicking) from comment #33)
> Also, if it matters, FirefoxOS currently only has requirements around
> pinning packages into the cache. However some of the features that we've
> talked about past the 2.5 release will involve pinning arbitrary other
> resources. It'd be great if that was possible here too, but definitely not
> required at this point.

Pinning arbitrary resources is possible.
Flags: needinfo?(honzab.moz)
So, it seems the best would be to design the pinning system for arbitrary resources rather then to bind pinning to any kind of context identification.  Simply said - any resource can easily be pinned just by setting a flag (or something) on http channel.

If we are OK to have two distinct caches, where a URL can be in the cache once pinned and once not pinned (two distinct cached resources) and we always know during the load which cache to look into, it should be easy to do.  Jonas?


We can go two ways:
1. extend entry keying and the current cache index database to add a "pinned" flag
2. deglobalize CacheIndex and have two CacheIOManagers (similar to the patch in this bug) one general and one pinned working with a whole different directory

ad 1 pros: relatively easy to impl the basics
ad 1 cons: depends on bug 1081069, "clear all" much slower and needs to be rewriten (might be a lot of code)

ad 2 pros: generally a good code change, "clear all" is fast
ad 2 cons: CacheIndex needs to be deglobalized too (as CacheFileIOManager in the current patch), should not be that much work tho
Flags: needinfo?(honzab.moz) → needinfo?(jonas)
Depends on: 1192881
Are you saying that there's a "pin cache" and a "http cache"? And any time that we do a network request, we'd have to decide if we want necko to look in the "pin cache" or the "http cache"?

If so that doesn't really match what we need. My hope is that once a URL has been pinned, simply browsing to it in a normal browser window would use the pinned resource.

Is there a reason that makes implementation harder?

We wouldn't need any ability to "opt out of" using the pinned resource, it's fine that we'd always use it for any request.

One thing does occur to me though, which is that we actually want to use the pinned resource directly from cache even if the user has connectivity. I.e. if the user has internet connection, we'd want to use the pinned dialer, rather than trying to load a more recent version from the network when the user loads the URL of the dialer.
(In reply to Jonas Sicking (:sicking) from comment #36)
> Are you saying that there's a "pin cache" and a "http cache"? And any time
> that we do a network request, we'd have to decide if we want necko to look
> in the "pin cache" or the "http cache"?
> 
> If so that doesn't really match what we need. 

Jonas, I asked about this several times before.  Yes, that is what the design is about and you have already approved that, the current patch here does the same thing, for several months already.  Why is this now different? 

What you want is simply what appcache is doing, tho it has to have the info that is has to be looked up as well - the "pin" info is put to the channel from above.

> My hope is that once a URL has
> been pinned, simply browsing to it in a normal browser window would use the
> pinned resource.

And how do we know to pin it?

> 
> Is there a reason that makes implementation harder?

Pretty much.  It does, and the whole process of opening a cache entry will probably slow down.  We will have to store the "pin" info in the index.  Files with the pinned content need to be in a different directory (non-volatile), so prior opening need to know where to look for them.  To look quickly to the right place we need the index "pin" flag.  It means, before we do know if a cache entry is pinned, we need to read the whole index first.  Will probably slow loads during the startup.

What more, if we decide otherwise, I can throw days of my work off.

> 
> We wouldn't need any ability to "opt out of" using the pinned resource, it's
> fine that we'd always use it for any request.

It's impossible, of course.  The channel needs to know which cache it has to work with before it's asyncOpen'ed.  

> 
> One thing does occur to me though, which is that we actually want to use the
> pinned resource directly from cache even if the user has connectivity. I.e.
> if the user has internet connection, we'd want to use the pinned dialer,
> rather than trying to load a more recent version from the network when the
> user loads the URL of the dialer.

If we know a cache entry is pinned (or loaded from a pinning storage) it's very easy to fulfill this.
And, if you want to pin a cache entry later, obviously it means to move the file across file systems.  Not very optimal.

I can start implementing what you want, we can find some ways to make it reasonably sufficient if not even as optimal as we are now (with a lot of code glitches probably).

But PLEASE, let's decide clearly what exactly you want before I start coding anything.
I'm trying to understand this in more detail - you're certainly the cache expert!

(In reply to Honza Bambas (not reviewing) (:mayhemer) from comment #37)
>
> probably slow down.  We will have to store the "pin" info in the index. 

from a high level it seems to me that pin-flag is only interesting during the purge or overwrite phases. Making that part of the index makes sense (especially when dealing with overwrite) - but is there a giant cost to doing that?

> Files with the pinned content need to be in a different directory

this is the part I don't understand. Obviously if they are in a different directory a lot of complications ensue, but why does this data need to be partitioned in another place?
(In reply to Patrick McManus [:mcmanus] from comment #39)
> I'm trying to understand this in more detail - you're certainly the cache
> expert!
> 
> (In reply to Honza Bambas (not reviewing) (:mayhemer) from comment #37)
> >
> > probably slow down.  We will have to store the "pin" info in the index. 
> 
> from a high level it seems to me that pin-flag is only interesting during
> the purge or overwrite phases. Making that part of the index makes sense
> (especially when dealing with overwrite) - but is there a giant cost to
> doing that?
> 
> > Files with the pinned content need to be in a different directory
> 
> this is the part I don't understand. Obviously if they are in a different
> directory a lot of complications ensue, but why does this data need to be
> partitioned in another place?


To sum (again):

- normal HTTP cache content (what we currently cache) is stored purposely in a volatile directory, on android it's /cache, on desktop (e.g. win) it's the local part of the profile (local == considered cache only) that can go away at any time *not just by user intention*
- the pinned cache must be stored in some different area, best the user's roaming part of the profile to ensure it stays put and migrates (roams) with the profile on move or when stored on network drives (as I understand the pinning concept)
- the index file is currently stored along the files in the volatile directory, where it logically belongs
- the index is considered stall, but, well, can at any time be rebuilt from the files on disk (costly, slow, but not that slow)

One important difference is that with the model we have now or the one I propose the cache is still working efficiently, simply and correctly even w/o an index.

To locate the correct file (pinned or normal) the index must be always up to date and completely read (hundreds of kilobytes!).  So, to decide where to locate a file (is it in the persistent pinning area or in the normal volatile one?) we either have to wait for the index to be up and ready or try opening the file in both locations (pinned first of course - 99% cases will fail).  This slows down loads soon after platform start, and makes the code more complex.

I think getting the "use pinning to load" info should go from upper layers as they can store it or get it more efficiently than the cache, that is to decide.

It all can be done, I just need to know what to do exactly before I start coding anything - already lost some 4 days...  Now I'm waiting for Jonas to make the call.




So, here are the questions for Jonas:

- can you know prior a load start whether we should load a pinned resource rather than a common web content?  if so, we can set a flag on a channel before it's AsyncOpen'ed - makes things easier

- how exactly are resources gonna be pinned in the cache?  as I understand, the packageappservice should instruct the loading channels to do so before AsyncOpen IMO ; I would definitely avoid pin-after-already-loaded, that would have to be done above HTTP level anyway by LOAD_FROM_CACHE flagged refetch (cannot just pin a cache entry with something like nsICacheEntry.pin = true, that would mean to move, across filesystems, a *volatile*, so maybe already deleted, file to the pinned storage area)
Flags: needinfo?(jonas)
Flags: needinfo?(jonas)
(In reply to Honza Bambas (not reviewing) (:mayhemer) from comment #40)
>
> 
> - normal HTTP cache content (what we currently cache) is stored purposely in
> a volatile directory, on android it's /cache, on desktop (e.g. win) it's the
> local part of the profile (local == considered cache only) that can go away
> at any time *not just by user intention*

I think its worth asking the questions about whether the pinned cache entry can just be an attribute on the regular cache. I'm not real excited about the separate namespaces here. If that's all we can do, then I guess that's the reality - but its worth a couple more questions.

I think by volatile you mean that some process, other than firefox, can delete it? Its possible this is tolerable for the pinning case. Jonas? Its possible the core goal is that normal expiration/frecency-rules should not push cached entries out but worrying about the OS is an un-needed optimization. If the network really doesn't provide a backup for the cache, then this isn't really a cache - right?

Alternatively (if two stores really are needed), the index can manage which one to look in, right? If the common case is loading a valid index, how long does that take? Worrying about a performance hit from a double lookup in that period might be overkill..
If the volatility is not a problem, then this bug is easier to do, right.  In that case we are also not blocked on reading the index.  Pinning mostly just stops the out-of-limit auto eviction that is blocked until we have the index up.  OTOH, delete all common web content will no more be simple by just renaming cache2 -> trash.X, but we will have to delete files one by one.
For FirefoxOS we don't have any need to have the "pinned" files in a separate directory. There's no other system which is going through and deleting files, nor do we need to worry about the user walking in to the directory and nuking it.

(We don't actually even have a "clear cache" functionality anywhere I think, but I think we should assume that that will get added.)

(In reply to Honza Bambas (not reviewing) (:mayhemer) from comment #40)
> So, here are the questions for Jonas:
> 
> - can you know prior a load start whether we should load a pinned resource
> rather than a common web content?  if so, we can set a flag on a channel
> before it's AsyncOpen'ed - makes things easier

When we want to *pin* something, I always assumed that we would have to let necko know before we started th load. If for no other reason than that the resource might not end up in the cache due to being too big, or having some cache control headers set.

So yes, I think it's doable for us to indicate that we want to pin before we start the load. Worst case we can always reload once we know that we'll want to pin.

However it'd still be really nice if *loading* from pinned content doesn't require knowing that the URL that we're loading might have been pinned in the past.

It would be nice if there was an easy way to pin all resource in a loadgroup. For example by using a load flag, or by making it easy to use an nsILoadGroup.groupObserver.

> - how exactly are resources gonna be pinned in the cache?  as I understand,
> the packageappservice should instruct the loading channels to do so before
> AsyncOpen IMO ; I would definitely avoid pin-after-already-loaded, that
> would have to be done above HTTP level anyway by LOAD_FROM_CACHE flagged
> refetch (cannot just pin a cache entry with something like nsICacheEntry.pin
> = true, that would mean to move, across filesystems, a *volatile*, so maybe
> already deleted, file to the pinned storage area)

To be clear, no packageappservice will be involved. Since this doesn't necessarily involve neither packages or apps. Though sometimes the resource that we want to pin is indeed going to be a package since that would presumably enable us to read any of the resources inside the package.

My thinking was that the way this would work is that we'd create a channel, somehow indicate that it should be pinned, and then call AsyncOpen.

Alternatively, we would navigate a docshell to a particular URL, and as we load both that document, flag all the channels that are created in relation to that page load that they should be pinned. I.e. flag the channel for the HTML as should be pinned, and flag all channels for sub resources, as should be pinned.
Flags: needinfo?(jonas)
The other thing I think I should clarify, is that for FirefoxOS 2.5, we don't need this bug at all. I.e. we could *just* enable pinning of packages, i.e. just do bug 1180092.

Pinning non-package resources is something that we'll want to do in later releases, but it's totally fine if we don't bother with it for now.
(In reply to Jonas Sicking (:sicking) from comment #43)
> For FirefoxOS we don't have any need to have the "pinned" files in a
> separate directory. There's no other system which is going through and
> deleting files, nor do we need to worry about the user walking in to the
> directory and nuking it.

OK, makes things easier.  Just won't be 100% reliable on android and probably under some circumstances involving roaming profiles.

> 
> (We don't actually even have a "clear cache" functionality anywhere I think,
> but I think we should assume that that will get added.)

Doesn't matter if you want or don't want to clear pinned cached entries.  The thing is that this will change how we will "clear all" commonly cached web content, so some amount of work anyway.

> 
> (In reply to Honza Bambas (not reviewing) (:mayhemer) from comment #40)
> > So, here are the questions for Jonas:
> > 
> > - can you know prior a load start whether we should load a pinned resource
> > rather than a common web content?  if so, we can set a flag on a channel
> > before it's AsyncOpen'ed - makes things easier
> 
> When we want to *pin* something, I always assumed that we would have to let
> necko know before we started th load. 

That's what I intend to do.

> If for no other reason than that the
> resource might not end up in the cache due to being too big, or having some
> cache control headers set.

So, you want pinned resources be unconditionally cached regardless any limits?  What about no-store resources?

> 
> So yes, I think it's doable for us to indicate that we want to pin before we
> start the load. Worst case we can always reload once we know that we'll want
> to pin.

OK

> 
> However it'd still be really nice if *loading* from pinned content doesn't
> require knowing that the URL that we're loading might have been pinned in
> the past.

If we just have a "pinned" flag in the cache index, then it's doable.

> 
> It would be nice if there was an easy way to pin all resource in a
> loadgroup. For example by using a load flag, or by making it easy to use an
> nsILoadGroup.groupObserver.

This is probably doable (a followup bug).

> 
> > - how exactly are resources gonna be pinned in the cache?  as I understand,
> > the packageappservice should instruct the loading channels to do so before
> > AsyncOpen IMO ; I would definitely avoid pin-after-already-loaded, that
> > would have to be done above HTTP level anyway by LOAD_FROM_CACHE flagged
> > refetch (cannot just pin a cache entry with something like nsICacheEntry.pin
> > = true, that would mean to move, across filesystems, a *volatile*, so maybe
> > already deleted, file to the pinned storage area)
> 
> To be clear, no packageappservice will be involved. Since this doesn't
> necessarily involve neither packages or apps. Though sometimes the resource
> that we want to pin is indeed going to be a package since that would
> presumably enable us to read any of the resources inside the package.
> 
> My thinking was that the way this would work is that we'd create a channel,
> somehow indicate that it should be pinned, and then call AsyncOpen.
> 
> Alternatively, we would navigate a docshell to a particular URL, and as we
> load both that document, flag all the channels that are created in relation
> to that page load that they should be pinned. I.e. flag the channel for the
> HTML as should be pinned, and flag all channels for sub resources, as should
> be pinned.

And what about the HTML itself?  pinned later?  or refetched with do-pinning = true?



(In reply to Jonas Sicking (:sicking) from comment #44)
> The other thing I think I should clarify, is that for FirefoxOS 2.5, we
> don't need this bug at all. I.e. we could *just* enable pinning of packages,
> i.e. just do bug 1180092.

Hmm.. but it depends on this bug, right?

> 
> Pinning non-package resources is something that we'll want to do in later
> releases, but it's totally fine if we don't bother with it for now.

You need it anyway - resource as a resource.
Flags: needinfo?(jonas)
> OK, makes things easier.  Just won't be 100% reliable on android and
> probably under some circumstances involving roaming profiles.

There are no plans to use this code in Android or Desktop right now. Of course I can't guarantee that that won't change in the future, since I don't know what the Android/Desktop team plans are. But I also think it's hard to guess at what exact requirements those teams would have, so I would say that for now we should not worry about it.

> > If for no other reason than that the
> > resource might not end up in the cache due to being too big, or having some
> > cache control headers set.
> 
> So, you want pinned resources be unconditionally cached regardless any
> limits?  What about no-store resources?

Yes, we want it cached unconditionally. No matter how big the resource is, or if it is served with no-store headers.

> > However it'd still be really nice if *loading* from pinned content doesn't
> > require knowing that the URL that we're loading might have been pinned in
> > the past.
> 
> If we just have a "pinned" flag in the cache index, then it's doable.

Great!

> > It would be nice if there was an easy way to pin all resource in a
> > loadgroup. For example by using a load flag, or by making it easy to use an
> > nsILoadGroup.groupObserver.
> 
> This is probably doable (a followup bug).

Sounds good.

> > > - how exactly are resources gonna be pinned in the cache?  as I understand,
> > > the packageappservice should instruct the loading channels to do so before
> > > AsyncOpen IMO ; I would definitely avoid pin-after-already-loaded, that
> > > would have to be done above HTTP level anyway by LOAD_FROM_CACHE flagged
> > > refetch (cannot just pin a cache entry with something like nsICacheEntry.pin
> > > = true, that would mean to move, across filesystems, a *volatile*, so maybe
> > > already deleted, file to the pinned storage area)
> > 
> > To be clear, no packageappservice will be involved. Since this doesn't
> > necessarily involve neither packages or apps. Though sometimes the resource
> > that we want to pin is indeed going to be a package since that would
> > presumably enable us to read any of the resources inside the package.
> > 
> > My thinking was that the way this would work is that we'd create a channel,
> > somehow indicate that it should be pinned, and then call AsyncOpen.
> > 
> > Alternatively, we would navigate a docshell to a particular URL, and as we
> > load both that document, flag all the channels that are created in relation
> > to that page load that they should be pinned. I.e. flag the channel for the
> > HTML as should be pinned, and flag all channels for sub resources, as should
> > be pinned.
> 
> And what about the HTML itself?  pinned later?  or refetched with do-pinning
> = true?

Most likely the flow will be something like:

1. User browses to URL
2. Browser loads HTML and all sub-resources and displays page to user.
3. User likes page and decides to pin it.
4. User presses some UI to pin the page.
5. Browser reloads HTML and all sub-resources with pinning enabled.

I.e. we're likely going to have to reload the HTML as well as all resource. And for both the HTML and for all the sub-resources enable pinning on the channel before calling AsyncOpen.

> 
> (In reply to Jonas Sicking (:sicking) from comment #44)
> > The other thing I think I should clarify, is that for FirefoxOS 2.5, we
> > don't need this bug at all. I.e. we could *just* enable pinning of packages,
> > i.e. just do bug 1180092.
> 
> Hmm.. but it depends on this bug, right?
> 
> > 
> > Pinning non-package resources is something that we'll want to do in later
> > releases, but it's totally fine if we don't bother with it for now.
> 
> You need it anyway - resource as a resource.
Flags: needinfo?(jonas)
> (In reply to Jonas Sicking (:sicking) from comment #44)
> > The other thing I think I should clarify, is that for FirefoxOS 2.5, we
> > don't need this bug at all. I.e. we could *just* enable pinning of packages,
> > i.e. just do bug 1180092.
> 
> Hmm.. but it depends on this bug, right?

Yes, but we could technically build something which is package specific and thus not depend on this bug.

But I agree that it's technically better to do a generic resource pinning solution, and then treat packages as just another resource.

I just wanted to let you know what our requirements are and that we have other options if we wanted to.
I realized that we never discussed the requirements here. I put the requirements that we have for pinned packages in bug 1190290 comment 3. But since some of those requirements ideally would apply here too I thought I'd mention it here too.
No longer depends on: 1192881
Attached patch v2 (obsolete) (deleted) — Splinter Review
- a pinned cache entry is a "force valid" entry
- entries are marked "pinned" in the index and in a new entry metadata field "flags"
- metadata version is now 2, but we can read version 1 as well (tested)
- nsICachingChannel has an attribute boolean pin, set to true to store new stuff in the pinning cache
- naICacheStorageService has a new method pinningCacheStorage(loadcontextinfo) which returns a storage where new entries are marked as pinned
- the "pinned" info is carried from CacheStorage down to CacheEntry, CacheFile and even CacheHandle, then to CacheIndex via InitEntry
- pinned entries are omitted from over-limit eviction
- pinned entries are also left intact when nsICacheStorage.asyncEvict() and nsICacheStorageService.clear() is called
- clear all is made by context eviction with null load context info
- CacheFileContextEvictor can now except a "null" context which means "delete all contexts", leaves pinned entries untouched, "all contexts" persistent eviction file marker is "*" base64'ed
- getting info if an entry is pinned from the index is via CacheIndex::HasEntry whom signature has been extended (maybe we should add a separate method, but I'm not sure by all the state checks)
- a cache entry (to write to) is mandatory when an HTTP channel is set to pinning, on failure of getting an entry the channel fails
- update to the tests, identical to previous patches

Not super-deeply tested, will do tomorrow.  But this is good enough to go through the first r rounds.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=5f9d438c9c5e
Attachment #8633520 - Attachment is obsolete: true
Attachment #8641463 - Attachment is obsolete: true
Attachment #8650055 - Flags: review?(michal.novotny)
Attached patch v2 -w (if useful) (obsolete) (deleted) — Splinter Review
Attached patch v2.1 (obsolete) (deleted) — Splinter Review
- fixed failing tests:
  1. missing "cacheservice:empty-cache" in the EvictByContext(nullptr) case
  2. missing carry of PINNED flag down to CacheFileHandle
- pinning storage's AsyncEvictStorage() now deletes the pinned data
- tests for this feature updated

https://treeherder.mozilla.org/#/jobs?repo=try&revision=d1d1e16a54f9
Attachment #8650055 - Attachment is obsolete: true
Attachment #8650421 - Attachment is obsolete: true
Attachment #8650055 - Flags: review?(michal.novotny)
Attachment #8657452 - Flags: review?(michal.novotny)
Attached patch v2.1 -w (for reference) (obsolete) (deleted) — Splinter Review
Comment on attachment 8657452 [details] [diff] [review]
v2.1

Review of attachment 8657452 [details] [diff] [review]:
-----------------------------------------------------------------

It looks good in general, but I think that passing pinned status through the handle won't work correctly. CacheFileHandle::mPinned is true iff the CacheEntry was opened by pinningCacheStorage. If some pinned entry is opened by diskCacheStorage we'll treat this handle/entry as not pinned. Even if we update CacheFilehandle::mPinned immediately after parsing the entry, we'll always have a time window in which the handle will have incorrect pinning status.

::: netwerk/cache2/CacheFile.cpp
@@ +218,5 @@
>                  CacheFileListener *aCallback)
>  {
>    MOZ_ASSERT(!mListener);
>    MOZ_ASSERT(!mHandle);
>  

aMemoryOnly and aPinned are mutually exclusive, so please add an assertion MOZ_ASSERT(!aPinned || !aMemoryOnly)).

@@ +292,5 @@
>          LOG(("CacheFile::Init() - CacheFileIOManager isn't initialized, "
>               "initializing entry as memory-only. [this=%p]", this));
>  
>          mMemoryOnly = true;
> +        mMetadata = new CacheFileMetadata(mOpenAsMemoryOnly, mPinned, mKey);

If mPinned is true and we fail to open or create a file, is it OK to return succeess and work with memory-only entry?

::: netwerk/cache2/CacheFileContextEvictor.cpp
@@ +100,5 @@
>      }
> +  } else {
> +    // Not providing load context info means we want to delete everything,
> +    // so let's not bother with any currently running context cleanups.
> +    mEntries.Clear();

Persisted eviction info on disk must be removed too.

::: netwerk/cache2/CacheFileIOManager.cpp
@@ +2832,5 @@
>    if (!mCacheDirectory) {
> +    // This is a kind of hack. Somebody called EvictAll() without a profile.
> +    // This happens in xpcshell tests that use cache without profile. We need
> +    // to notify observers in this case since the tests are waiting for it.
> +    nsRefPtr<EvictionNotifierRunnable> r = new EvictionNotifierRunnable();

We should notify observers only if aLoadContextInfo == nullptr. Maybe also if aPinned == false, but I'm not sure.

@@ +3270,3 @@
>  {
>    LOG(("CacheFileIOManager::InitIndexEntry() [handle=%p, appId=%u, anonymous=%d"
>         ", inBrowser=%d]", aHandle, aAppId, aAnonymous, aInBrowser));

Log aPinned too.

::: netwerk/cache2/CacheFileMetadata.cpp
@@ +882,5 @@
>    }
>  
>    mMetaHdr.ReadFromBuf(mBuf + hdrOffset);
>  
> +  if (mMetaHdr.mVersion == 1) {

When we update the metadata in the file we should use the latest version if possible, so we must change mMetaHdr.mVersion to kCacheEntryVersion somewhere. Probably here.

::: netwerk/cache2/CacheFileMetadata.h
@@ +63,5 @@
>      NetworkEndian::writeUint32(ptr, mLastModified); ptr += sizeof(uint32_t);
>      NetworkEndian::writeUint32(ptr, mFrecency); ptr += sizeof(uint32_t);
>      NetworkEndian::writeUint32(ptr, mExpirationTime); ptr += sizeof(uint32_t);
> +    NetworkEndian::writeUint32(ptr, mKeySize); ptr += sizeof(uint32_t);
> +    NetworkEndian::writeUint32(ptr, mFlags);

This will produce a wrong metadata when mVersion == 1.

::: netwerk/cache2/CacheIndex.cpp
@@ +802,5 @@
>      if (updated) {
>        updated->Init(aAppId, aAnonymous, aInBrowser);
>        updated->MarkDirty();
> +      if (aPinned) {
> +        updated->MarkPinned();

Pinned status doesn't change during entry's lifetime. So pass it as argument to CacheIndexEntry::Init() like we do in case of appId, anonymous and inBrowser.

@@ +1222,5 @@
>      if (IsForcedValidEntry(&hash)) {
>        continue;
>      }
>  
> +    if (index->mFrecencyArray[i]->mFlags & CacheIndexEntry::kPinnedMask) {

Make a static method bool CacheIndexEntry::IsPinned(CacheIndexRecord *aRec) to avoid direct flags usage in CacheIndex (see GetFileSize).

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +433,5 @@
>  
>      if (NS_FAILED(rv)) {
>          LOG(("OpenCacheEntry failed [rv=%x]\n", rv));
> +
> +        // Cache entry is demnaded when we are instructed to pin.

demnaded -> demanded

@@ +3549,5 @@
>  {
>      mCacheEntriesToWaitFor &= ~WAIT_FOR_CACHE_ENTRY;
>  
> +    if (MOZ_UNLIKELY(mPinCacheContent) && NS_FAILED(aEntryStatus)) {
> +        // Cache entry is demnaded when we are instructed to pin.

demnaded -> demanded
Attachment #8657452 - Flags: review?(michal.novotny) → feedback+
Blocks: 1203113
(In reply to Michal Novotny (:michal) from comment #53)
> > +    // Not providing load context info means we want to delete everything,
> > +    // so let's not bother with any currently running context cleanups.
> > +    mEntries.Clear();
> 
> Persisted eviction info on disk must be removed too.

I need some advice on how to do this most effectively.
Attached patch v2.2 (obsolete) (deleted) — Splinter Review
- fixed review comments (thanks for them btw!), except the "delete all persistent info" (need advice on that)
- the pin flag is set to true on handles opening for an existing file (presumption of pinning) 
- the pin flag is updated by info from existing entries (after metadata parsed) in CacheFile::OnMetadataRead (on the file and on the handle) and also in CacheEntry::OnFileOpened
- we prevent fallback to 'memory only' on CacheFile when file open fails and pinning is demanded
Attachment #8657452 - Attachment is obsolete: true
Attachment #8657454 - Attachment is obsolete: true
Attachment #8658863 - Flags: review?(michal.novotny)
Attached patch v2.2 -w (obsolete) (deleted) — Splinter Review
Attached patch v2.1 -> v2.2 idiff (obsolete) (deleted) — Splinter Review
Blocks: 1206060
Attached patch v3 (obsolete) (deleted) — Splinter Review
Writing the description of the patch below actually helped me to simplify it :)  One of the reasons to write good comments and descriptions!


(-w patch follows)

- this in general just adds more code to the previously submitted patches
- some of the API consistency is broken: we don't ensure any more that existing entries opened just before service.clear() are delivered ; we only ensure that evicted entries are never delivered/leaked (which is actually the only unbreakable rule of the cache)
- "unknown pinning state" is a period between CacheEntry::AsyncOpen and CacheFileMetadata::ParseMetadata, information from the index is _not_ used (would make the code just more complicated, see the foot note)
- I've introduced a "deferred doom" functionality for CacheEntry and CacheFileHandle
- they get to that state when they are to be doomed during the "unknown pinning state" period ; note that dooming can happen for pinned, non-pinned or also both
- a handle is set its pinning state directly from CacheFileMetadata::ParseMetadata, which can doom the handle
- a cache entry is set a pinning state in its OnFileReady
- there are 2 places handling the deferred state:

1. an existing ("active") cache entry with an unknown pinning state may be doomed by call to service.clear() or a selective eviction of a cache storage.  in that case I insert an artificial Callback object to the callback chain on that entry that (instead of doing any notifications) recreates the entry when found pinned or non-pinned ; goal is to allow the entry to open for consumers opening it before eviction and never give it to consumers opening the same entry after the eviction

2. there can be a context evictor active but the file can still be there (evicting a lot of files, disk is slow).  WasEvicted now returns for which pin state (or both!) the file has been evicted.  the handle is "defer doomed" based on that info (see DommFileInternal and SetPinned).  ParseMetadata then calls mHandle->SetPinned(mPinned).  when the handle is doomed by that operation, CacheFileMetadata does InitEmptyMetadata on itself and loads as 'new' - this is the place that makes the API no more 100% consistent, there can be a race.

hopefully I didn't forget about some other situation where this all breaks...

- introduced nsICacheTesting "internal" interface allowing suspension of the IO thread execution in a deterministic way
- added new complex tests exercising the two above described code paths


https://treeherder.mozilla.org/#/jobs?repo=try&revision=df260698afbc


Foot note:
why not to use index to prefill pin state on CacheEntry?  when a CacheEntry's pinning state is known for existing files from the very start, such entry will be removed from the master hashtables on eviction.  but handle for it may still be active in IOman's hashtables, still opening and with an unknown pinning state.  an entry for the same key, that may open right after the first entry was evicted, would end up using (waiting for) the same in-progress handle.  that is unacceptable.  we would otherwise need a complicated 'callback stopper' code as well on handles.  way too over-complicated.
Attachment #8658863 - Attachment is obsolete: true
Attachment #8658864 - Attachment is obsolete: true
Attachment #8658866 - Attachment is obsolete: true
Attachment #8658863 - Flags: review?(michal.novotny)
Attachment #8664493 - Flags: review?(michal.novotny)
Comment on attachment 8664493 [details] [diff] [review]
v3

Review of attachment 8664493 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/cache2/CacheEntry.h
@@ +166,5 @@
>  
> +    // These are set only for a specil-cased Callback instance inserted
> +    // to the callback chain.  When any of these is set and also any of
> +    // the corressponding flags on the entry is set, this callback will
> +    // indicate (via IsDeferredDoomedEntry()) the entry have to be

* via DeferDoom() doom

::: netwerk/cache2/CacheIndex.cpp
@@ +27,5 @@
>  #define kMinUnwrittenChanges   300
>  #define kMinDumpInterval       20000 // in milliseconds
>  #define kMaxBufSize            16384
>  #define kIndexVersion          0x00000001
> +#define kUpdateIndexStartDelay 10000 // in milliseconds

left over from test, will remove it.

::: netwerk/cache2/CacheStorageService.cpp
@@ -1803,5 @@
>    if (!entry->IsFileDoomed())
>      return;
>  
> -  if (entry->IsReferenced())
> -    return;

this was actually wrong.  even a referenced entry should be removed from the master hashtable.  not sure it belongs to this bug, tho.
Attached patch v3 -w (obsolete) (deleted) — Splinter Review
(fixed)
Comment on attachment 8664493 [details] [diff] [review]
v3

Review of attachment 8664493 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/cache2/CacheFileMetadata.cpp
@@ +962,5 @@
> +  if (mHandle) {
> +    if (!mHandle->SetPinned(Pinned())) {
> +      LOG(("CacheFileMetadata::ParseMetadata() - handle has been doomed on setting up "
> +           "its pinning status [this=%p]", this));
> +      return NS_ERROR_FILE_CORRUPTED;

This is probably wrong... we will from now write to a doomed file, so actually "nowhere".  

Could we just (instead of dooming) truncate and touch the file - actually use the code in CacheFileMetadata::InitEmptyMetadata() ?

The actual code change would be just removing of DoomFileInternalCall from CacheFileHandle::SetPinned(), locally checked this works!
(In reply to Honza Bambas (not reviewing) (:mayhemer) from comment #61)
> This is probably wrong... we will from now write to a doomed file, so
> actually "nowhere".  

One more try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2bed4e0ea80e
(In reply to Honza Bambas (not reviewing) (:mayhemer) from comment #62)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=2bed4e0ea80e

Green!  The |Android 4.3 API11+ debug M(13)| crash is very simple to fix, will be updated in a next version of the patch.
Attached patch v3.1 (obsolete) (deleted) — Splinter Review
this should pass try run.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=dcd4a5fc483b
Attached patch v3 -> v3.1 (obsolete) (deleted) — Splinter Review
Attachment #8665603 - Flags: review?(michal.novotny)
Comment on attachment 8665491 [details] [diff] [review]
v3.1

Review of attachment 8665491 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, just following details need to be explained.

::: netwerk/cache2/CacheFile.cpp
@@ +602,5 @@
>    LOG(("CacheFile::OnMetadataRead() [this=%p, rv=0x%08x]", this, aResult));
>  
> +  // Update the "pinned" flag also on the handle.  It can actually just go from true
> +  // to false here, since until metadata are read we presume a pinned file.
> +  MOZ_ASSERT(mHandle);

The handle is updated in CacheFileMetadata::ParseMetadata(), or did I miss something?

::: netwerk/cache2/CacheFileContextEvictor.cpp
@@ +105,5 @@
> +    // Not providing load context info means we want to delete everything,
> +    // so let's not bother with any currently running context cleanups.
> +    for (uint32_t i = 0; i < mEntries.Length(); ++i) {
> +      if (mEntries[i]->mInfo && mEntries[i]->mPinned == aPinned) {
> +        RemoveEvictInfoFromDisk(mEntries[i]->mInfo, mEntries[i]->mPinned);

I don't understand this. We remove only some persistent info from the disk, but we remove all entries below?

::: netwerk/cache2/CacheIndex.h
@@ +656,5 @@
>  
>    // Returns a hash of the least important entry that should be evicted if the
>    // cache size is over limit and also returns a total number of all entries in
>    // the index minus the number of forced valid entries that we encounter
>    // when searching (see below)

The comment needs to be updated:

"minus the number of forced valid entries and unpinned entries"
Attachment #8665491 - Flags: feedback+
Attachment #8664493 - Flags: review?(michal.novotny)
Attachment #8665603 - Flags: review?(michal.novotny)
(In reply to Michal Novotny (:michal) from comment #66)
> Comment on attachment 8665491 [details] [diff] [review]
> v3.1
> 
> Review of attachment 8665491 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks good, just following details need to be explained.
> 
> ::: netwerk/cache2/CacheFile.cpp
> @@ +602,5 @@
> >    LOG(("CacheFile::OnMetadataRead() [this=%p, rv=0x%08x]", this, aResult));
> >  
> > +  // Update the "pinned" flag also on the handle.  It can actually just go from true
> > +  // to false here, since until metadata are read we presume a pinned file.
> > +  MOZ_ASSERT(mHandle);
> 
> The handle is updated in CacheFileMetadata::ParseMetadata(), or did I miss
> something?

I think this comment and the assertion are leftovers from previous designs of this patch.  This hunk can be removed.

> 
> ::: netwerk/cache2/CacheFileContextEvictor.cpp
> @@ +105,5 @@
> > +    // Not providing load context info means we want to delete everything,
> > +    // so let's not bother with any currently running context cleanups.
> > +    for (uint32_t i = 0; i < mEntries.Length(); ++i) {
> > +      if (mEntries[i]->mInfo && mEntries[i]->mPinned == aPinned) {
> > +        RemoveEvictInfoFromDisk(mEntries[i]->mInfo, mEntries[i]->mPinned);
> 
> I don't understand this. We remove only some persistent info from the disk,
> but we remove all entries below?

Ah, good catch.  Yes, only the entries we remove the persistent info for should be removed from the array.  I'll fix it.


Thanks!
Attached patch v3.1 -> 3.2 interdiff (obsolete) (deleted) — Splinter Review
Attachment #8664493 - Attachment is obsolete: true
Attachment #8664505 - Attachment is obsolete: true
Attachment #8665491 - Attachment is obsolete: true
Attachment #8665603 - Attachment is obsolete: true
Attachment #8671506 - Flags: review?(michal.novotny)
Attached patch v3.2 (obsolete) (deleted) — Splinter Review
Michal, the interdiff left for re-review here is pretty tiny.  If you can do it as the first review, we could land this (plus one more pending bug.)  Thanks.
Attached patch v3.2 -> 3.3 interdiff (obsolete) (deleted) — Splinter Review
I've realized one thing (bug 1203113 comment 17): we cannot fail cache entry open when using OPEN_TRUNCATE or openTruncate(), pinning is demanded and the file later fails to open.  There is no way to inform consumer of the entry that the data has been successfully written to disk or there has been a failure - the output stream can be opened+written+closed prior the handle is obtained in CacheFile.  

For pinning, truncate opening will be prevalent.

If there is a demand for a failure detection, we should fix that in a different bug.  I also think that the deferred commit and the commit() callback may provide such functionality (or prepare a good land for it).


This interdiff patch:
- removes the hard-fail code from CacheFile when pinning is demanded and the file fails to open asynchronously (some 'else after return's fixed with it)
- also removed the code from CacheEntry
- and also from nsHttpChannel
- nsHttpChannel::Connect doesn't need to AsyncAbort (actually must not) ; it's OK to just return a failure directly (this is already fixed in the deferred commit patch, carrying it here when I'm updating once more anyway)


https://treeherder.mozilla.org/#/jobs?repo=try&revision=be08ea3dc9ce

-w patch follows.
Attachment #8671968 - Flags: review?(michal.novotny)
Attached patch v3.2 -> 3.3 interdiff (-w) (obsolete) (deleted) — Splinter Review
Attached patch v3.3 (obsolete) (deleted) — Splinter Review
Attachment #8671509 - Attachment is obsolete: true
(In reply to Honza Bambas (not reviewing) (:mayhemer) from comment #72)
> - nsHttpChannel::Connect doesn't need to AsyncAbort (actually must not) ;
> it's OK to just return a failure directly (this is already fixed in the
> deferred commit patch, carrying it here when I'm updating once more anyway)

Please, ignore this note.  The hunk has eventually been removed completely.
(In reply to Honza Bambas (not reviewing) (:mayhemer) from comment #74)
> Created attachment 8671973 [details] [diff] [review]
> v3.3

https://treeherder.mozilla.org/#/jobs?repo=try&revision=47bd2401bf1a
Attachment #8671506 - Flags: review?(michal.novotny) → review+
Attachment #8671968 - Flags: review?(michal.novotny) → review+
Comment on attachment 8671973 [details] [diff] [review]
v3.3

Review of attachment 8671973 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/cache2/CacheFileContextEvictor.cpp
@@ +594,3 @@
>      if (handle) {
>        // We doom any active handle in CacheFileIOManager::EvictByContext(), so
>        // this must be a new one. Skip it.

I missed this in previous review. This comment is no longer valid. Please update it with the description how it is ensured that we won't reuse the entry if the doom was deferred.

@@ +601,5 @@
>  
> +    CacheIndex::EntryStatus status;
> +    bool pinned;
> +    rv = CacheIndex::HasEntry(hash, &status, &pinned);
> +    if (NS_FAILED(rv)) {

This should never happen because EvictEntries() is run only when the index is up to date, so CacheIndex::HasEntry() cannot fail and here should be rather MOZ_ASSERT(NS_SUCCEEDED(rv)).
Attachment #8671973 - Flags: feedback+
Blocks: 1217508
Attached patch v3.3 -> 3.4 interdiff (obsolete) (deleted) — Splinter Review
Attachment #8671506 - Attachment is obsolete: true
Attachment #8671968 - Attachment is obsolete: true
Attachment #8671969 - Attachment is obsolete: true
Attachment #8671973 - Attachment is obsolete: true
Attachment #8677611 - Flags: review?(michal.novotny)
Attached patch v3.4 (deleted) — Splinter Review
rebased on the current m-c
Attachment #8677611 - Flags: review?(michal.novotny) → review+
Attachment #8677612 - Flags: review+
Attachment #8677611 - Attachment is obsolete: true
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/80eff2b52d14
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
Depends on: 1220215
removing the b2g 2.5 flag since this commit has been reverted due to an incorrect merge, sorry for the confusion
Depends on: 1221758
No longer blocks: 1229817
Depends on: 1229817
Depends on: 1489817
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: