Closed Bug 567360 Opened 15 years ago Closed 14 years ago

Pages using http-equiv imagetoolbar are only cached for 1 second

Categories

(Core :: Networking: HTTP, defect)

x86
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: jorrit, Assigned: byronm)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 4 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.53 Safari/533.4
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3

It appears that if the following appears in the <head> section of a document:

<meta http-equiv="imagetoolbar" content="no" />

And Cache-Control and Expires headers tell Firefox to cache the response for some time, Firefox will ignore this and instead cache only for one second. Note this http-equiv is used to turn off the image toolbar overlay in IE6 and 7, and appears in lots of webpages.

Note: I have also posted an article about this on my blog, which may or may not be more readable or clear: http://www.chainfire.eu/articles/77/Weird_Firefox_cache_issue_imagetoolbar_/



Reproducible: Always

Steps to Reproduce:
1. A sample PHP script that procudes the cache headers we want and the HTML is attached. May need some adaptation on your part, but you'll get the picture I'm sure. Removing the http-equiv makes the issue go away. Put it on a box somewhere.
2. Load it in Firefox
3. Note the "random number"
4. Click the link to refresh the page
5. GOTO 3
Actual Results:  
Every second the page is reloaded, instead of pulled from cache until it expires. You can see this by the random number changing. It initially looks like it reloads every other request (alternating GET from server and GET from cache), but if you click the link really fast it sometimes loads from cache multiple times. I suspect it reloads once per second.

Expected Results:  
It should have continued pulling the page from cache until the page expired, instead of reloading it once every second.

Using the random number method instead of Firebug to compare it to other browsers, they all behaved as expected (but do not have Firebug :))
Please provide a http log
https://developer-stage.mozilla.org/en/HTTP_Logging
Component: General → Networking: HTTP
Product: Firefox → Core
QA Contact: general → networking.http
Version: unspecified → 1.9.2 Branch
Or just a link to a site showing the problem?
Links to the code in action:

http://www.jongma.org/tmp/firefox/imagetoolbar-fail.php

http://www.jongma.org/tmp/firefox/imagetoolbar-ok.php

'fail' has the HTML code that offends in plain HTML, 'ok' has it inside an IE-specific conditional comment (which works around the issue).

The link mentioned above to provide a HTTP log doesn't work at the moment.
OK, I can definitely reproduce with the imagetoolbar-fail.php link on 1.9.2.  On trunk, the problem isn't present, and the fix range is http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=83c887dff0da&tochange=d6bb0f9e9519 so presumably due to enabling the html5 parser.

Digging into this now.
Aha.  Here's the issue.  In nsHttpChannel::FinalizeCacheEntry we have this code:

2633     if (mResponseHead && mResponseHeadersModified) {
2634         // Set the expiration time for this cache entry
2635         nsresult rv = UpdateExpirationTime();
2636         if (NS_FAILED(rv)) return rv;
2637     }

Now the UpdateExpirationTime function expects mRequestTime to be set to the request time, but that member is only set when hitting the network, not when reading from cache.  So on every other reload we'll load from cache, have an mRequestTime of 0, hit this code (due to the HTML content sink setting the "imagetoolbar" header on the HTTP channel), and in UpdateExpirationTime set the entry to expire "now".  Then the next reload we'll hit the network and set the expiration time correctly.  Then we'll hit cache, etc.

Possible solutions:

1)  Don't call into UpdateExpirationTime unless one of the headers actually
    relevant to expiration time got changed.
2)  Store the mRequestTime in the cache entry and read it back when reading
    from cache.
3)  Skip the UpdateExpirationTime thing when reading from cache(?).

Honza, Jason, thoughts?

Changing version to trunk, since extensions can still hit this if they change HTTP response headers.
Status: UNCONFIRMED → NEW
blocking2.0: --- → ?
Ever confirmed: true
Version: 1.9.2 Branch → Trunk
If possible, could you give an estimate as to which version of Firefox this bug appeared ?
All of them.  This code dates back to Mozilla 0.9.x or so (August 2001).
This sounds like something that could help with cache utilization on more than a few sites. Blocking, and over to Michal. Michal, can you please have a look at this before we release 1.9.3 final (later this year)?
Assignee: nobody → michal.novotny
blocking2.0: ? → beta1+
Blocks: http_cache
blocking2.0: beta1+ → beta2+
Byron says he'll have a crack at this one.
Assignee: michal.novotny → byronm
blocking2.0: beta2+ → betaN+
I looked into Boris' fix range. Turning on the HTML5 parser is what made the bad behavior go away, as he suggested. The changeset where we still see the bad behavior is 062a4fd597b8 and the changeset where it goes away is the next one, 358113b3642e. Changeset 358113b3642e turns on the HTML5 parser.

Boris, I followed your suggestion 3. If we are only reading from the cache, and not hitting the net at all, then UpdateExpirationTime will not be called. However, it will be called if we a) did not find the entry in the cache, or b) did find the entry but it was stale and had to be validated.

This modification makes the bad behavior go away on the last changeset (062a4fd597b8) for which it was visible. Given that we are not seeing the bad behavior on current builds, this seemed to be the best way to check that the problem is fixed. 

One thing though, is that I am not 100% confident on the semantics of when we should call UpdateExpirationTime in general. A couple of things that bother me include:

1) We call UpdateExpirationTime() from both OnStartRequest and OnStopRequest. Is this just because it is possible that we received response headers in between these two callbacks, and we want the cache entry's expiration data to be as up-to-date as possible before we close it? Or is there some other reason?

2) We call UpdateExpirationTime() when we validate a stale entry and receive a "304-Not Modified" status code. Is this correct? I'm guessing we do this just because the headers carrying this status code may have new expiration data, even though the content of the entry itself is the same?

I'd love any feedback on this (very small) patch, the questions in this post, or anything else that needs to be done to resolve this bug. 

Thanks!
Waiting for try server results before requesting review.
Generally OnStartRequest means we got all the server-sent headers.  But consumers could change headers on us after OnStartRequest; I assume that's why the OnStopRequest callsite is there.

For the 304, I think your hypothesis is also correct.

As far as the patch goes, UpdateExpirationTime only messes up if mRequestTime is not initialized, right?  It would make more sense to have the boolean indicate whether mRequestTime is initialized than to depend on the precise interplay of mCachedContentIsValid and mRequestTime.  For one thing, we may well hit the network even if mCachedContentIsValid, if what we have cached is a partial response.
(In reply to comment #13)

Boris, thanks very much for the feedback.

> It would make more sense to have the boolean
> indicate whether mRequestTime is initialized than to depend on the precise
> interplay of mCachedContentIsValid and mRequestTime.  For one thing, we may
> well hit the network even if mCachedContentIsValid, if what we have cached is 
> a partial response.

Right. In that case, would it make sense simply to add mRequestTime != 0 to the check that determines if we call UpdateExpirationTime?
Hmm...  I suppose we could, though I'd rather not rely on magic values that it might happen to be initialized to like that.
This addresses feedback from Comment 13. If we want to dedicate a separate boolean to whether mRequestTime is initialized or not, that is fine. I confirmed that this still eradicates the incorrect behavior on changeset 062a4fd597b8.
Attachment #460710 - Attachment is obsolete: true
(In reply to comment #15)
> Hmm...  I suppose we could, though I'd rather not rely on magic values that it
> might happen to be initialized to like that.

Boris, sorry, I posted the new patch before I saw your comment. I agree. While it seems unlikely that PRTimeToSeconds(PR_Now()) will return 0, better safe than sorry. I will change it such that we set a boolean indicating that mRequestTime is initialized, rather than simply checking that it does not equal zero.
Attachment #460776 - Attachment is obsolete: true
Comment on attachment 460788 [details] [diff] [review]
Uses a bool indicating if mRequestTime has been set rather than checking mRequestTime != 0

Very short patch resolving this bug, incorporating bz's comments. I think this is ready for review.
Attachment #460788 - Flags: review?(bzbarsky)
Comment on attachment 460788 [details] [diff] [review]
Uses a bool indicating if mRequestTime has been set rather than checking mRequestTime != 0

mRequestTimeInitialized should probably be part of the "state flags" bitset instead.  And fix the indent on that if condition?

r=bzbarsky with those.
Attachment #460788 - Flags: review?(bzbarsky) → review+
Attachment #460788 - Attachment is obsolete: true
Attachment #461476 - Flags: review+
Attached patch Correct formatting of patch. (deleted) — Splinter Review
Attachment #461476 - Attachment is obsolete: true
http://hg.mozilla.org/mozilla-central/rev/448f90927cca
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: