Closed Bug 269303 Opened 20 years ago Closed 15 years ago

if-modified-since sent even though vary: cookie indicates cached page is outdated.

Categories

(Core :: Networking: HTTP, defect)

x86
Linux
defect
Not set
minor

Tracking

()

RESOLVED DUPLICATE of bug 510359

People

(Reporter: spam_from_bugzilla, Unassigned)

References

Details

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040719 Firefox/0.9.1 Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040719 Firefox/0.9.1 I am observing what I consider the wrong behaviour when vary: cookie and if-modified-since are combined. I enclose an HTTP dump below that illustrates the problem, but the quick summary is the following: You have never visited this site before so have nothing cached and no cookies. You visit page P. It sends a vary: cookie response and a last-modified time. You cache this response. You visit some other pages in the same site and one of them sets a cookie. You visit page P again. Since you now have a cookie, and the page declares that cookies influence the content, you should fetch a fresh version of the page rather than using the cached version. Instead it seems that Mozilla sends an if-modified-since request. My server replies "not modified" since the content it would return to a visitor with the cookie has not changed since the specified date. Any thoughts? HTTP dump follows. Regards, Phil. First page request. No cookies, nothing in the cache. Response includes vary: cookie and a last-modified time. 1084485888[809d868]: http request [ 1084485888[809d868]: GET /treefic/work/treefic/test?a=tree_page HTTP/1.1 1084485888[809d868]: Host: andorra 1084485888[809d868]: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040719 Firefox/0.9.1 1084485888[809d868]: Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 1084485888[809d868]: Accept-Language: en-us,en;q=0.5 1084485888[809d868]: Accept-Encoding: gzip,deflate 1084485888[809d868]: Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 1084485888[809d868]: Keep-Alive: 300 1084485888[809d868]: Connection: keep-alive 1084485888[809d868]: ] 1103780784[812bb98]: http response [ 1103780784[812bb98]: HTTP/1.1 200 OK 1103780784[812bb98]: Date: Fri, 12 Nov 2004 01:54:23 GMT 1103780784[812bb98]: Server: Apache/2.0.48 (Debian GNU/Linux) 1103780784[812bb98]: Vary: Cookie 1103780784[812bb98]: Last-Modified: Fri, 12 Nov 2004 00:12:47 GMT 1103780784[812bb98]: Keep-Alive: timeout=15, max=100 1103780784[812bb98]: Connection: Keep-Alive 1103780784[812bb98]: Transfer-Encoding: chunked 1103780784[812bb98]: Content-Type: text/html 1103780784[812bb98]: ] Various images and style sheets follow. Nothing to see here. 1084485888[809d868]: http request [ 1084485888[809d868]: GET /treefic/work/treefic.css HTTP/1.1 1084485888[809d868]: Host: andorra 1084485888[809d868]: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040719 Firefox/0.9.1 1084485888[809d868]: Accept: text/css,*/*;q=0.1 1084485888[809d868]: Accept-Language: en-us,en;q=0.5 1084485888[809d868]: Accept-Encoding: gzip,deflate 1084485888[809d868]: Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 1084485888[809d868]: Keep-Alive: 300 1084485888[809d868]: Connection: keep-alive 1084485888[809d868]: Referer: http://andorra/treefic/work/treefic/test?a=tree_page 1084485888[809d868]: ] 1103780784[812bb98]: http response [ 1103780784[812bb98]: HTTP/1.1 200 OK 1103780784[812bb98]: Date: Fri, 12 Nov 2004 01:54:25 GMT 1103780784[812bb98]: Server: Apache/2.0.48 (Debian GNU/Linux) 1103780784[812bb98]: Last-Modified: Sat, 25 Sep 2004 14:24:03 GMT 1103780784[812bb98]: Etag: "bcb18-46a5-d8eca6c0" 1103780784[812bb98]: Accept-Ranges: bytes 1103780784[812bb98]: Content-Length: 18085 1103780784[812bb98]: Keep-Alive: timeout=15, max=100 1103780784[812bb98]: Connection: Keep-Alive 1103780784[812bb98]: Content-Type: text/css 1103780784[812bb98]: ] 1084485888[809d868]: http request [ 1084485888[809d868]: GET /treefic/work/imgs/logo.png HTTP/1.1 1084485888[809d868]: Host: andorra 1084485888[809d868]: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040719 Firefox/0.9.1 1084485888[809d868]: Accept: image/png,*/*;q=0.5 1084485888[809d868]: Accept-Language: en-us,en;q=0.5 1084485888[809d868]: Accept-Encoding: gzip,deflate 1084485888[809d868]: Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 1084485888[809d868]: Keep-Alive: 300 1084485888[809d868]: Connection: keep-alive 1084485888[809d868]: Referer: http://andorra/treefic/work/treefic/test?a=tree_page 1084485888[809d868]: ] 1103780784[812bb98]: http response [ 1103780784[812bb98]: HTTP/1.1 200 OK 1103780784[812bb98]: Date: Fri, 12 Nov 2004 01:54:25 GMT 1103780784[812bb98]: Server: Apache/2.0.48 (Debian GNU/Linux) 1103780784[812bb98]: Last-Modified: Mon, 12 Jul 2004 12:52:13 GMT 1103780784[812bb98]: Etag: "a15a7-1889-d2679940" 1103780784[812bb98]: Accept-Ranges: bytes 1103780784[812bb98]: Content-Length: 6281 1103780784[812bb98]: Keep-Alive: timeout=15, max=99 1103780784[812bb98]: Connection: Keep-Alive 1103780784[812bb98]: Content-Type: image/png 1103780784[812bb98]: ] Navigate to another page. Nothing special happens. 1084485888[809d868]: http request [ 1084485888[809d868]: GET /treefic/work/treefic/test?a=timeline HTTP/1.1 1084485888[809d868]: Host: andorra 1084485888[809d868]: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040719 Firefox/0.9.1 1084485888[809d868]: Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 1084485888[809d868]: Accept-Language: en-us,en;q=0.5 1084485888[809d868]: Accept-Encoding: gzip,deflate 1084485888[809d868]: Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 1084485888[809d868]: Keep-Alive: 300 1084485888[809d868]: Connection: keep-alive 1084485888[809d868]: Referer: http://andorra/treefic/work/treefic/test?a=tree_page 1084485888[809d868]: ] 1103780784[812bb98]: http response [ 1103780784[812bb98]: HTTP/1.1 200 OK 1103780784[812bb98]: Date: Fri, 12 Nov 2004 01:55:02 GMT 1103780784[812bb98]: Server: Apache/2.0.48 (Debian GNU/Linux) 1103780784[812bb98]: Vary: Cookie 1103780784[812bb98]: Last-Modified: Fri, 12 Nov 2004 00:12:47 GMT 1103780784[812bb98]: Keep-Alive: timeout=15, max=100 1103780784[812bb98]: Connection: Keep-Alive 1103780784[812bb98]: Transfer-Encoding: chunked 1103780784[812bb98]: Content-Type: text/html 1103780784[812bb98]: ] Send a POST request, which results in a cookie being set. 1084485888[809d868]: http request [ 1084485888[809d868]: POST /treefic/work/treefic/test HTTP/1.1 1084485888[809d868]: Host: andorra 1084485888[809d868]: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040719 Firefox/0.9.1 1084485888[809d868]: Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 1084485888[809d868]: Accept-Language: en-us,en;q=0.5 1084485888[809d868]: Accept-Encoding: gzip,deflate 1084485888[809d868]: Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 1084485888[809d868]: Keep-Alive: 300 1084485888[809d868]: Connection: keep-alive 1084485888[809d868]: Referer: http://andorra/treefic/work/treefic/test?a=timeline 1084485888[809d868]: ] 1103780784[812bb98]: http response [ 1103780784[812bb98]: HTTP/1.1 200 OK 1103780784[812bb98]: Date: Fri, 12 Nov 2004 01:55:14 GMT 1103780784[812bb98]: Server: Apache/2.0.48 (Debian GNU/Linux) 1103780784[812bb98]: Vary: Cookie 1103780784[812bb98]: Set-Cookie: treefic_test_SessionID="89029931"; Version="1" 1103780784[812bb98]: Last-Modified: Fri, 12 Nov 2004 00:12:47 GMT 1103780784[812bb98]: Keep-Alive: timeout=15, max=99 1103780784[812bb98]: Connection: Keep-Alive 1103780784[812bb98]: Transfer-Encoding: chunked 1103780784[812bb98]: Content-Type: text/html 1103780784[812bb98]: ] Now return to the original page. The cookie received above is sent, as well as what I consider to be an erroneous if-modified-since header. 1084485888[809d868]: http request [ 1084485888[809d868]: GET /treefic/work/treefic/test?a=tree_page HTTP/1.1 1084485888[809d868]: Host: andorra 1084485888[809d868]: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7) Gecko/20040719 Firefox/0.9.1 1084485888[809d868]: Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 1084485888[809d868]: Accept-Language: en-us,en;q=0.5 1084485888[809d868]: Accept-Encoding: gzip,deflate 1084485888[809d868]: Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 1084485888[809d868]: Keep-Alive: 300 1084485888[809d868]: Connection: keep-alive 1084485888[809d868]: Referer: http://andorra/treefic/work/treefic/test 1084485888[809d868]: Cookie: treefic_test_SessionID="89029931" 1084485888[809d868]: If-Modified-Since: Fri, 12 Nov 2004 00:12:47 GMT 1084485888[809d868]: ] 1103780784[812bb98]: http response [ 1103780784[812bb98]: HTTP/1.1 304 Not Modified 1103780784[812bb98]: Date: Fri, 12 Nov 2004 01:55:22 GMT 1103780784[812bb98]: Server: Apache/2.0.48 (Debian GNU/Linux) 1103780784[812bb98]: Connection: Keep-Alive 1103780784[812bb98]: Keep-Alive: timeout=15, max=98 1103780784[812bb98]: Vary: Cookie 1103780784[812bb98]: ] The server replies saying the page is not modified. Reproducible: Always Steps to Reproduce:
Mozilla interprets the Vary header to mean that the cached content needs to be validated before being used. Issuing a conditional request is meant as an optimization to allow the server to either respond with a 304 or a 200 response. It would seem that the server should return a 200 response if the content is indeed a function of the given Cookie header(s). This bug sounds invalid to me. From section 14.44 of RFC 2616: The Vary field value indicates the set of request-header fields that fully determines, while the response is fresh, whether a cache is permitted to use the response to reply to a subsequent request without revalidation. Hence, I believe Mozilla's implementation is correct. Marking INVALID.
Status: UNCONFIRMED → RESOLVED
Closed: 20 years ago
Resolution: --- → INVALID
BTW, you might want to use an ETag instead of relying on the Last-Modified header value to discern entities. Afterall, if the entity depends on the value of the Cookie header, then you really have two (or more) different entities. So, why not create unique ETag values for each entity? That way, the browser will send you a If-None-Match header instead of a If-Modified-Since, and you will be able to look at the ETag value to determine whether or not it is safe to return 304.
Darin, Thanks for the quick response. I don't agree with you, but thanks for being quick :-) I have reviewed RFC2616 (esp. section 13.6) and it does seem to agree with Mozilla's behaviour. But surely this will lead to the "wrong" thing happening. For example if I "vary: accept-language": If I fetch the EN version of a page, then change my preference to FR and fetch it again I expect to see the FR version of the page. If Mozilla behaves as you suggest it will do an if-modified-since fetch, discover that the FR version has not changed recently, and display the EN version again. If you are right, then sites that use Vary: must never return 304 replies (unless, perhaps, they are also using Etags). Is this is a fault with RFC2616? Time to find an appropriate mailing list... Phil.
I think it is intentionally designed this way. The browser does not know if the FR version of the page exists. So, it tells the server what it has in its cache (by sending it a "If-None-Match: entity-tag" request header). Then, the server uses that to decide how to respond. This is another reason why entity-tags are better than last-modified time stamps for validating cache entries. If you use last-modified as the cache validator, then you are saying essentially that the URL is enough to uniquely identify the content. So, maybe Mozilla should bend the rules a bit and not send If-Modified-Since when validating due to a Vary header. But, I could turn it around and ask why the server bothers sending a 304 for a document that is dynamically selected based on some variable request header? IMO, the server should use ETags if it wants to send 304 responses sometimes. Otherwise, it should ignore conditional requests and serve the content with a 200 response.
Hi again, Certainly if you use Etags it should all just work. It needs more work in the server though. Last-modified with vary: looks broken-by-design. I will change my server code to use Etags (exclusively). Then will find that it breaks something else..... Thankfully I have the weekend before I need worry about this again. Phil.
*** Bug 341779 has been marked as a duplicate of this bug. ***
No, the resolution as "INVALID" here is incorrect -- see also bug 341779. The statement that "Last-modified with vary: looks broken-by-design" suggests a misunderstanding. It is true that the HTTP spec is a bit opaque on this subject, but hopefully the following should clarify: Here's a thought experiment. Suppose you have a URL A which produces two different responses depending on whether a cookie x=1 is set, and which has a Last-Modified: time in the past, and neither page will ever change in the future. First consider this request: <- GET /A HTTP/1.1 <- Host: A and the server responds, -> HTTP/1.1 200 OK -> Content-Type: text/html -> Vary: Cookie -> Last-Modified: Sat, 1 Jan 2000 00:00:00 GMT -> -> body-text-with-no-cookie-set Clearly this result will never change. So, suppose that the client sends, <- GET /A HTTP/1.1 <- Host: A <- If-Modified-Since: Sat, 1 Jan 2000 00:00:00 GMT Evidently the server may always validly send the response, -> HTTP/1.1 304 Not Modified -> Content-Type: text/html -> Vary: Cookie -> Last-Modified: Sat, 1 Jan 2000 00:00:00 GMT because the result has not changed. Now consider the request, <- GET /A HTTP/1.1 <- Host: A <- Cookie: x=1 The server sends the response, -> HTTP/1.1 200 OK -> Content-Type: text/html -> Vary: Cookie -> Last-Modified: Sat, 1 Jan 2000 00:00:00 GMT -> -> body-text-with-cookie-x=1-set This is always the correct response to such a request (see assumptions above). Therefore, if the client sends the corresponding conditional request, <- GET /A HTTP/1.1 <- Host: A <- Cookie: x=1 <- If-Modified-Since: Sat, 1 Jan 2000 00:00:00 GMT it is clearly correct for the server always to send the response, -> HTTP/1.1 304 Not Modified -> Content-Type: text/html -> Vary: Cookie -> Last-Modified: Sat, 1 Jan 2000 00:00:00 GMT because the response to the non-conditional form of the request will never have changed. Does this mean that Vary:... with If-Modified-Since: is broken? No, not at all. The problem occurs *only* if the browser takes a cached copy of the response obtained without the cookie set and assumes that it is also a valid cached copy of the response that would have been obtained if the cookie had been set. This is a bug in Mozilla, which as Darin Fisher says above, interprets Vary as meaning "that the cached content needs to be validated before being used". This is true, if and only if you have cached content FOR THE REQUEST YOU ARE MAKING. You cannot take a response from one request, and assume that it is the correct response for another request that you have not made; if you do, you will come unstuck, which is what Mozilla does.
oh, two other brief points: firstly it doesn't matter whether you say Vary: Cookie or Vary: * -- Mozilla gets it wrong in both cases; secondly, for comparison, IE gets this right and Opera gets it wrong.
It's now a long time since I filed this bug and I've forgotten the details. But I suggest that "thought experiments" are less useful than carefully reading what the RFC says! My recollection is that I originally felt as you do about how it should work "in theory", but in practice that is not what the spec requires. It's all easy if you use Etags. --Phil.
The thought experiment is there to clarify what the RFC says, since, as I say, its own words are a bit opaque. You are correct that the existing Mozilla implementation does not match what the standard requires, but that means that Mozilla is wrong, not that the meaning of the RFC has changed. However, let's risk further confusion by wading through the relevant bit of the RFC: Firstly, what does a conditional GET with If-Modified-Since: mean? | 14.25 If-Modified-Since | | The If-Modified-Since request-header field is used with a method to | make it conditional: if the requested variant has not been modified | since the time specified in this field, an entity will not be | returned from the server; instead, a 304 (not modified) response will | be returned without any message-body. In English: If-Modified-Since: allows you to check whether a "variant" of an entity has changed since a given date; it will yield a 304 response if there has been no change. What is a "variant"? s.1.3: | variant | A resource may have one, or more than one, representation(s) | associated with it at any given instant. Each of these | representations is termed a `varriant'[sic.]. Use of the term `variant' | does not necessarily imply that the resource is subject to content | negotiation. A resource is "a thing identified by a URI"; a "variant" is "a thing identified by a URI and (perhaps) some other information". A "variant" and a "representation [of a resource]" are equivalent. The Vary: header (s.14.44), | The Vary field value indicates the set of request-header fields that | fully determines, while the response is fresh, whether a cache is | permitted to use the response to reply to a subsequent request | without revalidation. For uncacheable or stale responses, the Vary | field value advises the user agent about the criteria that were used | to select the representation. In English: if a response contains a Vary: header, then that header tells you which fields in your request were used to choose the specific variant of the resource that was sent to you; therefore, if any of those fields changes in a subsequent request, then you will get a different variant that time. So: when you send a request you get a "representation of a resource" -- a "variant". Which variant you get depends on the URI and on the fields in your request which were listed in the Vary: header of the response. An If-Modified-Since: header in a request allows you to test whether an old copy of a given variant -- NB NOT a resource -- is still valid. A 304 response to a conditional request for a particular resource tells you that the variant which would have been returned by a previous non-conditional request would still be valid; it does not tell you about any other variant or about the resource in general, and in particular a conditional request for one variant (say, the one you get sending a cookie) does not tell you anything about the validity of any other variant (for instance, the one you get not sending a cookie), and nor does it tell you whether the variant you have is the same as any other variant. Mozilla's behaviour here is wrong. It gets one variant, asks whether another has changed, and if told "no" shows the user the first variant again.
The counter-argument relies on the last two words of the paragraph cited in comment #1: The Vary field value indicates the set of request-header fields that fully determines, while the response is fresh, whether a cache is permitted to use the response to reply to a subsequent request without revalidation. The key thing is _revalidation_ : it says "permitted to use ... without revalidation", not "permitted to use ... at all". The implication is that it _is_ allowed to reuse the response as long as it revalidates it, which is what Moz does. There is ambiguity and complexity here. I can't see any holes in your reading of the RFC: certainly the use of "variant" in the definition that you cite of If-Modified-Since is interesting. Perhaps the best thing to do is to ask the people who wrote the RFC? Practically, I decided that I needed something that worked and I re-implemented my server-side code to use Etags, and it now works fine. You just need to hash together the last-modified time and the significant cookie values to generate an Etag, and return that with the response. One other comment: you mention that this works as you expect in IE. My experience is that IE is very, very pessimistic about caching; I don't think it will ever cache anything if it got a Vary header.
"permitted to use ... without revalidation", not "permitted to use ... at all" Your error is in thinking that a cached copy of one variant may be used to validate another. You cannot retrieve one variant, then ask questions about its validity and interpret them as giving you information about the validity of another variant (of which you do not even have a copy!). > One other comment: you mention that this works as you expect in IE. My > experience is that IE is very, very pessimistic about caching; I don't think it > will ever cache anything if it got a Vary header. yeah. My guess is that they read the RFC, realised they didn't understand it, and decided to make conservative assumptions which were correct, even if they were not as economical as possible. Being more charitable to Microsoft, it may be as simple as that they understood it where authors of other browsers did not.
I don't agree the RFC is poorly worded. There is very clear MUST level rules governing this. How a cache is supposed to operate is defined in 13.6. Of particular it says: "the cache MUST NOT use such a cache entry to construct a response to the new request unless all of the selecting request-headers present in the new request match the corresponding stored request-headers in the original request." and "the cache MUST NOT use a cached entry to satisfy the request unless it first relays the new request to the origin server in a conditional request and the server responds with 304 (Not Modified), including an entity tag or Content-Location that indicates the entity to be used."
OK, I'm willing to reopen this bug and change the behavior to not attempt a cache validation when there is no ETag, provided someone can show me an existing website, where this causes Firefox not to work. Otherwise, I'd rather remain consistent with our existing interpretation of RFC 2616.
See the example in bug 341779, with a test case, reproduced below: Steps to Reproduce: 1. Go to http://caesious.beasts.org/~chris/cgi-bin/vary 2. Note the first line, which tells you whether a particular cookie is set 3. Click the button in the page 4. Click the link below the button Actual Results: The page loaded when the link is clicked is exactly the same as the one originally loaded. Expected Results: Firefox should have downloaded the new page. The source for the CGI script above is here: http://caesious.beasts.org/~chris/tmp/20060616/vary The request and response headers look like this: 1. first request (actually this was shift-reload) GET /~chris/cgi-bin/vary HTTP/1.1 Host: caesious.beasts.org User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8) Gecko/20060116 Firefox/1.5 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://caesious.beasts.org/~chris/cgi-bin/vary Pragma: no-cache Cache-Control: no-cache 2. First response. The request is not conditional so the full response is sent; the body states that no cookie was received: HTTP/1.1 200 OK Date: Fri, 16 Jun 2006 16:21:39 GMT Server: Apache/1.3.19 (Unix) mod_fastcgi/mod_fastcgi-SNAP-0404142202 Vary: Cookie Last-Modified: Fri, 16 Jun 2006 00:00:00 GMT Keep-Alive: timeout=15, max=95 Connection: Keep-Alive Transfer-Encoding: chunked Content-Type: text/html 3. Second request sent, after clicking the "Set cookie" button in the page. The browser wrongly sends a conditional GET with the newly-set vary_test=1 cookie, even though it does not have a valid cached copy of the resource: GET /~chris/cgi-bin/vary HTTP/1.1 Host: caesious.beasts.org User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8) Gecko/20060116 Firefox/1.5 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://caesious.beasts.org/~chris/cgi-bin/vary Cookie: vary_test=1 If-Modified-Since: Fri, 16 Jun 2006 00:00:00 GMT 4. The server sends a correct 304 Not Modified response; HTTP/1.1 304 Not Modified Date: Fri, 16 Jun 2006 16:21:55 GMT Server: Apache/1.3.19 (Unix) mod_fastcgi/mod_fastcgi-SNAP-0404142202 Vary: Cookie Last-Modified: Fri, 16 Jun 2006 00:00:00 GMT Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/plain What I suspect is happening here is that the cache is not keyed on the full set of URL + relevant request headers (i.e., those named in the Vary: header sent with the cached response). So when the user clicks the link to get the new version of the page, the cache is consulted and the browser wrongly concludes that it has a fresh cached copy, and all it needs to do is check with the server that it is still valid. The server, thinking that it is being asked, "is the copy of the resource with URL 'http://caesious.beasts.org/~chris/cgi-bin/vary' and a header 'Cookie: vary_test=1' still valid?", responds that it is. Note that this doesn't just apply to 'Vary: Cookie' -- the same bug occurs with http://caesious.beasts.org/~chris/cgi-bin/vary2, which sends 'Vary: *'. Nevertheless the browser still sends a conditional GET response after the user clicks the link in the page. NB this is *not* a duplicate of bug 94123, and the statement in the comment of Thomas Rutter that Mozilla handles this case correctly but inoptimally is not accurate (though it may have been in previous versions). More generally, "provided someone can show me an existing website, where this causes Firefox not to work" is a dangerous approach. Web application authors test their applications against different browsers (this is how I discovered that Mozilla is broken in this case); if a particular feature specified in the standard doesn't work in a common browser, as here, then you change the site not to use it, because it is not practical to replace all users' browsers. You cannot test whether something is a "valid" bug by looking for sites broken by it on the web, because it is very likely that the mere fact that the bug exists will have caused web site developers to work around the problem (inefficiently, in this case) so that it does not exhibit. The correct approach is to correctly implement the standard.
Chris: That's a testcase that you created right? How about an actual deployed website? Is this a real problem that causes Firefox not to work on real websites?
This will affect any website which uses (say) cookies for login, and supports If-Modified-Since:. However, because Mozilla is broken as described above, I would expect anyone who has tried to implement such a website to have suppressed support for such conditional GETs (from Mozilla at least), because otherwise users of that browser will get (at best) a confusing user experience or (at worst) a site that does not work at all. Developers who discover that Mozilla's support for conditional GET does not work are forced to handle conditional GETs as non-conditional GETs (thereby negating their advantages), because Mozilla is now prevalent enough that its users cannot be ignored, even though they are using a non-standards-compliant browser. That is why I provided a test case. Obviously even if you do fix this it will be some time before it is safe to implement conditional GET, because there will be a lot of broken copies of Mozilla out there for a long time from now, but it would be nice if at *some point* in the future this stuff started working again (from comments on another bug I understand that this feature used to work properly but has since been broken).
You're overstating the problem. Conditional requests that depend on cookie values work fine provided the server uses ETags appropriately. If there's a real website where this problem occurs, then it becomes a high priority bug to fix. Otherwise, we're just speculating that it is a problem.
gah! It's a SELECTION EFFECT! Because this feature of Mozilla is broken, nobody can use this feature in a website. But this feature is desirable because it saves bandwidth and improves performance. Thought experiment: suppose Mozilla's support for (rolls dice...) CSS was horribly broken, and so nobody used CSS in their website. Would the correct response of the maintainers to the report that that feature was broken be, "nobody uses that so we don't need to fix it"? When did policy on making Mozilla standards-compliant change, ooi?
fair enough... patches welcome
Status: RESOLVED → UNCONFIRMED
Resolution: INVALID → ---
-> default owner
Assignee: darin → nobody
Severity: normal → minor
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P5
(In reply to comment #16) > How about an actual deployed website? When I reported this in November 2004 there was an "actual deployed website" where it caused a problem - treefic.com. A user reported that sometimes they would log in yet would not be able to access certain functions. It took me many days to track the problem down and create the HTTP dump in the description at the top of this bug. After Darin's response in comment #1 I went away and recoded it to use Etags. Treefic.com is sadly no longer active.
Thanks for the info Phil!
Try this (not tested): --- netwerk/protocol/http/src/nsHttpChannel.cpp.orig Tue Jun 20 22:35:38 2006 +++ netwerk/protocol/http/src/nsHttpChannel.cpp Tue Jun 20 22:36:36 2006 @@ -1443,7 +1443,7 @@ } } - PRBool doValidation = PR_FALSE; + PRBool doValidation = PR_FALSE, varies = PR_FALSE; // Be optimistic: assume that we won't need to do validation mRequestHead.ClearHeader(nsHttp::If_Modified_Since); @@ -1485,6 +1485,7 @@ else if (ResponseWouldVary()) { LOG(("Validating based on Vary headers returning TRUE\n")); doValidation = PR_TRUE; + varies = PR_TRUE; } // Check if the cache entry has expired... else { @@ -1562,7 +1563,7 @@ const char *val; // Add If-Modified-Since header if a Last-Modified was given val = mCachedResponseHead->PeekHeader(nsHttp::Last_Modified); - if (val) + if (val && !varies) mRequestHead.SetHeader(nsHttp::If_Modified_Since, nsDependentCString(val)); // Add If-None-Match header if an ETag was given in the response
What does the patch do? It looks to me as if it unconditionally fetches pages if they have a Vary: header. Is that right?
Not quite unconditionally -- If-None-Match: will still be sent if there is an entity tag.
Chris: this will make Firefox not validate Vary objects without ETag, right? Or in other words an optimization to not fall into second last paragraph of 10.3.5 304 Not Modified (which is a MUST level condition btw..). Still there is loopholes where the cache won't operate per the RFC. This should be complemented with an equality check in the 304 processing (second last paragraph of 10.3.5) to catch a number of unexpected cases, and a Content-Location check should be added to the above optimization to allow conditional if Content-Location is known. In the equality tests below, not having a header computes as a blank value to simplify things. if (304_have_etag && old_etag == 304_etag) ok else if (old_etag != 304_etag) retry without conditional else if (304_have_content_location && old_content_location == 304_content_location) ok else if (old_content_location != 304_content_location) retry without conditional else else ok If it's possible to remember the previous request headers then the logics can be finetuned a bit to allow caching even if the server returns neither ETag or Content-Location on Vary:ing responses, but I am not sure that would be a good thing... All the above is based on the assumption of a simple cache with at most one object per URL. Shared caches like Squid which I work most with have a bit more to take into account..
Chris: Right. A somewhat simplified approach to the problem and perhaps not exacly what the RFC had in mind but should mask nearly all of the problem cases and not in any way a violation. In theory If-Modified-Since should be send whenever validating an object where a modification time is known, but it's not required. If-None-Match takes higher priority anyway, and there is not very much gained from supporting Vary without ETag. So your opimization is very reasonable. Still the RFC requires validation of the 304 response before accepting it as valid. This to catch situations occuring when servers are upgraded adding Vary+ETag support to the content and other corner cases.
*** Bug 341779 has been marked as a duplicate of this bug. ***
I've got a related problem over here. In some cases, my server returns - Last-Modified - Etag - Expires: (5 minutes in the future) According to LiveHttpHeaders and about:cache, Firefox gets the resource once (status 200), and recognizes the Expires header. If I re-access the resource while it's fresh, Firefox re-validates it's cache entry (with If-None-Match, server returns 304), although it could have used the cached response (request headers are identical, after all). That's a bug, right?
(In reply to comment #30) > If I re-access the resource while it's fresh, Firefox re-validates it's cache > entry (with If-None-Match, server returns 304), although it could have used the > cached response (request headers are identical, after all). > > That's a bug, right? Depends on how you re-access the resource. Certain GUI actions forces a fresh copy. If it was a plain request by following a link back to the resource then it smells like a bug yes, but in such case it's a different bug than what this bug report is about. This bug report is about Firefox not honoring the Vary header proper, causing wrong content to be displayed if the server does not support If-None-Match (or when there is no ETag to use in If-None-Match).
I agree that the RFC is quite clear, to me; here's the full sentence Henrik quotes part of in comment #13: "When the cache receives a subsequent request whose Request-URI specifies one or more cache entries including a Vary header field, the cache MUST NOT use such a cache entry to construct a response to the new request unless all of the selecting request-headers present in the new request match the corresponding stored request-headers in the original request." Here's easy steps to show the bug: * set Firefox to prefer English (Tools->Options->Advanced->Languages) * clear the cache to make sure we have a clean slate * fetch a web page that correctly sets Vary: Accept-Language, sends a Last-Modified header, and doesn't send an ETag - e.g. http://www.dracos.co.uk/about/ * change Firefox to prefer French * fetch the same web page with a refresh As the Accept-Language request-header has changed (from "en,fr;q=0.5" to "fr,en;q=0.5" here) - it no longer matches the request-header in the original request, and Firefox "MUST NOT" use its cached copy, as per the RFC. However, it currently does, as you can see, and reshows the English page rather than the new French one. Chris's patch seems sensible to me, with no side effects that I can see, certainly improving Mozilla. Is there anything I can do to help get this implemented?
Chris patch reduces the value of the cache for Vary:ing objects without ETag. But it's a reasonable compromise without extending the cache to also store the relevant request headers (those listed in Vary). Extending the cache to also store the relevant request headers is required to be fully compliant however (or alternatively to keep a invalidation timestamp, and making sure the cache is invalidated on actions which may change the request headers, but that does not feel like a good approach). This also allows for cache validations of Vary:ing objects with Last-Modified only (no ETag). The example you provided is a good one for explaining the scope of the problem. After the user has changed his language preferences the cache of language dependent objects is not supposed to be considered valid, and any visit to such object (even fresh ones) should cause the cache to be revalidated. If should not be required by the user to force a refresh in order to have the new language preferences reflected. To make that point even more obvious consider the user closing his browser and returning a day later, and since the object is still fresh in the cache he still receives the old language version...
"If should not be required by the user to force a refresh in order to have the new language preferences reflected." - it doesn't matter if they do that; even a Ctrl-F5 in Firefox here sends an If-Modified-Since header and so you still get the English version, no matter how much you want the French one. You have to actually clear the cache in order to get the other language.
> Chris's patch seems sensible to me, with no side effects that I can see The side effect is increased server load in a quite common case. It really is necessary to do this properly, i.e. to put the relevant headers in the cache. Here's the common case: a site has a default appearance that is seen by 99% of visitors. The remaining 1% are subscribers who see a personalised version. Subscribers are identified by a cookie, so the site sends vary:cookie. If done properly, with any cookie header stored in the cache, the cache will hit until the cookie changes. If done as Chris suggests, the cache will never hit. So the server load is increased for all visitors, not just subscribers. Just to re-iterate: if you're implementing a site, your life will be much simpler if you use etags.
If you need another test case (or live demo) of the Vary-ignoring bug, here's a great one: http://www.tradeups.net/user/admin In case the page is down, here's a description of the chain of events: 1. Page /user/admin is requested with standard request headers, HTML page is returned. 2. A script on the page requests /user/admin with Accept: application/json in order to get a JS-readable version of the data. JSON is returned through the AJAX call. 3. View Source or Save Page gets you the JSON, not the HTML, since the JSON is the last to be cached. All these requests return Vary: Accept in the response headers, but it doesn't help.
It turns out that a newer bug was filed and has a patch, so marking this as a duplicate of the newer bug.
Status: NEW → RESOLVED
Closed: 20 years ago15 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.