Closed
Bug 718797
Opened 13 years ago
Closed 9 years ago
Heuristic freshness calculation does not account for relaxation of rules in HTTPbis
Categories
(Core :: Networking: Cache, defect)
Core
Networking: Cache
Tracking
()
RESOLVED
FIXED
mozilla44
Tracking | Status | |
---|---|---|
firefox44 | --- | fixed |
People
(Reporter: briansmith, Assigned: mcmanus)
References
()
Details
(Keywords: perf, Whiteboard: [HTTPbis][parity-Chrome])
Attachments
(1 file)
(deleted),
patch
|
u408661
:
review+
|
Details | Diff | Splinter Review |
+++ This bug was initially created as a clone of Bug #468594 +++
This is a spin-off from bug #462243 addressing the problem of heuristic query-freshness. In that bug, we implemented the RFC 2616 requirement for validating cache entries that contain a query string:
From http://tools.ietf.org/html/rfc2616#section-13.9:
"We note one exception to this rule: since some applications have
traditionally used GETs and HEADs with query URLs (those containing a
"?" in the rel_path part) to perform operations with significant side
effects, caches MUST NOT treat responses to such URIs as fresh unless
the server provides an explicit expiration time. This specifically
means that responses from HTTP/1.0 servers for such URIs SHOULD NOT
be taken from a cache. See section 9.1.1 for related information."
NOTICE however, that this is limited to ***HTTP/1.0*** servers and does not apply to HTTP/1.1 responses.
HTTPbis has proposed to change this:
From http://tools.ietf.org/html/draft-ietf-httpbis-p6-cache-18#section-2.3.1.1
"Note: RFC 2616 ([RFC2616], Section 13.9) required that caches do
not calculate heuristic freshness for URIs with query components
(i.e., those containing '?'). In practice, this has not been
widely implemented. Therefore, servers are encouraged to send
explicit directives (e.g., Cache-Control: no-cache) if they wish
to preclude caching."
We should comment on ietf-http-wg@w3.org that the query string restriction HAS been widely implemented (at least, it was implemented by us, which qualifies on its own as "widely").
We should either change our implementation to take the HTTP version into account (only do validation for 1.0, but not 1.1+). Or, if we think that is a bad idea, we should push back on the HTTPbis change ASAP with our rationale to have the change reverted. I do not know the HTTPbis timeline but this probably needs to be done ASAP.
We should also investigate what IE8/9/10 (at least) do in this situation.
See bug 703995 where a Patrick Müller demonstrated that this is a significant performance difference between Chrome and Firefox.
https://bugzilla.mozilla.org/show_bug.cgi?id=703995
Comment 1•13 years ago
|
||
(In reply to Brian Smith (:bsmith) from comment #0)
> From http://tools.ietf.org/html/rfc2616#section-13.9:
>
> "We note one exception to this rule: since some applications have
> traditionally used GETs and HEADs with query URLs (those containing a
> "?" in the rel_path part) to perform operations with significant side
> effects, caches MUST NOT treat responses to such URIs as fresh unless
> the server provides an explicit expiration time. This specifically
> means that responses from HTTP/1.0 servers for such URIs SHOULD NOT
> be taken from a cache. See section 9.1.1 for related information."
>
> NOTICE however, that this is limited to ***HTTP/1.0*** servers and does not
> apply to HTTP/1.1 responses.
Fwiw, I don't agree with your interpretation. My understanding is that 1.1 responses MUST do a conditional request for such cached entries, but since 1.0 do not define conditional requests we cannot send one, and hence cannot use cached 1.0 entries at all (which we won't, because a 1.0 server will not understand the cond req and just return the resource again with a 200).
> HTTPbis has proposed to change this:
>
> From
> http://tools.ietf.org/html/draft-ietf-httpbis-p6-cache-18#section-2.3.1.1
>
> "Note: RFC 2616 ([RFC2616], Section 13.9) required that caches do
> not calculate heuristic freshness for URIs with query components
> (i.e., those containing '?'). In practice, this has not been
> widely implemented. Therefore, servers are encouraged to send
> explicit directives (e.g., Cache-Control: no-cache) if they wish
> to preclude caching."
This is fine but doesn't really specify how a client should behave, right? :) It just states that servers should be more explicit. (Which will avoid the whole problem.)
> We should comment on ietf-http-wg@w3.org that the query string restriction
> HAS been widely implemented (at least, it was implemented by us, which
> qualifies on its own as "widely").
Agreed.
Reporter | ||
Comment 2•13 years ago
|
||
(In reply to Bjarne (:bjarne) from comment #1)
> (In reply to Brian Smith (:bsmith) from comment #0)
> Fwiw, I don't agree with your interpretation. My understanding is that 1.1
> responses MUST do a conditional request for such cached entries, but since
> 1.0 do not define conditional requests we cannot send one, and hence cannot
> use cached 1.0 entries at all (which we won't, because a 1.0 server will not
> understand the cond req and just return the resource again with a 200).
You are right. I misread the spec.
> This is fine but doesn't really specify how a client should behave, right?
> :) It just states that servers should be more explicit. (Which will avoid
> the whole problem.)
The effect is that the requirement "caches MUST NOT treat responses to such URIs as fresh unless the server provides an explicit expiration time" is REMOVED from the HTTP 1.1 specification, which means effectively that we could adopt Chrome's behavior and conform to HTTPbis vs RFC2616.
If we think that HTTPbis is wrong here, and we think we must validate these cached responses, then we need to argue for HTTPbis to revert to the RFC 2616 requirement so that Chrome is considered to be non-compliant. As it is right now, Chrome is just as compliant with HTTPbis as we are (in this respect) AND has much better performance in this scenerio. Not a good situation.
The other thing is that we could be more lenient with some things and not others. For example, we might use heuristic expiration for images, stylesheets, and/or scripts (and related things) either by mime type or by context (e.g. <img src=''>), but force validation for HTML documents, XHR, etc. If we were to do this, it would be difficult for the HTTP specification to say anything too concrete about it, but we should make sure the spec. is more in line with what we do.
Summary: Heuristic freshness calculation does not take into account HTTP version and does not account for relaxation of rules in HTTPbis → Heuristic freshness calculation does not account for relaxation of rules in HTTPbis
Comment 3•13 years ago
|
||
HTTPbis context: http://trac.tools.ietf.org/wg/httpbis/trac/ticket/211
Assignee | ||
Comment 4•13 years ago
|
||
on this one choose fast over safe.
Comment 5•13 years ago
|
||
If Chrome has been doing this for a while (do we know how long?), then I think it's safe to assume that HTTP/1.1 sites are either not returning transient pages with get + query string, or will fix themselves not to. So seems like we should take the HTTPBiz behavior.
(In reply to Bjarne (:bjarne) from comment #1)
> My understanding is that 1.1
> responses MUST do a conditional request for such cached entries, but since
> 1.0 do not define conditional requests we cannot send one, and hence cannot
> use cached 1.0 entries at all (which we won't, because a 1.0 server will not
> understand the cond req and just return the resource again with a 200).
Conditional requests still work in 1.0; HTTP versioning isn't monolithic.
> This is fine but doesn't really specify how a client should behave, right?
> :) It just states that servers should be more explicit. (Which will avoid
> the whole problem.)
>
> > We should comment on ietf-http-wg@w3.org that the query string restriction
> > HAS been widely implemented (at least, it was implemented by us, which
> > qualifies on its own as "widely").
>
> Agreed.
In that text, "widely" means "by many cache implementations" (i.e., it's not talking about market share). Still happy to talk about it in the WG, of course.
Please do do quickly, however; we're trying to get to LC.
While you're looking at cache-related issues, see also:
http://trac.tools.ietf.org/wg/httpbis/trac/ticket/212
... as it'd be good to confirm that works for you too.
Comment 7•9 years ago
|
||
Well, it doesn't seem like we ever took this back to the WG. Should we change something about our implementation here?
Flags: needinfo?(mcmanus)
Assignee | ||
Comment 8•9 years ago
|
||
I've confirmed that chrome will cache these documents - I'm going to make the change to match that behavior.
Flags: needinfo?(mcmanus)
Comment hidden (obsolete) |
Assignee | ||
Comment 10•9 years ago
|
||
Attachment #8671883 -
Flags: review?(hurley)
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → mcmanus
Status: NEW → ASSIGNED
Assignee | ||
Comment 11•9 years ago
|
||
Comment 12•9 years ago
|
||
Comment on attachment 8671883 [details] [diff] [review]
allow heuristic cache of query string resources
Review of attachment 8671883 [details] [diff] [review]:
-----------------------------------------------------------------
I, for one, welcome our new heuristically-valid overlords!
Attachment #8671883 -
Flags: review?(hurley) → review+
Assignee | ||
Comment 13•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/bb031a5714930c31e25b8c160277284bc540efb2
bug 718797 - allow heuristic cache of query string resources r=hurley
Comment 14•9 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox44:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
You need to log in
before you can comment on or make changes to this bug.
Description
•