718797 - Heuristic freshness calculation does not account for relaxation of rules in HTTPbis

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Reporter

Description

•

13 years ago

+++ This bug was initially created as a clone of Bug #468594 +++ This is a spin-off from bug #462243 addressing the problem of heuristic query-freshness. In that bug, we implemented the RFC 2616 requirement for validating cache entries that contain a query string: From http://tools.ietf.org/html/rfc2616#section-13.9: "We note one exception to this rule: since some applications have traditionally used GETs and HEADs with query URLs (those containing a "?" in the rel_path part) to perform operations with significant side effects, caches MUST NOT treat responses to such URIs as fresh unless the server provides an explicit expiration time. This specifically means that responses from HTTP/1.0 servers for such URIs SHOULD NOT be taken from a cache. See section 9.1.1 for related information." NOTICE however, that this is limited to ***HTTP/1.0*** servers and does not apply to HTTP/1.1 responses. HTTPbis has proposed to change this: From http://tools.ietf.org/html/draft-ietf-httpbis-p6-cache-18#section-2.3.1.1 "Note: RFC 2616 ([RFC2616], Section 13.9) required that caches do not calculate heuristic freshness for URIs with query components (i.e., those containing '?'). In practice, this has not been widely implemented. Therefore, servers are encouraged to send explicit directives (e.g., Cache-Control: no-cache) if they wish to preclude caching." We should comment on ietf-http-wg@w3.org that the query string restriction HAS been widely implemented (at least, it was implemented by us, which qualifies on its own as "widely"). We should either change our implementation to take the HTTP version into account (only do validation for 1.0, but not 1.1+). Or, if we think that is a bad idea, we should push back on the HTTPbis change ASAP with our rationale to have the change reverted. I do not know the HTTPbis timeline but this probably needs to be done ASAP. We should also investigate what IE8/9/10 (at least) do in this situation. See bug 703995 where a Patrick Müller demonstrated that this is a significant performance difference between Chrome and Firefox. https://bugzilla.mozilla.org/show_bug.cgi?id=703995

Bjarne (:bjarne)

Comment 1

•

13 years ago

(In reply to Brian Smith (:bsmith) from comment #0) > From http://tools.ietf.org/html/rfc2616#section-13.9: > > "We note one exception to this rule: since some applications have > traditionally used GETs and HEADs with query URLs (those containing a > "?" in the rel_path part) to perform operations with significant side > effects, caches MUST NOT treat responses to such URIs as fresh unless > the server provides an explicit expiration time. This specifically > means that responses from HTTP/1.0 servers for such URIs SHOULD NOT > be taken from a cache. See section 9.1.1 for related information." > > NOTICE however, that this is limited to ***HTTP/1.0*** servers and does not > apply to HTTP/1.1 responses. Fwiw, I don't agree with your interpretation. My understanding is that 1.1 responses MUST do a conditional request for such cached entries, but since 1.0 do not define conditional requests we cannot send one, and hence cannot use cached 1.0 entries at all (which we won't, because a 1.0 server will not understand the cond req and just return the resource again with a 200). > HTTPbis has proposed to change this: > > From > http://tools.ietf.org/html/draft-ietf-httpbis-p6-cache-18#section-2.3.1.1 > > "Note: RFC 2616 ([RFC2616], Section 13.9) required that caches do > not calculate heuristic freshness for URIs with query components > (i.e., those containing '?'). In practice, this has not been > widely implemented. Therefore, servers are encouraged to send > explicit directives (e.g., Cache-Control: no-cache) if they wish > to preclude caching." This is fine but doesn't really specify how a client should behave, right? :) It just states that servers should be more explicit. (Which will avoid the whole problem.) > We should comment on ietf-http-wg@w3.org that the query string restriction > HAS been widely implemented (at least, it was implemented by us, which > qualifies on its own as "widely"). Agreed.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Reporter

Comment 2

•

13 years ago

(In reply to Bjarne (:bjarne) from comment #1) > (In reply to Brian Smith (:bsmith) from comment #0) > Fwiw, I don't agree with your interpretation. My understanding is that 1.1 > responses MUST do a conditional request for such cached entries, but since > 1.0 do not define conditional requests we cannot send one, and hence cannot > use cached 1.0 entries at all (which we won't, because a 1.0 server will not > understand the cond req and just return the resource again with a 200). You are right. I misread the spec. > This is fine but doesn't really specify how a client should behave, right? > :) It just states that servers should be more explicit. (Which will avoid > the whole problem.) The effect is that the requirement "caches MUST NOT treat responses to such URIs as fresh unless the server provides an explicit expiration time" is REMOVED from the HTTP 1.1 specification, which means effectively that we could adopt Chrome's behavior and conform to HTTPbis vs RFC2616. If we think that HTTPbis is wrong here, and we think we must validate these cached responses, then we need to argue for HTTPbis to revert to the RFC 2616 requirement so that Chrome is considered to be non-compliant. As it is right now, Chrome is just as compliant with HTTPbis as we are (in this respect) AND has much better performance in this scenerio. Not a good situation. The other thing is that we could be more lenient with some things and not others. For example, we might use heuristic expiration for images, stylesheets, and/or scripts (and related things) either by mime type or by context (e.g. <img src=''>), but force validation for HTML documents, XHR, etc. If we were to do this, it would be difficult for the HTTP specification to say anything too concrete about it, but we should make sure the spec. is more in line with what we do.

Summary: Heuristic freshness calculation does not take into account HTTP version and does not account for relaxation of rules in HTTPbis → Heuristic freshness calculation does not account for relaxation of rules in HTTPbis

Julian Reschke

Comment 3

•

13 years ago

HTTPbis context: http://trac.tools.ietf.org/wg/httpbis/trac/ticket/211

Patrick McManus [:mcmanus]

Assignee

Comment 4

•

13 years ago

on this one choose fast over safe.

Jason Duell

Comment 5

•

13 years ago

If Chrome has been doing this for a while (do we know how long?), then I think it's safe to assume that HTTP/1.1 sites are either not returning transient pages with get + query string, or will fix themselves not to. So seems like we should take the HTTPBiz behavior.

u7736

Comment 6

•

13 years ago

(In reply to Bjarne (:bjarne) from comment #1) > My understanding is that 1.1 > responses MUST do a conditional request for such cached entries, but since > 1.0 do not define conditional requests we cannot send one, and hence cannot > use cached 1.0 entries at all (which we won't, because a 1.0 server will not > understand the cond req and just return the resource again with a 200). Conditional requests still work in 1.0; HTTP versioning isn't monolithic. > This is fine but doesn't really specify how a client should behave, right? > :) It just states that servers should be more explicit. (Which will avoid > the whole problem.) > > > We should comment on ietf-http-wg@w3.org that the query string restriction > > HAS been widely implemented (at least, it was implemented by us, which > > qualifies on its own as "widely"). > > Agreed. In that text, "widely" means "by many cache implementations" (i.e., it's not talking about market share). Still happy to talk about it in the WG, of course. Please do do quickly, however; we're trying to get to LC. While you're looking at cache-related issues, see also: http://trac.tools.ietf.org/wg/httpbis/trac/ticket/212 ... as it'd be good to confirm that works for you too.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Reporter

Updated

•

12 years ago

Depends on: 835079

Boris Zbarsky [:bzbarsky]

Comment 7

•

9 years ago

Well, it doesn't seem like we ever took this back to the WG. Should we change something about our implementation here?

Flags: needinfo?(mcmanus)

Patrick McManus [:mcmanus]

Assignee

Comment 8

•

9 years ago

I've confirmed that chrome will cache these documents - I'm going to make the change to match that behavior.

Flags: needinfo?(mcmanus)

Comment hidden (obsolete)

Patrick McManus [:mcmanus]

Assignee

Comment 10

•

9 years ago

Attached patch allow heuristic cache of query string resources (deleted) — Details — Splinter Review

Attachment #8671883 - Flags: review?(hurley)

Patrick McManus [:mcmanus]

Assignee

Updated

•

9 years ago

Assignee: nobody → mcmanus

Status: NEW → ASSIGNED

Patrick McManus [:mcmanus]

Assignee

Comment 11

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8a894e24def4

u408661

Comment 12

•

9 years ago

Comment on attachment 8671883 [details] [diff] [review] allow heuristic cache of query string resources Review of attachment 8671883 [details] [diff] [review]: ----------------------------------------------------------------- I, for one, welcome our new heuristically-valid overlords!

Attachment #8671883 - Flags: review?(hurley) → review+

Patrick McManus [:mcmanus]

Assignee

Comment 13

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/bb031a5714930c31e25b8c160277284bc540efb2 bug 718797 - allow heuristic cache of query string resources r=hurley

Carsten Book [:Tomcat]

Comment 14

•

9 years ago

https://hg.mozilla.org/mozilla-central/rev/bb031a571493

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

status-firefox44: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla44

Bugzilla

Heuristic freshness calculation does not account for relaxation of rules in HTTPbis

Categories

(Core :: Networking: Cache, defect)

Tracking

()

People

(Reporter: briansmith, Assigned: mcmanus)

References

(
URL
)

Details

(Keywords: perf, Whiteboard: [HTTPbis][parity-Chrome])

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Comment 7

Comment 8

Comment 9

Comment 10

Updated

Comment 11

Comment 12

Comment 13

Comment 14

Attachment

General

Description

File Name

Content Type