Closed Bug 1713941 Opened 3 years ago Closed 3 years ago

Firefox cache doesn't seem to handle link header arrays correctly in 304 transforms

Categories

(Core :: Networking: Cache, defect, P2)

Firefox 88
defect

Tracking

()

RESOLVED FIXED
92 Branch
Tracking Status
firefox92 --- fixed

People

(Reporter: aw248065, Assigned: valentin)

References

Details

(Whiteboard: [necko-triaged])

Attachments

(1 file)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:89.0) Gecko/20100101 Firefox/89.0

Steps to reproduce:

We setup a minimum repro case here: https://github.com/maxakuru/firefox-cache-bug where a server is spun up that returns a constant etag on some GET, serves 304 when it receives the etag as if-match, and returns multiple links headers.

To repro you

  1. clone the repo,
  2. run yarn && yarn start,
  3. then open http://localhost:2337 in Firefox

This example was tested on Firefox 88, 89, as well as nightly build 91.0a1 (2021-06-01) (64-bit) on MacOS

Actual results:

The links header containing a list of links is squashed in the cache to only the serve the last entry when the if-match hits and server returns 304

In the repro case, you see that the 2nd request (for which 304 is returned), the only the last link in the array (of three links) is in the cache.

Expected results:

I would have expected all the links of the array to be in the cache; in the example repro, I would have expected both the first request and the 2nd request to return the full links array.

Let's try Necko cache as an initial triage.

Component: Untriaged → Networking: Cache
Flags: needinfo?(jstutte)
Product: Firefox → Core

Just adding a comment to bump this bug, our app is blocked on FF due to it

Clearing needinfo for re-triage.

Flags: needinfo?(jstutte)

The contents of about:cache-entry?storage=disk&context=O^partitionKey=%28http%2C127.0.0.1%2C2337%29,&eid=&uri=http://127.0.0.1:2337/test on my test

Cache entry information

key: http://127.0.0.1:2337/test
fetch count: 12
last fetched: 2021-08-01 23:09:26
last modified: 2021-08-01 23:09:26
expires: 2021-08-01 23:09:26
Data size: 0 B
Security: This document does not have any security info associated with it.
strongly-framed: 1
request-method: GET

ctid: 1
uncompressed-len: 0
net-response-time-onstart: 97
net-response-time-onstop: 116

response-head:

HTTP/1.1 200 OK
etag: some-etag
Content-Length: 0
Date: Sun, 01 Aug 2021 21:09:26 GMT
X-Powered-By: Express
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: ETag,Link
link: <ref3>; param1=value1

original-response-headers:

X-Powered-By: Express
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: ETag,Link
etag: some-etag
link: <ref>; param1=value1
link: <ref2>; param2=value2
link: <ref3>; param1=value1
Date: Sun, 01 Aug 2021 21:09:20 GMT
Connection: keep-alive
Content-Length: 0

From what I can tell, what is happening here is that we do the first request and properly parse the headers.
The second request comes from the cache, but we still revalidate it via a 304 - which again returns the headers and we'll call UpdateHeaders on the cachedResponseHead. This will go through each of the headers in the 304 and overwrite the old ones, so we get left with only the last instance of the Link header.

Assignee: nobody → valentin.gosu
Severity: -- → S3
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Priority: -- → P2
Whiteboard: [necko-triaged]

Going through each of the headers in the 304 response and setting it on the
cached response head will cause us to only keep the last instance when
duplicate header names are present.

We should instead use the GetHeader method which properly handles merging the
response.

Pushed by valentin.gosu@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/e6a29acdaca7
Properly handle duplicated headers in a 304 revalidation r=necko-reviewers,dragana
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 92 Branch

Hi, I was wondering if this fix was in the current nightly build? I installed nightly build a couple days ago (and again today because I thought maybe I got a build before it went in). At first I thought it was fixed because in the console when looking at the HEAD request, it showed response headers with:
link: <ref>; param1=value1;
link: <ref2>; param2=value2
link: <ref3>; param1=value1;

but trying to get the link header with xhr.getResponseHeader("link"), still only returned <ref3>; param1=value1;
When I look at the HEAD request info in the Firefox console for the 304, is that showing the original response? (asking because I don't know)

When I look at the disk cache entry information in about:cache-entry?storage=disk&context=O^partitionKey=%28http%2C127.0.0.1%2C2337%29,&eid=HEAD&uri=http://127.0.0.1:2337/test which looks similar to what valentin.gosu@gmail.com saw originally?

Cache entry information

key: http://127.0.0.1:2337/test
fetch count: 8
last fetched: 2021-08-06 18:07:26
last modified: 2021-08-06 18:07:26
expires: 2021-08-06 18:07:26
Data size: 0 B
Security: This document does not have any security info associated with it.
necko:classified: 1
strongly-framed: 1
request-method: HEAD

response-head:

HTTP/1.1 200 OK
etag: some-etag
Date: Sat, 07 Aug 2021 01:07:26 GMT
X-Powered-By: Express
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: ETag,Link
link: <ref3>; param1=value1;

original-response-headers:

X-Powered-By: Express
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: ETag,Link
etag: some-etag
link: <ref>; param1=value1;
link: <ref2>; param2=value2
link: <ref3>; param1=value1;
Date: Sat, 07 Aug 2021 01:03:39 GMT
Connection: keep-alive
Keep-Alive: timeout=5
ctid: 1
uncompressed-len: 0
net-response-time-onstart: 10
net-response-time-onstop: 11

I was looking at the fix and source code, just trying to understand how things work. Not familiar at all with it so I have just basic questions on how it comes together valentin.gosu@gmail.com (and thank you for your work on the fix)

How does using GetHeader after PeekHeaderAt help to merge instances when duplicate header names are present? I'm sure I've missed something but when I was looking at them they appeared to both look for a specific header in mHeaders and give back the value of the entry in mHeaders.

You mentioned merging the response, does the merge parameter in SetHeader_locked have anything to do with the merging you were talking about or is it just totally not related to that? I saw that the comment above it said it would overwrite the current header value with the new value.

Got latest version of nightly, and now seeing the three links in the response head, not just original header. So I guess when I first testing with the nightly build it didn't have the change in it yet. Thank you again!

I'm still just generally curious about how using GetHeader instead of just PeekHeaderAt merges the response, if you don't mind me asking, but let me know if there's a place I can just ask these questions about the code in general, instead of here..

Previously aOther->mHeaders.PeekHeaderAt(i, header, headerNameOriginal); would get the return the value of the

So, previously you'd have a cache entry with the merged header Link: <ref>; param1=value1, <ref2>; param2=value2, <ref3>; param1=value1
You'd get a 304 response and merge the headers in UpdateHeaders, so for each one we'd call:
aOther->mHeaders.PeekHeaderAt(i, header, headerNameOriginal); and get the unmerged value and set it, so we'd only keep the last one.
Instead we use PeekHeaderAt to only get the name, and at each iteration we set the value of Link to what nsHttpHeaderArray::GetHeader returns. This method calls LookupEntry which according to the comment always returns "merged" headers.

So for each "Link" header in the 304 response we'll set the value on the cached response to the full merged "Link" header in the 304. It's a bit inefficient but there's less risk of regressions.

I see, thanks for the explanation!

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: