Open Bug 1491010 Opened 6 years ago Updated 2 years ago

[Fetch API] Headers.get("WWW-Authenticate") does not return anything, if server sends more than one such header

Categories

(Core :: DOM: Networking, defect, P3)

60 Branch
defect

Tracking

()

People

(Reporter: TbSync, Assigned: twisniewski)

References

(Blocks 1 open bug)

Details

(Whiteboard: [necko-triaged])

Attachments

(1 file, 1 obsolete file)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36 Edge/17.17134

Steps to reproduce:

I create a GET or POST request using the Fetch API without authentication and get back a 401 from the server and use 

   response.headers.get("WWW-Authenticate")

to eval the allowed authentication methods. This works fine, if the server sends only one such header.

If however he sends two or more, Headers.get("WWW-Authenticate") returns null.

I only observe this for the WWW-Authenticate header, other headers - if send multiple times - will simply be returned as a list (joined by colon).

This is a example set of headers, that is not accessible:

WWW-Authenticate: Negotiate
WWW-Authenticate: Basic realm="Contact and Calendar data"

I encountered this using Thunderbird 60, but the Fetch API is part of the core


Actual results:

Headers.get("WWW-Authenticate") returns null, if the response contains more then one such header.



Expected results:

Get the values of all WWW-Authenticate headers, joined by "\n".
Component: Untriaged → Networking
Product: Firefox → Core
I think this is DOM:Networking...
Component: Networking → DOM: Networking
If you need a Server to test against:
organizer.kdjf.net

that will return the example set from above.
Dragana, the reason I put it in Networking is because of the header parser special casing WWW-Athenticate which is likely what leads to the API doing something weird as well.
As for the expected result:

> Get the values of all WWW-Authenticate headers, joined by "\n".

I don't think there's any specification that would make that the expected result.
That is what Ehsan Akhgari said:

An interesting question is what we return from Headers.get() for WWW-Authenticate, given that the fetch spec calls for pasting the multiple header values together with ", " but Gecko internally pastes them together with "\n"...
I may have misread that, but since the Digest Auth Header contains colons itself, this might be a good idea. Otherwise parsing that would be very difficult.

But that is not the scope of this bug. I just thought "that this is the expected behavior".
The component is correct.  We translate the *-Authenticate headers to \n separation internally for a reason - would have to look the reasons up to list here.  The fix has to happen on higher levels and convert auth headers to a comma separated list.

Or we (Necko) have to provide a different API to read these headers and translate for consumers to .join(',') form.
Talked to Anne on irc.  The likely best solution would be to have a special internal API to read the response auth headers separated with \n (to not break the existing code all over the place) and the current exposed api for response header reading should return it ',' separated.
Status: UNCONFIRMED → NEW
Component: DOM: Networking → Networking: HTTP
Ever confirmed: true
The code I was looking at earlier today was <https://searchfox.org/mozilla-central/rev/a23c3959b62cd3f616435e02810988ef5bac9031/netwerk/protocol/http/nsHttpHeaderArray.h#268>.  The comment there says that the pasting with newline is done because the header value itself may contain commas, which makes sense.

Please note that I haven't debugged this myself, and don't know for example why we return null right now.  That seems surprising to me.  So there may be two issues at play here, one on the DOM side and one on the Necko side.
Thanks Ehsan.  After discussing this with the Necko team and reading comment 0 more carefully, we identify this as a DOM:Net bug specifically in fetch.  We definitely provide the header(s), but delimited by \n and not by ',' for reasons that Ehsan explains.  This has to be fixed in fetch.
Component: Networking: HTTP → DOM: Networking
I still don't think that makes sense. We should have consistent header parsing. HTTP doesn't allow special casing certain header names, with the exception of Set-Cookie. A high-level API effectively undoing the special casing of the low-level API seems like bad design.
(Also note we'd have to undo this special casing for all "high-level" consumers, not just fetch(). Though at the moment that'd be just fetch() and XMLHttpRequest.)
(In reply to Anne (:annevk) from comment #11)
> A high-level API effectively undoing the special
> casing of the low-level API seems like bad design.

Anne, I didn't realize before, but the problem is that even though http channel returns "foo\nbar\n" - i.e. a non-empty value, the fetch() api returns null instead of that string.

I'm not aware of the same bug in XHR.
I think fetch wants to use the following APIs:

https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/netwerk/protocol/http/nsIHttpChannel.idl#354-386

This is already used for devtools and maybe even XHR is using this.
Per https://tools.ietf.org/html/rfc7235#section-4.1 the production for WWW-Authenticate is 1#challenge. As far as I know that means that

  WWW-Authenticate: X
  WWW-Authenticate: Y

is equivalent to

  WWW-Authenticate: X, Y

and both should always be treated the same. That we special case the former suggests there's something related to web compatibility not covered by HTTP or simply legacy code that we could fix.

In general I think whenever we treat

  Name: Value
  Name: Value

different from

  Name: Value, Value

that's something we should look into fixing as proxies are allowed to combine, APIs will combine, and treating them differently creates attack opportunities. (I started looking into this a bit, e.g., see https://github.com/web-platform-tests/wpt/pull/10548 for Content-Length, but I haven't really finished this work.)
I pushed some tests here covering Fetch and XMLHttpRequest: https://github.com/web-platform-tests/wpt/pull/13006. Chrome and Safari don't have this problem. (Tests covering HTTP authentication would be nice too, but I'll leave that to someone else for now.)
I just want to correct my posts in this bug: Whenever I said colon, I meant comma (,). 
-> The digest auth header of course contains commas, not colons

Sorry for that.
(In reply to Honza Bambas (:mayhemer) from comment #14)
> I think fetch wants to use the following APIs:
> 
> https://searchfox.org/mozilla-central/rev/
> dd965445ec47fbf3cee566eff93b301666bda0e1/netwerk/protocol/http/
> nsIHttpChannel.idl#354-386
> 
> This is already used for devtools and maybe even XHR is using this.

That doesn't sound right.

Headers:
https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/dom/fetch/InternalHeaders.cpp#370
XHR:
https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/dom/xhr/XMLHttpRequestMainThread.cpp#1129

Both using VisitResponseHeaders().

devtools network panel however uses VisitOriginalResonseHeaders():
https://searchfox.org/mozilla-central/rev/dd965445ec47fbf3cee566eff93b301666bda0e1/devtools/server/actors/network-monitor/network-observer.js#282
Another question for Honza:

The comments here state that newline is used for easier parsing of the header components.  Where do we need to parse the contents of WWW-Authenticate and Proxy-Authenticate?  How is this not a problem for other consumers?  I can kind of agree with Anne's point in comment 11 to be honest.  (It's a bit annoying that Necko exposes these "wrong" APIs for consumers to inspect headers using!)
The http header code is old and I do not really want to change it. The only think that can happen is that we introduce some regressions.

Therefore we introduced VisitOriginalResonseHeaders.
It seems to me we have a conflict here between the HTTP spec (which says that all headers but Set-Cookie should logically be mergeable with ", ") and the fact of life on the web that WWW-Auth and Proxy-Auth header values can contain commas (and presumably can have a space following the comma?).

This means that following the spec here in the presence of multiple headers could return a value that can't be parsed correctly/unambiguously.  That seems Bad, and makes me ask--what are other browsers doing when a Fetch request reads these headers?  Are they using \n as a delineator? Commas? Some other hack?  This feels like an HTTP spec bug and a potential cross-browser incompatibility in practice.  The latter is actually my main worry.

I'm less worried about the internal API design (i.e whether internally we provide all comma-separated vals, with a special API for getting them with \n, or vice versa).  But let's talk about it :) :) :)

There are 3 formats we can return HTTP header values in:

1) Off-the-wire without modification (and in order).  Devtools wants this, for instance, so we can show users exactly what came off the wire.  Hence VisitOriginalHeaders().

2) "Useful" format.  Return comma-separated when useful, or use \n when commas would result in a possibly unparsable return value.  It's what GetResponseHeader() does.

3) Spec-correct (always use commas, except for Set-Cookie).  This is the API that asked for here.  It's not clear to me what code we would want to switch to use it (i.e. it's not clear what code wants to see potentially ambiguous comma-separated values for {WWW|Proxy}-Auth)

I do agree that for the internal Mozilla developer who isn't familiar with this code, skipping right to "Useful" behavior is unintuitive (hence Ehsan's "It's a bit annoying" comment).  I don't think that's a strong reason to switch the APIs for no functional gain.

Because I apparently enjoy such things, I just tried to figure out the bugzilla and spec/industry history here:

Blame is a wormhole--the change goes back into our CVS repo, where it got landed as part of a huge "http branch" merge in 2001 (I gave up at that point):

   https://github.com/ehsan/mozilla-cvs-history/blame/408c8ab6a6ac62443d50922ef7889b580e07aa1c/netwerk/protocol/http/src/nsHttpHeaderArray.cpp#L76
   https://github.com/ehsan/mozilla-cvs-history/commit/265c2b53cbfd5dc8d0fe3c3aac56737faefec37e

Spec-wise, RFC 2617 explicitly and unhelpfully addresses this issue here https://tools.ietf.org/html/rfc2617#section-1.2 :

   User agents will need to take special care in parsing the WWW-
   Authenticate or Proxy-Authenticate header field value if it contains
   more than one challenge, or if more than one WWW-Authenticate header
   field is provided, since the contents of a challenge may itself
   contain a comma-separated list of authentication parameters.

So 1) this is a known issue; 2) Short of implementing some smart algorithm that reliably guesses which commas are separators, the spec is basically asking for the impossible; 3) We've lived with the current behavior since 2001.

My gut tells me the first thing to resolve is the cross-browser compat issue.  If Fetch is returning multiple Auth headers the same way on different browsers, that's a problem.  Anne, do you happen to know the answer to that? (or have a faster way to find out other than my finding someone to look into it?)
Flags: needinfo?(annevk)
The tests I linked to in comment 16 (now part of web-platform-tests) show that in other browsers there's nothing special about WWW-Authenticate to the extent that can be observed from XMLHttpRequest and fetch(). They treat it like any other header.

Also, reading the latest HTTP Authentication RFC https://tools.ietf.org/html/rfc7235#section-4.1 on this matter it seems that you can unambiguously parse this and are required to when multiple challenges are found in a single header, so we might even have a compliance issue here as well...
Flags: needinfo?(annevk)
Jason, thanks for you insightful comment.
Priority: -- → P3
Whiteboard: [necko-triaged]
Jason, thanks, this really helped me understand the semantics of the currently exposed APIs, I never really correctly understood which functions are supposed to be used for what purpose.

It seems that we need a VisitScriptResponseHeaders() for the "scripted" callers that need to adhere to DOM specs, which can be called from the call sites mentioned in comment 18.  That would fix the bug, as far as I can tell.
Depends on: 1493686
No longer depends on: 1493686
Depends on: 669675
add response header getter and visitor methods which work as per the Fetch standard, using them for fetches and XHRs
I found some time to create a patch that seems to match what's desired here (though the names of the functions and such might need tweaking).

A try-run seems fine: https://treeherder.mozilla.org/#/jobs?repo=try&revision=348de9846a38e0ba5428d485861bebad210d1d45
I may have missed something: You are changing the way XHR Returns WWW-Authenticate headers, because it currently does not comply with the standard. Ok. But this bug is about fetch() not returning any WWW-Authenticate header, if two or more of them are send by the server. This is actually a show stopper for me which is making fetch() useless at the moment...
I see, john.bieling. You're right that your original issue isn't going to be addressed by this fix.

When I visit https://organizer.kdjf.net, bypass the SSL warning, cancel the auth-prompt, and then run this in the console (and cancel the auth prompt again):

>fetch("https://organizer.kdjf.net").then(r=>{console.log(r.headers.get("www-authenticate"));})

I'm seeing Firefox logging null, while Chrome and Edge display 'Negotiate, Basic realm="Contact and Calendar data"'

Dragana, Honza, is Firefox perhaps concealing the headers because it's a 401 response?
Flags: needinfo?(honzab.moz)
Flags: needinfo?(dd.mozilla)
Yes, that is what I observe as well. If you repeat that test with another server which only returns one WWW-Authenticate header, fetch() will return that as expected. So it is not a general supression on 401, but some sort of parsing error.
Actually, funny enough, I think I'm dead wrong: I just repeated the test on a build with the patch I just wrote, and it's showing the two headers just fine. So yeah, unless I'm seeing things the newline fix should also fix what you're running into, john.bieling.
Flags: needinfo?(honzab.moz)
Flags: needinfo?(dd.mozilla)
I was just flying over the patch and did not test it, I will do that soon, but if it works *Hooray*

But I now have a general question about all this. Consumers like fetch and XHR will now only see one single WWW-Authenticate header which needs to be parsed manually to get the individual headers. This however is not an easy task, because these headers can contain commas themself.

If I have understood this correctly, the reasoning behind disallowing access to the pure/original header is, a proxy in between could have mangled the headers already so every client needs to be able to cope with that single WWW-Authenticate header.

That should also be valid for nsIChannel / nsIHttpChannel. So there must be a some function already to get the individual headers from the serialized header. Can this function be exposed to be used by fetch and XHR?
john.bieling, yes, whatever "raw" headers we receive could likely be exposed in some other manner. But of course what that manner is would be a separate standards discussion from just fixing this bug (and I'm afraid I couldn't tell you whether proxy-mangling would make it worth exposing such detail at all; that would be a discussion for those more knowledgeable than I am on the matter).
John, for how to parse the WWW-Authenticate header, please see comment 22 (https://tools.ietf.org/html/rfc7235#section-4.1).  Using the description given there, a parsing algorithm can be constructed to parse WWW-Authenticate headers with multiple challenges unambiguously.

You may argue that doing so is difficult, but that's hardly a reason to change a feature in web browsers.  Usually in these cases people use JS libraries to rescue.  :-)  A quick github search seems to suggest people have already created libraries to help parse WWW-Authenticate, e.g. https://github.com/izaakschroeder/auth-header.
Assignee: nobody → twisniewski
I saw those comments, but what I was thinking is that there must be a full implementation of that parsing already part of Firefox, because nsIChannel / nsIHttpChannel must be doing that now already. So I was asking to allow others to use that parsing code instead of having to write it again.

Sorry if i cannot get myself explained, english is not my mother language.
I think the approach of undoing the 0x0A (\n) rather than 0x2C 0x20 (, ) combining is likely insufficient. I discovered and reported in bug 1499356 that when combining we also omit headers with empty/whitespace values, which is against the HTTP standard and contrary to what other browsers do. A fix for that that gives fetch() and XMLHttpRequest the correct values cannot be undoing things, but needs to operate on the headers earlier on.
(In reply to Anne (:annevk) from comment #36)
> I think the approach of undoing the 0x0A (\n) rather than 0x2C 0x20 (, )
> combining is likely insufficient. I discovered and reported in bug 1499356
> that when combining we also omit headers with empty/whitespace values, which
> is against the HTTP standard and contrary to what other browsers do. A fix
> for that that gives fetch() and XMLHttpRequest the correct values cannot be
> undoing things, but needs to operate on the headers earlier on.

ni? Dragana to make sure she sees this before reviewing the patch...
(In reply to john.bieling from comment #35)
> I saw those comments, but what I was thinking is that there must be a full
> implementation of that parsing already part of Firefox, because nsIChannel /
> nsIHttpChannel must be doing that now already. So I was asking to allow
> others to use that parsing code instead of having to write it again.

I think I understood what you meant originally.  But web browsers typically don't expose their internals to web pages in haphazard ways.  Rather we use web standards, which are documents that describe how various features that are available to web developers are expected to work.  Web browser developers like us look at those standards and code their browser engines according to those standards.  This ensures that the code that the web developers write works in all browsers.

The respective standards here are fetch and XHR: https://fetch.spec.whatwg.org/ and https://xhr.spec.whatwg.org/.  If you would like new features in these APIs, for examples utilities for parsing WWW-Authenticate headers, you need to propose your desired changes to the respective standards and once they get accepted into one, browsers will start to adopt those changes.

Otherwise, Firefox will never start to randomly expose an internal routine for parsing a header, or anything like that to a web page without it being standardized first.  Hope this helps describe the normal process we use for adding features to the web platform.
@:Ehsan This is so sad and it shows once more that you do not care at all about your AddOn developers. If you land this patch, you will break AddOns using XHR all over the place and you do not provide any help, even though you could.

There is lots of Mozilla specific stuff exposed, which Mozilla Addons can use, which will not work in Chrome. Appearently only if one of you needs some off-standard stuff, it gets exposed. Understood. And I was not asking to change the XHR or fetch() standard, I was aksing to expose the static parse function used by nsIChannel. I do not care at all about my AddOn not being able to run in Chrome, I only provide them for the Mozilla eco-system. I have no words for this.
John, I understand that you're frustrated, but the context of the discussion has so far been the Fetch and XHR standards themselves, not add-ons.

We also can't be keeping interoperability bugs with the web platform forever for the sake of add-ons, because we're a web browser first, and an add-on platform second.

As you mentioned there is also little stopping us from adding another add-on API, if that's ultimately a better solution (though I can see it being argued as redundant if the JS libraries in comment 34 are good enough).

Of course we'll back out or fix whatever we land here if it causes worse issues (with add-ons or web sites), but for now all we know for sure is that we're breaking the web's interoperability expectations and causing the bug which you initially reported here.
Wow, OK.  Sorry, I didn't realize you're talking about an *extension* API.  But this is literally the first time on this bug that you're mentioning you're using this API from an extension, or did I miss that before?

At any rate, for extensions, none of comment 38 applies, I said all of that since I assumed you're using this for a web page.  I'm not too familiar with the process for adding extension APIs, but I think there are certainly options for exposing more features to extension developers.

You can find some information about WebExtensions here: https://wiki.mozilla.org/WebExtensions.  From that page I found this guide for requesting new APIs: https://wiki.mozilla.org/WebExtensions/NewAPIs.  Please have a look, this will help get you in touch with the right folks about exposing this functionality to extensions.

Again, apologies for not understanding your use case properly before.
Oh no ... I really did not say that I look at this bug from the perspective of an extension developer. It's in my head all the time, but I have not told it. Of course that makes my last comment completely unjustified. I am deeply sorry. Thanks for de-escalating this.
No offense taken here.  :-)
Nor here. In fact I'm genuinely interested in figuring out new APIs like this, and where they belong. After all, parsing complex header values in general (not just www-authenticate, but others like content-type and such) is an area that's not always easy to do right, and can be very prone to error. It seems like an area worth taking seriously to me, and not just for add-ons.
Interestingly, no one complained about the Set-Cookie response header which we treat the same way:

https://searchfox.org/mozilla-central/rev/b03a62c3c82316e733a3b09622c1cb7e59f64cc3/netwerk/protocol/http/nsHttpHeaderArray.h#267-274
FWIW: cookies are (intentionally) not exposed through fetch() or XMLHttpRequest so nobody would have any way of knowing (other than by reading our code). (And in theory cookies are the sole exception to the combining rule, so if we ever did expose them they'd require a special API.)

(In reply to Honza Bambas (:mayhemer) from comment #10)

Thanks Ehsan. After discussing this with the Necko team and reading comment
0 more carefully, we identify this as a DOM:Net bug specifically in fetch.
We definitely provide the header(s), but delimited by \n and not by ',' for
reasons that Ehsan explains. This has to be fixed in fetch.

This looks like the true cause for this bug. Fetch() is becoming more and more unusable due to more servers/proxies sending multiple WWW-Authenticate headers.

We have completely stepped back from the request, that Firefox/fetch() should parse the headers for us, but we need the headers to parse them by ourselves. Any chance fetch() will be fixed soon and returns the headers joined by "," as stated by the spec, instead of nothing?

Chrome is following the spec and returns the list as expected.

Just so folks know, I'm currently working on a patch which does what Dragana's first idea (in her review notes on the previous patch) suggest, just making the results work with both Fetch and XHR.

I'd just like to see if we can drop the preference in bug 1504344 to help simplify things.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=89a62143b347df07805985fbb3f127036019567e

Blocks: 1528800
Attachment #9016521 - Attachment is obsolete: true

Have XHRs use the Fetch InternalHeaders code to handle GetResponseHeader and GetAllResponseHeaders, to reduce code-duplication.

When filling InternalHeaders, do two passes over the originals: one using VisitOriginalResponseHeaders for the cookies, proxy-authenticate, and www-authenticate headers which require special newline-handling, and one using VisitResponseHeaders for the rest that do not require special handling.

Note that I'm not 100% sure if I should be changing the MaybeSortList function match our findings in bug 1540688 for both fetch and XHR, or should instead make an XHR-specific codepath and keep the fetch code the same. Since I don't see any failing tests from this change, I thought it might be good to aim for consistency here, since XHR webcompat relies on that change, so perhaps fetch ought to do the same.

How are you guys doing on this one?
It's still a problem in version 73.00

Severity: normal → --
Severity: -- → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: