Request fails with Windows Authentication and bigger payload
Categories
(Core :: Networking, defect, P2)
Tracking
()
People
(Reporter: martin.gubis, Assigned: valentin)
References
(Regression)
Details
(Keywords: regression, Whiteboard: [necko-triaged])
Attachments
(8 files)
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:81.0) Gecko/20100101 Firefox/81.0
Steps to reproduce:
- Configure Windows Authentication site hosted on Windows Server 2012 R2 on IIS8
- Create a page, which creates hidden field with lot of content (1 MB, it was aspx viewstate in my case)
- Load the page, authenticate properly and wait at least 2 minutes (the time is important - if I do the postback sooner, it works)
- Do the postback on the page
Actual results:
Postback ends with error page "connection resetted".
Expected results:
Postback should successfully reach the server.
Above mentioned setup works properly on Chrome and also when the Firefox uses Fiddler as the proxy.
Comment 1•4 years ago
|
||
Bugbug thinks this bug should belong to this component, but please revert this change in case of error.
Comment 2•4 years ago
|
||
Can you make a http log:
https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
please use a clean profile, because logs may contain cookies.
Reporter | ||
Comment 3•4 years ago
|
||
Please find the log on the MMIT cloud (it was too big to be attached, it will be available until 14th of December, 2020): https://cloud.moenkemoeller.it/index.php/s/QCMzqjqw5C7iAjE
Please let me know if there is anything else I can be helpful with.
Comment 4•4 years ago
|
||
The connection is reset by the server. Do you have a server log to look for reasons why the connection is reset.
Anyway, probably this is not firefox issue. Maybe the server log will give use some clue.
Reporter | ||
Comment 5•4 years ago
|
||
Reporter | ||
Comment 6•4 years ago
|
||
Reporter | ||
Comment 7•4 years ago
|
||
Reporter | ||
Comment 8•4 years ago
|
||
Reporter | ||
Comment 9•4 years ago
|
||
Reporter | ||
Comment 10•4 years ago
|
||
I have done some Wiresharking on the line between client and server. The problem seems to be somehow related to Firefox - the test works without problem on Chrome, only Firefox runs into connection reset.
First I have checked the Firefox access with smaller payload of the POST (Firefox - smaller payload.png). The actual post communication starts with No. 135, Firefox receives 401 several times (actual requests are not decoded well by Wireshark, that's the reason why they are not visible). Anyway, Firefox behaves correctly and receives 200 at the end.
Then I have checked the Chrome access with big payload of the POST (Chrome - bigger payload.png). After couple of 401s, Chrome receives correctly 200.
Then I have checked the Firefox access with big payload of the POST (Firefox - bigger payload.png). Communication stops prematurely after receiving 401 from the server. As you can see from Chrome, client is supposed to continue communication. I have drilled down to see what is actually happening in the point when the communication stops.
Drilldown of the communication end is in the "Firefox - Bigger payload - EOC drilldown.png". The actual communication is immediately preceded by the "401 Unauthorized, NTLMSSP_CHALLENGE" (so what you see in the drilldown is actually Firefox response to server). Firefox sends the POST request (packet No. 2594), but it seems, that the actual body is missing and is never sent by Firefox (the contents of the packet No. 2594 are attached in the "Firefox - Last sent request.txt"). The advertised Content-Length: 3000329 is never sent over the line. Server just acknowledges packet receive (TCP level) - Packet number 2595 and then waits for body to come. It never comes and server terminates connection (packet #2612) after the timeout.
This seems to be Firefox related problem - I think the body should be sent in the last request or it shouldn't be advertised (I am no expert in HTTP, so I am not sure whether this is possible at all).
Please let me know if you need more detailed information or Wireshark dumps. Thanks for looking into the issue.
Reporter | ||
Comment 11•4 years ago
|
||
Also, would it be possible to make 'Firefox - Last sent request.txt' private attachment? Or could it be deleted?
I am not sure whether the information in there are not sensitive. Thank you.
Updated•4 years ago
|
Comment 12•4 years ago
|
||
(In reply to Martin Gubis from comment #11)
Also, would it be possible to make 'Firefox - Last sent request.txt' private attachment? Or could it be deleted?
I am not sure whether the information in there are not sensitive. Thank you.
I've made it a private attachment.
In addition, could you try to make a http log as said in comment #2?
Note that the log should not have the content of authorization header, but could have cookie headers. If you want, you could send the log to my email address directly.
Thanks.
Reporter | ||
Comment 13•4 years ago
|
||
Thanks for the feedback. Http log was already created - please find it in comment #3 (I was not able to attach it since it is too big).
Comment 14•4 years ago
|
||
So, I can't find anything wrong from the log.
As said in comment #4, we need the information from the server side to tell us what's wrong.
Martin, could you try to figure out why the server reset the connection?
Reporter | ||
Comment 15•4 years ago
|
||
I have already done the analysis, please see complete analysis in comment #10.
To summarize:
The analysis was done on the TCP level using Wireshark. I am no expert in HTTP protocol, but it seems, that the Firefox simply doesn't send the whole request (please see Firefox - Last sent request.txt - it contains contents sent to the server - actually last packet sent to the server). Firefox advertises 3MB content length, but never sends the content. Server waits for content to arrive for 120s - nothing is sent from the Firefox, so it resets the connection.
Comment 16•4 years ago
|
||
The severity field is not set for this bug.
:valentin, could you have a look please?
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 17•4 years ago
|
||
I just took a look at the logs. Indeed, there seems to be a problem in Firefox. Do you happen to know if this worked with older versions of Firefox?
It seems we send the request, upload the entire body, but then fail to send the entire body when the auth retry happens:
2020-11-13 08:59:14.299000 UTC - [Parent 11488: Main Thread]: D/nsHttp sending progress and status notification [this=00000218801CD000 status=804b000a progress=0/0]
804b000a = STATUS_WAITING_FOR
2020-11-13 08:59:14.300000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannel::OnStartRequest [this=00000218801CD000 request=000002187C9E7D00 status=0]
2020-11-13 08:59:14.300000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannel::ProcessResponse [this=00000218801CD000 httpStatus=401]
2020-11-13 08:59:14.301000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannel::ContinueProcessResponse3 [this=00000218801CD000, rv=804b0001]
2020-11-13 08:59:14.301000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannelAuthProvider::ProcessAuthentication [this=000002187F28D790 channel=00000218801CD750 code=401 SSLConnectFailed=0]
nsHttpChannel @218801cd000 --> nsHttpChannelAuthProvider @2187f28d790
2020-11-13 08:59:14.301000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannelAuthProvider::PrepareForAuthentication [this=000002187F28D790 channel=00000218801CD750]
2020-11-13 08:59:14.301000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannelAuthProvider::GetAuthenticator [this=000002187F28D790 channel=00000218801CD750]
2020-11-13 08:59:14.301000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannelAuthProvider::GetCredentialsForChallenge [this=000002187F28D790 channel=00000218801CD750 proxyAuth=0 challenges=Negotiate]
2020-11-13 08:59:14.301000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannelAuthProvider::GetIdentityFromURI [this=000002187F28D790 channel=00000218801CD750]
2020-11-13 08:59:14.301000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannelAuthProvider::GetAuthenticator [this=000002187F28D790 channel=00000218801CD750]
2020-11-13 08:59:14.301000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannelAuthProvider::GetCredentialsForChallenge [this=000002187F28D790 channel=00000218801CD750 proxyAuth=0 challenges=NTLM]
2020-11-13 08:59:14.301000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannelAuthProvider::GetIdentityFromURI [this=000002187F28D790 channel=00000218801CD750]
2020-11-13 08:59:14.301000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannel::ConnectionRestartable this=00000218801CD000, restartable=1
2020-11-13 08:59:14.301000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannel::OnDataAvailable [this=00000218801CD000 request=000002187C9E7D00 offset=0 count=1293]
2020-11-13 08:59:14.301000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannel::OnStopRequest [this=00000218801CD000 request=000002187C9E7D00 status=0]
2020-11-13 08:59:14.301000 UTC - [Parent 11488: Main Thread]: D/nsHttp OnStopRequest 00000218801CD000 requestFromCache: 0 mFirstResponseSource: 0
2020-11-13 08:59:14.301000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannel 00000218801CD000 has a strongly framed transaction: 1
2020-11-13 08:59:14.302000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannel::DoAuthRetry [this=00000218801CD000, aTransWithStickyConn=0000021878C2A810]
2020-11-13 08:59:14.302000 UTC - [Parent 11488: Main Thread]: V/nsHttp HttpBaseChannel::SetRequestHeader [this=00000218801CD000 header="Cookie" value="ASP.NET_SessionId=ca4gv4je2j3ho3yzqwel4n31" merge=0]
2020-11-13 08:59:14.302000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannel::ContinueDoAuthRetry [this=00000218801CD000]
2020-11-13 08:59:14.302000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannel::DoConnect [this=00000218801CD000]
2020-11-13 08:59:14.302000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannel::DoConnectActual [this=00000218801CD000, aTransWithStickyConn=0000021878C2A810]
2020-11-13 08:59:14.302000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannel::SetupTransaction [this=00000218801CD000, cos=64, prio=-20]
2020-11-13 08:59:14.302000 UTC - [Parent 11488: Main Thread]: E/nsHttp nsHttpChannel 00000218801CD000 created nsHttpTransaction 000002187A809810
nsHttpChannel @218801cd000 --> nsHttpTransaction @2187a809800
2020-11-13 08:59:14.302000 UTC - [Parent 11488: Main Thread]: D/nsHttp nsHttpChannel::ContinueOnStopRequestAfterAuthRetry [this=00000218801CD000, aStatus=0 aAuthRetry=1, aIsFromNet=1, aTransWithStickyConn=0000021878C2A810]
2020-11-13 08:59:14.303000 UTC - [Parent 11488: Main Thread]: D/nsHttp sending progress and status notification [this=00000218801CD000 status=804b0005 progress=693/3001014]
804b0005 = STATUS_SENDING_TO
2020-11-13 08:59:14.303000 UTC - [Parent 11488: Main Thread]: D/nsHttp sending progress and status notification [this=00000218801CD000 status=804b000a progress=0/0]
804b000a = STATUS_WAITING_FOR
Hopefully this is an HTTP bug, not an NTLM bug.
Provisionally taking the bug to have a look.
Assignee | ||
Comment 18•4 years ago
|
||
Martin, do you know if sending a smaller payload makes the use case work?
The answer is useful for tracking down the cause of the bug.
Reporter | ||
Comment 19•4 years ago
|
||
Valentin, smaller payload works correctly. Problem surfaces only when there is a bigger payload. Should I check any specific payload size?
Reporter | ||
Comment 20•4 years ago
|
||
I didn't check with older version of Firefox. I suspect, that Firefox ESR 78 has the similar behaviour (problem was originally reported on the customer's side where I wasn't able to do proper analysis), but I didn't verify it. Is there any special version I should check against?
Assignee | ||
Comment 21•4 years ago
|
||
It seems we do attempt to rewind the upload stream and send all of it, but maybe that fails for some reason?
It would be useful to know what the upload is. Is it a file? Just a bunch of text you embed into the form?
(In reply to Martin Gubis from comment #20)
I didn't check with older version of Firefox. I suspect, that Firefox ESR 78 has the similar behaviour (problem was originally reported on the customer's side where I wasn't able to do proper analysis), but I didn't verify it. Is there any special version I should check against?
There's the mozregression tool you can use to find a range:
https://mozilla.github.io/mozregression/
I would for something in the past 2-3 years - mozregression -g 2018-01-01 -M classic
The -M classic
option would first check if the build on that date is good or bad before starting the bisection. If that fails you can go backwards by a few years, but I doubt it will start working 🙂
Reporter | ||
Comment 22•4 years ago
|
||
First phase of bisection
Reporter | ||
Comment 23•4 years ago
|
||
Reporter | ||
Comment 24•4 years ago
|
||
Valentin, I have successfully done bisection on nightly builds (I have done it in two phases - both are attached). The problem is regression, which happened in the commits on 23rd of October 2019 (version from 22nd of October 2019 works well and version from 23rd of October 2019 fails).
Also, if you have a look on the bisection report, the version from 26th of October 2019 crashed during the postback (I had to skip it).
The most of the payload (like 99%) is one big input (hidden). The problem doesn't surface when the big file is part of the payload - it seems, that only the big FormData causes trouble (but I didn't test this thoroughly, so I might be wrong with this - I don't have a proper test setup for this scenario).
The customer (big German enterprise) is using Firefox ESR 78.5.0 (32-bit). It would be great to have this regression fixed also in this ESR release. They are relying heavily on the Windows Authentication and are facing issues with apps, which have big postback payloads. Can you arrange it to have the fix also in the ESR 78 or should I create separate ticket for it?
Assignee | ||
Comment 25•4 years ago
|
||
Thank you for doing the regression. Future note, it's better to post the output rather than a screenshot 🙂 I eventually managed to extract the link from the photo.
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=47d7c18620c845c3a42&tochange=f915da42169e70bdd67
I suspect the culprit might be Bug 1583700, but I can't be sure.
Could you help us further by testing with the build here:
https://archive.mozilla.org/pub/firefox/nightly/2019/10/2019-10-25-16-40-00-mozilla-central/
Check if setting the browser.tabs.documentchannel
pref to false
in about:config
makes the bug go away.
Thanks!
Reporter | ||
Comment 26•4 years ago
|
||
Thanks for the feedback. I have tested the version from the repository (I didn't know how to run it in portable mode, so I have installed it to virtual machine - it was reporting, that the update is ready, but I made sure, that about showed 72.0a.1 nightly version when I was doing the tests):
- Testing with default config (fresh install): didn't work
- Had to reinstall (updates got installed during Firefox restart) - made sure I am on 72.0a.1 after reinstall
- Switching browser.tabs.documentchannel to false: works correctly
Would be the disabling of the document channel valid workaround for customer? Does it anyhow influence browsing experience on other pages?
Thanks.
Assignee | ||
Comment 27•4 years ago
|
||
I don't think the pref is available in esr78.
Matt, I'm trying to understand what's going on here, but without access to a test setup I'm only speculating.
Can you think of any interaction between document channel and the upload stream that would cause us to fail when retrying?
I suspect that rewinding the upload stream fails for some reason, but I can't say why it only happens with the document channel, and only with a large payload.
Updated•4 years ago
|
Comment 28•4 years ago
|
||
One difference is that HttpChannelChild uses AutoIPCStream directly whereas DocumentChannel uses the IPDLParamTraits helpers for nsIInputStream*. That's caused issues before in bug 1595578, but I'm not aware of any remaining problems.
Iirc for larger payloads we switch to using an actor for blob serialization, rather than just streaming the bytes across inline, so that might be the difference as to when this becomes an issue.
Assignee | ||
Comment 29•4 years ago
|
||
Thanks for the info. It may be the blob serialization that is causing issues here 🤔
Andrea, are there any known issues with rewinding a serialized (blob) input stream?
Reporter | ||
Comment 30•4 years ago
|
||
I suspect that the code doesn't react properly to the situation when the mUploadStream doesn't implement ISeekableStream interface. It just continues silently without having the upload stream rewinded. Does the instance of mUploadStream used with big payload implement this interface?
Reporter | ||
Comment 31•4 years ago
|
||
Is there anything new about the topic?
Reporter | ||
Comment 32•4 years ago
|
||
Is there anything new about the topic? If Mozilla doesn't plan to fix this issue (or look into it), please mark this as won't fix so I can recommend to the customer (big German enterprise) to use Chrome instead.
Assignee | ||
Comment 33•4 years ago
|
||
Assignee | ||
Comment 34•4 years ago
|
||
Depends on D107068
Assignee | ||
Comment 35•4 years ago
|
||
Summary of the issue:
Content page is creating a blob larger that 256Kb in size and posting it.
In the parent process, we do an authRetry after the authentication completes so we try to rewind the stream and retry the request.
Due to the fact that the deserialized stream does not implement nsISeekableStream we fail to rewind the stream.
The channel will hang, because we attempt to read from the stream and there is no more data there.
To improve user experience my approach here is to fail the channel if we fail to rewind the streams.
This will cause the load to fail we stay on the original page and the user can submit again if necessary.
In a follow-up bug we can investigate whether it's possible to make this work. Introducing an API that allows us to rewind the stream asynchronously might be a solution, but it's not that easy to implement.
Updated•4 years ago
|
Updated•4 years ago
|
Comment 36•4 years ago
|
||
Comment 37•4 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/c8321424eb47
https://hg.mozilla.org/mozilla-central/rev/5fabd9f4bb61
Updated•4 years ago
|
Description
•