Closed Bug 468087 Opened 16 years ago Closed 16 years ago

Cannot add new https hosts to Mochitest

Categories

(Testing :: Mochitest, defect)

defect
Not set
minor

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mayhemer, Assigned: mayhemer)

References

(Blocks 1 open bug)

Details

(Keywords: autotest-issue, fixed1.9.1)

Attachments

(2 files, 3 obsolete files)

The problem is that Firefox sends just a domain part (the host name) in the Host header. Thanks that httpd.js looks up just the list of http hosts bound to port 80. Workaround for this problem is to besides a new https://XX:443 host also add its http://XX:80 mirror. Ted believes we should have smarter interaction between ssltunnel and httpd.js in this case.
I know Mossop hit this when trying to write a test that used SSL hosts. I think httpd.js should probably just add a mapping for host:80 for all the SSL-hosts on port 443, if one doesn't already exist.
You're sure there's no testing use case which requires an https server be present on port 443 but no http server be present on port 80? I need to look at this more closely, but I don't think we have to require http/https both be present for a host if at least the latter must be. Taking...
Assignee: nobody → jwalden+bmo
Status: NEW → ASSIGNED
(In reply to comment #2) > You're sure there's no testing use case which requires an https server be > present on port 443 but no http server be present on port 80? Hmm.. that is good point. But natural would IMHO be to allow situation when there is not any insecure server, though. I just tried to add an HTTP host with port 9000 and it is accessible. IMHO to look up also the https list of hosts would fix this problem, but it is just a guess.
http://mxr.mozilla.org/mozilla-central/source/netwerk/test/httpserver/httpd.js?mark=1262-1266#1250 We're going through a proxy here, so why don't we have an absolute path with scheme://hostport in the request line?
Actually, I guess the request generated from the PAC query wouldn't be a request of a proxy, so it wouldn't require the absolute path form.
Okay, here's an idea for a solution: modify ssltunnel to rewrite the Request-Line of incoming requests to use an absolute path with scheme, host, and port. The server will then know all the pertinent details to pick the right scheme. It's a little dodgy, but it's only one line of rewriting, not too much twiddling...
Hm, no, that's more than one line (of buffering), isn't it, because we have to pick out the Host header to know the right host, of course. Still doable, just need to buffer a bit more data...
That seems sensible. The base issue here is that for SSL hosts ssltunnel *is the proxy*, so httpd.js is confused.
(In reply to comment #4) > http://mxr.mozilla.org/mozilla-central/source/netwerk/test/httpserver/httpd.js?mark=1262-1266#1250 > > We're going through a proxy here, so why don't we have an absolute path with > scheme://hostport in the request line? The request that comes to the server has already passed ssltunnel and is not going through proxy (PAC function), its going through 'direct' connection to a secure server. (In reply to comment #6) > Okay, here's an idea for a solution: modify ssltunnel to rewrite the > Request-Line of incoming requests to use an absolute path with scheme, host, > and port. The server will then know all the pertinent details to pick the > right scheme. It's a little dodgy, but it's only one line of rewriting, not > too much twiddling... I was afraid of that.. What about this: xport = +port || 80; var scheme = identity.getScheme(host, xport); if (!scheme) { xport = +port || 443; var scheme = identity.getScheme(host, xport); if (!scheme) { dumpn("*** unrecognized hostname (" + hostPort + ") in Host " + "header, 400 time"); throw HTTP_400; } } I am not expect to the code, but something like that would solve the problem, no?
I don't think so; fundamentally, we don't have enough information in the incoming request itself to be able to determine a port (and from that, a scheme) if we only have the hostname and nothing else, if we have both http/https on a host with the default ports. In your code, with Host: example.com and https://example.com and http://example.com active, we'll always hit with xport == 80 even if it was an https://example.com request.
Ah, that's true. But that's exactly what we do now. I expect the goal is more then just to let the request 'pass through' i.e. check against a list of acceptable hosts. Other idea is to let ssltunnel also forward the whole proxy CONNECT request where the complete scheme-host-port is contained. It is relatively clear solution and not that complicated to implement. We just copy the proxy request to the target buffer.
Ok, stupid idea. CONNECT sends just the host and port pair as the request and also server would close the connection and following (real) request would not even get to the server. I have to think of this but it seems we really will have to let ssltunnel rewrite the request.
Keywords: autotest-issue
Hello gnarliness! The load-an-https URL smoketest works with this patch, and a full run of Mochitests works too (although I seem to leak a lot, but I don't think that's this patch causing it). It occurs to me (after writing this patch, alas) that the CONNECT Request-Line to ssltunnel already tells us what the target hostport is, so there's no need to buffer and grunge through sent headers for the hostport -- we just have to make ReadConnectRequest assign the hostport it finds to an outparam. Still, given how much trouble this was to write, I think I want at least this version backed up here to be safe.
Attached patch a bit simpler and still sufficient patch (obsolete) (deleted) — Splinter Review
Much simpler patch, but also not very nice. This overrides path in request line, not the host header. It modifies the request to look like the http proxy request. The patch doesn't count with potential support for pipelining. My upcoming set of mixed content tests works on it, adding new hosts works on it, simply accessing mochitest https hosts works.
Comment on attachment 351884 [details] [diff] [review] a bit simpler and still sufficient patch I realize now there should also be check for request line we parse to fit between bufferhead and buffertail.
Ah, saw my master comment at the top of ssltunnel.cpp, eh? I think we should add it in your patch, too, because while reading the code to understand it I stumbled across a number of ways to trigger it with the existing code. I don't think we can be bothered to care, so long as we add assertions in reasonable places and attempt to make it clear we know they exist. The out pointer strings in ReadConnectRequest should be references since they can't be NULL without crashing. Please check that the Request-URI begins with a '/' before mangling it, as it's possible incoming requests might already have absolute URIs or things like "*" that aren't absolute paths and shouldn't be touched. Don't restrict the methods that can be used in the request; a handler could interpret any syntactically valid HTTP method any way it wanted (and that does happen in the XXX tests, for one). Use memmove rather than memcpy when shifting the path and everything else downward; you're relying on undefined behavior. Please at least assert that you're in-bounds with the request-line shifting. At a glance this looks pretty nice, tho, and rather simpler -- wonder how much of that is not grunging for a Host head...
(In reply to comment #16) > Ah, saw my master comment at the top of ssltunnel.cpp, eh? I think we should > add it in your patch, too, because while reading the code to understand it I > stumbled across a number of ways to trigger it with the existing code. I don't > think we can be bothered to care, so long as we add assertions in reasonable > places and attempt to make it clear we know they exist. > So, as I understand you would like to add the "WARNING: DO NOT USE THIS CODE IN PRODUCTION SYSTEMS" comment? > The out pointer strings in ReadConnectRequest should be references since they > can't be NULL without crashing. > > Please check that the Request-URI begins with a '/' before mangling it, as it's > possible incoming requests might already have absolute URIs or things like "*" > that aren't absolute paths and shouldn't be touched. > Good points. > Don't restrict the methods that can be used in the request; a handler could > interpret any syntactically valid HTTP method any way it wanted (and that does > happen in the XXX tests, for one). > I added that more to check I am on the right place in the buffer. And yes, I also feel to restrict or just check the methods is not a good step. > Use memmove rather than memcpy when shifting the path and everything else > downward; you're relying on undefined behavior. Please at least assert that > you're in-bounds with the request-line shifting. > Yes, I realized that after I submitted the patch. > At a glance this looks pretty nice, tho, and rather simpler -- wonder how much > of that is not grunging for a Host head... What exactly means "grunging" ? :) Thanks for this feedback.
(In reply to comment #17) > So, as I understand you would like to add the "WARNING: DO NOT USE THIS CODE > IN PRODUCTION SYSTEMS" comment? Yeah -- something to make it clear that it shouldn't be used beyond its original purpose, at least not in its current state. > What exactly means "grunging" ? :) Making low-level, slightly dirty and hairy modifications, basically.
Attached patch simpler, v2 (obsolete) (deleted) — Splinter Review
- Added the master comment - Changed function headers to address comments and to take the buffer structure; need to access it - Added some assertions, highly improbable to hit false on them ..;) - Few small fixes Tested the same way as the previous patch. Now it is dependent on patch from bug 456001 but not a big problem to create an independent one.
Attachment #351884 - Attachment is obsolete: true
Attachment #351948 - Flags: review?(jwalden+bmo)
Comment on attachment 351948 [details] [diff] [review] simpler, v2 >diff --git a/testing/mochitest/ssltunnel/ssltunnel.cpp b/testing/mochitest/ssltunnel/ssltunnel.cpp >- * Ted Mielczarek <ted.mielczarek@gmail.com> >+ * Ted Mielczarek <ted.mielczarek@gmail.com> >+ * Honza Bambas <honzab@firemni.cz> Nitpicking, but indent another space there (so two spaces indent from the rest of the comment contents). >+const PRInt32 BUF_SIZE = 16384; >+const PRInt32 BUF_MARGIN = 1024; >+const PRInt32 BUF_TOTAL = (BUF_SIZE + BUF_MARGIN); Make these static const defs inside relayBuffer, and nix parens on BUF_TOTAL. >+struct relayBuffer Sometime we should make the pointer members private and expose access through methods like append or so, so that you don't have to manually manage every time you want to add data. Not here, tho... > bool ReadConnectRequest(server_info_t* server_info, >- char* bufferhead, char* buffertail, PRInt32* result, string* certificate, client_auth_option* clientauth) >+ relayBuffer& buffer, PRInt32* result, string& certificate, >+ client_auth_option* clientauth, string& host) > { >- if (buffertail - bufferhead < 4) >+ if (buffer.present() < 4) > return false; >- if (strncmp(buffertail-4, "\r\n\r\n", 4)) >+ if (strncmp(buffer.buffertail-4, "\r\n\r\n", 4)) > return false; > > *result = 400; > > char* token; >- token = strtok(bufferhead, " "); >+ token = strtok(buffer.bufferhead, " "); > if (!token) > return true; > if (strcmp(token, "CONNECT")) > return true; Should these two return false? I think so, but I'm not sure. >+ certificate = (char*)c; static_cast<> >+ host = "https://"; >+ host += token; Collapse to a single line and use string() on one or the other. > *clientauth = *(client_auth_option*)c; Same here while you're in here. > else > *clientauth = caNone; > > token = strtok(NULL, "/"); > if (strcmp(token, "HTTP")) > return true; Maybe return false again? >+bool AdjustRequestURI(relayBuffer& buffer, string *host) string& host >+ *(buffer.buffertail+1) = '\0'; buffer.buffertail[1] = '\0'; >+ // If the path doesn't start with a slash don't change it, it is probably '*' or a full >+ // path already. Return true, we are done with this requets adjustment. "request's", and change the comma after "it" to a period and capitalize the other "it". >+ // We have sent response (at least part of it) thus we expect >+ // there will be a new request from the client. If we are in >+ // https proxy mode allow Request-URI adjustment. >+ if (s == 0) >+ expect_request_start = do_http_proxy; I don't understand why this is guaranteed to only execute once and never happen again. Would you explain this, please?
(In reply to comment #20) > (From update of attachment 351948 [details] [diff] [review]) > > bool ReadConnectRequest(server_info_t* server_info, > >- char* bufferhead, char* buffertail, PRInt32* result, string* certificate, client_auth_option* clientauth) > >+ relayBuffer& buffer, PRInt32* result, string& certificate, > >+ client_auth_option* clientauth, string& host) > > { > >- if (buffertail - bufferhead < 4) > >+ if (buffer.present() < 4) > > return false; > >- if (strncmp(buffertail-4, "\r\n\r\n", 4)) > >+ if (strncmp(buffer.buffertail-4, "\r\n\r\n", 4)) > > return false; > > > > *result = 400; > > > > char* token; > >- token = strtok(bufferhead, " "); > >+ token = strtok(buffer.bufferhead, " "); > > if (!token) > > return true; > > if (strcmp(token, "CONNECT")) > > return true; > > Should these two return false? I think so, but I'm not sure. > Shouldn't. true means the CONNECT request was completely received (\r\n\r\n at the end). false means the CONNECT request is not yet completely received and therefor it's too soon to parse it, we have to wait for additional data. > > else > > *clientauth = caNone; > > > > token = strtok(NULL, "/"); > > if (strcmp(token, "HTTP")) > > return true; > > Maybe return false again? > Same as before. > >+ // We have sent response (at least part of it) thus we expect > >+ // there will be a new request from the client. If we are in > >+ // https proxy mode allow Request-URI adjustment. > >+ if (s == 0) > >+ expect_request_start = do_http_proxy; > > I don't understand why this is guaranteed to only execute once and never happen > again. Would you explain this, please? As I said this patch doesn't count with pipelinig, same as httpd.js doesn't (AFAIK). So, we expect request will be received on next read from the client socket 1) at the very start (the first request after the CONNECT request) that actually comes after ssltunnel responses with 200 OK Connected, 2) after a response has been returned by the server (the chain 'request, response, request, response'.) And it's actually not guaranteed to only execute once and never happen again, it doesn't have to be. Hope we understand each other...
Comment on attachment 351948 [details] [diff] [review] simpler, v2 (In reply to comment #21) > > > char* token; > > >- token = strtok(bufferhead, " "); > > >+ token = strtok(buffer.bufferhead, " "); Merge the definition and initialization into one. > Shouldn't. true means the CONNECT request was completely received (\r\n\r\n at > the end). false means the CONNECT request is not yet completely received and > therefor it's too soon to parse it, we have to wait for additional data. Oh, I was glossing over the 400 assignment (that should be to a reference, of course), which results in the proper error case. > As I said this patch doesn't count with pipelinig, same as httpd.js doesn't > (AFAIK). You FAIK correctly. :-) > And it's actually not guaranteed to only execute once and > never happen again, it doesn't have to be. Sorry, I still don't understand this. As I see it, if we're in HTTP proxy mode and sending data to the client, any time we empty all to-be-sent data from buffers[1], we'll go back to expecting a Request-Line. What happens if the response is a 5k file that's too big to fit into the buffer for sending back to the client? We'll fill up buffers[0], write out 5k to net, and if all 5k can be completely written, we'll then reenable expect_request_start and expect another Request-Line, inside the 5k file itself.
(In reply to comment #22) > (From update of attachment 351948 [details] [diff] [review]) > (In reply to comment #21) > > > > > char* token; > > > >- token = strtok(bufferhead, " "); > > > >+ token = strtok(buffer.bufferhead, " "); > > Merge the definition and initialization into one. > Hmm... why? It's nicer this way IMHO. It indicates token is about to be used later in the code. But as you wish. > Sorry, I still don't understand this. As I see it, if we're in HTTP proxy mode > and sending data to the client, any time we empty all to-be-sent data from > buffers[1], we'll go back to expecting a Request-Line. What happens if the > response is a 5k file that's too big to fit into the buffer for sending back to > the client? We'll fill up buffers[0], write out 5k to net, and if all 5k can > be completely written, we'll then reenable expect_request_start and expect > another Request-Line, inside the 5k file itself. Ok. The buffers work this way: buffers[0] is a buffer to read from the client socket into it and then write the content in the same buffer to the server socket (relay). buffers[1] is used in opposite direction - readings from server socket are stored to that buffer and then written to the client socket. I count on fact that client doesn't do following: Request -> Request -> <- Response <- Response but it always does: Request -> <- Response Request -> <- Response and on fact that server never responses until the complete request is received. When receiving a request from the client we are always at the start of the buffer (buffers[0]) and we know that at that start of the buffer is start of the request (the Request-line) to parse and we are not in the middle of e.g. a request content.
You shouldn't rely on the server not doing that. People are agitating more and more of late for the ability to generate and send a response asynchronously in controllable steps, and it's only a matter of time before we hit situations where the incoming request has to be processed incrementally too. (I can think of at least one, HTML5's eventsource element (bug 338583), coming down the pipeline, and I'm sure there are others.)
OK, agree, then there is one question: will httpd.js remember the host from the first requests or it has to be adjusted in all other subsequent requests? Also, there is some need to have keep-alive on httpd.js or covered by ssltunnel. See bug 466080. Based on what you say it seems to be nonsense to implement that in ssltunnel and rather wait for impl on httpd.js level, although implement it in ssltunnel would be relatively easy.
Writing these notes as a bug comment because I unfortunately lost my IRC history of chat with Jeff. We discussed, as I remember: - we would like to have support of keep alive in httpd.js; Jeff started to work on this - when it's done we should get ssltunnel be ready for that as well as for pipelining to cover this bug (be able to adjust all incoming requests) - ssltunnel doesn't completely conform the http(s) proxy spec (HTTP 1.1 RFC) and probably should be rewritten anyway but it it not a priority at all. In this light I suspect ssltunnel would become much more complex, we will probably have to use some necko components to handle transfer encoding etc to successfully locate Request-URI in requests to be adjusted. Other way is to let httpd.js listen on two distinct ports and make ssltunnel to forward to just one of them. This way httpd.js would recognize the request is a ssl request and not pure http. This way actually normal http/s servers recognize incoming https from http. Other option is to close this as WONTFIX. There is a very simple workaround for this bug.
Attachment #351948 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 351948 [details] [diff] [review] simpler, v2 >diff --git a/testing/mochitest/ssltunnel/ssltunnel.cpp b/testing/mochitest/ssltunnel/ssltunnel.cpp >- * Ted Mielczarek <ted.mielczarek@gmail.com> >+ * Ted Mielczarek <ted.mielczarek@gmail.com> >+ * Honza Bambas <honzab@firemni.cz> Uber-nit: one more space of indentation on these is most common, I think. >+const PRInt32 BUF_SIZE = 16384; >+const PRInt32 BUF_MARGIN = 1024; >+const PRInt32 BUF_TOTAL = (BUF_SIZE + BUF_MARGIN); No parens in last assignment. >+ certificate = (char*)c; static_cast<> >+ // Cannot use strnchr so add a null char at the end. There is always some space left >+ // because we preserve a margin. >+ *(buffer.buffertail+1) = '\0'; buffer.buffertail[1] = '\0'; >+ // path already. Return true, we are done with this requets adjustment. "request" >+ memmove(path + hostlength, path, buffer.buffertail - path); >+ memmove(path, host->c_str(), hostlength); This latter should be a memcpy. Not especially keep-alive-ready yet, of course, but good enough for now. BTW, when we do add that, I do think it would be easy for httpd.js to cache the hostname and port for future pipelined requests, so you shouldn't have to be mucking with the Request-Line every time. :-)
Attached patch simple fix v3 (deleted) — Splinter Review
Ready to land. (In reply to comment #27) > Not especially keep-alive-ready yet, of course, but good enough for now. BTW, > when we do add that, I do think it would be easy for httpd.js to cache the > hostname and port for future pipelined requests, so you shouldn't have to be > mucking with the Request-Line every time. :-) Great! This will make thinks easier for now. Under this light I removed the code to reengage the request line update flag when some data is sent back to the client. Now ssltunnel doesn't count with keep-alive and pipelining at all.
Assignee: jwalden+bmo → honzab.moz
Attachment #351871 - Attachment is obsolete: true
Attachment #351948 - Attachment is obsolete: true
Dep on bug 456001, the patch for that bug must land first.
Depends on: 456001
Blocks: 470963
Blocks: 466524
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Blocks: 466080
Status: REOPENED → RESOLVED
Closed: 16 years ago16 years ago
Resolution: --- → FIXED
Are there plans to land this on the 1.9.1 branch?
If so we should land bug 456001 first. The patch for this bug is dependent on it.
Attached patch v3 for 1.9.1 (deleted) — Splinter Review
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: