Closed Bug 84409 Opened 24 years ago Closed 23 years ago

URL: relative URLs break if if base URL has ";" character

Categories

(Core :: Networking, defect, P4)

x86
Windows NT
defect

Tracking

()

RESOLVED FIXED
mozilla0.9.4

People

(Reporter: rail, Assigned: gagan)

References

Details

(Whiteboard: got r=dougt, seeking sr=)

Attachments

(3 files)

From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; WinNT4.0; en-US; rv:0.9) Gecko/20010505 BuildID: 2001050515 Mozilla seems to return an incorrect URL when attempting to access page components (images etc.) which are specified by their relative locations, if the page URL contains a ";" character. For example, while accessing a web page located in a directory on a server where the directory name contains a semi-colon, the requests recieved from Mozilla seem to skip the directory name. (Please refer to the 'Steps to Reproduce' and 'Additional Information' sections.) Steps to setup Environment: 1. In your webserver's document root directory, create a temporary directory whose name includes a semi-colon (;). e.g 'Test;Folder'. 2. In this folder, place a simple HTML file (say 'URLTest.html') which among other things accesses an image (I used a .gif file) in this folder. Note that the image must be accessed with its relative URL. e.g '<IMG SRC="6jelly.gif" HEIGHT=300 WIDTH=400>'. 3. Now load the page in Mozilla by entering the following in the address bar 'http://<hostname>/Test;Folder/URLTest.html'. The page should show up with the image. (This seems to work with IE 5.5 and Netscape 4.08) Webserver Log Entries for requests: From IE 5.5: 192.168.55.249 - - [06/Jun/2001:18:29:46 -0700] "GET /Test;Folder/URLTest.html HTTP/1.1" 200 195 192.168.55.249 - - [06/Jun/2001:18:29:46 -0700] "GET /Test;Folder/6jelly.gif HTTP/1.1" 200 245396 From Netscape Navigator 4.08: 192.168.55.249 - - [06/Jun/2001:18:30:34 -0700] "GET /Test;Folder/URLTest.html HTTP/1.0" 200 195 192.168.55.249 - - [06/Jun/2001:18:30:35 -0700] "GET /Test;Folder/6jelly.gif HTTP/1.0" 200 245396 From Mozilla M18: 192.168.55.249 - - [06/Jun/2001:18:21:15 -0700] "GET /Test;Folder/URLTest.html HTTP/1.1" 200 195 192.168.55.249 - - [06/Jun/2001:18:21:17 -0700] "GET /6jelly.gif HTTP/1.1" 404 323 Note that the 'Test;Folder' part of the URL has been skipped in the request for the image made by Mozilla. Reproducible: Always Steps to Reproduce: 1. In your webserver's document root directory, create a temporary directory whose name includes a semi-colon (;). e.g 'Test;Folder'. 2. In this folder, place a simple HTML file (say 'URLTest.html') which among other things accesses an image (I used a .gif file) in this folder. Note that the image must be accessed with its relative URL. e.g '<IMG SRC="6jelly.gif" HEIGHT=300 WIDTH=400>'. 3. Now load the page in Mozilla by entering the following in the address bar 'http://<hostname>/Test;Folder/URLTest.html'. Actual Results: The browser is not able to resolve the url for the image. It just doesn't show the image. Expected Results: The html file should be displayed as it is designed. i.e I would expect the image to be shown For that matter any other relative URLS in it should be shown
This works if you use "%3B" instead of ";" in the URL as you should if it is part of a file/directory name and not a separator for parameters. Nevertheless this is a bug in Mozilla. We seem to treat ";" as a terminator of the URI if it used as a base URI. some.gif relative to http://host/x/a;b/ becomes http://host/x/some.gif , same for some.gif relative to http://host/x/a;b/y/ . Confirming (2001-06-06-04-trunk, Win NT). -> Networking
Assignee: asa → neeti
Status: UNCONFIRMED → NEW
Component: Browser-General → Networking
Ever confirmed: true
QA Contact: doronr → benc
A quick look at nsStdURLParser makes me think our URL parsing is completely broken with respect to parameters. We seem to assume parameters can only occure in the final path segment and only one parameter is possible. RFC 2396: 3.3. Path Component The path component contains data, specific to the authority (or the scheme if there is no authority component), identifying the resource within the scope of that scheme and authority. path = [ abs_path | opaque_part ] path_segments = segment *( "/" segment ) segment = *pchar *( ";" param ) param = *pchar pchar = unreserved | escaped | ":" | "@" | "&" | "=" | "+" | "$" | "," The path may consist of a sequence of path segments separated by a single slash "/" character. Within a path segment, the characters "/", ";", "=", and "?" are reserved. Each path segment may include a sequence of parameters, indicated by the semicolon ";" character. The parameters are not significant to the parsing of relative references.
Keywords: qawanted
benc: could you set up a testcase for this bug thanks, neeti
Making a local testcase for this is simple... 1. Create a directory like C:\TEST\FOO;BAR\ 2. Place an HTML in this directory, say C:\TEST\FOO;BAR\hello.html 3. Open browser and type "C:\TEST\FOO;BAR\hello.html" on the URL bar. Actual Results: Browser cannot open this file... But if you use "File -> Open" method to load that file, the ";" would be escaped into "%3B" and you can open it successfully.
The test case is presented pretty clearly. (Thanks rail & doctor!) Do you mean you need this loaded on a system? Probably loading this on an NT system would be ideal, my flagship server is Solaris, and ";" means "end of shell command in UNIX. If this is what you want, I'll put it on my list of things to do.
I'm not sure if *NIX would accept a directory name containing ";". I can certainly create such directory in Win98.
I am using an NT system. So I guess we would have to check it on an NT system However, It is possible to create a folder under Solaris that contains the ; character by escaping it with a \ You can say mkdir test\;folder Thanks
I have set up small testcases at http://clarence.de/tests/x;y/test.html and http://clarence.de/tests/x;y/x/test.html , but I can't give you access to my server logs. To compare the escaped URLs: http://clarence.de/tests/x%3By/test.html and http://clarence.de/tests/x%3By/x/test.html .
Priority: -- → P4
Target Milestone: --- → mozilla0.9.2
Target Milestone: mozilla0.9.2 → mozilla0.9.3
Target Milestone: mozilla0.9.3 → mozilla0.9.4
If I remember correctly it was decided to not support params inside the directory part of the path, because nobody has ever used it that way and it is not easy to implement. But, maybe we have to look at the detection alogrithm that decides wether a ; starts the param or is part of the directory name, because there are one or more / right of it in the string. I will look into that.
If we parse URIs accordingly to RFC 2396 (as opposed to the obsolete RFC 1808) we dont't need to care about parameters at all. We can treat ';' like any other char. It's only the server's responsibility to care about parameters.
I've attached a patch to get these kind of urls parsed conforming to RFC 2396. I stayed with the param component, but we now ignore ; when there is a slash after it and it is not in the query or ref part.
-> gagan: can you review this patch?
Assignee: neeti → gagan
This patch needs heavy testing on a number of test urls. I will look into my old stack of fixed urlparser bugs to test it against some of the finest ;-) examples of urls that use params. For example: are there params that have a / inside it, that could be a killer ...
Whiteboard: got r=dougt, seeking sr=
hey Andreas, I "think" the patch looks fine... Its one of those changes where its correct if its been tested and doesn't break anything :-) r/sr=rpotts
Thanks, Rick! I've gone through the old bugs, done the precheckin tests and haven't seen any problems ( after the latest update ;-) ).
fix is checked in
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Summary: Incorrect request for page components if main page URL contains ";" character → URL: relative URLs break if if base URL has ";" character
Keywords: qawanted
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: