Closed
Bug 85500
Opened 23 years ago
Closed 23 years ago
Mozilla includes # anchors in GET URI in some cases
Categories
(Core :: Networking: HTTP, defect, P4)
Core
Networking: HTTP
Tracking
()
VERIFIED
FIXED
mozilla0.9.4
People
(Reporter: sharding, Assigned: neeti)
Details
Attachments
(1 file)
(deleted),
patch
|
Details | Diff | Splinter Review |
When Mozilla does an HTTP GET, if the URL includes more than one '#' character,
it includes everything before the last '#' in the URI sent to the server. For
example, loading http://foo.example.com/foo.html#bar# will result in:
GET /foo.html#bar HTTP/1.1
etc., etc.
Some servers handle this gracefully and ignore the anchor, but others spit back
a 404. Either way, it seems like this isn't the correct behavior. Shouldn't it
be lopping off everything after the first '#'? That's what it appears that
Netscape 4.7x, IE and Opera do.
Comment 1•23 years ago
|
||
Well... a URI with two # characters in it is illegal, no? The second # should
be escaped.
So you'll get different results depending on whether the browser looks for #
from the front (as IE/Opera/4.x seem to) or from the back (as we seem to).
Confirming bug, though. We should try to deal with this invalid case in my
opinion....
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: FreeBSD → All
Hardware: PC → All
Reporter | ||
Comment 2•23 years ago
|
||
Is it illegal? That wasn't clear to me. I'd agree that it should be avoided, but
I wasn't able to find anywhere (specifically looking in RFC 2396) that said
there can only be one '#'.
Either way, it does exist in the wild, so Mozilla might as well deal with it
cleanly.
Comment 3•23 years ago
|
||
RFC 2396. Section 2.4.3 -- Excluded US-ASCII Characters
The character "#" is excluded because it is used to delimit a URI from a
fragment identifier in URI references (Section 4)
Data corresponding to excluded characters must be escaped in order to
be properly represented within a URI.
I agree with you, though. We should consider being compatible with other
browsers on this
Reporter | ||
Comment 4•23 years ago
|
||
The character "#" is excluded because it is used to delimit a URI from a
fragment identifier in URI references (Section 4)
Data corresponding to excluded characters must be escaped in order to
be properly represented within a URI.
Right. I saw that, but I read that as saying that a "#" can't be in a URI. If
there are two "#"s, the first one would be delimiting the URI from the fragment
identifier and the second one would be part of the fragment identifier. So the
real question is whether or not "#" is allowed in fragment identifiers. It turns
out that it isn't; I just didn't see that part the first time I looked:
The character restrictions described in Section 2 for URI also apply to the
fragment in a URI-reference.
So, you're right. It's not legal. We agree that Mozilla's handling of it should
change, but I'll go ahead and contact the maintainers of the site I first saw
this on to let them know the problem as well.
If this could be a long discussion, lets discuss this in a newsgroup. If we
decide this is unsupported, I'll send this to evangelism.
Updated•23 years ago
|
Target Milestone: mozilla0.9.3 → mozilla0.9.4
Comment 6•23 years ago
|
||
I don't know where the observed behaviour should come from, the urlparser does
the right thing as it is visible with urltest (updated version):
urltest http://foo.example.com/foo.html#bar#
gives
http://foo.example.com/foo.html#bar#
http,,,foo.example.com,-1,/,foo,html,,,bar#,http://foo.example.com/foo.html#bar#
Does anyone have any real live examples? The only possibility I see is that if
this really happens then somewhere down in the http protocol someone does it's
own parsing ... not good ...
Comment 7•23 years ago
|
||
The problem is inside nsHttpChannel::SetupTransaction()
...
// use the URI path if not proxying (transparent proxying such as SSL proxy
// or socks does not count here).
nsXPIDLCString requestURIStr;
const char* requestURI;
if (!mConnectionInfo->ProxyHost() ||
mConnectionInfo->UsingSSL() ||
!PL_strcmp(mConnectionInfo->ProxyType(), "socks") ||
!PL_strcmp(mConnectionInfo->ProxyType(), "socks4")) {
rv = mURI->GetPath(getter_Copies(requestURIStr));
if (NS_FAILED(rv)) return rv;
requestURI = requestURIStr.get();
}
else
requestURI = mSpec.get();
// trim off the #ref portion if any...
char *p = PL_strrchr(requestURI, '#');
if (p) *p = 0;
...
This should be char *p = PL_strchr(requestURI, '#').
Every # as part of path or spec is escaped and is not found with left search of #.
Comment 8•23 years ago
|
||
Comment 9•23 years ago
|
||
cc-ing darin, who worked last at that code
Keywords: review
Whiteboard: seeking r/sr
Comment 10•23 years ago
|
||
r=darin on the patch
Comment 11•23 years ago
|
||
sr=rpotts
Comment 12•23 years ago
|
||
a=dbaron (on behalf of drivers)
Comment 13•23 years ago
|
||
fix checked in.
Status: NEW → RESOLVED
Closed: 23 years ago
Keywords: review
Resolution: --- → FIXED
Whiteboard: seeking r/sr
Comment 14•22 years ago
|
||
Verified per andreas' comment.
Status: RESOLVED → VERIFIED
QA Contact: benc → junruh
You need to log in
before you can comment on or make changes to this bug.
Description
•