Closed
Bug 483304
Opened 16 years ago
Closed 9 years ago
location.hash getter returns the hash value unescaped ("%7C" turns into "|")
Categories
(Core :: DOM: Core & HTML, defect)
Core
DOM: Core & HTML
Tracking
()
RESOLVED
DUPLICATE
of bug 1093611
People
(Reporter: szabobela1987, Unassigned)
References
Details
(Keywords: html5, testcase)
Attachments
(4 files, 4 obsolete files)
(deleted),
text/html
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
jduell.mcbugs
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
bzbarsky
:
review+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5
The encodeURIComponent doesn't encode the "|" character to "%7C", yet all the major browsers (Chrome, IE, Opera, Safari) does, this way makes cross-browser development harder.
Reproducible: Always
Steps to Reproduce:
1. Created a variable containing a hash (the "|" character's role in the hash is significant)
2. Used the mentioned function on a the variable.
3. Replaced the current hash with the variable
Actual Results:
The current URL changed to this -> "blog.php#page:1|1"
Expected Results:
The current URL should have changed to this -> "blog.php#page:1%7C1"
Updated•16 years ago
|
Assignee: nobody → general
Component: General → JavaScript Engine
OS: Windows 7 → All
Product: Firefox → Core
QA Contact: general → general
Hardware: x86 → All
Summary: encodeURIComponent function error → encodeURIComponent doesn't encode "|"
Comment 1•16 years ago
|
||
Works for me:
js1.9.0> encodeURIComponent("|")
%7C
js1.9.1> encodeURIComponent("|")
%7C
js1.9.2> encodeURIComponent("|")
%7C
Do you have a testcase that demonstrates the problem? Perhaps the difference in behavior is at some other level, or under different circumstances.
Comment 2•16 years ago
|
||
Need a testcase. Sounds like this involves URL processing by Necko or Gecko or even the Location Bar. It's not JS engine as Gavin shows.
/be
Assignee: general → nobody
Component: JavaScript Engine → General
QA Contact: general → general
Reporter | ||
Comment 3•16 years ago
|
||
Sorry, I just realized that the problem has a different nature: the function
works correctly but if I enter "%7C" to the URL bar it will convert it back to
"|".
Comment 4•16 years ago
|
||
We display unescaped values in the location bar on purpose - it's much more user friendly. That shouldn't have any impact on what the server sees, though. Are you suggesting it does?
Summary: encodeURIComponent doesn't encode "|" → "%7C" displayed unescaped ("|") in the location bar
Reporter | ||
Comment 5•16 years ago
|
||
No, on the server side it works correctly, my problem with it that I store my AJAX variables in the hash and separate them with the "|" character for example:
"blog.php#page:3|user:2|category:5"
This is only problematic when the variable itself contains the "|", in that case it should be stored as "%7C", so it doesn't cut the variable's content in half.
Actually this can be considered quite rare, but since I'm working on a CMS project I would like to know if this will change in the near future or I should remove the "|" characters from the variables if the user agent is Firefox?
Comment 6•16 years ago
|
||
The difference shouldn't be visible to page scripts either.
Can you provide example code that behaves differently in Firefox because of this difference?
Reporter | ||
Comment 7•16 years ago
|
||
I've uploaded an example script. I've tested it with Chrome, Firefox 3, IE 6/7/8beta, Opera 9 and Safari, all return the same result except Firefox.
Comment 8•16 years ago
|
||
Difference between RFC 2396 and RFC 3986 is relevant?
> http://www.faqs.org/rfcs/rfc2396.html
> 2.4.3. Excluded US-ASCII Characters
> (snip)
> unwise = "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`"
> Data corresponding to excluded characters must be escaped in order to
> be properly represented within a URI.
> http://www.faqs.org/rfcs/rfc3986.html
> No special description on "|" in RFC 3986.
Bugs which are open && ( summary contains rfc && (2396 || 3986) ) :
> Bug 275689 same-document references should work according to RFC 3986
> Bug 309671 Support %-escaped hostnames per RFC 3986 (3.2.2) / Cannot open IDN from other applications(e.g., from Thunderbird)
> Bug 394537 Update the encodeURI and the decodeURI JavaScript functions to reflect the new reserved characters in RFC 3986
Comment 9•16 years ago
|
||
Ah, this is because we unescape the value in GetHash explicitly:
There's a relevant discussion in bug 135309, starting at bug 135309 comment 30.
Status: UNCONFIRMED → NEW
Component: General → DOM
Ever confirmed: true
QA Contact: general → general
Summary: "%7C" displayed unescaped ("|") in the location bar → location.hash getter returns the hash value unescaped ("%7C" turns into "|")
Comment 10•16 years ago
|
||
(In reply to comment #9)
> Ah, this is because we unescape the value in GetHash explicitly:
http://hg.mozilla.org/mozilla-central/annotate/29a714ae9d11/dom/base/nsLocation.cpp#l399
Comment 11•16 years ago
|
||
This testcase passes in:
Opera 10.00 Alpha 6166
Safari Version 3.2.1 (5525.27.1)
IE 7
It fails in Firefox trunk/branch.
Attachment #367493 -
Attachment is obsolete: true
Comment 12•16 years ago
|
||
Yeah, this is just a duplicate of bug 135309, no?
Until necko stops escaping refs on url parse, we can't really change this.
Updated•16 years ago
|
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → DUPLICATE
Comment 14•16 years ago
|
||
(In reply to comment #12)
> Yeah, this is just a duplicate of bug 135309, no?
Except that that bug is FIXED...
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Updated•16 years ago
|
Status: REOPENED → NEW
Comment 15•16 years ago
|
||
That bug should have been marked "wontfix"; it got hijacked, basically.
This is also a problem with .hash on the <a> element. HTML5 doesn't prescribe unescaping here.
Keywords: html5
gsnedders says Google Wave expects the current behavior from Firefox (browser sniffing going on).
Comment 20•15 years ago
|
||
ok, 8 years later and we're still waiting for a fix for this. any estimates on when we can expect this to be fixed, guys?
Comment 21•15 years ago
|
||
Probably right after you fix it? ;)
Seriously, just changing this behavior breaks other things; to change it one has to change how URI refs are handled throughout Gecko wholesale. In particular, it involves the mess in bug 436006. So once someone steps up and spends a few weeks of full-time work on it, it'll be fixed. Given the amount of work involved and the low impact, it's not a high priority for me personally at the moment.
Comment 24•13 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #21)
> Seriously, just changing this behavior breaks other things; to change it one
> has to change how URI refs are handled throughout Gecko wholesale. In
> particular, it involves the mess in bug 436006. So once someone steps up
> and spends a few weeks of full-time work on it, it'll be fixed. Given the
> amount of work involved and the low impact, it's not a high priority for me
> personally at the moment.
Is it not possible to fix the public interfaces of the <a> DOM element and the window/document.location object to expose the Right Thing? That sounds like it should cover the web-exposed side of this, be much easier, more contained, and take much less work. Or would that patch not get taken because it's Not the Right Thing? :-)
Comment 25•13 years ago
|
||
> Is it not possible to fix the public interfaces of the <a> DOM element and the
> window/document.location object to expose the Right Thing
No, because the "right thing" requires changing our internal storage of the URI. Right now we have to unescape because the URI is stored escaped. And escaping is always lossy.
And if we change the internal URI storage, then you end up with a cascade of other things that need to be fixed, as I said in the comment you quoted.
Comment 26•12 years ago
|
||
What about
1. storing the URI without escaping,
2. escaping the URI on retrieving it via nsIURI or nsIURL, and
3. adding a new interface nsIUnescapedURL or something and use it from Location and others expecting unescaped representation
to minimize the impact?
Comment 27•12 years ago
|
||
I do not want to add yet another hackaround for a new iid. Actual serialization format change will be in the following patch.
Comment 28•12 years ago
|
||
Attachment #650388 -
Flags: review?(bzbarsky)
Comment 29•12 years ago
|
||
Attachment #650390 -
Flags: review?(bzbarsky)
Comment 30•12 years ago
|
||
Attachment #650391 -
Flags: review?(bzbarsky)
Comment 31•12 years ago
|
||
Attachment #650392 -
Flags: review?(bzbarsky)
Comment 32•12 years ago
|
||
Attachment #650393 -
Flags: review?(bzbarsky)
Comment 33•12 years ago
|
||
I'm not going to get to these reviews at least until the week starting August 27, unfortunately.
Comment 34•12 years ago
|
||
But just so I'm clear on what's going on, is the .hash thing just working around the fact that <a name> stuff doesn't follow the spec right now? Or are there other consumers you're worried about?
Comment 35•12 years ago
|
||
I'm planning to implement the URL standard.
http://dvcs.w3.org/hg/url/raw-file/tip/Overview.html
nsURI.hash will be used from URL.hash.
Comment 36•12 years ago
|
||
I guess the real question is why we're adding .hash instead of changing behavior of .ref.
Comment 37•12 years ago
|
||
For compatibility with existing consumers. Note that ref (mRef) is used as a part of nsIURI comparison (by Equals method).
Comment 38•12 years ago
|
||
Right, but the question is what those consumers are. It's just a bit confusing to have two properties that basically do the same thing, and people will keep using the wrong one... I can live with it if we have to, but if we can avoid it that would be ideal.
mRef can be kept as an implementation detail, of course; it just doesn't have to be exposed in the API. Though comparing on escaped representations if the unescaped ones are supposed to be used can screw things up (e.g. not scrolling on anchor navigation even though two different anchors are pointed to), no?
Comment 39•12 years ago
|
||
<a name="è"> and <a name="%C3%A8"> are different per HTML5 spec while "#è" and "#%C3%A8" are supposed to be the same per IRI spec (RFC 3987). So we need to keep unescaped representation for DOM even if they are considered identical as URI/IRI. I added mHash for this purpose.
Comment 40•12 years ago
|
||
> while "#è" and "#%C3%A8" are supposed to be the same per IRI spec (RFC 3987)
Last I checked, this was one of the parts of RFC 3987 that doesn't quite match reality... In practice, we clearly can't canonicalize the former to the latter if they need to have different behavior!
Comment 41•12 years ago
|
||
IRI equivalence depends on the comparison level.
https://tools.ietf.org/html/rfc3987#section-5.3.1
https://tools.ietf.org/html/rfc3987#section-5.3.2.3
Some consumers may require IRI-to-URI mapping and some others may not. So the different properties will be required. No?
Comment 42•12 years ago
|
||
Hmm. Maybe. ccing some people who're actually up on all this network stuff.
Comment 43•12 years ago
|
||
This really needs another reviewer. I don't know when I'll have a chance to dig into all this gunk, if ever, and the necko change should really get review from someon who's an active necko peer.
Updated•12 years ago
|
Attachment #650387 -
Flags: review?(bzbarsky) → review?(benjamin)
Updated•12 years ago
|
Attachment #650388 -
Flags: review?(bzbarsky) → review?(jduell.mcbugs)
Updated•12 years ago
|
Attachment #650390 -
Flags: review?(bzbarsky) → review?(jduell.mcbugs)
Updated•12 years ago
|
Attachment #650391 -
Flags: review?(bzbarsky) → review?(jduell.mcbugs)
Updated•12 years ago
|
Attachment #650392 -
Flags: review?(bzbarsky) → review?(jduell.mcbugs)
Comment 44•12 years ago
|
||
Comment on attachment 650393 [details] [diff] [review]
Part 6: Use nsIURI.hash from DOM interfaces
r=me if the previous stuff is all OK, I guess.
Attachment #650393 -
Flags: review?(bzbarsky) → review+
Updated•12 years ago
|
Attachment #650388 -
Flags: review?(jduell.mcbugs) → review+
Comment 45•12 years ago
|
||
Comment on attachment 650391 [details] [diff] [review]
Part 4: Implement nsIURI.hash
OK, this attribute certainly needs more documentation on how it differs from .ref: http://mxr.mozilla.org/mozilla-central/source/netwerk/base/public/nsIURI.idl#220
How does it differ from ref?
Attachment #650391 -
Flags: review-
Comment 46•12 years ago
|
||
hmm so after reading some comments... .hash is the escaped version of .ref...?
Comment 47•12 years ago
|
||
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #46)
> hmm so after reading some comments... .hash is the escaped version of
> .ref...?
More precisely, .hash will not be escaped regardless of "network.standard-url.escape-utf8" state. If the input string is already escaped, it will be kept as-is.
I consider removing this pref because I doubt we can change the setting.
Comment 48•12 years ago
|
||
Comment on attachment 650390 [details] [diff] [review]
Part 3: Fix inconsistency between nsSimpleURI::SetSpec and nsSimpleURI::SetRef and add automated tests for non-ASCII hash
I'm hoping :biesi can take this--I've talked with him about it on IRC. Really only :bz or he are equipped to take this on without some learning curve, and I don't have time for it.
Attachment #650390 -
Flags: review?(jduell.mcbugs) → review?(cbiesinger)
Updated•12 years ago
|
Attachment #650391 -
Flags: review?(jduell.mcbugs) → review?(cbiesinger)
Updated•12 years ago
|
Attachment #650392 -
Flags: review?(jduell.mcbugs) → review?(cbiesinger)
Comment 49•12 years ago
|
||
Entering URL in the address bar of Iceweasel 10.0.7 on Debian unstable and using Web Console:
1. document.location.search
a. https://www.google.ca/search?q=test| => Enter => https://www.google.ca/search?q=test|
document.location.search => "?q=test|"
b. https://www.google.ca/search?q=test%7C => Enter => https://www.google.ca/search?q=test|
document.location.search => "?q=test%7C"
c. https://www.google.ca/search?q=testè ==> Enter ==> https://www.google.ca/search?q=testè (cut-and-paste shows %C3%A8)
document.location.search => "?q=test%C3%A8"
d. https://www.google.ca/search?q=test%C3%A8 ==> Enter ==> https://www.google.ca/search?q=testè (cut-and-paste shows %C3%A8)
document.location.search => "?q=test%C3%A8"
e. https://www.google.ca/search?q=test%3D ==> Enter ==> https://www.google.ca/search?q=test%3D
document.location.search => "?q=test%3D"
Firefox's safe-percent-decoding-of-queries in the address bar does not degenerate when re-submitting queries (b) thanks to preserving "%3F" "?", "%3D" "=" and "%26" "&" in user input (e). Only javascripts will see difference in .search between submitting the original input in the address bar and re-submitting the address bar contents converted by Firefox.
2. document.location.hash
a. https://www.google.ca/search?q=test#%7C ==> Enter ==> https://www.google.ca/search?q=test|
document.location.hash => "#|"
b. https://www.google.ca/search?q=test#| ==> Enter ==> https://www.google.ca/search?q=test|
document.location.hash => "#|"
c. https://www.google.ca/search?q=test#è ==> Enter ==> https://www.google.ca/search?q=test#è (cut-and-paste shows "#%C3%A8")
document.location.hash => "#è"
d. https://www.google.ca/search?q=test#%C3%A8 ==> Enter ==> https://www.google.ca/search?q=test#è (cut-and-paste shows "#%C3%A8")
document.location.hash => "#è"
e. https://www.google.ca/search?q=test#%3D ==> Enter ==> https://www.google.ca/search?q=test#%3D
document.location.hash => "#="
3. Questions.
a. How does this disagree with other browsers?
b. Regardless of existing implementations, what disagrees with expectations? Why would we expect other results?
c. How do standards such as W3C URL Living Draft disagree with the current behaviour?
Comment 50•12 years ago
|
||
Comment on attachment 650387 [details] [diff] [review]
Part 1: Add iid parameter to nsISeralizable::Read to determine a serialization format and add a type-safe utility function to nsIObjectInputStream.idl
I'm going to clear this request until biesi has looked at the other parts of this bug. I pinged him about it in IRC. Please rerequest review when appropriate and I'll do it quickly ;-)
Attachment #650387 -
Flags: review?(benjamin)
Comment 51•12 years ago
|
||
So, re. question b:
I'm writing a web application with ember.js, which stores part of its state in an url fragment. It goes like this:
location.hash = '#/graphs/' + encodeURIComponent(JSON.stringify(state));
This makes for bookmarkable URLs. At some later time, ember may parse the URL into routes and parameters - here the route is /graphs/:graph_json, and ember passes encoded data into my de-serialization function for deserialization.
That works great in Chrome. Due to this bug, it *fails horribly* in Firefox, because there's every chance that the data I'm serializing contains slashes.
I'm working around the problem by browser sniffing and double-escaping my URL fragments, which sort-of-but-not-quite works. Sometimes they end up double-unescaped, sometimes someone manages to get a double-escaped version in a chat message and send it to a Chrome user, where it'll fail, often the double-escaping means the URL is too long for the web server.
Basically, when I assign some properly escaped value to location.hash, I want to get the same value back later.
Please fix.
Updated•12 years ago
|
Keywords: dev-doc-needed
Comment 53•11 years ago
|
||
No mater what you guys thing about the correctness of encoding the fragment part of the URL, this behavior causes the `hash` to be changed after page reload, which is ridiculous and is definitely a BUG that requires a fix.
I have seen this discussion started like 7 years ago, right? So I think it's about time to fix it, stop looking for excuses and move on.
Comment 54•11 years ago
|
||
In case anyone had still some doubts, please consider the following example:
window.loaction.hash = "#%3F" // or "%3F", whatever
The url becomes "/#%3F" which is the expected behavior. However,
window.location.hash === "#?"
which I could understood if it wasn't for the fact that after I hit `Ctrl-R`, the url becomes "/#?". WHAT? Really? Can anyone explain to me what is the logic behind this, and what are the reasons for not considering it as a BUG?
Comment 55•10 years ago
|
||
This above behaviour in Firefox is wrong per URL spec (and not only this):
http://url.spec.whatwg.org/#fragment-state
Other borwser do this correct (test Chrome37 and IE11), so right now (and for a long time) Firefox for developers is problematic.
set:
window.loaction.hash = "#%3F"
get:
window.loaction.hash
Result:
Firefox: "#?"
Chrome and IE and Opera (Presto): "#%3F"
Comment 56•10 years ago
|
||
Hello. I also got some troubles with hash. See that: https://github.com/putaindecode/fix-all-the-things/issues/11
Updated•10 years ago
|
Assignee: VYV03354 → nobody
Status: ASSIGNED → NEW
Comment 57•10 years ago
|
||
The unescaping is the real issue with the current implementation:
window.location.hash = '%3F?';
console.log(window.location.hash);
> #??
This way I have to use the window.location.href to get back the original value, since the getter cannot return the same string I set with the setter.
Comment 58•10 years ago
|
||
The testcase passes in current nightly versions.
I think this bug was fixed by bug 1093611.
Comment 59•10 years ago
|
||
Sure.
If I copied a url with "#%7C" from the location bar or press Enter on the location bar, it will turn into "#|", but it is a different (Firefox) bug.
Status: NEW → RESOLVED
Closed: 16 years ago → 10 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Attachment #650390 -
Attachment is obsolete: true
Attachment #650390 -
Flags: review?(cbiesinger)
Updated•10 years ago
|
Attachment #650391 -
Attachment is obsolete: true
Attachment #650391 -
Flags: review?(cbiesinger)
Updated•10 years ago
|
Attachment #650392 -
Attachment is obsolete: true
Attachment #650392 -
Flags: review?(cbiesinger)
Comment 60•9 years ago
|
||
The bug seems to be wrongly marked as "RESOLVED FIX", the testcase still fails in Firefox 38.0.5 as well es in Nightly 41.0a1 (2015-06-08).
Comment 61•9 years ago
|
||
(In reply to Malte Wedel from comment #60)
> The bug seems to be wrongly marked as "RESOLVED FIX", the testcase still
> fails in Firefox 38.0.5 as well es in Nightly 41.0a1 (2015-06-08).
This was fixed by bug 1093611, which then got backed out because too much internet broke. I'm not sure it makes sense to keep both bugs open here.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 62•9 years ago
|
||
guys - could this please be fixed at some point soon? "because too much internet broke" as an explanation simply isn't good enough - Firefox is just one out of many browsers and by far not the one with the biggest market share. Some poorly written code that only works in Firefox should not stop progress.
Comment 63•9 years ago
|
||
Workaround (I think it is worth repeating idea from comment #57, which is commonly known and used, but wasn't clearly stated here):
For now consider `location.hash` write-only for cross-browser implementations. *Never ever READ its value*. So DO NOT USE constructs like `document.location.hash.slice(1)`.
Instead, for retrieval get it from `document.location.href` like with `document.location.href.split('#')[1]`.
Comment 64•9 years ago
|
||
a more fail-safe version would be:
var location_hash = location.href.split("#");
location_hash.shift();
location_hash = "#"+location_hash.join("#"); // equals location.hash as implemented in other browsers
the #-character could be in location.hash more than once. But yes, Michal's version should be sufficient for most cases.
Comment 65•9 years ago
|
||
Is this bug still happening? I can't seem to reproduce it in Fx 41.
Comment 66•9 years ago
|
||
current stable is 39 - can be reproduced in that version.
As :gijs wrote in #61 this was fixed by the resolution of bug 1093611 which was included in v41 - though as he states the fix was then removed again(?)
Comment 67•9 years ago
|
||
Ah yeah, thanks, I didn't read this comment properly.
Looks like Bug 1093611 was relanded! (Note it was originally backed out because it produced incorrect results -- not because "the internet broke" ;) ).
Status: REOPENED → RESOLVED
Closed: 10 years ago → 9 years ago
Resolution: --- → DUPLICATE
Updated•9 years ago
|
Keywords: dev-doc-needed
Assignee | ||
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•