Closed
Bug 237818
Opened 21 years ago
Closed 21 years ago
nsStandardURL should normalize UTF-8 hostnames
Categories
(Core :: Networking, defect, P1)
Core
Networking
Tracking
()
VERIFIED
FIXED
mozilla1.8alpha1
People
(Reporter: darin.moz, Assigned: dbaron)
References
Details
(Keywords: intl, Whiteboard: [IDN][patch])
Attachments
(2 files, 1 obsolete file)
(deleted),
text/html; charset=UTF-8
|
Details | |
(deleted),
patch
|
jshin1987
:
review+
darin.moz
:
superreview+
chofmann
:
approval1.7+
|
Details | Diff | Splinter Review |
nsStandardURL should normalize UTF-8 hostnames.
BuildNormalizedSpec, SetHost, and any other methods that modify the host field
of nsStandardURL should normalize the given value if it is UTF-8.
We might want to add a NormalizeUTF8 method on nsIIDNService.
Reporter | ||
Updated•21 years ago
|
Status: NEW → ASSIGNED
Whiteboard: [IDN]
Target Milestone: --- → mozilla1.8alpha
Assignee | ||
Comment 1•21 years ago
|
||
Comment 2•21 years ago
|
||
> We might want to add a NormalizeUTF8 method on nsIIDNService
Can we just use nsIUnicodeNormalizer? There may be some issues to think about,
though.
Keywords: intl
Reporter | ||
Comment 3•21 years ago
|
||
> Can we just use nsIUnicodeNormalizer? There may be some issues to think about,
> though.
We could do that, but personally I think we should use the same normalization
algorithm used by nsIIDNService. (Maybe nsIUnicodeNormalizer is what
nsIIDNService already uses?? -- I thought IDN required special normalization
rules.) Whatever works! :-)
Comment 4•21 years ago
|
||
Yeah, I vaguely remembered that IDN has its own (slightly different?)
normalization rules so that I wrote an 'escape clause' ("there may be issues to
think about, though") :-) We have to read the IDN standard document :-)
Assignee | ||
Comment 5•21 years ago
|
||
We currently just use the normalizer, it seems:
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/netwerk/dns/src/nsIDNService.cpp&rev=1.17&mark=393#351
Since we probably don't want to rip huge pieces of a URL out if the user enters
it and it's invalid according to IDN (so they can fix their typo, say?), perhaps
just normalization is best at this stage -- since it seems to be the only thing
we do other than check for invalid things.
Assignee | ||
Comment 6•21 years ago
|
||
Er, except that the "Map" bit is probably doing something. :-)
Assignee | ||
Comment 7•21 years ago
|
||
Reporter | ||
Comment 8•21 years ago
|
||
Comment on attachment 144199 [details] [diff] [review]
possible patch
first glance looks right on to me... i'll take a closer look tomorrow. thx
dbaron!
Attachment #144199 -
Flags: review?(darin)
Reporter | ||
Comment 9•21 years ago
|
||
Comment on attachment 144199 [details] [diff] [review]
possible patch
>Index: base/src/nsStandardURL.cpp
>+ // However, perform Unicode normalization on it, as IDN does.
>+ mHostEncoding = eEncoding_ASCII;
>+ if (mHost.mLen > 0) {
>+ const nsACString& tempHost =
>+ Substring(spec + mHost.mPos, spec + mHost.mPos + mHost.mLen);
>+ if (IsASCII(tempHost)) {
>+ approxLen += mHost.mLen;
>+ } else {
>+ mHostEncoding = eEncoding_UTF8;
>+ if (gIDNService &&
>+ NS_SUCCEEDED(gIDNService->Normalize(tempHost, encHost))) {
>+ approxLen += encHost.Length();
>+ } else {
>+ encHost.Truncate();
>+ approxLen += mHost.mLen;
>+ }
>+ }
>+ }
only some nits here...
(1) this file avoids brackets when not needed
(2) should not be necessary to Truncate encHost in the failure case.
an XPCOM method is not supposed to set an out-param value when it
fails. plus, it looks like nsIDNService::Normalize isn't going to
cause any trouble in this regard.
> nsStandardURL::GetAsciiHost(nsACString &result)
> {
>- if (mHostEncoding == eEncoding_Unknown) {
>- if (IsASCII(Host()))
>- mHostEncoding = eEncoding_ASCII;
>- else
>- mHostEncoding = eEncoding_UTF8;
>- }
>+ NS_ASSERTION(mHostEncoding != eEncoding_Unknown, "encoding should be set");
my only concern here is that someone could construct a nsStandardURL via
CreateInstance. they could decide not to initialize the nsStandardURL.
should we really assert in that case? perhaps we should initialize
mHostEncoding to eEncoding_ASCII in the nsStandardURL ctor. it should
be okay to regard an empty hostname as ASCII. then again, i'm not sure
that the rest of nsStandardURL works that well when not initialized :-/
sr=darin
Attachment #144199 -
Flags: superreview+
Attachment #144199 -
Flags: review?(jshin)
Attachment #144199 -
Flags: review?(darin)
Comment 10•21 years ago
|
||
(In reply to comment #9)
> my only concern here is that someone could construct a nsStandardURL via
> CreateInstance. they could decide not to initialize the nsStandardURL.
personally, I think they deserve whatever they get :)
Assignee | ||
Comment 11•21 years ago
|
||
This fixes the two numbered comments and changes where mHostEncoding is set (so
it's never Unknown).
Attachment #144199 -
Attachment is obsolete: true
Assignee | ||
Updated•21 years ago
|
Attachment #144426 -
Flags: superreview?(darin)
Attachment #144426 -
Flags: review?(jshin)
Assignee | ||
Updated•21 years ago
|
Attachment #144199 -
Flags: review?(jshin)
Comment 12•21 years ago
|
||
Comment on attachment 144426 [details] [diff] [review]
patch
Here we assert.
>+ if (gIDNService &&
>+ NS_SUCCEEDED(gIDNService->Normalize(tempHost, encHost)))
>+ approxLen += encHost.Length();
>+ else {
>+ NS_ASSERTION(encHost.IsEmpty(),
>+ "Normalize failed by modified encHost");
BTW, it's not documented in the idl file that the out parameter is empty when
normalize fails (although the only implementation does that)...
>+ approxLen += mHost.mLen;
However, we don't here.
>+ if (!IsASCII(flat)) {
>+ mHostEncoding = eEncoding_UTF8;
>+ if (gIDNService &&
>+ NS_SUCCEEDED(gIDNService->Normalize(flat, escapedHost))) {
>+ host = escapedHost.get();
>+ len = escapedHost.Length();
>+NS_IMETHODIMP nsIDNService::Normalize(const nsACString & input, nsACString & output)
>+{
>+ nsAutoString outUTF16;
>+ nsresult rv = stringPrep(NS_ConvertUTF8toUTF16(input), outUTF16);
>+ if (NS_SUCCEEDED(rv))
>+ CopyUTF16toUTF8(outUTF16, output);
>+ return rv;
How about moving the assertion here? Isn't stringPrep always supposed to
succeeed? Alternatively, we can assert at both places.
r=jshin
Attachment #144426 -
Flags: review?(jshin) → review+
Reporter | ||
Comment 13•21 years ago
|
||
Comment on attachment 144426 [details] [diff] [review]
patch
>+ "Normalize failed by modified encHost");
s/by/but/
sr=darin
Attachment #144426 -
Flags: superreview?(darin) → superreview+
Assignee | ||
Updated•21 years ago
|
Assignee: darin → dbaron
Status: ASSIGNED → NEW
Priority: -- → P1
Whiteboard: [IDN] → [IDN][patch]
Assignee | ||
Comment 14•21 years ago
|
||
(In reply to comment #12)
> BTW, it's not documented in the idl file that the out parameter is empty when
> normalize fails (although the only implementation does that)...
darin said in comment 9 that that's implied, although I'd never heard that rule
before.
> >+ approxLen += mHost.mLen;
>
> However, we don't here.
Don't what?
> How about moving the assertion here? Isn't stringPrep always supposed to
> succeeed? Alternatively, we can assert at both places.
No, stringPrep can fail for some cases of input text (I think there are some
rules on bidi text and an "isprohibited" function) -- and I'd rather leave the
URL unmodified in such cases so the user can correct it. Presumably the failure
is handled by the DNS lookup code in a way that causes the hostname lookup to
fail. I should probably write a testcase for this, though...
Assignee | ||
Comment 15•21 years ago
|
||
Comment on attachment 144426 [details] [diff] [review]
patch
Requesting approval. This should be pretty safe, and it fixes what could be
perceived as a minor security problem (strange things could appear in the URL
bar for well-known sites).
Attachment #144426 -
Flags: approval1.7?
Comment 16•21 years ago
|
||
Comment on attachment 144426 [details] [diff] [review]
patch
a=chofmann for 1.7
Attachment #144426 -
Flags: approval1.7? → approval1.7+
Assignee | ||
Comment 17•21 years ago
|
||
Fix checked in to trunk, 2004-04-02 23:32 -0800.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 18•19 years ago
|
||
This caused bug 307259, fwiw.
Comment 19•19 years ago
|
||
Verified with windows 1.0.7, 1.7.12, Mac 1.0.7, 1.7.12
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•