Closed
Bug 133016
Opened 23 years ago
Closed 17 years ago
[mozTXTToHTMLConv] Freetext url recognition stops at ")" and "'" (apostrophe)
Categories
(MailNews Core :: Backend, defect, P2)
MailNews Core
Backend
Tracking
(Not tracked)
VERIFIED
FIXED
mozilla1.9alpha6
People
(Reporter: jthg, Assigned: BenB)
References
()
Details
(Keywords: verified1.8.1.8)
Attachments
(8 files, 9 obsolete files)
(deleted),
message/rfc822
|
Details | |
(deleted),
application/octet-stream
|
Details | |
(deleted),
image/png
|
Details | |
(deleted),
patch
|
mscott
:
review+
Biesinger
:
superreview+
mscott
:
approval-thunderbird2+
|
Details | Diff | Splinter Review |
(deleted),
text/plain
|
Details | |
(deleted),
image/jpeg
|
Details | |
(deleted),
image/png
|
Details | |
(deleted),
patch
|
dveditz
:
approval1.8.1.8+
|
Details | Diff | Splinter Review |
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.9+)
Gecko/20020318
BuildID: 2002031803
When reading a e-mail message, Mozilla turns all web addresses to clickable
links. However, some links are not correctly highlighted; they are either
segmented leaving the hyperlinks useless or the hyper link is not extended enough.
Two examples:
http://www.finaid.vt.edu/flash/Mar02flash.pdf
http://filebox.vt.edu/users/jeisinge/frisbee/09%20(at%2013h07m02s).JPG
Reproducible: Always
Steps to Reproduce:
1. recieve an e-mail with an above mentioned link in the body
Actual Results: With http://www.finaid.vt.edu/flash/Mar02flash.pdf, the link is
segmented so that http://www.finaid becomes a hyper link, .vt. just becomes
text, and edu/flash/Mar02flash.pdf becomes a hyper link.
With http://filebox.vt.edu/users/jeisinge/frisbee/09%20(at%2013h07m02s).JPG,
http://filebox.vt.edu/users/jeisinge/frisbee/09%20(at%2013h07m02s becomes a
hyper link and ).JPG becomes just text.
Expected Results: the whole address should become a hyperlink
Comment 1•22 years ago
|
||
The following information may or may not be related. It may also be a problem at
the senders end. I received the following mail message from the bugzilla system
with a URL that was not fully recognised as a URL. Only the first line of the
field "URL:" was recognised as a clickable URL - the rest was just text.
http://bugzilla.mozilla.org/show_bug.cgi?id=140851
Summary: LDAP query in browser loads entries into address book
Product: Browser
Version: other
Platform: PC
URL: http://bugzilla.mozilla.org/buglist.cgi?bug_status=NEW&b
ug_status=ASSIGNED&bug_status=REOPENED&email1=&emailtype
1=substring&emailassigned_to1=1&email2=&emailtype2=subst
ring&emailreporter2=1&bugidtype=include&bug_id=&changedi
n=&votes=&chfieldfrom=&chfieldto=Now&chfieldvalue=&short
_desc=ldap&short_desc_type=allwordssubstr&long_desc=&lon
g_desc_type=allwordssubstr&bug_file_loc=&bug_file_loc_ty
pe=allwordssubstr&status_whiteboard=&status_whiteboard_t
ype=allwordssubstr&keywords=&keywords_type=anywords&fiel
d0-0-0=noop&type0-0-0=noop&value0-0-
0=&cmdtype=doit&order=Reuse+same+sort+as+last+time
OS/Version: Windows 2000
Status: UNCONFIRMED
Severity: enhancement
Priority: --
Component: URL Bar
AssignedTo: hewitt@netscape.com
ReportedBy: matthew.jurgens@solnet.com.au
QAContact: claudius@netscape.com
>From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.0.0+)
Gecko/20020428
BuildID: 2002042808
Comment 2•22 years ago
|
||
The following is truncated at ! also (Version 1.1 release)
http://linux.bkbits.net:8080/linux-2.5/user=akpm/ChangeSet@-4w?nav=!-|index.html|stats|!+|index.html
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 3•22 years ago
|
||
Please attach the complete message in question to this bug for further inspection.
Possibly a dupe of bug 90161.
pi
Reporter | ||
Comment 4•22 years ago
|
||
The first link seems to work now! However, when parenthesis are used, the link
terminates before the ending parenthesis ( ')' ).
Comment 5•22 years ago
|
||
*** Bug 145703 has been marked as a duplicate of this bug. ***
*** Bug 145473 has been marked as a duplicate of this bug. ***
Comment 7•22 years ago
|
||
Besides that, arbitrary text can be hilited when it should not be =( E.g. here
PDF::parse: cannot find pdf parser /usr/local/bin/acroread
PDF::parse: is shown as a link (since Mozilla 1.3a, 1.2 didn't have this
effect). I think protocol should be verified with getservbyname() or similiar call.
*** Bug 200216 has been marked as a duplicate of this bug. ***
To stop this bug from morphing I am changing the summary to reflect URL 2 in
comment O, which is what the dupes are about too. This bug affects normal ")",
square "]", and sqiugly "}" brackets.
Comment 7 is an other bug, which I belive is fixed.
Summary: mail message does not correctly identify URI / URL / web address / hyper link → mail message does not correctly identify URI / URL / web address / hyper link containing right bracket ")", "]", or "}"
Comment 10•22 years ago
|
||
Adding URL from dupped bug that demonstrates problem.
Also, placing a "<" and a ">" around the URL fixes the problem, as long as the
URL is not wrapped to multiple lines.
Comment 11•22 years ago
|
||
There was some work done with bug 183111, which may affect this bug as Torben
said. RFC 2396 at one point has this to say about delimiters:
2.4.3. Excluded US-ASCII Characters
Although they are disallowed within the URI syntax, we include here a
description of those US-ASCII characters that have been excluded and
the reasons for their exclusion.
The control characters in the US-ASCII coded character set are not
used within a URI, both because they are non-printable and because
they are likely to be misinterpreted by some control mechanisms.
control = <US-ASCII coded characters 00-1F and 7F hexadecimal>
The space character is excluded because significant spaces may
disappear and insignificant spaces may be introduced when URI are
transcribed or typeset or subjected to the treatment of word-
processing programs. Whitespace is also used to delimit URI in many
contexts.
space = <US-ASCII coded character 20 hexadecimal>
The angle-bracket "<" and ">" and double-quote (") characters are
excluded because they are often used as the delimiters around URI in
text documents and protocol fields. The character "#" is excluded
because it is used to delimit a URI from a fragment identifier in URI
references (Section 4). The percent character "%" is excluded because
it is used for the encoding of escaped characters.
delims = "<" | ">" | "#" | "%" | <">
Other characters are excluded because gateways and other transport
agents are known to sometimes modify such characters, or they are
used as delimiters.
unwise = "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`"
Data corresponding to excluded characters must be escaped in order to
be properly represented within a URI.
The code that marks a url in a text does also recognize () and {} as delimiters,
maybe it can be made a little bit smarter.
Also there is the general problem with linebreaks, RFC 2396 is very clear about
that:
E. Recommendations for Delimiting URI in Context
URI are often transmitted through formats that do not provide a clear
context for their interpretation. For example, there are many
occasions when URI are included in plain text; examples include text
sent in electronic mail, USENET news messages, and, most importantly,
printed on paper. In such cases, it is important to be able to
delimit the URI from the rest of the text, and in particular from
punctuation marks that might be mistaken for part of the URI.
In practice, URI are delimited in a variety of ways, but usually
within double-quotes "http://test.com/", angle brackets
<http://test.com/>, or just using whitespace
http://test.com/
These wrappers do not form part of the URI.
In the case where a fragment identifier is associated with a URI
reference, the fragment would be placed within the brackets as well
(separated from the URI with a "#" character).
In some cases, extra whitespace (spaces, linebreaks, tabs, etc.) may
need to be added to break long URI across lines. The whitespace
should be ignored when extracting the URI.
No whitespace should be introduced after a hyphen ("-") character.
Because some typesetters and printers may (erroneously) introduce a
hyphen at the end of line when breaking a line, the interpreter of a
URI containing a line break immediately after a hyphen should ignore
all unescaped whitespace around the line break, and should be aware
that the hyphen may or may not actually be part of the URI.
Using <> angle brackets around each URI is especially recommended as
a delimiting style for URI that contain whitespace.
The prefix "URL:" (with or without a trailing space) was recommended
as a way to used to help distinguish a URL from other bracketed
designators, although this is not common in practice.
For robustness, software that accepts user-typed URI should attempt
to recognize and strip both delimiters and embedded whitespace.
For example, the text:
Yes, Jim, I found it under "http://www.w3.org/Addressing/",
but you can probably pick it up from <ftp://ds.internic.
net/rfc/>. Note the warning in <http://www.ics.uci.edu/pub/
ietf/uri/historical.html#WARNING>.
contains the URI references
http://www.w3.org/Addressing/
ftp://ds.internic.net/rfc/
http://www.ics.uci.edu/pub/ietf/uri/historical.html#WARNING
Mozilla does not conform to the linebreak handling described here.
Comment 12•22 years ago
|
||
> Mozilla does not conform to the linebreak handling described here (comment 11)
For the record, that is bug 5351.
Comment 13•22 years ago
|
||
OK, so RFC3296 has certain rules governing what can be in a URI. However, may I
suggest that if a user receives an eMail that contains some "unwise" characters
in the URI, that Mozilla should be able to link to the complete URI.
Of course, sending an eMail with these "unwise" characters is a totally
different matter.
Comment 14•22 years ago
|
||
As I said, our detection code probably needs to be a little bit smarter, for
example: If the url does not start with a bracket then counting brackets and
stopping when the count reaches zero makes no sense as it seems to happen with
the second example, but that's just a guess.
Assignee | ||
Comment 15•22 years ago
|
||
Thanks for ccing me Andreas. Most of the comments on this bug are actually wrong
or offtopic, so I am focussing on the problem mentioned in comment 9. Shortening
summary.
For the spec, see <http://www.bucksch.com/1/projects/mozilla/16507/>.
Andreas, what are you suggesting? My opinion is basically: if you're sending
freaky URLs like those with (), then wrap them in <>. If the sender didn't, fix
the sender (if possible) or copy the URL manually (happens rarely enough IMHO).
But, what would you do? IIRC, I didn't count spaces, I just stop at any bracket
when in freetext mode.
Summary: mail message does not correctly identify URI / URL / web address / hyper link containing right bracket ")", "]", or "}" → Freetext url recognition stops at right bracket ")", "]", or "}"
Assignee | ||
Comment 16•22 years ago
|
||
narrowing summary even further, because } and ] are illegal in URLs and should
IMHO not be linked. It's a recognizer, after all, it's not expected to get it
right in all cases, esp. not the abuse ones.
Summary: Freetext url recognition stops at right bracket ")", "]", or "}" → Freetext url recognition stops at ")"
Assignee | ||
Comment 17•22 years ago
|
||
Code is in mozTXTToHTMLConv::FindURLEnd
<http://lxr.mozilla.org/seamonkey/source/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp#299>
(sorry for the spam).
Comment 18•22 years ago
|
||
Interesting... I have a couple of observations/info.
I am the one who posted the newsgroup message currently listed in the URL of
this bug. I obtained this URL as follows. Using Mozilla, goto microsoft.com
navigate to the search and select search knowledgebase by id. I typed in the id#
815411, the corresponding page is loaded and the URL in Mozilla location bar is
the one in the message. I then simply selected and copied it from my location
bar, and pasted it to my news post, to get:
http://support.microsoft.com/default.aspx?scid=kb;[LN];815411
- I find it interesting that when posting this to bugzilla it handles it
"correctly" (i.e. it highlights the entire url). I pasted the above URI string
the same way.
- As one would expect, since Microsoft is the originator of above URL Outlook
also highlights the whole URL in the posting.
Comment 19•22 years ago
|
||
Perhapse there needs to be a "900lbs-gorilla" tag for bugs which highlight bugs
which are technically compliant with public specs but M$ ignores thus
promulgating bad behavior.
I leave it up to others to make the political/pragmatic decisions if the
prevalence needs to be accommodated or not.
Comment 20•21 years ago
|
||
Take a look at RFC 2396 (2.3):
mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"
There is described that ´)' and ´'' are also allowed within an URI. But both
aren't working within MailNews.
Like the comment [1] tells me, it was forgotten to update and RFC2396E sounds
like a draft for me.
Within the Browser you see this url [2] completly because Bugzilla creates a
hyperlink and so its shown correctly. Instead MailNews renders the URI for itself.
[1]
http://lxr.mozilla.org/seamonkey/source/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp#308
[2] http://support.microsoft.com/default.aspx?scid=kb;[LN];815411
Comment 21•20 years ago
|
||
Bug 119963 is about the apostrophe case.
See also bug 218287.
Summary: Freetext url recognition stops at ")" → [mozTXTToHTMLConv] Freetext url recognition stops at ")"
Assignee | ||
Comment 22•20 years ago
|
||
*** Bug 119963 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 23•20 years ago
|
||
I marked bug 119963, which is about apostrophes, a dup of this, because they are
so similar.
I am tending towards WONTFIX. This is a recognizer, designed to get most valid
cases right.
Cases outside the spec (e.g. "[" as in comment 18) generally are not supported.
Comment 18 asks to allow "[" in URLs, while bug 218287 asks *not* to support it
and gives a good example. This shows that the recognizer is just guesswork and a
judgement call.
"(" and apostrophe are allowed in URL per spec. But we're not in a situation
where we know for sure that it's supposed a URL, we need to *find* it in a bunch
of human language text (like http://example.com). This is why I excluded
brackets from the URL and probably apostrophe as well. For consistency and for
bug 218287, I should probably actually disallow both opening and closing
brackets everywhere in the URL.
Summary: [mozTXTToHTMLConv] Freetext url recognition stops at ")" → [mozTXTToHTMLConv] Freetext url recognition stops at ")" and "'" (apostrophe)
Comment 24•20 years ago
|
||
A quick point of clarification (since I added comment #18):
I am not advocating support for parsing characters in URI's such as the "[" if
it is outside the spec.
I was simply adding contextual information on where I came across it in the
field and how from an end user perspective there "seemed to be" differing
behaviours between Mozilla, IE, an Bugzilla.
If end users dont come across these non-standard cases very often in practice,
its probably not a big deal, and Mozilla should do the right thing.
Comment 25•20 years ago
|
||
I know this is polluting the menus, but perhaps on selecting the full URL by
hand you could add 'open selection as link' on the context menu? Would be
less painful than the 'cut and paste and open a browser window and and
and ...' solution that's required otherwise.
Comment 26•20 years ago
|
||
we should not do that. but i can today select text and drag it to a tab or
window if i want it treated as a url. if i'm using x11 i can probably middle
click on my selection to open it somewhere.
Comment 27•20 years ago
|
||
*** Bug 230972 has been marked as a duplicate of this bug. ***
Comment 28•20 years ago
|
||
*** Bug 248319 has been marked as a duplicate of this bug. ***
Comment 29•20 years ago
|
||
*** Bug 260714 has been marked as a duplicate of this bug. ***
Comment 30•20 years ago
|
||
I think that & (ampersant) and | (pipe) are included in standard but it still
doesn't work with them.
Updated•20 years ago
|
Product: Browser → Seamonkey
Comment 31•20 years ago
|
||
*** Bug 271374 has been marked as a duplicate of this bug. ***
Comment 32•20 years ago
|
||
Moving a bunch of mozTXTToHTMLConv bugs into Core / MailNews Backend to live
with their bretheren.
Updated•20 years ago
|
Component: MailNews: Main Mail Window → MailNews: Backend
Product: Mozilla Application Suite → Core
Comment 33•20 years ago
|
||
*** Bug 275183 has been marked as a duplicate of this bug. ***
Comment 34•19 years ago
|
||
*** Bug 236915 has been marked as a duplicate of this bug. ***
Comment 35•19 years ago
|
||
*** Bug 303590 has been marked as a duplicate of this bug. ***
Comment 36•19 years ago
|
||
Domino produces links like:
https://lsd.memach.com/ConditionReports.nsf/(lookup)/CC8CB6274C8EEBE385257062005E54A6?OpenDocument
and the user has to copy the link and paste it into the browser in order to view
the link. It would really be nice if the behavior of thunderbird would use the
entire link vice stopping at the letter before the ")". I cant change the way
domino produces the links. Thanks for your efforts
Comment 37•19 years ago
|
||
(In reply to comment #32)
> Moving a bunch of mozTXTToHTMLConv bugs into Core / MailNews Backend to live
> with their bretheren.
What does this mean? That the bug is going to die for lack of attention or it is
going to be looked at and action taken. I was planning on deploying a bunch of
thunderbird apps but since we use domino for one of our main application and
domino insists on using ( and ) in the links I am stuck unless this is resolved.
I will have to find an alternative email program... How can I find out what is
being done. Do I need to look for an alternative? Thanks for all your efforts...
Lyman
Comment 38•19 years ago
|
||
(In reply to comment #37)
> (In reply to comment #32)
> > Moving a bunch of mozTXTToHTMLConv bugs into Core / MailNews Backend to live
> > with their bretheren.
>
> What does this mean? That the bug is going to die for lack of attention or it
> is going to be looked at and action taken. I was planning on deploying a bunch
> of thunderbird apps but since we use domino for one of our main application
> and domino insists on using ( and ) in the links I am stuck unless this is
> resolved.
Comment 23, written by the person who implemented URL recognition in the first
place, gives the arguments *against* allowing parentheses in a recognized URL.
The main issue here is that someone might write a message such as
---
... and you can visit my homepage (http://homepage.somewhere.tld) to see.
---
In this case, the closing parenthesis should *not* be part of the URL. Making
the code smart enough to recognize parentheses in context is harder than it
sounds -- particularly since there could be a legal URL which contains *only* a
closing or opening parenthesis, e.g.
http://paranoia.tld/kso03)cldke.html
You say you can't change how Domino works, but unless you can write some code
for Mozilla (or convince/pay someone to do so for you), you can't change how TB
works either. Why not file a bug with them and see if they can change the URL
generation to either enclose the URL within angle brackets, or escape the
parentheses? That change on their part would be much easier to implement than
the change required on Mozilla's part.
Comment 39•19 years ago
|
||
Why isn't it simple enough to include a paren if there's no whitespace after it?
Comment 40•19 years ago
|
||
Both (http:example.com) and http://paranoia.tld/kso03)cldke.html are correctly
recognized here. I guess that Bugzilla's recognition is better than Thunderbird's :p
Comment 41•19 years ago
|
||
*** Bug 301363 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 42•19 years ago
|
||
In response to Lyman Byrd, I created a simple patch which does what I think is
the only thing I can do here: URL recognition won't stop at "(", ")" or "'" in
the middle of the URL, but will exclude ")" and "'", if they appear at the end.
The patch is completely untested (it does compile, though).
I am still not sure, if this bug is a good idea in the first place or should be
WONTFIXed. Please read my comments above before applying the patch.
And please do contact IBM/Lotus to fix their broken software - it's highly
unwise to use things like brackets and apostrophs in URLs, exactly because of
cases like these.
Assignee: sspitzer → ben.bucksch
Status: NEW → ASSIGNED
Assignee | ||
Comment 43•19 years ago
|
||
Ops, last patch had an obvious error.
(Also: Remove "broken" from last paragraph of last comment. Rest stays.)
Attachment #194943 -
Attachment is obsolete: true
Comment 44•19 years ago
|
||
*** Bug 313252 has been marked as a duplicate of this bug. ***
Comment 45•19 years ago
|
||
I assume this is the catch-all bug for automatic URL detection problems. This URL is also segmented at a strange place:
http://www.modpython.org/FAQ/faqw.py?query=expat&querytype=simple&casefold=yes&req=search
Specifically, right before the "?". Now, obviously "?" is a pretty common character to be in a URL, so this really shouldn't happen. It is hard to imagine a regular sentence which is a question which ends with a URL... e.g.:
Have you looked at: http://www.example.com/?
In fact, if you wanted to catch that sort of thing, the logic should be fairly straight forward -- just see if there is any whitespace after the ?. I think the same goes for the parenthesis examples.
I'm using Thunderbird 1.5beta1.
Assignee | ||
Comment 46•19 years ago
|
||
> I assume this is the catch-all bug for automatic URL detection problems
It is most definitely not and has a specific summary.
> Specifically, right before the "?"
WFM, verbatim in the bugmail from your comment.
Comment 47•19 years ago
|
||
*** Bug 316428 has been marked as a duplicate of this bug. ***
Comment 48•19 years ago
|
||
*** Bug 326189 has been marked as a duplicate of this bug. ***
Comment 49•19 years ago
|
||
If ) brackets are not to be recognized as part of the URL, then why does this link exist?
http://kb.mozillazine.org/Reducing_memory_usage_(Firefox)
It cannot be copy/pasted or typed reliably with either Thunderbird, or reproduced with Firefox, because the final ) is rendered as text.
If ) closing parenthesis are not to be used in a link, then why has mozillazine.org used them?
The problem in Thunderbird 1.5 is even worse
none of the following show as links, all failing at the ) character in TB 1.5, yet they render in Firefox!
http://test(no.org/test(situation(
http://test(no.org/test(situation)
http://test(no.org/test)situation)
http://test)no.org/test(situation)
As read here in Firefox, items 2 thru 4 render incompletely on the LAST )
but in Thunderbird 1.5, the link breaks at the FIRST ) in each url.
Thunderbird 1.5 and Firefox render the links differently
Comment 50•19 years ago
|
||
It is clear that there are cases when a round bracket might be a legitimate part of a URL. The fact of the matter is that USUALLY it won't be, so the auto-sensing URL checker should go for the most regular use case. Basically, it should get it right *most* of the time. This, in my experience, will mean ignoring closing brackets (of any sort), or punctuation *other than* a slash as the last character in the string.
Comment 51•19 years ago
|
||
(In reply to comment #50)
That is understood as the rational behind not rendering the LAST ) as part of a link. But even mozillazine uses such as links.
And, in Thunderbird 1.5 even a ) within the body of a link breaks it.
while in Firefox ONLY a trailing ) breaks it.
According to RFC http://www.faqs.org/rfcs/rfc1738.html
From section 2.2
Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL.
typing http://test)this.is.a.test
here in Firefox, and the link is NOT broken
try sending that in Thunderbird 1.5 and it comes out as
http://test )this.is.a.test (space added after test for display only)
The program is ignoring a valid character in a URL, simply because some users MIGHT use the character as a delimiter. Would it not be better to include the ) in links, and correct the improper use of such as a delimiter when it arises?
And once again, the problem is more profound in Thunderbird 1.5 which renders ANY ) as an invalid character (in a URL)
Comment 52•18 years ago
|
||
Just a doubt: Am I wrong or with the rfc 3986 the chars ( ) and ' are considered as Reserved Characters (par. 2.2) and so must be escaped?
Assignee | ||
Comment 53•18 years ago
|
||
You are right that they are now "reserved" (they weren't in RFC2396). This means that they *must* be escaped when not used as delimiters, but *must not* be escaped when used in their origial purpose as delimiter. I'm not sure if the Lotus Notes use falls into the category of a proper use as delimiter.
Assignee | ||
Comment 54•18 years ago
|
||
Re last paragraph in comment 51: No, this is a *recognizer* mainly targetted at text that *humans* write. This is something totally different from an HTTP header parser. "Go to Google (http://www.google.com)" is far more often in emails than the rare URLs which make use of () inside the URL. See my other comments above.
Comment 55•18 years ago
|
||
*** Bug 348012 has been marked as a duplicate of this bug. ***
Updated•18 years ago
|
OS: Windows NT → All
Hardware: PC → All
Comment 56•18 years ago
|
||
*** Bug 349531 has been marked as a duplicate of this bug. ***
Comment 57•18 years ago
|
||
The workaround to send a good clickable link is to replace ) with %29
Compare:
http://en.wikipedia.org/wiki/O_(Cyrillic)
http://en.wikipedia.org/wiki/O_(Cyrillic%29
http://en.wikipedia.org/wiki/O_%28Cyrillic%29
Does anyone know the % codes for ' and " ?
Comment 58•18 years ago
|
||
Another workaround is to use <> brackets
observe
http://en.wikipedia.org/wiki/O_(Cyrillic)
<http://en.wikipedia.org/wiki/O_(Cyrillic)>
Firefox doesnt have this problem, what code do they use? Workarounds are good but they detract from the fact that in this case Thunderbirds actions are non-standard. a ) is a legitimate character in urls, and one shouldnt have to use a 'workaround' to get it to work in Thunderbird
Comment 59•18 years ago
|
||
*** Bug 353013 has been marked as a duplicate of this bug. ***
Comment 60•18 years ago
|
||
xref bug 281321
Comment 61•18 years ago
|
||
*** Bug 356291 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 62•18 years ago
|
||
It's not clear what to do there. traditionally (even netscape 4), we assumed
that if you write "my company (http://www.foobar.com) does ...", you would
not want the () included in the URL. wikipedia unfortunately makes very
liberal use of that, which I think is very unwise. But it poses the question:
which case do we favor?
The only compromise I see is to look whether there's a ( after the URL start,
and if so, include the ) as well. I like that.
That doesn't solve the apostrophe proble, though, but I guess that's less
severe, we could just include it? It (only) breaks people using ' as quote.
Assignee | ||
Comment 63•18 years ago
|
||
Comment on attachment 194945 [details] [diff] [review]
Possible Patch, v2 (not sure, if good idea or WONTFIX)
Example:
http://en.wikipedia.org/wiki/Bulb_(disambiguation) (very common in wikipedia)
http://en.wikipedia.org/wiki/Lemarchand's_box
Attachment #194945 -
Attachment is obsolete: true
Assignee | ||
Comment 64•18 years ago
|
||
OK, I want to get this fixed.
Severity: minor → normal
Priority: -- → P2
Comment 65•18 years ago
|
||
Its not only common in Wikipedia, Mozillazine uses it quite often as well i.e.
http://kb.mozillazine.org/Reducing_memory_usage_(Firefox)
The essence imho is; Thunderbird currently DOESNT follow web standards in that () are approved characters in urls.
Instead, it is 'fudging' the standards to meet developers expectations. Funny, when Microsoft does this, its propeity code and wrong, but here is Thunderbird doing the same exact thing.
Currently you have to tell people they CANNOT quote a legimate website
http://kb.mozillazine.org/Reducing_memory_usage_(Firefox)
because Thunderbird wont render it properly - even tho there is NOTHING wrong with the web site at all!
I would MUCH rather have Thunderbird properly render the () as authorized url characters which they in fact are, and then tell someone why HIS rendition of a url (http://mozilla.com/html) doesnt work (the extra )). Simply tell the person NOT to use () to quote.
Comment 66•18 years ago
|
||
(In reply to comment #65)
When this bug (and so many related to it) was logged I would have agreed with you wholeheartedly, since RFC 2396 [Aug 1998] was in effect and "(" and ")" characters were permissible as data characters in URI's.
Four years later, though, Thunderbird's behaviour is now "accidentally compliant" with 2396's successor, RFC 3986 [Jan 2005], which moves "(" and ")" into sub-delims, meaning that they should be pct-encoded as %28 and %29 when appearing as data characters in URI's. Ref: http://www.ietf.org/rfc/rfc3986.txt
It seems that we should now be chasing the web masters of popular sites like Google Maps, Wikipedia and Mozillazine, to get them to fix their URI encoding. Or we could wait another four years and see what standards are in effect then. :)
Assignee | ||
Comment 67•18 years ago
|
||
Guys, this is not about standards. They are the ground rule, but this is a fuzzy recognizer, see my comments above. Unless somebody finds a problem with my idea in comment 62 or has better ideas:
Please stop from making comments.
Assignee | ||
Comment 68•18 years ago
|
||
Here's another idea (in addition to comment 62): bug 356657. Only helps, if sender used Thunderbird/Seamonkey.
Comment 69•18 years ago
|
||
I frequently receive correspondance from companies which include ( and ) as part of their URL. I had a survey which I filled out this past week for a company wanting to know how I felt about their customer service, and I was directed to the below URL, and whether this is wise from a standards perspective seems to be moot when compared to the fact that the links do not get properly interpreted as would be reasonably expected by a human. I should add, though, that there was a bunch of other junk added to allow the browser to interpret it as a possible HTML link but since I display in plain text, I was unable to click without a bunch of cut and paste. (I just tested this with the message in question, and it was never intended to be an HTML e-mail, so perhaps the sender just tried to get fancy or something and Thunderbird didn't know what to properly do with it? Should Thunderbird even try to interpret such thing?) The link I received was as follows:
http://surveys.benchmarkportal.com/ebay/survey.taf?survey_id=1283&use
r_id=19B0E8D6-9BFE-4FB7-9996-90BF83F5AD10
Within its context, though, it actually looked like this mess:
<br><br><a
href="http://surveys.benchmarkportal.com/ebay/survey.taf?survey_id=1283&use
r_id=19B0E8D6-9BFE-4FB7-9996-90BF83F5AD10">http://surveys.benchmarkportal.c
om/ebay/survey.taf?survey_id=1283&user_id=19B0E8D6-9BFE-4FB7-9996-90BF83F5A
D10</a><br><br>
The URL was spread across multiple lines and even with the quotes around the URL, it was misinterpreted by thunderbird. I think it is PROBABLY safe to suggest that anything beginning with a quotation mark followed by the string http:// is a URL which should be interpreted as a URL until the next occurrence of a quotation mark, and one can probably make the same assumption with other network protocols like ftp:// or https:// for example. (Whether this is easy to code for or not, I don't know.) What I was able to click was only everything up until a line break occurred in Thunderbird.
Comment 71•18 years ago
|
||
This fixes it for all cases I have found.
Attachment #251951 -
Flags: review?(ben.bucksch)
Comment 72•18 years ago
|
||
Nit: separate keywords from parentheses with a space:
while (...) ...; if (....) ...;
^ ^
You could perform the same matching-parenthesis test at the end for matching apostrophe. I don't think this is common, so maybe it's not worth testing for, but it's possible someone would put in the URL as:
'http://mozilla.org'
Assignee | ||
Comment 73•18 years ago
|
||
Comment on attachment 251951 [details] [diff] [review]
proposed fix
Thanks for the patch, but the code in the |if (state[check] == endok)| doesn't fit the design of the code. Any reason not to put it in the code you commented as "back up a bit"?
No reason to change the loop from for to while. And it *is* iterating.
Comments are not entirely correct, but comments would make sense, yes.
Unless you have a reason why the above proposal won't work, I'll attach a patch for that.
Attachment #251951 -
Flags: review?(ben.bucksch) → review-
Assignee | ||
Comment 74•18 years ago
|
||
> Any reason not to put it in the code you commented as "back up a bit"?
Nevermind, I realized it. However, your code - apart from not being in FindURL where is belongs - would not recognize (for example http://www.yahoo.com) correctly, it only looks at the char before the URL start. See comment 62.
Assignee | ||
Comment 75•18 years ago
|
||
Here's what I mean:
- Changes for loop to be more understandable
- Allows ' in middle of URL, but not at end
- Allowes ) in middle of URL, but not at end, unless there was a ( earlier
However, it doesn't work as intended. I don't see why. Magnus, when looking at the code, can you see what's wrong?
Attachment #251951 -
Attachment is obsolete: true
Attachment #251988 -
Flags: review?(mkmelin+mozilla)
Assignee | ||
Comment 76•18 years ago
|
||
> However, it doesn't work as intended. I don't see why.
2 reasons:
- should be "&& !haveOpeningBracket" (forgot the ! )
- I stupidly built static, so my changes were not picked up. rebuilding.
Assignee | ||
Updated•18 years ago
|
Attachment #251988 -
Flags: review?(mkmelin+mozilla)
Assignee | ||
Comment 77•18 years ago
|
||
Yup, that was it. Fix works as intended. I couldn't find anything it breaks either.
Magnus, please review.
Attachment #251988 -
Attachment is obsolete: true
Attachment #251997 -
Flags: review?(mkmelin+mozilla)
Assignee | ||
Updated•18 years ago
|
Attachment #251997 -
Attachment is patch: true
Attachment #251997 -
Attachment mime type: text/x-patch → text/plain
Comment 78•18 years ago
|
||
I'll be away for a couple of days, so I wont be able to try it out until monday. Looks like it could work though. Assuming it works, it sure would be nice to get this in for tb2.0...
Do we want to fix the 'http://mozilla.org' case too? Don't know how usual that is, but it would be pretty straight forward fix.
Assignee | ||
Comment 79•18 years ago
|
||
With the patch v5, http://test)foo is recognized. This patch is an alternative, allowing ) only after a (. I.e. http://test_(Comp) and http://test(lookup)bla are recognized, but http://text)foo is not.
> Do we want to fix the 'http://mozilla.org'
Works before and after the patch (both versions).
Attachment #252040 -
Flags: review?(mkmelin+mozilla)
Assignee | ||
Updated•18 years ago
|
Attachment #252040 -
Attachment is patch: true
Attachment #252040 -
Attachment mime type: text/x-patch → text/plain
Assignee | ||
Updated•18 years ago
|
Attachment #251997 -
Flags: review?(mkmelin+mozilla)
Assignee | ||
Comment 80•18 years ago
|
||
Here are my old test mails. Quite a lot. For thorough testing, they should be checked with all relevant pref combinations.
Assignee | ||
Comment 81•18 years ago
|
||
Ops, the last attachment was actually intended for another bug, but may be useful here as well.
Here's a mail with testcases specifically for the brackets and similar. Clicking on it should open it in the browser and render it with the browser's libmime, if available. You can also copy the file to LocalFolders and open it in Thunderbird.
Assignee | ||
Comment 82•18 years ago
|
||
Assignee | ||
Comment 83•18 years ago
|
||
Comment on attachment 251997 [details] [diff] [review]
Fix, v5
Yes, I like v6 better
Attachment #251997 -
Attachment is obsolete: true
Comment 84•18 years ago
|
||
Comment on attachment 252040 [details] [diff] [review]
Fix, v6. Alternative to v5: ) only after (
Seems good except for one case I happend to stumble upon. In a bug it said: per foo@bar.com's suggestion. For emails apostrophes should mark the end, no? And hypothetical cases like foo@(bar).com probably shouldn't be links either.
I'm not a reviewer, but with that fixed, seems ok to me.
Comment 85•18 years ago
|
||
This is fix v6 + don't allow ' and ( in email address recognition.
Attachment #252040 -
Attachment is obsolete: true
Attachment #253854 -
Flags: review?(ben.bucksch)
Attachment #252040 -
Flags: review?(mkmelin+mozilla)
Comment 86•18 years ago
|
||
Ben: have you had time to look at this?
Assignee | ||
Comment 87•18 years ago
|
||
Yes. The 2-line check you added was in the wrong place (there's a similar check later, should be added there). I just fixed that.
I'll also disallow [ and {. Currently, ] and } are disallowed (reasons see above), so including the opening ones doesn't make any sense whatsoever.
Attachment #253854 -
Attachment is obsolete: true
Attachment #254729 -
Flags: review?(mscott)
Attachment #253854 -
Flags: review?(ben.bucksch)
Assignee | ||
Comment 88•18 years ago
|
||
Attachment #252195 -
Attachment is obsolete: true
Updated•18 years ago
|
Attachment #254729 -
Flags: review?(mscott) → review+
Assignee | ||
Updated•18 years ago
|
Attachment #254729 -
Flags: superreview?(mscott)
Assignee | ||
Updated•18 years ago
|
Attachment #254729 -
Flags: superreview?(mscott) → superreview?(bienvenu)
Assignee | ||
Comment 89•18 years ago
|
||
Comment on attachment 254729 [details] [diff] [review]
Fix, v8
ops, this is netwerk, so I guess I need a netwerker sr. trying biesi.
Attachment #254729 -
Flags: superreview?(bienvenu) → superreview?(cbiesinger)
Comment 90•18 years ago
|
||
Comment on attachment 254729 [details] [diff] [review]
Fix, v8
sr=biesi
Attachment #254729 -
Flags: superreview?(cbiesinger) → superreview+
Assignee | ||
Comment 91•18 years ago
|
||
Fix v8 checked into trunk.
Will bake on trunk for a few days, then ask for permission for branch.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 92•18 years ago
|
||
Can somebody QA this on trunk, please, so that we have a chance to get it into TB2.0? For expected results, see screenshot in comment 82.
Updated•18 years ago
|
Attachment #254730 -
Attachment mime type: message/rfc822 → text/plain
Comment 93•18 years ago
|
||
Tested with Thunderbird Version 3 alpha 1 (20070226) on Windows. Ben, have a look at the screenshot if the output is that what we want to have.
Assignee | ||
Comment 94•18 years ago
|
||
Comment on attachment 254729 [details] [diff] [review]
Fix, v8
Thanks Henrik. Yes, that output is as expected.
Requesting TB2 branch approval.
Attachment #254729 -
Flags: approval-thunderbird2?
Updated•18 years ago
|
Attachment #256518 -
Attachment description: Screenshot with patch v6 on fixed trunk → Screenshot with patch v8 on trunk
Updated•18 years ago
|
Attachment #254729 -
Flags: approval-thunderbird2? → approval-thunderbird2+
Updated•18 years ago
|
Whiteboard: [checkin needed]
Updated•18 years ago
|
QA Contact: olgam → backend
Whiteboard: [checkin needed]
Assignee | ||
Comment 95•18 years ago
|
||
checked into branch.
Updated•18 years ago
|
Keywords: fixed1.8.1.3
Comment 96•18 years ago
|
||
WFM with Tb 2.0pre 2007030604. Good job guys! - especially as this bug was about to get Wontfix'ed 3 years ago :)
Comment 97•18 years ago
|
||
^ wrong clipboard content: the build id is 2007030703 (Lin)
Comment 98•17 years ago
|
||
the checkin to branch
http://bonsai.mozilla.org/cvsview2.cgi?file=mozTXTToHTMLConv.cpp&branch=1.77.18.2&subdir=/mozilla/netwerk/streamconv/converters&command=DIFF_FRAMESET&rev1=1.77.18.1&rev2=1.77.18.2
looks quite a bit different from the checkin to trunk
http://bonsai.mozilla.org/cvsview2.cgi?file=mozTXTToHTMLConv.cpp&branch=1.85&subdir=/mozilla/netwerk/streamconv/converters&command=DIFF_FRAMESET&rev1=1.83&rev2=1.84
is that intentional?
Depends on: 390779
Assignee | ||
Comment 99•17 years ago
|
||
No, IIRC that was not intentional.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 100•17 years ago
|
||
Branch apparently got an older version of the fix. Trunk seems OK.
I don't know what the branch rules are. Are correctness fixes allowed or only critical and security fixes?
If not allowed, I'll mark the bug fixed again.
Comment 101•17 years ago
|
||
Ben, if trunk is fine we could leave it as fixed. Due to the wrong patch checked into 1.8.1.3 we should ask for approval 1.8.1.7. Could you attach the patch which corrects the mistake?
Status: REOPENED → RESOLVED
Closed: 18 years ago → 17 years ago
Flags: blocking1.8.1.7?
Keywords: fixed1.8.1.3
Resolution: --- → FIXED
Assignee | ||
Comment 102•17 years ago
|
||
Bug 390779 comment 8 is the result of the bad branch check-in.
Assignee | ||
Comment 103•17 years ago
|
||
This is the diff between branch and trunk on my disk, only the relevant part for this bug.
I have *not tested* it yet, will do later, or somebody else can if you want to.
Assignee | ||
Comment 105•17 years ago
|
||
Assignee | ||
Updated•17 years ago
|
Attachment #275242 -
Flags: approval-thunderbird2?
Assignee | ||
Comment 106•17 years ago
|
||
Tested Branch fixup, v1 with current branch, it indeed brings it on par with trunk, see screenshot.
mscott: please approve. This is to fix bug 390779.
Comment 107•17 years ago
|
||
Comment on attachment 275242 [details] [diff] [review]
Branch fixup, v1
>--- 2.0-branch/mozilla/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp 2007-01-21 04:53:17.806002000 +0100
>+++ trunk/mozilla/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp 2007-06-17 13:27:56.999825000 +0200
Dunno if we need a real diff against CVS 1.8 branch here. You only supplied your local diff between trunk and 1.8 branch.
Adjusting approval flag to 1.8.1.7. No review needed?
Attachment #275242 -
Flags: approval-thunderbird2? → approval1.8.1.7?
Assignee | ||
Comment 108•17 years ago
|
||
Same thing, but if you insist on it.
The patch already had review and checkin-approval for branch. I simply checked in the wrong patch.
Attachment #275242 -
Attachment is obsolete: true
Attachment #275313 -
Flags: approval1.8.1.7?
Attachment #275242 -
Flags: approval1.8.1.7?
Updated•17 years ago
|
Flags: blocking1.8.1.7? → blocking1.8.1.7+
Comment 109•17 years ago
|
||
Comment on attachment 275313 [details] [diff] [review]
Branch fixup, v1
approved for 1.8.1.7, a=dveditz for release-drivers
Attachment #275313 -
Flags: approval1.8.1.7? → approval1.8.1.7+
Updated•17 years ago
|
Keywords: checkin-needed
Comment 110•17 years ago
|
||
Ben, can you check-in attachment 275313 [details] [diff] [review] on MOZILLA_1_8_BRANCH, please?
Assignee | ||
Comment 111•17 years ago
|
||
Checked in on branch.
Assignee | ||
Updated•17 years ago
|
Keywords: checkin-needed
Updated•17 years ago
|
Keywords: fixed1.8.0.7
Target Milestone: --- → mozilla1.9alpha6
Updated•17 years ago
|
Keywords: fixed1.8.1.8
Comment 112•17 years ago
|
||
hskupin: Ben checked this into the 1.8 (.1) branch, not the 1.8.0 branch. Also, please do not change the target milestone unless it's your bug, that's for developer's use. But I'm not sure what you were getting at anyway since this was apparently already fixed on the trunk before you set it.
Keywords: fixed1.8.0.7
Target Milestone: mozilla1.9alpha6 → ---
Comment 113•17 years ago
|
||
Daniel, sorry for the fixed1.8.0.7 keyword. Yes, I ment fixed1.8.1.7. Something got wrong. I set the target milestone after the bug was fixed and which should give a hint for which version this issue is resolved. I don't think that this is inconvenient because Ben forgot that, isn't it? At least it will give us better results when doing QA.
Comment 114•17 years ago
|
||
Is the target milestone really used that way? Ok, guess that's all right if the bug's actually fixed.
Target Milestone: --- → mozilla1.9alpha6
Comment 115•17 years ago
|
||
Verified in nightly version 2.0.0.7pre (20071010) on Mac OS 10.4.9. All these cases work great:
http://www.finaid.vt.edu/flash/Mar02flash.pdf
http://filebox.vt.edu/users/jeisinge/frisbee/09%20(at%2013h07m02s).JPG,
http://www.mozilla.org'u
http://www.mozilla.org)u
http://www.mozilla.org()
http://www.mozilla.org()u
http://www.mozilla.org(u)u
http://www.mozilla.org)u(
http://www.mozilla.org[]
http://www.mozilla.org{}
Status: RESOLVED → VERIFIED
Keywords: fixed1.8.1.8 → verified1.8.1.8
Updated•16 years ago
|
Product: Core → MailNews Core
You need to log in
before you can comment on or make changes to this bug.
Description
•