Open Bug 694856 Opened 13 years ago Updated 2 years ago

"Open link in ..." context menus are missing for url with /

Categories

(Firefox :: Menus, defect, P3)

defect

Tracking

()

People

(Reporter: alice0775, Unassigned, Mentored)

References

Details

(Keywords: regression, Whiteboard: [lang=js])

Attachments

(4 files)

Build Identifier: http://hg.mozilla.org/mozilla-central/rev/3b58a9df4c8c Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0a1) Gecko/20111016 Firefox/10.0a1 ID:20111016031010 "Open link in ..." context menus are missing for urls with / Reproducible: Always Steps to Reproduce: 1. Start Firefox with clean profile 2. Select following text www.mozilla.org/products/firefox 3. Right click Actual Results: "Open link in ..." context menus are missing Expected Results: Open link in ..." context menus should pop up Regression Window(m-c) Works: http://hg.mozilla.org/mozilla-central/rev/bf714fffdfdf Mozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0a1) Gecko/20110621 Firefox/7.0a1 ID:20110621003810 Fails: http://hg.mozilla.org/mozilla-central/rev/854df0c7239e Mozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0a1) Gecko/20110621 Firefox/7.0a1 ID:20110621083640 Pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=bf714fffdfdf&tochange=854df0c7239e Triggered By: e92f3523a97c Dão Gottwald — Bug 591953 - Show the "Open Link" context menu item for URLs with hyphens. r=gavin
Confirmed on Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1 It happens also with urls ending with any file extension. Reproduce: select the plain link text and choose "open link" in the context menu. This one works (there is an option in the context menu): www.mozilla.com Those are not working (no option in the context menu): www.mozilla.com/ www.mozilla.com/page.html www.mozilla.com/page.php www.mozilla.com/page.htm www.mozilla.com/page.asp and so on.
Confirmed on Mozilla/5.0 (Windows NT 5.1; rv:9.0.1) Gecko/20100101 Firefox/9.0.1
Bug 720023 comment 1 had a patch [1] to fix the RegExp for this case (trailing slash), and that also fixed other cases (like IP urls). The only problems I could see with the new RegExp (already present in the old one) are: (In reply to Chad Freeman from Bug 720023 comment #1) > With this RegExp we: > 1) Ensure that the top-level domain is 2-4 characters in length, and > contains only alphabetic characters. This encompasses all of the well-known > TLDs that I could find or think of. The .museum tld [2] (6 chars) is uncommon, but does exist. :) And some TLDs that are becoming more common are the IDN TLDs (International Domain Names) that use the punycode encoding for domain names (in FQDNs as well as in TLDs). This can make a TLD longer. For example, .xn--p1ai (.рф [3]) is 8 characters long. There are also TLDs in arabic, greek, and maybe more. > 2) Ensure that the domain and subdomains do not begin or end with hyphens, > as this is not allowed by domain registrars. I originally disallowed the > use of consecutive hyphens as well, although it seems that some registrars > allow for consecutive hyphens in a domain name, provided they meet the > former rule as well. The consecutive hyphen should also be allowed in the TLD, as explained above. Punycode encoding always begins with 'xn--'. Anyway, even in it's current state, the patch from Bug 720023 comment #1 would be an improvement over the current situation. :) [1] https://bugzilla.mozilla.org/attachment.cgi?id=603082&action=diff#a/browser/base/content/nsContextMenu.js_sec2 [2] http://en.wikipedia.org/wiki/.museum [3] https://www.gandi.net/domaine/xn--p1ai/info http://en.wikipedia.org/wiki/.xn--p1ai http://кц.рф/en/
Gavin had pointed me to this bug when I was working on bug 720023. I've been meaning to submit a proper patch, and will be later tonight or tomorrow. (In reply to Yann Brelière from comment #6) > The .museum tld [2] (6 chars) is uncommon, but does exist. :) > And some TLDs that are becoming more common are the IDN TLDs (International > Domain Names) that use the punycode encoding for domain names (in FQDNs as > well as in TLDs). This can make a TLD longer. For example, .xn--p1ai (.рф > [3]) is 8 characters long. There are also TLDs in arabic, greek, and maybe > more. I realized that shortly after I had posted the original patch (.TRAVEL is another case). Perhaps the TLD length should just be >2, unless we can come up with some upper restriction on character length? > The consecutive hyphen should also be allowed in the TLD, as explained > above. Punycode encoding always begins with 'xn--'. Wasn't aware of this type of TLD encoding; I'll read more into it and adjust the RegExp accordingly.
(In reply to Chad Freeman from comment #7) > I realized that shortly after I had posted the original patch (.TRAVEL is > another case). Perhaps the TLD length should just be >2, unless we can come > up with some upper restriction on character length? Misspoke here: TLD length should be >=2. Currently, the longest TLD seems to be 22 characters in length [1], although, limiting the length to there wouldn't future-proof the solution. [1] http://data.iana.org/TLD/tlds-alpha-by-domain.txt
Attached patch Ping (deleted) — Splinter Review
This is Chad Freeman's regex patch, with the TLD check relaxed to >=2
Ping (2) - Chad? FWIW I've been using this locally for a few weeks and it seems to work fine.
Still happening, I can reproduce on Mozilla/5.0 (Windows NT 5.1; rv:17.0) Gecko/20100101 Firefox/17.0
I see it in latest 22.0a1 nightlies too. There's a restartless addon that fixes it: https://addons.mozilla.org/en-US/firefox/addon/bug694856open-selected-text
Mozilla/5.0 (X11; Linux x86_64; rv:57.0) Gecko/20100101 Firefox/57.0 Also tried it on and Nightly 59.0 Still happening, I can reproduce it. It is in browser/base/content/nsContextMenu.js file, that uses makeURI function in /services/common/utils.js

To fix this someone would want to update the regular expression at https://searchfox.org/mozilla-central/rev/840881e1232f664a58b39caaae6284c7bcf121df/toolkit/modules/SelectionUtils.jsm#94-99

The attached 10-year-old patch would be a place to start, but I'd really want to see the regular expression split up - we can check for the ip address cases separately, for instance - because as it is, the regex is not very readable.

Mentor: gijskruitbosch+bugs
Severity: normal → S3
OS: Windows 7 → All
Priority: -- → P3
Hardware: x86 → All
Whiteboard: [lang=js]

Hi there! I'd love to take this! I was just curious as to why we don't use the URL api, or am I being naive? (does it have to do with it being implemented in C++ and being a part of the dom?). Alternatively, could we use UrlbarTokenizer.jsm (https://searchfox.org/mozilla-central/source/browser/components/urlbar/UrlbarTokenizer.jsm) for SelectionUtils.jsm?

(In reply to gliu20 from comment #16)

Hi there! I'd love to take this! I was just curious as to why we don't use the URL api, or am I being naive?

I'm not sure I understand the suggestion. I think without a protocol, www.mozilla.org does not parse as a URL (ie if you do new URL("www.mozilla.org"), it throws an exception). If you force-add a protocol, practically anything parses as a URL (in particular, any single English word, when prefixed with http:// or https:// will parse as a URL, cf http://cheese is a perfectly valid URL). We don't want to show the link context menu items unless the selection really is plausibly a URL, so checking whether it is strictly parsable as a full URL would cause either false negatives or false positives, depending on which way we go.

(does it have to do with it being implemented in C++ and being a part of the dom?). Alternatively, could we use UrlbarTokenizer.jsm (https://searchfox.org/mozilla-central/source/browser/components/urlbar/UrlbarTokenizer.jsm) for SelectionUtils.jsm?

This is a good idea. I wasn't aware that module existed; it post-dates the filing of this bug and the regexes in the existing code.

We couldn't quite do that directly, because SelectionUtils.jsm is a toolkit module (so is also usable from Firefox for Android, Thunderbird, etc.) and UrlbarTokenizer is desktop-Firefox-only. We could talk to the address bar team if it's worth moving some of the functionality into a toolkit module, or conversely, we could potentially change the browser context menu code to just talk to UrlbarTokenizer directly, not using SelectionUtils for this usecase. Marco, WDYT?

Flags: needinfo?(mak)

The behavior of the UrlbarTokenizer is deeply customized to the urlbar needs.
If we want to extract some regex or code from it and put it into toolkit that's fine, then the tokenizer itself could use that code, but I'd be wary of reusing the module in a non-urlbar context.

Flags: needinfo?(mak)

The severity field for this bug is relatively low, S3. However, the bug has 4 duplicates.
:jaws, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jaws)
Flags: needinfo?(jaws)

(In reply to Marco Bonardo [:mak] from comment #18)

The behavior of the UrlbarTokenizer is deeply customized to the urlbar needs.
If we want to extract some regex or code from it and put it into toolkit that's fine, then the tokenizer itself could use that code, but I'd be wary of reusing the module in a non-urlbar context.

Can we extract the looksLikeUrl and/or looksLikeOrigin methods? Certainly the second one doesn't look particularly URL-bar-specific, but I'm not very familiar with the code - moving the first one would perhaps be cleaner than moving the second one because there's some overlap in the regexes they both rely on, but I guess if we move to e.g. BrowserUtils we could in principle make the UrlbarTokenizer just depend on BrowserUtils...

Flags: needinfo?(mak)

yes we could. What I meant mostly is that we made best guesses that usually work in the urlbar but may not be the best choice for everything. But probably the javadoc is already clear enough about that.

Flags: needinfo?(mak)

(In reply to Marco Bonardo [:mak] from comment #21)

yes we could. What I meant mostly is that we made best guesses that usually work in the urlbar but may not be the best choice for everything. But probably the javadoc is already clear enough about that.

OK, then this sounds workable - gliu, does that sound like something you'd be happy to work on?

Flags: needinfo?(gliu10000)

Yeah for sure! It might take me a while though--I'm also working on 1118553, which is a bit more complex than what I've previously worked on.

Flags: needinfo?(gliu10000)

Refactoring to a separate module to promote code reuse since
UrlbarTokenizer.jsm has useful utility functions and regexps
for determining whether a string looks like a url.

Assignee: nobody → gliu10000
Status: NEW → ASSIGNED
Attachment #9271198 - Attachment description: Bug 694856 - Refactor looksLikeUrl and looksLikeOrigin to UrlUtils.jsm r?Gijs,adw → WIP: Bug 694856 - Refactor looksLikeUrl and looksLikeOrigin to UrlUtils.jsm r?Gijs,adw
Attachment #9271198 - Attachment description: WIP: Bug 694856 - Refactor looksLikeUrl and looksLikeOrigin to UrlUtils.jsm r?Gijs,adw → WIP: Bug 694856 - Refactor looksLikeUrl and looksLikeOrigin to UrlUtils.jsm r?mak,adw
Attachment #9271199 - Attachment description: Bug 694856 - Extend SelectionUtils.jsm to detect more classes of plaintext links r?mccr8,Gijs → WIP: Bug 694856 - Extend SelectionUtils.jsm to detect more classes of plaintext links r?mak,Gijs
Attachment #9271198 - Attachment description: WIP: Bug 694856 - Refactor looksLikeUrl and looksLikeOrigin to UrlUtils.jsm r?mak,adw → Bug 694856 - Refactor looksLikeUrl and looksLikeOrigin to UrlUtils.jsm r?mak,adw
Attachment #9271199 - Attachment description: WIP: Bug 694856 - Extend SelectionUtils.jsm to detect more classes of plaintext links r?mak,Gijs → Bug 694856 - Extend SelectionUtils.jsm to detect more classes of plaintext links r?mak,Gijs
Attachment #9271199 - Attachment description: Bug 694856 - Extend SelectionUtils.jsm to detect more classes of plaintext links r?mak,Gijs → WIP: Bug 694856 - Extend SelectionUtils.jsm to detect more classes of plaintext links r?mak,Gijs

Hey gliu! Are you still working on this? Can we help you get unstuck in some way?

Flags: needinfo?(gliu10000)

Hey Gijs! Thanks for checking in and for your patience. I was hoping to work on this between studying and working part-time, but it looks like I don't have much bandwidth. If others would like to take this, feel free to do so.

=== Info in case others want to take this ===
I'm not sure how stale the current commits are, but I think what's there should be current with all the changes I made. However, the module is in the old .jsm style modules (and I think that would need to be migrated to the new module system? (I haven't followed too closely with the new module system news))

In terms of what would be required to make this commit ready for merging:

  • bring back using the word boundaries of selections as a heuristic for whether link selection was intentional to avoid regressing. (see lines 91 onward in the old SelectionUtils.jsm)
  • implementing some sort of solution for failure in toolkit/modules/tests/browser/browser_BrowserUtils.js caused by a float being recognized as url due to changes in SelectionsUtils.jsm
  • merging the test for browser/base/content/test/general/browser_plainTextLinks.js with browser_contextmenu_linkselect.js in D143117

The other aspect is updating the open dependencies on the patches.

The part that could be a bit tricky is understanding browser_plainTextLinks.js and merging the tests.

=== Other thoughts ===
Right now, I'm eyeing around September (when I will be back in Toronto for university) for when I can continue work on this patch.

Flags: needinfo?(gliu10000)

Sorry, there was a problem with the detection of inactive users. I'm reverting the change.

Assignee: nobody → gliu10000
Status: NEW → ASSIGNED

The bug assignee is inactive on Bugzilla, so the assignee is being reset.

Assignee: gliu10000 → nobody
Status: ASSIGNED → NEW
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: