Closed Bug 45739 Opened 24 years ago Closed 17 years ago

Extract first well formed URL in URL field for URL link

Categories

(Bugzilla :: Creating/Changing Bugs, enhancement, P3)

2.10
enhancement

Tracking

()

RESOLVED WONTFIX

People

(Reporter: sidr, Unassigned)

References

Details

The URL fields of some bugs contain other text instead of or along with well-formed URLs in the URL field. Such text should not be part of the _URL_ link displayed by show_bug.cgi before the URL field. Examples: 1. "all pages at Disney" 2. "www.netscape.com" 3. "any catalog page at http://www.xyz.corp/" 4. "http://unicode.org/iuc/iuc10/x-utf8.html and http://home.netscape.com/ko and http://www.joins.com" (taken from bug 45543) Since examples 3 and 4 contain valid URLs, the _URL_ link would be more useful if it had an HREF consisting of the first valid URL in the field rather than the entire field. RFE bug 37684, "[RFE] automatic http:// in bugzilla URL field", would have enter_bug.cgi amend example 2 to make it valid, so its sort would no longer appear, but existing bugs with similar URL fields would still appear. RFE bug 42609, "Only allow well formed URLs in URL field of bug report", would have enter_bug.cgi reject all of the above because they all contain text that is not part of a single well-formed-URL. Again, existing bugs would be unchanged. This RFE would apply to show_bug.cgi, to make a useful link out of the contents of the URL field, whatever it may contain, if possible. The corrollary would be that the effective _URL_ link HREF for example 1 would be "", rather than the amusing "http://bugzilla.mozilla.org/all pages at Disney". IIRC, this could be fixed with a single regex assignement in the right place, although if there is not already a distinction between the field contents and the HREF contents, a new variable would be needed.
The original proposal would make a useable _URL_ link from URL fields (bug_file_loc) resembling example 3 and 4 above. A simple extension could improve it to handle most cases resembling example 2 by doing something like the following iff no well formed URL was found: prepend "http://" to the first match for www\.[-\w]+\.*+ and assign to bug_file_loc_url prepend "ftp://" to the first match for ftp\.[-\w]+\.*+, and assign to bug_file_loc_url where bug_file_loc_url would be the value for the HREF of the _URL_ <A> element. Ammending Summary from "RFE: Restrict URL field link to first well formed URL in field" to "RFE: Extract first well formed URL in URL field for URL link" for clarity.
Summary: RFE: Restrict URL field link to first well formed URL in field → RFE: Extract first well formed URL in URL field for URL link
Gack. For www\.[-\w]+\.*+ please read www\.[-[[:alnum:]]]+\..*+([[:>:]]|$) The latter should match only the (non-well-formed) URL in something like "www.some-domain.tld/~jqpublic/cgi/cgi.cgi?query=query+x;n=1#end don't work!"
Target Milestone: --- → Future
reassign
Assignee: tara → myk
Component: Bugzilla → Creating/Changing Bugs
Product: Webtools → Bugzilla
Version: other → 2.10
Depends on: 95065
Summary: RFE: Extract first well formed URL in URL field for URL link → Extract first well formed URL in URL field for URL link
QA Contact: mattyt-bugzilla → default-qa
Target Milestone: Future → ---
Assignee: myk → create-and-change
Strings passed to the URL field may be quite complex, see e.g. the one in bug 413652. Having to parse this is almost impossible (and probably prone to errors). Problems as described in the 4th example of comment 0 are so rare that I wouldn't waste my time writing a complex parser in Bugzilla::Bug->_check_bug_file_loc().
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.