Closed Bug 261608 Opened 20 years ago Closed 20 years ago

Google search (default) in location bar with : or . in terms fails

Categories

(Core :: DOM: Navigation, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: Waldo, Assigned: Waldo)

References

Details

Attachments

(1 file, 6 obsolete files)

Type 'chewbacca site:starwars.com' into the location bar. Assuming the user doesn't have 'chewbacca' defined as a keyword, the expected result is to visit: http://www.google.com/search?&q=chewbacca%20site:starwars.com&sourceid=mozilla-search However, the characters ":" and "." fool Firefox into thinking that the stuff entered is a URL, and the user gets the message: The URL is not valid and cannot be loaded. This becomes a problem if I often visit a long URL that's the top result on Google within a specific site. For example, as I go through Firefox nightlies I often need to download a special CA and personal certificate in order to use parts of MIT's website. Within Google I can access the webpage by typing "certificates site:mit.edu" in Google. (I could also type site:mit.edu certificates", but I wouldn't expect that to work because it should be interpreted as a URL within the nonexistent site: protocol.) Firefox, however, restricts me to using the workaround "google certificates site:mit.edu" within the location. In addition to being extra typing, it requires a second mouse click to get where I want to go. Incidentally, this is annoying enough that I've done some searching to find the problem. The problem is here: http://lxr.mozilla.org/aviarybranch/source/docshell/base/nsDefaultURIFixup.cpp#679 I've cced vlad in case he thinks this is something he might be able to handle reasonably quickly.
This bug was surprisingly easy for me to fix, given that my knowledge of C++ is only maybe an hour's worth of learning past JavaScript. I spent most of the time just tracking through the code to verify that I'd found the function that contained the faulty logic. For the solution, instead of outright rejecting all strings that contain a ":" or ".", the code simply searches to determine whether the suspect characters are before or after the first space. If they're after the first space or aren't in the string, the code continues to evaluate the URL as a keyword search. I also added some examples of what this would enable to the function's documentation. In my testing of a Linux trunk build with this fix I haven't found any difference in behavior with both normal keyword searches and with the special type that this bug allows. Now, who to review this? (Ideally I'd like to see this fixed in 1.0 because it's personal dogfood as I described above, so I need someone who'll respond relatively quickly.)
Assignee: bugs → jwalden+bmo
Status: NEW → ASSIGNED
Component: Location Bar and Autocomplete → XPCOM
Product: Firefox → Browser
Version: unspecified → Trunk
Comment on attachment 160213 [details] [diff] [review] Exec keyword search iff first space-separated substring doesn't contain [.:] This patch fixes a bug that causes some useful strings that can be used in default searches to fail in Firefox. The fix is pretty simple, and as long as during a review you don't see any holes in my logic it should be pretty much good to go. I've tested on Linux, and it seems to work well.
Attachment #160213 - Flags: review?(dougt)
Component: XPCOM → Embedding: Docshell
Comment on attachment 160213 [details] [diff] [review] Exec keyword search iff first space-separated substring doesn't contain [.:] Ben, can you review this and hopefully approve it for aviary checkin? This is something that would be really helpful for me in day-to-day use.
Attachment #160213 - Flags: review?(dougt) → review?(bugs)
Comment on attachment 160213 [details] [diff] [review] Exec keyword search iff first space-separated substring doesn't contain [.:] I just noticed this breaks "?mozilla". Patch coming up ASAP...
Attachment #160213 - Attachment is obsolete: true
Attachment #160213 - Flags: review?(bugs)
I've checked every string mentioned in the documentation for the bug and received the results I expected (by comparing them with an aviary branch build's results). This patch should work correctly.
This patch is relatively trivial in that the only *real* change occurs in one line. I've tested the usage cases outlined in docs, and they function properly. This is also something I could really use as I switch between nightlies and erase and create new profiles. It should be safe for the aviary branch, because the function I change is only called once in code, and that's for the specific situation I tracked here.
Flags: blocking-aviary1.0?
Comment on attachment 160315 [details] [diff] [review] Fixes the case '?mozilla', which was broken w/prev. patch Ben, can you review this? It enables using the Location Bar in Firefox for a search like 'firefox site:mozilla.org'. I've tested the usage cases listed in source (and added a few new ones as well), and I fixed the bug in the previous version. This is also something that would be a big help to me day-to-day as I erase and create new profiles for nightly builds.
Attachment #160315 - Flags: review?(bugs)
Flags: blocking-aviary1.0? → blocking-aviary1.0+
I don't review changes to that code.
Flags: blocking-aviary1.0+ → blocking-aviary1.0-
Comment on attachment 160315 [details] [diff] [review] Fixes the case '?mozilla', which was broken w/prev. patch Okay, let's try someone else then. Still hoping to get this before 1.0...
Attachment #160315 - Flags: review?(bugs) → review?(caillon)
*** Bug 262605 has been marked as a duplicate of this bug. ***
(In reply to comment #11) > This is bug 79655. It would seem so, although some of the problems described there seem to have been worked around in the urlbar handling logic of Firefox or have been fixed in the meantime. This bug is far newer, and *usually* it would be marked as a dup of that one. However, because the patch is here, it makes more sense to keep both open, get r+sr+a (this would need sr now that I think of it), and then mark both fixed (assuming all of Seamonkey's issues get fixed by this -- that bug might need to be morphed to cover Seamonkey-specific urlbar handling logic if the problems still exist). This prevents bugspam from dup notification, reposting the patch, and asking for a review there.
I agree, whish is why I didn't dupe it. Just thought I'd let you know. This patch looks to me like it addresses the issues of that bug in one way, but how does it compare to attachment 75174 [details] [diff] [review]?
Comment on attachment 160315 [details] [diff] [review] Fixes the case '?mozilla', which was broken w/prev. patch Are you sure you attached the right patch? Looks like "?mozilla" is still broken here. Anyway, use kNotFound instead of -1
Attachment #160315 - Flags: review?(caillon) → review-
Attached patch The real updated patch (obsolete) (deleted) — Splinter Review
(In reply to comment #14) > Are you sure you attached the right patch? Gah! Curses! How did that happen!? > Anyway, use kNotFound instead of -1 Done, including in one spot I'd previously not touched but was within the area I was tweaking.
Attachment #160315 - Attachment is obsolete: true
Attachment #161108 - Flags: review?(caillon)
Comment on attachment 161108 [details] [diff] [review] The real updated patch >Index: mozilla/docshell/base/nsDefaultURIFixup.cpp >=================================================================== >RCS file: /cvsroot/mozilla/docshell/base/nsDefaultURIFixup.cpp,v >retrieving revision 1.40 >diff -u -8 -r1.40 nsDefaultURIFixup.cpp >--- mozilla/docshell/base/nsDefaultURIFixup.cpp 9 Sep 2004 14:27:52 -0000 1.40 >+++ mozilla/docshell/base/nsDefaultURIFixup.cpp 5 Oct 2004 02:49:55 -0000 >@@ -696,33 +696,43 @@ > nsresult nsDefaultURIFixup::KeywordURIFixup(const nsACString & aURIString, > nsIURI** aURI) > { > // These are keyword formatted strings > // "what is mozilla" > // "what is mozilla?" > // "?mozilla" > // "?What is mozilla" >+ // "?docshell site:mozilla.org" - first space-separated string contains no colons or dots >+ // "docshell site:mozilla.org" > > // These are not keyword formatted strings >- // "www.blah.com" - anything with a dot in it >- // "nonQualifiedHost:80" - anything with a colon in it >+ // "www.blah.com" - anything with the first space-separated substring containing a dot >+ // "www.blah.com stuff" >+ // "?www.blah.com" >+ // "nonQualifiedHost:80" - anything with the first space-separated substring containing a colon >+ // "nonQualifiedHost:80 args" >+ // "?site:intranetSite" It's unclear to me why the above line is not a keyword search. I don't think that we would do anything with treating "?site:nonQualifedHost" as a non-keyword anyway (the load will probably throw an error). However, you are not mandating the characters to be "site" so someone could enter something like "?Magic: The Gathering" which would get treated as a non-keyword using your algorithm. I'd argue that anything starting with a ? should get treated as a keyword search, though I could perhaps be persuaded to treat "?www.mozilla.org" as a non-keyword. > // "nonQualifiedHost?" > // "nonQualifiedHost?args" > // "nonQualifiedHost?some args" > >- if(aURIString.FindChar('.') == -1 && aURIString.FindChar(':') == -1) >+ PRInt32 dotLoc = aURIString.FindChar('.'); >+ PRInt32 colonLoc = aURIString.FindChar(':'); >+ PRInt32 spaceLoc = aURIString.FindChar(' '); >+ >+ if(((dotLoc == kNotFound) || ((spaceLoc > 0) && (spaceLoc < dotLoc))) && >+ ((colonLoc == kNotFound) || ((spaceLoc > 0) && (spaceLoc < colonLoc)))) > { > PRInt32 qMarkLoc = aURIString.FindChar('?'); >- PRInt32 spaceLoc = aURIString.FindChar(' '); > > PRBool keyword = PR_FALSE; > if(qMarkLoc == 0) > keyword = PR_TRUE; >- else if((spaceLoc > 0) && ((qMarkLoc == -1) || (spaceLoc < qMarkLoc))) >+ else if((spaceLoc > 0) && ((qMarkLoc == kNotFound) || (spaceLoc < qMarkLoc))) > keyword = PR_TRUE; > > if(keyword) Hm, the boolean and two ifs can be combined into one larger if statement.
Attachment #161108 - Flags: review?(caillon) → review-
Attached patch Patch, the next iteration (obsolete) (deleted) — Splinter Review
(In reply to comment #16) > >+ // "?site:intranetSite" > > It's unclear to me why the above line is not a keyword search. I don't think > that we would do anything with treating "?site:nonQualifedHost" as a > non-keyword anyway (the load will probably throw an error). However, you are > not mandating the characters to be "site" I didn't/wouldn't mandate the characters to be site or any such string, because Google also has at least intitle, inurl, filetype, cache, link, related, and define. It's also not worth bloating code to fix only a specific set of cases. As for not doing anything with "?site:nonQualifiedHost", I was trying to modify the original algorithm as little as possible to allow a few specific safe usages, and this just got banned with the modifications I made; the addition to the list was to make that fact clear. > so someone could enter something like > "?Magic: The Gathering" which would get treated as a non-keyword using your > algorithm. I'd argue that anything starting with a ? should get treated as a > keyword search, though I could perhaps be persuaded to treat "?www.mozilla.org" > as a non-keyword. Google redirects to the URL if you do an I'm Feeling Lucky search for a URL, so I don't see a real reason not to treat it as a keyword. Incidentally, do you know why are searches beginning with "?" considered keywords? Some forgotten historical precedent, perhaps? > > PRBool keyword = PR_FALSE; > > if(qMarkLoc == 0) > > keyword = PR_TRUE; > >- else if((spaceLoc > 0) && ((qMarkLoc == -1) || (spaceLoc < qMarkLoc))) > >+ else if((spaceLoc > 0) && ((qMarkLoc == kNotFound) || (spaceLoc < qMarkLoc))) > > keyword = PR_TRUE; > > > > if(keyword) > > Hm, the boolean and two ifs can be combined into one larger if statement. Done, although the if() gets really ugly (probably why it was split up initially) in my opinion (which is, admittedly, the opinion of one who isn't really familiar with C++ yet, and so might not be the best judge of ugliness). Also, in making "?" at beginning always trigger keyword search, there's some repetition of (qMarkLoc == 0) in both if()s, but I don't know of a good way to move qMarkLoc into a single if() in a way that doesn't lower the coding clarity or increase the probable compiled code size.
Attachment #161108 - Attachment is obsolete: true
Attachment #161327 - Flags: review?(caillon)
Comment on attachment 161327 [details] [diff] [review] Patch, the next iteration >@@ -696,34 +697,36 @@ > nsresult nsDefaultURIFixup::KeywordURIFixup(const nsACString & aURIString, > nsIURI** aURI) > { > // These are keyword formatted strings > // "what is mozilla" > // "what is mozilla?" >- // "?mozilla" >- // "?What is mozilla" >+ // "docshell site:mozilla.org" - has no dot/colon in the first space-separated substring >+ // "?mozilla" - anything that begins with a question mark >+ // "?site:mozilla.org docshell" > > // These are not keyword formatted strings >- // "www.blah.com" - anything with a dot in it >- // "nonQualifiedHost:80" - anything with a colon in it >+ // "www.blah.com" - first space-separated substring contains a dot, doesn't start with "?" >+ // "www.blah.com stuff" >+ // "nonQualifiedHost:80" - first space-separated substring contains a colon, doesn't start with "?" >+ // "nonQualifiedHost:80 args" > // "nonQualifiedHost?" > // "nonQualifiedHost?args" > // "nonQualifiedHost?some args" > >- if(aURIString.FindChar('.') == -1 && aURIString.FindChar(':') == -1) >+ PRInt32 dotLoc = aURIString.FindChar('.'); >+ PRInt32 colonLoc = aURIString.FindChar(':'); >+ PRInt32 spaceLoc = aURIString.FindChar(' '); >+ PRInt32 qMarkLoc = aURIString.FindChar('?'); >+ >+ if((((dotLoc == kNotFound) || ((spaceLoc > 0) && (spaceLoc < dotLoc))) && >+ ((colonLoc == kNotFound) || ((spaceLoc > 0) && (spaceLoc < colonLoc)))) || >+ qMarkLoc == 0) Hm, I think this is right. Please do testing against all the cases above, though. Also note, you are over-bracing everything. The above would be re-written as: if ((dotLoc == kNotFound || (spaceLoc > 0 && spaceLoc < dotLoc)) && (colonLoc == kNotFound || (spaceLoc > 0 && spaceLoc < colonLoc)) || qMarkLoc == 0) >+ if(qMarkLoc == 0 || >+ ((spaceLoc > 0) && ((qMarkLoc == kNotFound) || (spaceLoc < qMarkLoc)))) And, if you add the above if, everything would combine into: if ((dotLoc == kNotFound || (spaceLoc > 0 && spaceLoc < dotLoc)) && (colonLoc == kNotFound || (spaceLoc > 0 && spaceLoc < colonLoc)) && (spaceLoc > 0 && (qMarkLoc == kNotFound || spaceLoc < qMarkLoc)) || qMarkLoc == 0) There probably is a way to whittle this down further, but I have been looking at this too long and have other things to do. Chopping off braces helps make this much more readable though.
Attachment #161327 - Flags: review?(caillon) → review+
Attached patch Tweaked version of prev. patch (obsolete) (deleted) — Splinter Review
I retested all the testcases again, and they all work properly. I tested the latter set of testcases with a valid nonQualifiedHost, so when each permutation is typed into Google the first result is a completely different site (because an invalid nonQualifiedHost is later passed on as a Google search).
Attachment #161327 - Attachment is obsolete: true
Attached patch Tweaked version of prev. patch (diff -uw) (obsolete) (deleted) — Splinter Review
Comment on attachment 163656 [details] [diff] [review] Tweaked version of prev. patch Carrying over r+ from previous patch, asking for sr...
Attachment #163656 - Flags: superreview?(darin)
Attachment #163656 - Flags: review+
Comment on attachment 163656 [details] [diff] [review] Tweaked version of prev. patch >Index: mozilla/docshell/base/nsDefaultURIFixup.cpp >+ nsCAutoString keywordSpec("keyword:"); >+ char *utf8Spec = ToNewCString(aURIString); // aURIString is UTF-8 >+ if(utf8Spec) >+ { >+ char* escapedUTF8Spec = nsEscape(utf8Spec, url_Path); >+ if(escapedUTF8Spec) > { >+ keywordSpec.Append(escapedUTF8Spec); >+ NS_NewURI(aURI, keywordSpec.get(), nsnull); >+ nsMemory::Free(escapedUTF8Spec); >+ } // escapedUTF8Spec >+ nsMemory::Free(utf8Spec); >+ } // utf8Spec >+ } // dotLoc This could all be simplified down to: nsCAutoString keywordSpec("keyword:"); keywordSpec.Append(aURIString); NS_NewURI(aURI, keywordSpec); The implementation of NS_NewURI will perform the escaping itself, so there is no need to escape here. All you have to do is make sure that the string passed to NS_NewURI is UTF-8, which in this it is.
Attachment #163656 - Flags: superreview?(darin) → superreview-
Attached patch Addresses sr comments (deleted) — Splinter Review
This patch changes the indicated code as requested above. I removed the "// dotLoc" comment after the closing brace because it's not needed for such a short code block. Hopefully, that's it now. Aah, the sound of a smaller codebase...
Attachment #163656 - Attachment is obsolete: true
Attachment #163657 - Attachment is obsolete: true
Comment on attachment 167370 [details] [diff] [review] Addresses sr comments Moving forward the r+, asking for sr again...
Attachment #167370 - Flags: superreview?(darin)
Attachment #167370 - Flags: review+
Comment on attachment 167370 [details] [diff] [review] Addresses sr comments sr=darin
Attachment #167370 - Flags: superreview?(darin) → superreview+
Checking in docshell/base/nsDefaultURIFixup.cpp; /cvsroot/mozilla/docshell/base/nsDefaultURIFixup.cpp,v <-- nsDefaultURIFixup.cpp new revision: 1.42; previous revision: 1.41 done
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
*** Bug 273157 has been marked as a duplicate of this bug. ***
*** Bug 79655 has been marked as a duplicate of this bug. ***
*** Bug 293764 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: