Closed Bug 12493 Opened 25 years ago Closed 22 years ago

layout does not canonicalize URIs for global history lookup (color of visited links) ( :visited :link )

Categories

(Core :: CSS Parsing and Computation, defect, P1)

defect

Tracking

()

VERIFIED FIXED
mozilla1.2alpha

People

(Reporter: chrispetersen, Assigned: alecf)

References

()

Details

(Keywords: testcase, topembed+, Whiteboard: fix in hand, suntrak-n6)

Attachments

(4 files, 7 obsolete files)

Version: Apprunner Build: 1999082412 (Aug 24th M9) Platform: All Expected Results: The vlink attribute should set the visited link to purple What I got: The link remains blue. Steps to reproduce: 1) Open http://slip/projects/marvin/html/body_vlink.html 2) Click on the link. This should load the cnn web page. 3) After the site loads, click on the back arrow. 4) The body_vlink page appears but the link remains as blue.
Assignee: troy → nisheeth
The style system is calling the web shell GetLinkState() function, so I guess the problem is not layout Nisheeth, I think you're still the current web shell owener
Assignee: nisheeth → scc
Re-assigning this to Scott, the new web shell owner...
Status: NEW → ASSIGNED
Summary: Vlink attribute not supported → [Webshell] Vlink attribute not supported
(Mass-) Re-assigning [Webshell] bugs to travis, on the advice of dp, et al. Many of these may be invalid in the new world of the re-written webshell, but he certainly needs to be aware of these issues. Hope this helps you, travis.
Status: NEW → ASSIGNED
Depends on: 13374
Priority: P3 → P1
Target Milestone: M15
Move to M15. Not required for beta 1 ... I think. Correct me if I'm wrong.
I believe this is a beta 1 issue since the vlink attribute is failing. Alink and link body attributes are functioning from my testing.
Keywords: beta1
per PDT, need more info. On the test page, the LINK is indeed not shown in the visited color. However, on lots of other web pages, visited links are shown correctly. Is the test case broken, or is this really a bug that lots of users will see?
Whiteboard: [NEED INFO]
Travis: Please comment on this. Why is this webshell related???
Ok, this is actually a Global History bug. Basically the problem is that when the URL is added to Global History, it is added usually fully qualified "http://www.cnn.com/". However the link has the href= to "http://www.cnn.com". Therefore the history code thinks there isn't a match when it tries to find the row. Global History should try and match up URLS that are actually the same URL, but with slightly different formatting. I've talked with Waterson as he is the original owner of this code and he thinks he can probably fix it pretty easily. I'm flipping it over to him.
Assignee: travis → waterson
Status: ASSIGNED → NEW
Removing dependency on 13374.
No longer depends on: 13374
PDT: This bug will cause some links to appear visited, while others won't, depending on the way that the page author has coded their links. My guess is that the impact of *not* fixing this will lead to several bug reports filed that "history doesn't work", as well as a bit of confusion with respect to what links are colored and what links aren't. I do not have an immediate fix, but am thinking that this is like a day's worth of work to sort out, get tested, and get checked in. IMO, this isn't a beta-stopper, but, there have been other link-coloring bugs marked PDT+, so I'll leave it to you to decide. By the way, Travis, thanks for tracking this down. Global history bugs like this have been popping up now and then for the last couple of months!
Whiteboard: [NEED INFO]
PDT-
Whiteboard: [PDT-]
Status: NEW → ASSIGNED
Target Milestone: M15 → M16
Attached patch proposed fix (obsolete) (deleted) — Splinter Review
I changed nsILinkHandler::GetLinkState() to take nsIURI. The style system now calls NS_NewURI() to create a canonical URI, which can be used to look up an entry in history. For example, NS_NewURI() canonifies "http://www.cnn.com/" and "http://www.cnn.com" to the same string.
fix checked in.
uh, let me try that again. FIX CHECKED IN.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
reopening bug; my changes caused performance to regress too seriously.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
restate bug to be what it really means.
Summary: [Webshell] Vlink attribute not supported → layout does not canonicalize URIs for global history lookup
add style and necko folks to cc list.
Target Milestone: M16 → M30
so, who wants to own this problem?
Isn't the reason this causes performance problems due to the other bug where style is calling GetLinkState 12 times for just one link? If that was fixed, would this still be a performance problem?
Yes, this is a performance problem because the style system calls GetLinkState() so many times, and for each time it calls, it needs to re-resolve the link's "href" attribute to a canonical URI. It is likely that if we fixed the style resolution code to call this fewer times, it would not be as much of a performance concern.
I think that problem seems to be fixed as well.
You can assign a bug to me (or Andreas actually) to make nsIURI::Resolve return a canonical format. I think it should.
Added keyword nsbeta2.
Keywords: nsbeta2
Whiteboard: [PDT-]
Putting on [nsbeta2+] radar. Moving to M16.
Whiteboard: [nsbeta2+]
Target Milestone: M30 → M16
pierre, I'm giving this back to you to figure out. I think it's rather silly to have this be a Beta2 blocker, as we currently color links exactly like 4.x does, but who am I to say.
Assignee: waterson → pierre
Status: REOPENED → NEW
Component: Layout → Style System
waterson: we are *not* coloring links exactly like 4.x does. The testcase above works in Nav4x but not in Moz. However I agree that this is not a beta blocker. Removing 'nsbeta2' keyword, moving to M20, marked dependency on bug 29611 ("GetLinkState is called too often").
Status: NEW → ASSIGNED
Depends on: 29611
Keywords: beta1, nsbeta2
Priority: P1 → P3
Whiteboard: [nsbeta2+]
Target Milestone: M16 → M20
Yes, you are right: this test case still does not work.
*** Bug 23210 has been marked as a duplicate of this bug. ***
Added nsbeta2 keyword from bug 23210. Fix needed for a:visited to work in Beta 2.
Keywords: nsbeta2
[nsbeta2-]
Whiteboard: [nsbeta2-]
An additional note that came up in related bug 37416: We need to make sure that whatever we do we treat the link as a link if it has an href, even if we cannot understand the protocol associated with the href.
ALERT! This bug occurs on almost every link on http://www.yahoo.com/. Putting under the beta2 radar again. Marking top100.
Keywords: top100
Whiteboard: [nsbeta2-]
Target Milestone: M20 → M17
Pierre, I think the Yahoo! situation is different. Their top-level links are to virtual diorectories that map to completly different URLs. For example, the link on www.yahoo.com for Arts & Humanities has an href of http://www.yahoo.com/r/ar but that causes a redirect to http://dir.yahoo.com/Arts/ I think IE and Nav treat the link as visited because they track the URL's that have been navigated through whereas we track the URLs that have actually been arrived at. My understanding of the original problem in this bug is where www.foo.bar is not matching www.foo.bar/ - the Yahoo! problem seems quite different and I think the solutions are very different as well. Maybe we really need a new bug for the Yahoo! problem?
You're right. I wasn't looking at the URL field... Putting back [nsbeta2-]. I'm going to open a separate bug for Yahoo.
Keywords: top100
Whiteboard: [nsbeta2-]
Target Milestone: M17 → M20
What is this bug currently covering? Pierre: did you ever file the separate bug?
David, the problem is that some links don't show as visited because they don't have a '/' at the end. Waterson checked in a fix but he had to remove it because of performance problems that are dependent on bug 29611. For Yahoo, I opened bug 41973.
Then how is this bug different from bug 41973? The addition of a slash when a directory name is given without a trailing slash is a standard feature of most web servers, accomplished through a redirect (just like any other redirect). The browser can't know if it's the right thing to do.
Pierre was referring only to slashes at the end of hostnames, as in http://www.cnn.com ...vs http://www.cnn.com/ ...which is not done by the server, it's done by canonicising the URI.
More exactly, the problem is that Web pages contain URLs in any form while the History only contains canonized URLs. Bug 41973 comes from the assumption that a page that causes a redirection isn't added to the History before causing the redirection. But as I'm writing this, I'm realizing that Marc and I could have been wrong because the URLs on Yahoo are not canonized either. We need to write a testcase with canonized URLs that cause a redirection. If it works, then bug 41973 is in fact a dup of bug 12493.
Ian Hickson: That's wrong. Servers do redirect http://www.cnn.com to http://www.cnn.com/ (with the trailing slash). We just can't know.
warren: I stand by my original remark. Compare the following HTTP logs: http://webtools.mozilla.org/web-sniffer/view.cgi?url=http%3A%2F%2Fwww.cnn.com http://webtools.mozilla.org/web-sniffer/view.cgi?url=http%3A%2F%2Fwww.cnn.com%2F http://webtools.mozilla.org/web-sniffer/view.cgi?url=http%3A%2F%2Fwww.cnn.com% 2F2000 http://webtools.mozilla.org/web-sniffer/view.cgi?url=http%3A%2F%2Fwww.cnn.com% 2F2000%2F The first pair is cnn.com without the slash and then with the slash, and the results from www.cnn.com are identical. The second pair are from the same server, but asking for the directory /2000, without and then with a slash. There you see the server redirecting if the slash is missing. Thus for THIS bug the problem is presumably that we do not canonicalize the href URI when comparing it to the global history, which presumably *is* canonicalized. However, there is another, separate bug which is about the problem of remembering the original URI and/or the URI after a redirect.
Nominating for nsbeta3. Vlink attribute (in this case) is not working properly in this case.
Keywords: nsbeta3
Denying for beta3. May be reevaluated if resources free up before the end of the beta cycle.
Whiteboard: [nsbeta2-] → [nsbeta2-][nsbeta3-]
*** Bug 49236 has been marked as a duplicate of this bug. ***
*** Bug 47333 has been marked as a duplicate of this bug. ***
Bug #49236 also relates to this roughly (similar but not equal), per Hixie. I'll still check it out later, and since I will anyway, Petersen, I'll take this one off your hands.
QA Contact: petersen → lorca
Whiteboard: [nsbeta2-][nsbeta3-] → [nsbeta2-][nsbeta3-] Must Check Bug#49236 When Fixed
Let me state for the record that this bug is extremely annoying since I use a single file to link from for my HTML tests. Grunt.
Nominating for rtm. Are we going to ship with vlink not working? Gerv
Keywords: rtm
Marking [rtm-] since we don't have a patch and the risk is too high.
Whiteboard: [nsbeta2-][nsbeta3-] Must Check Bug#49236 When Fixed → [nsbeta2-][nsbeta3-] Must Check Bug#49236 When Fixed [rtm-]
FWIW, I've noticed that the problem goes away when I delete my history.dat (I'm running Win98), and stays fixed until I install the next nightly build. (I'm currently running 20001030; if anyone wants me to, I'll write back after I switch up again.) Is the format of history.dat being changed nightly, or am I corrupting it when I install? Unless it's the latter, it's something testers may want to keep in mind.
Do not run two browsers at once!
My bad, it doesn't last until I update my version (I haven't updated yet, but it's back). It does still seem to go away temporarily if I delete history.dat , but not permanently.
*** Bug 61223 has been marked as a duplicate of this bug. ***
*** Bug 49974 has been marked as a duplicate of this bug. ***
Bug 49974 has patches too. http://bugzilla.mozilla.org/showattachment.cgi?attach_id=20978 http://bugzilla.mozilla.org/showattachment.cgi?attach_id=21154 I don't think i like the patches from that bug, but Can we please get traction on this bug? My theory and i'm probably dead wrong, since i don't speak protocols: <q src="bugzilla?bug_49974#2000-12-21_12:52"> We make link mappings a => a/ and then resolve(a) => a/; check(a/) => visited, or just mark(a)=visited when we visit it.</q> In real cases this means the following behavior is acceptable to me. V=visits, S=sees, X=appears visited, I=intermediate R=redirected v:www.cnn.com I:http://www.cnn.com I:http://www.cnn.com/ Xs:http://www.cnn.com Xs:/ v:www.cnet.com/ I:http://www.cnet.com/ Xv:http://www.cnet.com/ Xs:http://www.cnet.com v:http://www.cnet.com/new/ s:http://www.cnet.com/new v:http://www.cnn.com/redirect?http+www.time.com R:http://www.time.com I:http://www.time.com/ R:http://www.time.com/time/ Xs:http://www.time.com s:http://www.time.com/time Xs:http://www.time.com/ someone from QA should make a complete test description. It seems that history should include all intermediate steps. Then the only special cases are: root [/?] for which we can certainly hack support. and MixedCaseDomains which i guess a correct fix for this bug will have to address :-( A few reminders: unicode is now possible for domains. I wonder when WhiteHouse.com and whitehouse.com will become separate entities, since case can matter in other parts of unicode. Status clobbered because old referenced bug was marked as a dupe of this one. Borrowing all tags from Bug 49974. Marking correctness and borrowing (1) goat from hixie.
Whiteboard: [nsbeta2-][nsbeta3-] Must Check Bug#49236 When Fixed [rtm-] → suntrak-n6
Martin Horwath suggested I include a comment I wrote in bug 61223 (on a not- entirely-unrelated issue), and I said to myself "Why not?:)": Here's a sample URL that did fail for me at yahoo: http://srd.yahoo.com/drst/1036697/*http://home.earthlink.net/~hartmanc/ That said, it can be seen that the issue here is not simply that the HTML is not working here...its a joint failure of cache/history and HTML follow-through. Hope this makes sense, but check out the test case. [The above comment being relevant to this bug I think.] Thinking about it, I'm marking a dependency, since I think these two issues are related, while not necessarily being duplicates. The test case from the other bug: http://bugzilla.mozilla.org/showattachment.cgi?attach_id=21798
Blocks: 61223
Where does the responsibility for fixing this lie? Currently, SelectorMatches calls nsStyleUtil::IsHTMLLink, which calls nsHTML{Anchor,Area,Link}Element::GetHrefCString, which either calls NS_MakeAbsoluteURIWithCharset, which calls nsStdURL::Resolve. Who in that chain is responsible for changing http://www.cnn.com to http://www.cnn.com/ ? (Apparently the issue of redirects is now covered by bug 61223.)
*** Bug 29080 has been marked as a duplicate of this bug. ***
Adding mostfreq and nsbeta1 keywords.
Keywords: mostfreq, nsbeta1
*** Bug 66894 has been marked as a duplicate of this bug. ***
Reassigning QA Contact for all open and unverified bugs previously under Lorca's care to Gerardo as per phone conversation this morning.
QA Contact: lorca → gerardok
I kind of think that global history itself should be handling this, not layout.. (though the other stuff in chris's patch look valuble)
*** Bug 67195 has been marked as a duplicate of this bug. ***
*** Bug 73437 has been marked as a duplicate of this bug. ***
waterson: In four days, it will be the first anniversary of the patch you've attached to this most-frequently-reported and reasonably-high-visibility bug. It would be cool if we could celebrate it by closing the bug. Is there any chance of you tarting it up and reattaching it for review? Gerv
The important part of the patch is the NS_NewURI() call in ns[HTML|CSS]StyleSheet(). The problem is that this code is *the* hotspot for layout, and NS_NewURI() is slower than molasses. Now, attinasi moved a lot of the slow code out of SelectorMatches(), so maybe we could put it there, but I still think it'd cause a slowdown. This needs some more thought, and I'm not sure who the person to give that thought is...
is the only case the difference between http://<something> and http://<something>/? if so, we really could hardcode this in history much more easiliy - we just have to fix nsIGlobalHistory::GetIsVisited() to try appending or removing a trailing "/" from the passed-in URI when looking up the entry in the history DB.
No, this bug isn't restricted only to the final-slash vs no-final-slash distinction. For another example, visit this URL: http://www.wuffings.co.uk/Wuffingas/WuffingBText.html The URL for "an unnamed queen" will tickle the bug.
*** Bug 77463 has been marked as a duplicate of this bug. ***
Verified on build: 2001-05-15-04-Trunk platform: Win XP The link also does not change color in this platform.
this is a cross-platform bugs - we don't need any more "it doesn't work for me either" reports..
*** Bug 82406 has been marked as a duplicate of this bug. ***
*** Bug 84505 has been marked as a duplicate of this bug. ***
Priority: P3 → P2
Summary: layout does not canonicalize URIs for global history lookup → layout does not canonicalize URIs for global history lookup (color of visited links)
Target Milestone: --- → mozilla1.0
Build 2001-07-27-branch on WinNT and Linux Same is the behavior with test case of no DOCTYPE.
*** Bug 95687 has been marked as a duplicate of this bug. ***
*** Bug 96231 has been marked as a duplicate of this bug. ***
*** Bug 96437 has been marked as a duplicate of this bug. ***
*** Bug 96903 has been marked as a duplicate of this bug. ***
*** Bug 99189 has been marked as a duplicate of this bug. ***
Blocks: 104166
A contributor wrote me: "When canonicalizing URLs also URL-encoding has to be considered, so that non-encoded hrefs are matched against encoded history entries... Apparently some "special characters" (don't know which ones exactly) are stored in the history URL-encoded ('%3D' for '=', '%25' for '%' etc.) while in links they may appear as the original characters - and thus the unencoded characters in the href won't match with those in the history. Possibly this also applies the other way around - if 'non-special-chars' that are encoded in the href are decoded before they are stored in the history, they won't match either."
Pushing to 1.1 because a fix will likely affect performance.
Target Milestone: mozilla1.0 → mozilla1.1
I think this issue has a fix in bug 108988. Might be duping this bug to that one even though this one is much older. Jake
Bulk moving from Moz1.1 to future-P1. I will pull from this list when scheduling work post Mozilla1.0.
Priority: P2 → P1
Target Milestone: mozilla1.1 → Future
Again, isn't this fixed by bug 108988???? The testcase in the url works fine now since that bug was fixed. Marking dup of bug 108988 (fixed by 108988). jake *** This bug has been marked as a duplicate of 108988 ***
Status: ASSIGNED → RESOLVED
Closed: 25 years ago23 years ago
Resolution: --- → DUPLICATE
Bug 49236 was marked as a duplicate of this bug, but bug 108988 doesn't cover it.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
*** Bug 125644 has been marked as a duplicate of this bug. ***
*** Bug 128999 has been marked as a duplicate of this bug. ***
Attached file testcase (deleted) —
Severity: normal → major
Keywords: testcase
Summary: layout does not canonicalize URIs for global history lookup (color of visited links) → layout does not canonicalize URIs for global history lookup (color of visited links) ( :visited :link )
cc'ing myself
*** Bug 146248 has been marked as a duplicate of this bug. ***
*** Bug 152860 has been marked as a duplicate of this bug. ***
Assigning pierre's remaining Style System-related bugs to myself.
Assignee: pierre → dbaron
Status: REOPENED → NEW
*** Bug 152497 has been marked as a duplicate of this bug. ***
*** Bug 133443 has been marked as a duplicate of this bug. ***
*** Bug 150496 has been marked as a duplicate of this bug. ***
Keywords: mozilla1.2
oops, I thought this was still assigned to me.. back to me :) I've got a working patch, I'm just trying to figure out the best way to fix one minor thing (figuring out the correct location of the "/" in case the given url has a query, such as http://www.cnn.com?foo - we can't just append a "/" here.
Assignee: dbaron → alecf
Attached patch Canonicalize the path with SetSpec/GetSpec (obsolete) (deleted) — Splinter Review
I also slightly updated the nsILinkHandler interface to use nsACString rather than char* - it will save us link calculation
Attachment #7286 - Attachment is obsolete: true
looking for reviews/super-reviews...
Keywords: topembed+
Comment on attachment 93156 [details] [diff] [review] Canonicalize the path with SetSpec/GetSpec You should really use nsIProtocolHandler::NewURI to construct a nsIURI. nsStandardURL is not used by all protocol handlers, so this is likely to cause problems. Normalization algorithms, etc, may very well differ across protocols. I know it can be costly to call NewURI so many times, but I don't think it'd be that much more costly than this solution. Perhaps we can look at making nsStandardURL allocator use a recycling allocator or something to reduce the cost of calling NewURI in this patch.
Attachment #93156 - Flags: needs-work+
Comment on attachment 93156 [details] [diff] [review] Canonicalize the path with SetSpec/GetSpec r=radha.
Attachment #93156 - Flags: needs-work+
Comment on attachment 93156 [details] [diff] [review] Canonicalize the path with SetSpec/GetSpec r=radha.
Attachment #93156 - Flags: review+
Comment on attachment 93156 [details] [diff] [review] Canonicalize the path with SetSpec/GetSpec ok, thanks darin (putting back the needs-work, removing radha's r=)
Attachment #93156 - Flags: review+ → needs-work+
Attached patch use DefaultFixupURI (obsolete) (deleted) — Splinter Review
ok, I figured if we're going to use NS_NewURI, we might as well go all out and use nsIURIFixup - it will do all the special things like make sure local paths become file urls, and so forth. its the same URI resolution mechanism that the URL bar uses as well. in the process I also changed nsIURIFixup to use nsAString.
Attachment #93156 - Attachment is obsolete: true
adding darin and radha for followup reviews.. new patch attached! thanks.
Status: NEW → ASSIGNED
hmm... fixup is only used for stuff entered in the URL bar. that means for example that '\' will be converted to '/' on windows. that conversion is technically incorrect according to RFC2396, and for that reason the necko URL parsers do not do this kind of fixup. we only do it in the URL bar so as to accommodate those users that are accustomed to using '\' as a delimiter. so, i'm not sure that this is the right thing to do for history.
I have to agree with Darin, using urifixup on links is not what we want to do.
heh, I'll get it right one of these times! patch coming up.
Attached patch just use NS_NewURI (obsolete) (deleted) — Splinter Review
well, here's the updated patch - anyone mind if I switch nsIURIFixup while I'm there anyway, even if I'm not going to use it? its not frozen, and this does allow us to do some smart string stuff and avoid an extra copy or three...not to mention there is a one-line overlap with changing that API and changing nsWebShell.cpp anyway...
Attachment #93200 - Attachment is obsolete: true
Target Milestone: Future → mozilla1.2alpha
Comment on attachment 93210 [details] [diff] [review] just use NS_NewURI >Index: docshell/base/nsWebShell.cpp >+ // get the cached IO service >+ if (!mIOService) >+ mURIFixup = do_GetService(NS_IOSERVICE_CONTRACTID, &rv); oops! looks like you also added UTF8 <-> UCS2 conversions where there were previously ISOLatin1 <-> UCS2 conversion. from what i could tell, it seems like added goodness. sr=darin with that "oops" fix.
Attachment #93210 - Flags: needs-work+
Attached patch juse use NS_NewURI v1.01 (obsolete) (deleted) — Splinter Review
ok, finally got it right :) And yeah, I did get rid of two ToNewCString()'s which were doing Latin1->UCS2 conversions (Bad!) - so its added value :)
Attachment #93210 - Attachment is obsolete: true
Comment on attachment 93311 [details] [diff] [review] juse use NS_NewURI v1.01 sr=darin
Attachment #93311 - Flags: superreview+
Comment on attachment 93311 [details] [diff] [review] juse use NS_NewURI v1.01 >@@ -689,14 +696,35 @@ > } > > NS_IMETHODIMP >-nsWebShell::GetLinkState(const char* aLinkURI, nsLinkState& aState) >+nsWebShell::GetLinkState(const nsACString& aLinkURI, nsLinkState& aState) > { > aState = eLinkState_Unvisited; > > if(mGlobalHistory) > { (Argh -- 3-space indentation and wacky if( -- travis code?) >+ This newline seems better below, after rv's initialized declaration -- but see below for whether rv can be localized to an inner block. >+ // default to the given URI >+ nsCAutoString resolvedPath(aLinkURI); >+ >+ nsresult rv = NS_OK; >+ // get the cached IO service >+ if (!mIOService) >+ mIOService = do_GetService(NS_IOSERVICE_CONTRACTID, &rv); >+ >+ if (NS_SUCCEEDED(rv)) { >+ >+ // clean up the url using the right parser >+ nsCOMPtr<nsIURI> uri; >+ rv = NS_NewURI(getter_AddRefs(uri), aLinkURI, nsnull, nsnull, >+ mIOService); >+ >+ // now get the fully canonicalized path >+ if (NS_SUCCEEDED(rv)) >+ rv = uri->GetSpec(resolvedPath); >+ } >+ The second if should be in the first's then clause, no? // get the cached IO service if (!mIOService) { nsresult rv; mIOService = do_GetService(NS_IOSERVICE_CONTRACTID, &rv); if (NS_SUCCEEDED(rv)) { // clean up the url using the right parser nsCOMPtr<nsIURI> uri; rv = NS_NewURI(getter_AddRefs(uri), aLinkURI, nsnull, nsnull, mIOService); // now get the fully canonicalized path if (NS_SUCCEEDED(rv)) rv = uri->GetSpec(resolvedPath); } } No need to initialize rv, too. >@@ -78,9 +78,8 @@ > nsCOMPtr<nsIURI> uri; > PRUint32 newFixupFlags = aFixupFlags & ~FIXUP_FLAG_ALLOW_KEYWORD_LOOKUP; > >- nsAutoString tempString; >- tempString = Substring(uriString, 12, uriString.Length() - 12); >- rv = CreateFixupURI(tempString.get(), newFixupFlags, getter_AddRefs(uri)); >+ rv = CreateFixupURI(Substring(uriString, 12, uriString.Length() - 12), >+ newFixupFlags, getter_AddRefs(uri)); > if (NS_FAILED(rv) || !uri) > return NS_ERROR_FAILURE; Shouldn't uri always be non-null if NS_SUCCEEDED(rv)? >+ if (NS_FAILED(rv)) >+ *_retval = PR_FALSE; > else > *_retval = PR_TRUE; > How about just *_retval = !NS_FAILED(rv); (equivalent to *_retval = NS_SUCCEEDED(rv); but explicitly creates a boolean value)? Sorry for all the nit-picky whining :-). /be
just wondering if this handles the special case of port 80 tacked onto the end of the url (since that could be considered a default)... a page contains three urls http://www.yahoo.com/ http://www.yahoo.com:80/ http://www.yahoo.com:5000/ click either of the first two, and both the first and second url will show as visited, but not the third. This is just something I've noticed the competition does in their GH.
yes, NS_NewURI canonicalizes the port. that's one more reason why simply constructing a nsStandardURL and re-using that doesn't cut it. going through NS_NewURI causes the underlying nsIURI implementation to know the default port and remove the port from the URL string if the port matches the default port.
Attached patch just use NS_NewURI v1.02 (deleted) — Splinter Review
ok, brendan's nit-picky nits picked :) I re-indented GetLinkState entirely, and made it if (!mGlobalHistory) so the entire function didn't get an extra indentation. (and I'm automatically transferring darin's sr=)
Attachment #93311 - Attachment is obsolete: true
Comment on attachment 93323 [details] [diff] [review] just use NS_NewURI v1.02 carrying over sr=darin
Attachment #93323 - Flags: superreview+
Comment on attachment 93323 [details] [diff] [review] just use NS_NewURI v1.02 r=radha
Attachment #93323 - Flags: review+
Thanks everyone - I'm going to wait until the tree opens for 1.2 on the chance that this slows page loading.
Keywords: approval, helpwanted
Whiteboard: suntrak-n6 → fix in hand, suntrak-n6
*** Bug 161255 has been marked as a duplicate of this bug. ***
ok, this has been checked in. Cross your fingers this doesn't hurt pageload.. if not we'll back it out and maybe try a new approach.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago22 years ago
Resolution: --- → FIXED
amazingly, either this patch either sped up page load time by about 3msec, or bz's code consolidation did it. In any case, pageloading is actually faster, not slower, after this landed.. yay!
Actually, this might be the result of another patch that landed last night from bug 81724. There is said to be a 3% increase in page loading from that patch. See http://bugzilla.mozilla.org/show_bug.cgi?id=81724#c26 Either way, it is all good news :-) Nice work! Jake
Removing the double-lookup could have been a speedup.
This isn't working for me in 2002 081209 (Win2k, trunk). I have the following links on my home page, http://www.cs.hmc.edu/~jruderma/s/ : 1. http://www.google.com (missing trailing slash) 2. http://bugzilla.mozilla.org/ 3. http://www.dictionary.com (missing trailing slash) 4. http://www.m-w.com (missing trailing slash) #1 is only marked as visited if http://www.cs.hmc.edu/~jruderma/s/ is the first page loaded in a browser window (as the home page or after "open link in new window"). #2 is marked as visited, correctly. #3 and #4 are never marked as visited. In a build from last week, all of these links are marked as visited.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
http://bugzilla.mozilla.gr.jp/show_bug.cgi?id=208 I can reproduce with 2002081209-trunk/WinXP. I reproduced this case when I click "1" and back. But, I can't reproduce when I opened this page on new tab.
Hmm... this patch works first-time only: + // default to the given URI + nsCAutoString resolvedPath(aLinkURI); + nsresult rv; + + // get the cached IO service + if (!mIOService) { + mIOService = do_GetService(NS_IOSERVICE_CONTRACTID, &rv); + [... won't come here next-time to re-resolve |resolvedPath| ...] + } + + PRBool isVisited; + NS_ENSURE_SUCCESS(mGlobalHistory->IsVisited(resolvedPath.get(), &isVisited), + NS_ERROR_FAILURE); No wonder that it made pageloading faster :-)
oh good god I'm an idiot. nice catch - that was a result of trying to clean up from the earlier suggestions. new patch coming up. (That would also explain how I remember it working perfectly at SOME point :))
Status: REOPENED → ASSIGNED
*** Bug 161007 has been marked as a duplicate of this bug. ***
Attached patch actually call NS_NewURI/etc (obsolete) (deleted) — Splinter Review
ok, this is the fix - this makes it so that every time through, we call NS_NewURI() instead of only on the first time on the first link. this is a patch to the code that is in the tree (since the previous patch has been checked in) most of this patch is just re-indenting lines..the key is that it now looks like this: // get the cached IO service if (!mIOService) mIOService = do_GetService(NS_IOSERVICE_CONTRACTID); if (mIOService) { // clean up the url using the right parser nsCOMPtr<nsIURI> uri;
Attachment #95264 - Flags: superreview+
Comment on attachment 95264 [details] [diff] [review] actually call NS_NewURI/etc sr=darin
Comment on attachment 95264 [details] [diff] [review] actually call NS_NewURI/etc r=radha
Attachment #95264 - Flags: review+
ok, fix has landed - hopefully it stays fixed this time :) I'll be watching for page-load regressions.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago22 years ago
Resolution: --- → FIXED
reopening, this caused about a 2.1% increase in pageload times - yikes! darin, I'm thinking about going back to the standardurl method - I know its not entirely correct, but it seems like the cheapest solution - the only other option I could come up with is to keep a cache of schema->URIs that I could use. Though, you mentioned using a recycling allocator to make standardurl allocation faster.. any thoughts there? which protocols use standardurl? That would be a win if HTTP used standardurl - there are rarely links on web pages beyond HTTP and FTP, and maybe mailto:, no?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
here, I'll answer my own question: about: -> simple data: -> simple datetime: -> simple file: -> standard finger: -> simple ftp: -> standard gopher: -> standard http: -> standard https: -> http: -> standard jar: -> custom (nsJARURI) keyword: -> simple resource: -> custom (nsResURL, derives from standard) theme: -> simple (theme? new to me!) view-source: -> simple So the thing is, it looks the ones that use standard url (file, ftp, gopher, http, https) Looking over this, I almost wonder if we could somehow leverage nsIOService's cache of protocol handlers for each schema. Obviously nsIIOService is now frozen, but on a new interface, it would be really cool to see: AUTF8String normalizeURI(in AUTF8String uri); and store a single URI in each entry in the protocol handler cache - i.e. instead of storing a direct pointer to a protocol handler, we could store a combination of {handler, uri} and the uri could be reused during normalizeURI() Anyway, I'm stuck. This bug needs to be fixed but I need a performant means to do it.
Status: REOPENED → ASSIGNED
nsIIOService isn't frozen yet.. I had to backout my patch because of a smoketest blocker the following day. that said, where's the real cost. your code basically constructs a nsIURI and then gets nsIURI::spec. for nsStandardURL, that amounts to the following steps: 1) allocate nsStandardURL 2) parse URL string 3) normalize URL string (string copy) 4) get spec (string copy) A NormalizeURI function if implemented by the protocol handler could eliminate the allocation of a nsStandardURL object and the string copy in step 3. It'd be interesting to see what kind of performance gain you get by just reusing a global nsStandardURL object.
ok, after a suggestion from darin to optimize for web pages, which means optimizing for http and https (and I'm adding FTP) I've got a new version which caches a URL for each of http, https, and ftp - then it tries those 3 first, before continuing on. As for cost of each of these in the previous attempt at fixing this: > 1) allocate nsStandardURL I think this is where the bulk of our expense lies - mostly because it goes through the component manager, does a bunch of addref/release/etc - caching these 3 uris will hopefully speed us up. > 2) parse URL string seems cheap? > 3) normalize URL string (string copy) again, probably cheap, but the normalization could involve shifting the string, looking up ports, etc > 4) get spec (string copy) this is unavoidable if we're going through necko at all - we have to get the string back somehow :) Anyhow, darin why don't you let me know what you think.
Attachment #95264 - Attachment is obsolete: true
Comment on attachment 95437 [details] [diff] [review] custom NormalizeWellKnownURI for http, https, ftp sr=darin (though i might have just put everything inside one function called NormalizeURI)
Attachment #95437 - Flags: superreview+
yeah, funny I realized that while I was eating lunch this one combines it all into one function, and making it much easier to understand whats going on. I'm avoiding setting the spec back to "" in the non-cached state, figuring a stack variable test is cheaper than sending a parser to parse an empty string. man I hope this is the last attempt!
Attachment #95437 - Attachment is obsolete: true
Comment on attachment 95454 [details] [diff] [review] custom NormalizeURI for http, https, ftp r=radha
Attachment #95454 - Flags: review+
Comment on attachment 95454 [details] [diff] [review] custom NormalizeURI for http, https, ftp sr=darin
Attachment #95454 - Flags: superreview+
I forgot, I needed this to just to quiet down some warnings from necko about empty URLs, from calling uri->SetSpec(""); can I get reviews on this too? it just make sure that when we set the spec to an empty string, that we don't spit out a warning every time (i.e. otherwise we spit one out for every link on the page)
Comment on attachment 95466 [details] [diff] [review] make simple/standard URIs handle empty strings >Index: nsStandardURL.cpp >+ if (!spec || input.Length() == 0) nit: since we've already gone through the trouble of getting a pointer to the flat buffer, how about checking !*spec instead? e.g.: if (!(spec && *spec)) with that, sr=darin
doh, checked this in a few days back.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago22 years ago
Resolution: --- → FIXED
verified
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: