Closed Bug 103916 Opened 23 years ago Closed 23 years ago

nsStdURL::GetSpec() is 2.6% of main1()

Categories

(Core :: Networking, defect, P1)

x86
Windows 2000
defect

Tracking

()

RESOLVED FIXED
mozilla0.9.7

People

(Reporter: dp, Assigned: darin.moz)

References

Details

(Keywords: perf)

Attachments

(2 files, 19 obsolete files)

(deleted), text/plain
Details
(deleted), patch
darin.moz
: review+
darin.moz
: superreview+
Details | Diff | Splinter Review
Func #calls F ms F+D ms Module ------------------------------------------------------------- main1 1 0 3,279ms mozilla.exe nsStdURL::GetSpec 2832 0 86ms necko.dll That is about 2.6% of main1's time In general, nsStdURL:: allocates way too many times (small quantities but too many allocations).
Blocks: 7251
Keywords: perf
-> 0.9.7
Priority: -- → P2
Target Milestone: --- → mozilla0.9.7
-> ok.. 0.9.6
Target Milestone: mozilla0.9.7 → mozilla0.9.6
Status: NEW → ASSIGNED
Priority: P2 → P1
from browser startup to displaying http://mozilla.org/start/, this is what i found: count_GetSpec = 3801, count_SetSpec = 875, count_GetPrePath = 1, count_SetPrePath = 0, count_GetScheme = 680, count_SetScheme = 0, count_GetPreHost = 1, count_SetPreHost = 0, count_GetUsername = 0, count_SetUsername = 0, count_GetPassword = 0, count_SetPassword = 0, count_GetHost = 246, count_SetHost = 0, count_GetPort = 8, count_SetPort = 0, count_GetPath = 4044, count_SetPath = 0, count_Equals = 197, count_SchemeIs = 595, count_Clone = 43, count_Resolve = 996, count_GetFilePath = 0, count_SetFilePath = 0, count_GetParam = 0, count_SetParam = 0, count_GetQuery = 1, count_SetQuery = 0, count_GetRef = 4, count_SetRef = 0, count_GetDirectory = 4, count_SetDirectory = 0, count_GetFileName = 0, count_SetFileName = 0, count_GetFileBaseName = 0, count_SetFileBaseName = 0, count_GetFileExtension = 232, count_SetFileExtension = 0, count_GetEscapedQuery = 0, count_GetFile = 54, count_SetFile = 0 this suggests to me that storing the URL in decomposed form is really not worth it. we should really be storing the URL as a single string with pointers to the particular section. that way, we'd be optimizing for the Getter methods, and in particular GetSpec.
Totally. I came to the same conlusion by looking at quantify data. Go go go...
This storing method might be difficult. You have to distinguish between an non existant component and an empty component, for example there is a difference between http://host/path?query and http://host/path? . Both have a query, but the last one is empty. That is different from http://host/path which has no query
The more I think about this I'm pretty sure it will not work or at least be very very difficult. We currently store the components as we get them and return isolated components unescaped (safe), but composed components accessible trough GetSpec or GetPath for example escaped. That is necessary to retain url syntax. This again will be very difficult to achive with the proposed solution I think. Instead I would propose to move the old structure to the string world and see if that alone helps. If not it might be possible to have the whole spec cached inside in addition to the components to speed up GetSpec.
andreas: i'm not convinced that it's not possible to store the whole string with simple offsets to each component. i was thinking of storing a starting position integer and a string length integer. if the component is present, but zero-length, then it's length value will be 0. if its delimiter is not present, then it will have a length of -1. the important thing is to store a normalized version of the spec, so that Equals can be a simple strcmp.
about GetSpec(), we should determine what the break down is within GetSpec(). I think that quite a bit of it (and it's callers) is in doing malloc and free. I'll go quantify and get some numbers. the caller will do a malloc and free, and within GetSpec() we'll do a malloc and free on GetPath() see bug http://bugzilla.mozilla.org/show_bug.cgi?id=104929 I want to fix GetPath() and GetSpec() to take a nsCAutoString reference, so that the caller can use a stack based string and we can pass that through.
seth: nsIURI/nsIURL are nearly frozen interfaces... we may run into some serious resistance if we make changes to those interfaces. perhaps we need to invent some new internal interfaces for working with URI/L's within mozilla?
Seth: Don't we have an nsACString& idl type somewhere? Discussions on irc imply that the answer is no. We definately don't have an nsCAutoString& idl type. You definately don't want to hard code a specific string type. darin: internal interfaces would suck. If its not frozen, then lets change it before it is - why not give external consumers the perf increase? Of course, there are a whole lot of callers, and all of them would have to be individually modified...
I met with darin, and we looked at the quantify data. we can get a win if we add a *non scriptable* method: [noscript] void GetSpecConst([shared] out wstring aConstValue); This is similar in spirt to GetValueConst() that we have on nsIRDFLiteral. That will prevent all the allocation, copying, and freeing that goes on within GetSpec(), GetPath() and the callers to GetSpec(). following darin's plans, GetSpecConst() and GetSpec() would no longer call GetPath(), as we'd have the address to the entire spec to return. adding GetPathConst() might also be a win, we should look into the callers of GetPath()
bbaetz: i believe that there may be external implementations of nsIURI/nsIURL.
GetValueConst will vanish once we get COW strings, I believe (although I'm not certain) And you're still going to have to change all the callers... If its a wstring, then you're going to have to convert it anyway, so you'll lose any perf gain. I presume thats just a typo, though. If you don't want to change the callers, what about adding a: bool specIs(in OurNsACStringType foo); method? Again, this assumes that we have such a type
darin: Its not a frozen interface. And updating the implementation isn't as much work as updating all the callers. It is a reasonably large perf win, so it needs to be considered, I think.
Cope-on-write strings will only benefit this case if the interface is converted to use |AString| or |ACString| (if/when we add this type to xpidl). |getSpecConst| (or how about |getSpecShared|?) is a solution I can live with.
bbaetz: yeah, seth meant 'string' not 'wstring' i'm not saying that we shouldn't consider adding GetSpecConst and GetPathConst to nsIURI... we just need to keep in mind that changing the interface could mean surprises for some people. moreover, these we must not require that all URI implementations support GetSpecConst/GetPathConst, since these methods require the implementation to hold onto a null terminated spec/path... it just so happens that StdURL could do this if we changed its internal representation as i have suggested. also, isSpec and isPath would not satisfy the problem here. consider imgCache, which calls GetSpec in order to pass the spec as a cache key. if we replaced that call site with GetSpecConst, we'd see a noticeable perf improvement.
|GetSpecShared| sounds better to me too.
darin: Hmm. You mean that you want an nsIURIConstGetter interface? You'd lose almost as much from the QI as you would from saving a copy, wouldn't you? If we're going to go for a [noscript] class, then the method could take an nsSomethingCString& and do string-fu to either hold an owning or a non-owning reference. xpidlcstrings used to be able to do this, but it looks like getter_shares vanished some time ago. jag: Is there a replacement?
bbaetz: Nope, no replacement for getter_Shares.
note: nsIURI is not frozen. It is under review (http://lxr.mozilla.org/seamonkey/source/netwerk/base/public/nsIURI.idl#70 ). It is definately an interface that people are using, so mods to it will cause bustage, but, non frozen interfaces are used at the user's own risk until frozen. GetSpec() may be 2.6% of main1(), but, what does that ammount to in "time"? Are we talking microseconds here? If so, we need to consider whether or not the bustage is worth the optimization.
2.6% of 20sec = .52 secs Either way is acceptable to me. Keeping the old way working is code bloat and maintenance issues. So changing interface seems the cheaper option.
no matter what, there is a lot of work that can be done inside of nsStdURL and ns*URLParser to improve performance. so, i'm not convinced that we absolutely need to change the interfaces to see a significant perf win... it is however an option that would improve things further... how much further remains to be seen. so, my current plan is to work with the existing api.. get a better nsStdURL impl, and then rerun the perf tests. then we'll know for sure if we want to change the interfaces.
just for reference: GetSpec is called at least once for each and every chrome element loaded, since the spec is used by imagelib as the cache key. opening a new window, opening the preferences panel... opening anything having to do with chrome (and of course loading web pages), will show GetSpec taking a significant slice of time. with a nsStdURL impl that stores the spec as a single string, and with a GetSpecShared method, we could nearly eliminate this cost.
Blocks: 105040
with this patch, we pass all the tests in urlparse.dat as well as the builtin Resolve tests, and i can start the browser and surf the web. things that still need to be done: 1- implement nsIURI, nsIURL setters (SetSpec already done) 2- implement nsISerializable methods 3- cleanup xpcom/io/nsEscape 4- rename interfaces, classes, etc -- replacing nsStdURL and the URL parsers. 5- test, test, test ;-)
some preliminary testing shows %5 improvement with a debug linux build. however, my optimized win32 build isn't showing much of an improvement :-( GetSpec[Const/Shared] would definitely eliminate the 2.6% reported by dp. oh yeah, and one more thing that still needs to be done... put back the code for caching mFile, and make sure it actually works! (unlike the nsStdURL impl)
This mFile stuff ... can't say I like it at all. Why not use this opportunity to get rid of it? This GetFile/SetFile methods are highly platform depended, for every new platform we need to change nsStdURL. I would very much like it to have Get/SetURL on the different nsIFile implementations. That makes much more sense to me.
The patch seems to be missing the implementations of nsIURLParser2. Also I don't like the nsStandardURL::Resolve method. It seems wrong to me to use the noAuthURLParser to find out if there is a scheme or not. ExtractURLScheme should be used instead, a full parse seems to much for that.
just this morning, i thought of exactly the same thing... so roughly we'd have: nsStandardURL::SetFile(nsIFile *file) { nsXPIDLCString url; file->GetURL(getter_Copies(url)); if (url) SetSpec(url); } and nsStandardURL::GetFile(nsIFile **result) { nsCOMPtr<nsIFile> file = do_CreateInstance(...); file->SetURL(mSpec); NS_ADDREF(*result = file); } and then all of the platforms specific code would be moved into nsLocalFileXXX
whoops.. you're right.. i forgot to add nsURLParser2.{h,cpp} ... i'll add them along with a revised patch tonight.
Attachment #54320 - Attachment is obsolete: true
Attached patch v0.8 - nearing completion (obsolete) (deleted) — Splinter Review
Attachment #54602 - Attachment is obsolete: true
Attachment #54786 - Attachment is obsolete: true
With this patch i'm seeing anywhere from 0-4% (averaging about 2%) at startup on a PIII 850MHz w/ 256Mb RAM running Win2k. Drilling down to the low level details, here's some of the numbers for the actual nsIURI methods of interest: method current impl new impl ------ ------------ -------- SetSpec 1821 1595 GetSpec 2151 150 Resolve 1537 334 SetPath 938 181 GetPath 1342 164 These numbers are time in milliseconds and correspond to the amount of time required to run the specified method 100,000 times on an average URL with average arguments to each method. So, the numbers aren't exact by any measure, but they do give some indication of improvement across the various key methods. I have two implementations of SetSpec: one using the string code and another simply using malloc and memcpy. The later is roughly 20% faster. I've asked some of the string folks to take a look to see if this isn't just due to some mistake in my usage of the string facilities. At any rate, this implementation seems to give the level of improvement I was hoping for, and it still passes all of the URL tests. So, now it's just a matter of testing it out to make sure that it isn't missing something major -- andreas?
*** Bug 105040 has been marked as a duplicate of this bug. ***
Attached patch v1.0 final draft, ready for reviews (obsolete) (deleted) — Splinter Review
Attachment #55176 - Attachment is obsolete: true
Comment on attachment 55329 [details] [diff] [review] v1.0 final draft, ready for reviews @@ -70,6 +67,7 @@ nsStreamProviderProxy.cpp \ nsURIChecker.cpp \ nsURLHelper.cpp \ + nsURLParsers.cpp \ $(NULL) Should use tabs to indent to align (copied from elsewhere, I know, but...). if (urlPart) - *urlPart = nsCRT::strdup(scheme.get()); + *urlPart = nsCRT::strdup(scheme); return NS_OK; Looks like a 3-space indent crept in on the + line, somehow. + case url_Port: + if (port != -1) { + nsCAutoString buf; + buf.AppendInt(port); + ExtractUrlPart_Helper(buf, 0, buf.Length(), startPos, endPos, urlPart); + } + else + ExtractUrlPart_Helper(nsnull, 0, -1, startPos, endPos, urlPart); + break; These calls seem silly -- why not set *urlPart = ToNewCString(buf) or whatever the latest incantation is? PL_strndup is not *necessarily* compatible with nsMemory, btw. +void +nsStandardURL::CoalesePath(char *path) "Coalesce" is misspelled. + nsCRT::memcpy(buf + i, escapedStr->get(), seg.mLen); What's wrong with memcpy? gcc among others will optimize it, many libcs optimize it too. Sorry for the nits, more later -- our plain text editor bugs are killing me in this tiny textarea (grrrrrrr!). /be
Attachment #55329 - Flags: needs-work+
I still don't like this part of nsStandardUrl::Resolve: + // NOTE: there is no need for this function to produce normalized + // output. normalization will occur when the result is used to + // initialize a nsStandardURL object. + + if (mScheme.mLen < 0) { + NS_ERROR("unable to Resolve URL: this URL not initialized"); + return NS_ERROR_NOT_INITIALIZED; + } + + nsresult rv; + URLSegment scheme; + URLSegment path; + + // relative urls should never contain a path, so we always want to use + // the noauth url parser. Why path? Do you mean host? path or host, I dont't think this assumption is valid. + rv = gNoAuthParser->ParseURL(relpath, -1, + &scheme.mPos, &scheme.mLen, + nsnull, nsnull, + &path.mPos, &path.mLen); + if (NS_FAILED(rv)) return rv; + Why use the whole parser when you just want to know if there is a scheme? We have ExtractURLScheme for that which does a much better job. And if you must use the full parser please don't use the noAuthParser, you can't make any assumptions about the url given to Resolve.
brendan: thanks for the review comments. andreas: yeah, i meant "host" not "path." thanks for catching that. i'm calling the parser instead of ExtractURLScheme because i need to know both the scheme as well as the path. if there is something missing from the scheme parsing logic in nsBaseURLParser then that should be fixed. i don't see the point of having ExtractURLScheme.
andreas: let me answer your question a bit more directly: i want to optimize for the case when the relativeURL is actually relative, and normally we wouldn't expect there to be a scheme, but there would always be a path. so, i'm calling the parser to extract these portions. and it isn't parsing the whole URL.. it is only looking for the scheme, auth, and path sections. i use the no-auth parser, since i know that relativeURL's should not have an authority section. isn't this the case? nsStandardURL::Resolve passes all of the relative url tests... is there some case that i'm not considering?
darin: I haven't found the usage of the path inside Resolve on first scanning, are you sure? ExtractURLScheme looks a bit on the characters inside the scheme and returns false if there is an invalid character. Maybe that can be moved into the urlparser too. My guess is that 90% of all calls on startup to Resolve through MakeAbsolute are done by already absolute URLs, we should optimize on that. And having there most of the time an absoulute URL (with every thinkable scheme!) it is dangerous in my opinion to scan with the noAuthParser.
andreas: you're right... |path| isn't even being used! but it was meant to be, and that is an error. it is needed in case there is some extra junk at the end of relpath that the parser wants to trim off (such as a trailing \n). so references to relpath should really be references to something a kin to Substring(relpath, path.mPos, path.mLen). regarding the appropriateness of the no-auth url parser... i don't see why it would matter which url parser is invoked. all of the basic url parsers make the same assumptions when parsing the scheme... in fact they are implemented by the same piece of code (see nsBaseURLParser::ParseURL). the distinction between a no-auth url parser and an auth url parser lies in what follows the scheme, right? so there is no reason to really care which parser is invoked.except in this case, we know that we also want the path section of relpath, so as i said, i intended to use the |path| result to extract the url path from relpath.
actually, the old nsStdURL::Resolve didn't do any cleansing of trailing whitespace or control characters, so as a first pass i'm just going to eliminate the |path| local variable, and i'll make ParseURL return early if both |authLen| and |pathLen| are null (ie. if the caller isn't interested in the authority section or the path section, then don't both parsing them). i'll also move the extra scheme validation checks from ExtractURLScheme into ParseURL. i'll make ParseURL return NS_ERROR_MALFORMED_URI if it detects invalid chars in the scheme.
Yes, nsStdURL::Resolve goal was to create a good looking url string with minimal effort (without parsing to much). Everything else was left to the most of the time following SetSpec call.
Attachment #55329 - Attachment is obsolete: true
Comment on attachment 55674 [details] [diff] [review] v1.1 revised per comments from brendan and andreas accidentally left some debugging crap in this patch... ugh!
Attachment #55674 - Attachment is obsolete: true
Attachment #55674 - Flags: needs-work+
so, i tested this patch on Cathleen's P200, and got the following startup times (in seconds): [new code] [old code] 18.62 18.95 19.11 19.01 18.67 19.01 18.57 18.95 removing the worst time from each, and averaging this comes out to: 18.62 18.97so the new code amounts to approximately 1.9% savings at startup. the next thing to test is how this patch effects page load times.
Okay, I applied the patch on a linux build, it seems to work. A few comments: 1) There is an indention problem with mozilla/netwerk/base/src/Makefile.in on the entry for nsStandardURL.cpp. Please use 2 tabs instead of 8 spaces. Emacs marks this line suspicious with good reason. 2) I see a new assertion: ###!!! ASSERTION: NS_ENSURE_TRUE(unknownOther) failed: 'unknownOther', file nsStandardURL.cpp, line 1039 ###!!! Break: at file nsStandardURL.cpp, line 1039 Someone trys to Equal with a null pointer. No problem of the new code, but points to a problem somewhere else. 3) I checked with urltest and see some changes: Doing urltest -file res/urlparse.dat I see two more failures than usual: Testing: http://username:password@hostname.com:80/pathname/./more/stuff/../pathResult: http,username,password,hostname.com,80,/pathname/more/,path,,,,,http://username:password@hostname.com:80/pathname/more/path Expected: http,username,password,hostname.com,80,/pathname/more/,path,,,,,http://username:password@hostname.com/pathname/more/path FAILED Testing: http://username:password@hostname:80/pathname Result: http,username,password,hostname,80,/,pathname,,,,,http://username:password@hostname:80/pathname Expected: http,username,password,hostname,80,/,pathname,,,,,http://username:password@hostname/pathname FAILED We no longer suppress the port when it is given, but is the default port. Doing urltest -file res/urlparse_unx.dat I see 2 failures; Testing: file://home Result: file,,,,-1,/,home,,,,,file:///home Expected: file,,,home,-1,/,,,,,,file://home/ FAILED Testing: file:home Result: file,,,,-1,/,@(,k+@,,,,file:///@(%C9%05%08%03%00%00%00D%EB%FF%B... ...(very, very long line)...00%00%00file%3Ahome Expected: file,,,,-1,/,home,,,,,file:///home FAILED The first problem points to a loss in functionality in the noAuthParser. Although there are usually no hosts in file urls the syntax allows it and it is sometimes used. So there are some special cases where there are "authorities" in file urls. I started the same way as it is with this patch, but had a bunch of bugs filed against it. All those bugs will have to be reopened ... The second problem is either with urltest.cpp or with the parser, haven't figured that out yet. Something points to nowhere ...
It's a parser problem. When searching for the most right / you don't take into account that there might be none. I changed ParseFilePath to get it working: // search backwards for filename ======== <= better directory delimiter for (p = end - 1; *p != '/' && (p - filepath) > 0 ; --p) ===================== The default port stuff is also an easy fix, just use the lines from nsStdURL::GetSpec.
The new assertion is caused when this url chrome://navigator/content/navigator.xul is compared with a null pointer.
Attachment #55677 - Flags: needs-work+
I just found that you removed nsStdUnescape from nsEscape.cpp and replaced it with the old unescape stuff. That's bad, because the old stuff contains at least one nasty error when dealing with a % followed by chars that are not from the hex range. You should resurect it ...
andreas: thanks for all the feedback! the port problem looks like a regression... i must have regressed that when i was trying to optimize SetSpec. thanks for pointing out the other issues... i hate the fact that file://home shouldn't be interpreted with home as a hostname. that just seems wrong. do you have any idea which bugs would regress if we didn't "fix" this?
I have a "patched" patch in my local tree that now performs well with the exception of the file://home stuff. Instead of resurrecting the contents of StdUnescape I fixed the old stuff. file://home In the old nsNoAuthURLParser there was some special stuff to deal with this case. That is now gone, take a look. On first search I could find bug 32997.
I just found a big regression in mail/news. I can't read any mail stored in folders, that includes the inbox. I get a message that the file could not be found. Darin could you please check if this happens in your tree too.
andreas: i have patches in my tree to handle everything but file://home and the unescape stuff... could you please submit your revised patch for nsEscape.{h,cpp}? thanks!
i'm also experiencing problems with imap... nntp works fine though. investigating...
i just played around with standard url using xpcshell, and it looks like nsStandardURL::GetPreHost is busted. it just returns empty on a URL like: imap://darin@nsmail-2.mcom.com/ however url.username comes out correctly.
Attached patch patch to nsEscape.cpp, fixing nsUnescape (obsolete) (deleted) — Splinter Review
With pop3 mail it seems we are trying to read a file including the query part. The mailbox url is simply turned into a file url.
nevermind... turns out that i was typing "url.prehost" instead of "url.preHost"
nevermind ... I see the same file/mailbox urls in an unpatched build. Must be something else ... Seth, any ideas where to look?
as far as file://home is concerned we currently are kind of broken anyways. nsLocalFile::SetURL looks for a host and simply prepends it to the path. it does not handle the host as a UNC hostname. so in some sense bug 32997 still exists today. looks like nsStdURL::GetFile simply ignores the hostname section, which makes file:/// == file://localhost/ == file://xyz/ how confusing is that?!? to do the same we could simply make the file protocol use the auth parser, but it hardly seems right. perhaps, file://localhost/ should map to file:////localhost/, but then we'd need to add logic elsewhere to ignore the UNC-hostname part of the path.
ok, i discovered the problem with Imap... it was actually caused by a miscalculation of the path/filePath length in BuildNormalizedSpec. the fix is trivial and now Imap is working again.
Attached patch v1.3 (obsolete) (deleted) — Splinter Review
Attachment #55677 - Attachment is obsolete: true
this latest patch handles everything andreas raised except for the file://home issue. i'm still thinking about other ways of dealing with this problem.
(p - filepath) > 0 => p > filepath ? benc said elsewhere that file://foo/etc/bar should only work from computers who are AKA foo. eg file://www.netscape.com/etc/passwd should only work from computers who think/know they are www.netscape.com, and not my computer since mine knows it isn't. a reminder that microsoft has a prefered interpretation for +// eg. file:foo/bar.txt which says it's a relative path. so if you encounter it in c:\windows\desktop\something.url microsoft will go to c:\windows\desktop\foo\bar.txt speaking of nonstandard behaviors, try this in lynx: http://www.mozilla.org:http/ having just read the spec today i can tell you that this isn't valid per at least one iteration of the spec. of course, implementing that could make urls really interesting: http://user:password@host:protoportname/path@idontcare
p - filepath > 0 => p - filepath + filepath > 0 + filepath => p > filepath ok, so file: URLs aren't exactly no-auth URLs. there is an authority section. but we can't just use the auth parser for file: URLs either because it makes different assumptions about file:/foo/bar/ and file:foo/bar what we need is an URL parser that makes the following normalization: file:foo/bar => file:///foo/bar file:/foo/bar => file:///foo/bar file://foo/bar => file://foo/bar file:///foo/bar => file:///foo/bar so, for all but the 3rd, the authority section is empty. this to me sounds like another specialization of nsBaseURLParser::ParseAfterScheme. perhaps we need a nsIStandardURL::InitWithParser method, so we can implement a custom URL parser for file: URLs in netwerk/protocol/file/src. or, we could just add another standard URL parser for file: URLs.
NoAuthUrlparser was a bad name choice, but at that point in time I really thought there was no authority in file urls. I should have called this thing LocalUrlParser, which would have fitted much better. I don't think we need a new specialisation, all noAuth urls are most likely really local urls. If you want to make that distinction, just add the new parser, put it into the global objects part and change the New_URI part in the file protocol handler to use that parser by default. No need for any new methods.
but file: urls are not only able to reference local resources. isn't that the point of the authority section: that it can reference any host? of course, we only support local file urls, but that doesn't restrict the meaning of the host section for a file url. i think that no-auth still makes sense. auth always tries to err on the side of ensuring that there is an authority section, noauth always treats what may look like an authority section as part of the path. std is in between. and file needs something else that is also in between auth and noauth, but not equal to std. so i really think that file needs a separate url parser. i'm just not sure what it should be called... perhaps file?? local just seems incorrect.
Hmmm ... we could call it LocAuth for LocalAuth. If we do this and use it for file NoAuth becomes unused at the moment, as far as I know.
when i started working on this bug i promised myself that the first revision wouldn't involve any api changes... now it seems we're talking about api changes, so i think that i should forget renaming/replacing no-auth for now, and just concentrate on getting the performance improvements in. there can be another bug later on about cleaning up the api.
But then let the current noAuth have the originial functionallity of the old NoAuth, put in the file:// handling and the drive detection-code for XP_PC.
right.. v1.3 has the drive detection code (though it's not quite in the right place). i'll upload new nsURLParsers.{h,cpp} as well as a new complete patch.
Attached file url.tgz containing new nsURLParsers.{h,cpp} (obsolete) (deleted) —
Attached patch v1.4 (obsolete) (deleted) — Splinter Review
v 1.3 including url.tgz = v1.4 (?) looks good, it passes all urlparser tests it should. But I still see the problem in mailnews. I can't access any mail stored in mailfolders of pop3 mailaccounts or the Local Folders.
andreas: v1.4's nsStandardURL.cpp fixes the mailnews problem. please give it a try. thx!
actually, no.. v1.3 nsStandardURL.cpp and v1.4 nsStandardURL.cpp are identical. andreas: can you enable NSPR_LOG_MODULES=nsStandardURL:5, try loading your pop3 folders, and submit the log? this might reveal the problem. thx!
hmm.. there's nothing odd about those URLs.. i also can't detect any difference in the parse results. i'm going to try setting up a pop3 account myself to see i can reproduce the problem.
i setup a pop3 test account and am likewise noticing a problem... but the problem seems a bit different than what you describe. in my case i get an alert dialog that says it can't find a file of the form: /C|/Documents%20and%20Settings/darinf/Application%20Data/Mozilla/Profiles/test/ ... the rest of the text exceeds the width of the dialog. anyways, it looks like someone is getting the raw escaped path when they shouldn't. i suspect it may be a problem related to my changes to nsStandardURL::[GS]etFile().
I think it's the same thing as what I see, maybe you should try puting in the old Get/SetFile stuff for now. Get/SetFile is a really good reason, it makes sense since the url looks fine. I'm signing off for today, it's close to 2AM over here and I'm to tired to get anything useful done.
Okay the mailnews problems is that we get the query part of the url as part of the FileBaseName part inside SetURL. Looking more into it.
Attached patch patch to fix my problem with the mail folders (obsolete) (deleted) — Splinter Review
andreas: thanks for figuring out the mailbox bug... i had mistakenly thought that i could get away with not calling ParsePath.
What's the RFC's after 1738 I should be reading? I need to catch up to some of the terms used here. re: file://string, that is probably illegal, or if you are being nice, it would be mapped to file://string/, which would presumably be the top level of the filesystem if the browser is running on a hostnamed "string". Our current file URL implementation always ignores the hostname, so if I'm reading your comments correctly, things might be working fine.
darin, I'll sr= as soon as you're happy with testing results and have an r=. Cc'ing myself so I can see the bugmail. /be
darin: Are your problems with the pop3 account on windows gone with the change to nsLocalFileCommon.cpp?
andreas: yes, your change to nsLocalFileCommon.cpp fixed the problem i was seeing with POP3. thanks again! uploading new "final" patch shortly.
Attached patch v1.5 (obsolete) (deleted) — Splinter Review
Attachment #56206 - Attachment is obsolete: true
Attachment #56295 - Attachment is obsolete: true
darin: which files changed between v1.4 and 1.5? Only nsEscape.cpp and nsLocalFileCommon.cpp? What about the unimplemented setters in nsStandardUrl.cpp? Will you do them?
andreas: yes, i changed those two files according to your latest diffs and then added a bit of documentation to nsIURL. otherwise, it is equivalent to the v1.4 patch. the setters that are unimplemented in nsStandardURL were also unimplemented in nsStdURL, so there should be no loss of functionality.
Comment on attachment 56600 [details] [diff] [review] v1.5 r=andreas.otte@debitel.net (tested on linux, please get some more r=)
Attachment #56600 - Flags: review+
Comment on attachment 56615 [details] [diff] [review] patch to nsXULDocument.cpp to rev the fastload version number got a rs=brendan on this change
Attachment #56615 - Flags: superreview+
Comment on attachment 56600 [details] [diff] [review] v1.5 Underindented by 1, or a (somehow hidden from me) tab? if (urlPart) - *urlPart = nsCRT::strdup(scheme.get()); + *urlPart = nsCRT::strdup(scheme); Why not combine cases below, one Extract... call under the 3 case labels? + else { + switch (flag) { + case url_Param: + ExtractUrlPart_Helper(urlString, paramPos, paramLen, startPos, endPos, urlPart); + break; + case url_Query: + ExtractUrlPart_Helper(urlString, paramPos, paramLen, startPos, endPos, urlPart); + break; + case url_Ref: + ExtractUrlPart_Helper(urlString, paramPos, paramLen, startPos, endPos, urlPart); + break; This char-at-a-time (after the first \r, \n, or \t) Append loop can hurt: + PRBool writing = PR_FALSE; + for (const char *p = str; len && *p; ++p, --len) { + if (PL_strchr("\r\n\t", *p)) { + writing = PR_TRUE; + result.Assign(str, p - str); + } + else if (writing) + result.Append(*p); + } + return writing; How about this instead? + PRBool writing = PR_FALSE; + for (const char *p = str, *q = p + len; p < q; str = ++p) { + while (*p && *p != '\r' && *p != '\n' && *p != '\t') + ++p; + if (!writing) { + if (*p == '\0') + break; + writing = PR_TRUE; + } + result.Assign(str, p - str); + } + return writing; Why not nsCAutoStrings here and below? + // buffers for holding escaped url segments (these will remain empty unless + // escaping is required). + nsCString escapedScheme; + nsCString escapedUsername; + nsCString escapedPassword; Maybe jag knows of a way here: + mSpec = buf; // XXX too bad there isn't some way of avoiding this strdup! + return NS_OK; Sorry, tired and I don't trust the stinking plaintext (textarea) editor. More tomorrow. /be
Attachment #56600 - Flags: superreview+
I guess those could be nsCAutoString, stack allocation is pretty cheap and it'll save us an alloc in most cases when escaping is needed, since I suspect most parts to be less than 64 characters. I've looked at the |mSpec = buf;| before. Basically what this needs is |Adopt| on nsSharableString. dbaron has a patch for this in bug 104663. That doesn't quite help nsCString right now, so maybe we should hack this method on nsCString for the time being.
i went with nsCString thinking that i was optimizing for the case when escaping wasn't necessary, but i guess at 64 bytes a piece nsCAutoString's wouldn't be too costly. thanks for all the good review comments... i'll put out a new patch right away.
Attached patch v1.5 (obsolete) (deleted) — Splinter Review
the latest patch fixes the problems pointed out by brendan. i went with the following algorithm for FilterString: PRBool nsStandardURL::FilterString(const char *str, nsCString &result) { PRBool writing = PR_FALSE; result.Truncate(); for (const char *p = str; *p; ++p) { if (*p == '\t' || *p == '\r' || *p == '\n') { writing = PR_TRUE; // append chars up to but not including *p result.Append(str, p - str); str = p + 1; } } if (writing) result.Append(str, p - str); return writing; } tested this out using urltest.exe and it seems to work correctly.
I looked over the v1.5 patch and spoke with Darin about any questions I had. I didn't look at the code in context, but what I did see looks okay. r=gordon
bug 108734 blocks this patch... i think i can work around that bug for now.
nsCString::Replace can also cause a hang... see bug 108738. so, i definitely should not be using it.
i talked with jag... he's going to try to fix these bugs, but we're not going to wait on those fixes before landing this patch. so, i'm going to replace all instances of string.Replace, with a inline static helper function that instead calls Cut followed by Insert. none of the Replace calls are on the critical path so i don't think we need to worry too much about the perf hit of not using Replace. later when the string bugs are fixes, we can easily revert to the current patch.
i discovered a crash (unrelated to the string.Replace problem) while running the page loader test. the stack trace on my MOZ_PROFILE win32 build is all screwed up, but it looks like someone is calling nsIURI::Resolve with a junk |this| pointer. time to break out the debug build... this patch unfortunately might not make it for 0.9.6 :(
For the record, I've pointed out to darin that the |for| loop he's using won't build everywhere since the |p| is declared within the scope of the |for| loop and won't be accessible below it (though some compilers get the scoping wrong, which is why it compiled for him).
Attached patch v1.6 (obsolete) (deleted) — Splinter Review
Attachment #56600 - Attachment is obsolete: true
Attachment #56743 - Attachment is obsolete: true
Attached file diff -u v1.5 v1.6 # slightly annotated (obsolete) (deleted) —
patch v1.6 accidently includes changes to netwerk/test/makefile.win to build foo.exe -- this'll be removed before commiting the patch. andreas, brendan, gordon: can i get updated r='s and sr='s from you guys on the latest patch? thx! -> 0.9.7 (since there's no reason to risk destabilizing 0.9.6 for this patch)
Target Milestone: mozilla0.9.6 → mozilla0.9.7
Blocks: 109179
darin: could you include a complete final patch, v1.6 is only some stuff about nsAString.
Attached patch v1.6 (for real this time) (obsolete) (deleted) — Splinter Review
Attachment #56133 - Attachment is obsolete: true
Attachment #56286 - Attachment is obsolete: true
Attachment #56411 - Attachment is obsolete: true
Attachment #56413 - Attachment is obsolete: true
Attachment #56615 - Attachment is obsolete: true
Attachment #56864 - Attachment is obsolete: true
Blocks: 109255
Attachment #57254 - Flags: review+
what I want is an URL parser that makes the following mangling: file:foo/bar => foo/bar (./foo/bar if you prefer, it doesn't matter to me, as long as we can provide a compatible relative path form for ms .url files, and "bare/word" alone doesn't work in them, so i think we pretty much have to support "foo:notreallybare/word" => "notreallybare/word") I know that no one likes it, but ms uses it, how much harm would come about from us doing that?
A canonical URL? Lets make an new bug for this...
I agree, that should be a new bug, but I wouldn't give it a good chance, because that kind of urls are violating rfc2396 when interpreted as relative urls. For these .url files, what about having a base url like file:///whereever and a relative url like bare/word?
Blocks: 100212
So I'm willing to sr= this because I want to see it land... * NS_EscapeURL() should use nsACString not nsCString as its result * shaver sez that true out parameters are not "optional" - meaning you can't pass in nsnull and expect it to work.. what you can do instead is just pass in one dummy PRInt32 to each of these. * its a litte late now, but your URLSegment is essentially the same as an nsDependentSingleFragmentSubstring, which means you could be passing them around as nsAString-level classes instead of constantly pulling out mPos and mLen * doesn't look like you're freeing "buf" in BuildNormalizedSpec * there may be some cool way of avoiding that extra malloc() in BuildNormalizedSpec() with CBufDescriptor.. there may be some way to pass off the malloced buffer to the nsCString...but I'm not an expert with that. the code itself might be cleaner if you make buf into an nsCString (or maybe nsCAutoString)and do a SetCapacity(approxLen+32) - that will also make sure you never overflow that buffer, because nsCString ought to do it for you.. * a comment just to say that the URLSegment's in nsStandardURL are relative to mSpec might make it a little clearer what they are for :)
Comment on attachment 57254 [details] [diff] [review] v1.6 (for real this time) walked through the v1.6 changes with brendan today and got a verbal sr=brendan
Attachment #57254 - Flags: superreview+
alec: thanks for the comments... didn't see them before sitting down with brendan. 1- i'll make the change to NS_EscapeURL. 2- i disagree, what's the point? 3- nsDependentSingleFragmentCSubstring wouldn't work because i need to be able to change the underlying buffer out from under the offsets/lengths. in other words, the URLSegments cannot be replaced with |char *start|, |char *end|. 4- yikes! ...you're right. bad me! 5- originally i used SetCapacity and Append to build up the normalized spec, and i found it to be really really slow... 8-20% slower than the current method. i spent some time with jag trying to understand where the cost is, but we weren't able to get a good understanding on it... seems like Append is much much more expensive then it should be. and since the cost of the additional malloc is only 1.5% of SetSpec, i figured it would be sufficient until we get support for nsCString::Adopt. 6- yeah, i'll add that.
I disagree with shaver as well, thought I'd bring it up, see what other folks say :) Anyway, all sounds good (esp. good to hear about the perf analysis)
1- made NS_EscapeURL take a nsACString 2- added free(buf) after mSpec = buf; 3- switched to use malloc instead of PR_Malloc per discussion with brendan 4- added comment above URLSegment member variables "relative to mSpec" assuming r=andreas, sr=brendan,alecf
Attachment #56866 - Attachment is obsolete: true
Attachment #57254 - Attachment is obsolete: true
Attachment #57990 - Flags: superreview+
Attachment #57990 - Flags: review+
landed-on-trunk
marking FIXED.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: