Closed Bug 31770 Opened 25 years ago Closed 23 years ago

Find on Page [and in view source] extremely slow

Categories

(SeaMonkey :: UI Design, defect, P1)

Tracking

(Not tracked)

VERIFIED FIXED
mozilla0.9.5

People

(Reporter: spam, Assigned: mozeditor)

References

Details

(Keywords: perf, topperf, Whiteboard: [PDT-] ETA: fixed on trunk; waiting for approval for 094. [perf][nav+perf])

Attachments

(19 files)

(deleted), text/html
Details
(deleted), text/html
Details
(deleted), text/plain
Details
(deleted), patch
Details | Diff | Splinter Review
(deleted), text/html
Details
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
Details | Diff | Splinter Review
(deleted), patch
jesup
: review+
Details | Diff | Splinter Review
(deleted), patch
kinmoz
: review+
kinmoz
: superreview+
Details | Diff | Splinter Review
To avoid filing duplicates i searched bugzilla for new, reopened, verified, fixed and unverified bugs. Got around 3600 hits. (Resulting page loaded in 5 minutes) Now: Searching with "find on page" got slower and slower - to find the next occurance of a word mentioned twice in one sentence took around 15 seconds when i was half way down the page. Almost like it searched the whole page from start all over again each time the "find" button was clicked. In effect it's so slow it's useless. I can read the page myself, faster than the searchtool can. Build ID: 2000031318
search dialog bug
Assignee: matt → law
reporter - what sort of system (processor, etc) are you on? When you search, how bug is the cpu usage?
Intel P120, 96 MB RAM, RedHat6, somewhat upgraded. Gnome/sawmill. No problems. I did a comparision, a little "manual benchmark". This query was performed first in Netscape 4.7, then in Mozilla Build ID 2000031418: http://bugzilla.mozilla.org/buglist.cgi?bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&bug_status=VERIFIED&rep_platform=PC&op_sys=Linux&email1=&emailtype1=substring&emailassigned_to1=1&email2=&emailtype2=substring&emailreporter2=1&bugidtype=include&bug_id=&changedin=&votes=&chfieldfrom=&chfieldto=Now&chfieldvalue=&short_desc=&short_desc_type=substring&long_desc=&long_desc_type=substring&bug_file_loc=&bug_file_loc_type=substring&status_whiteboard=&status_whiteboard_type=substring&keywords=&keywords_type=anywords&field0-0-0=noop&type0-0-0=noop&value0-0-0=&cmdtype=doit&order=Reuse+same+sort+as+last+time (that got long heh...) Well: netscape found all 26 occurances of the word "ftp" (no quotes) in that page in 45 seconds. Mozilla found them in 6 minutes, 35 seconds. The test was performed by clicking on "search" again as soon as the word was found. Netscape 4.7 was almost 9 times faster, in other words. What is very striking is that Mozilla gets slower and slower as it searhes further down the page. In the end it takes around 15 seconds to find the second occurance of ftp in a sentence looking like this: ftp://ftp.... (etc) And yes - CPU load was high while Mozilla searched: It uses what it can - figures like 87% and 95% showed up with a "top" when i tested it afterwards. (No "top" was running while the comparative search was done, and the apps were running "alone" - under the same conditions.) The only "difference" was that this was done manually, and since Netscape found the word "ftp" very quick at times, i may have been a little slow hitting the find-button again. In case it did any difference it's to Mozilla's discredit, unfortunately.
Confirming bug. I tried it on WinNT with a 450Mhz P2 and the lag is still noticeable. Very telling is the case where ftp occurs again only a few characters away from where the last occurence was. 4.x finds this instantaneously. Seamonkey takes just as long as the others. I get the impression Seamonkey searches the whole page up to the point where it last found an occurence each time through. Whereas 4.x just picks up searching wherever it left off (the last occurence). I'm also changing the component and QA.
Status: UNCONFIRMED → NEW
Component: Search → XPApps
Ever confirmed: true
QA Contact: claudius → sairuh
Summary: Searh/Find on Page extremely slow → Find on Page extremely slow
h'm, i didn't really see much of a lag btwn hitting the Find button (searched for the string 'ftp' in the link above from dark@c2i.net): linux [2000.03.15.06-nb1b, mozilla] -- 2 seconds (okay, that's slow) mac [2000.03.15.05-nb1b, mozilla] -- less than 1 second winNT [2000.03.15.06-nb1b, commercial] -- less than 1 second however, i did notice when i first went to that page, and initially brought up the Find On Page dialog, that it would take 2-6 seconds for the dialog to appear. did anyone else see that?
Yes :) But that's the least of the problems..
sairuh: A second thought.. Did you go through the whole procedure, finding *all* (26?) occurances of the word ftp in that page?
This is probably a performance issues with nsITextServices, which kin wrote. cc:ing kin. FYI, Spellcheck in editor uses the same service, so may also show performance problems on long documents.
ah...i think i see what you mean: i only searched a few times, ie, the first 3 or so instances of "ftp". so, i just tried searching through the whole page, and what i notice is that searches for the first half of the page take only ~1sec...but once i'm searching through the latter half of the page, it takes more like 2-3sec btwn each search (longer on linux than on mac or winNT). but not extremely slow from my perspective...
I oughta look you up in the Silmarillon.. Regardless of horsepower the factors here speak for themselves, I think. 6.5 min. search versus 45 sec. This can be done much better.
Giving this to kin. The problem has to be in the document text services.
Assignee: law → kin
Keywords: perf
Target Milestone: --- → M16
I've noticed a considerable slowdown of find as well. It would seem to be related to an insane amount of memory usage. After the first find, virtual memory usage jumps up about 60 or 70 MB, and then slowly creeps up with each additional find. I don't think it's leaks, per se, as all of the memory is immediately freed when you close the find dialog. This is with Linux on a Celeron 500, 128MB RAM.
*** Bug 36005 has been marked as a duplicate of this bug. ***
Accepting bug.
Status: NEW → ASSIGNED
Target Milestone: M16 → M17
moving to m17
*** Bug 45782 has been marked as a duplicate of this bug. ***
Keywords: nsbeta3
Component: XP Apps → XP Apps: GUI Features
setting to m18, nsbeta3
Target Milestone: M17 → M18
I did a jprof profile (to be attached) of find-in-page in nsBlockFrame.cpp (for a string that wasn't in the file). The basic problem here is that nsContentIterator::NextNode (for post-order traversal) calls |parent->IndexOf(cN, indx)|, which is O(N) on the length of the array. On a huge pre element, this is really really bad, because when you iterate through the whole list, the iteration is O(N^2) when it should be O(N). nsContentIterator needs to store the index of the current node. If there's a possibility that the list changes during iteration, then it should double check that the index is still valid, and *if not*, do the O(N) search. It's probably worth going through nsContentIterator and seeing if other things need cleaning up too...
Note that since that's a realtime profile (rather than time spent by the app), g_main_poll shows up. It should be ignored.
Cc jfrancis@netscape.com since he is the author of the content iterator.
I have no idea how to read that profile. I'll have to reprofile this with a tool whose output I can read. caching the index will make the iterator faster, but does that explain a 15 second delay between finding two occurances of a string in the same line? I smell another problem...
To see how to read the profile: http://lxr.mozilla.org/mozilla/source/tools/jprof/README.html What the profile is showing is that 73% of the time in Find in Page is spent in the call to IndexOf() within nsContentIterator::NextNode. That's clearly a problem.
Right. But while indexof is slow, that doesnt mean thats the problem. The real problem may be that we are iterating unneccessarily, etc. It's hard to see how two occurances of the string within the same node should need to call indexof a bunch of times.
moving to m19
Target Milestone: M18 → M19
for instance, nsTextServicesDocument::GetCurrentTextBlock() is spending all of it's time in the iterator. That smells bad. Do you want to look at that Kin? On another front, I'm wondering if I can get away with the caching David suggests. I can if all the callers are keeping their contract not to use the iterator while chaning the content they are iterating. I think right now you are only hosed if you delete content, but with that change I think you might be hosed when adding content as well. Maybe I should just try it and see what breaks. :-P
jfrancis: If you're worried about users changing the content model while iterating, you might want to try one of: a) cache *both* the index and the current node, and when you want the next node, call IndexOf (which is O(1)) on the cached index, and if it's *not* the current node, do the O(N) search b) in DEBUG mode only, cache the current node, and always cache the index, and in DEBUG mode only do the O(1) check and Assert if it fails. This would prevent callers from triggering lots of O(N) searches.
dbaron: I don't understand "call IndexOf (which is O(1)) on the cached index". IndexOf takes nodes, not indexes. And it's never O(1) unless you are asking for the index of the first child. I think you mean something else, but I'm not sure what...
Sorry, I meant "ChildAt" instead of "IndexOf" in part (a) of my last comment.
Keywords: nsbeta3
beppe: Please do not remove nsbeta3 nominations made by others.
Keywords: nsbeta3
well, perf bugs are in m19 because we are addressing them after the correctness and polish bugs, hence I removed the nsbeta3 keyword, since you want the keyword in -- marking nsbeta3- NOTE: will readdress after correctness and polish bugs are addressed
Whiteboard: [nsbeta3-]
This profile still shows that most of the time is spent in the iterator. It looks like we probably are doing unnecessary work on every successive find. In the original profile, of searching a large document for a string it did not contain, most of the time was spent in nsContentIterator::Next, which was called (roughly equally) by: nsTextServicesDocument::FirstTextNodeInNextBlock(nsIContentIterator *) nsTextServicesDocument::CreateOffsetTable(nsString *) In the new profile finding text later on the same line, about 2/3 of the iteration time is spent in nsContentIterator::Next, all of which was within nsTextServicesDocument::CreateOffsetTable(nsString *) and about 1/3 of the time was spent in nsContentIterator::Prev, all of which was within nsTextServicesDocument::::FirstTextNodeInCurrentBlock(nsIContentIterator *) . This makes me think that probably the usage of the iterator could be improved too...
find is rendered useless the further down the file, it basically comes to a halt. For messages or composed pages, if a find is invoked and the search word is repeated through the file, each find will cause the next portion to be slower. In one test with a page that was approx. 700 words, with the word "help" repeated within the text 50 times, resulted in a wait of 15-17 seconds between words. That is a major problem.
Severity: normal → major
Keywords: rtm
Priority: P3 → P2
Whiteboard: [nsbeta3-] → [nsbeta3-][rtm+]
Find needs to be entirely rearchitected anyway.
Priority: P2 → P1
Whiteboard: [nsbeta3-][rtm+] → [nsbeta3-][rtm+][p:1]
Not setting rtm +/- until i understand the scope of the fix, if a viable 'patch' that is low risk can be written to resolve this, then it will be considered. if a complete rearchitecture needs to be done, then this will need to wait until post rtm.
Whiteboard: [nsbeta3-][rtm+][p:1] → [nsbeta3-][p:1]
just talked with kin, he believes it is a low risk fix for a very visible problem with a highly used feature. Kin, please include the required information per the rtm checkin rules
Whiteboard: [nsbeta3-][p:1] → [nsbeta3-][p:1][rtm+ NEED INFO]
Simon and I did some looking at this. It turns out the find dialog is stateless, so every time you press find, the find code uses the TextServicesDocument::PrevBlock() method to count how many blocks from the beginning of the doc the current block is, just in case it has to wrap around. Going backwards in the document is very expensive because the content iterator can't find the previous content node without calling IndexOf. IndexOf can get really expensive especially in big flat documents. I'll need to provide some method that can give the find code the current block index without having to resort to going backwards through the document.
i believe the search engine never "wraps around"? If you mean it at some point would start searching from the beginning again. That never happens. To test: search for a common word, all occurances in a page. Then change the word it searches for and search again. It will only beep, even if the word is mentioned. When it comes to "end of search" of the first search, it never "leaves" that last spot it got to, but subsequent searches are only performed in context *after* it, and not "wrapping" to search from start of page again.
It should wrap around, and certainly worked. Did you check the 'wrap' checkbox in the dialog?
Oops..silly me. I assumed it would autowrap. (4.7* almost does that, prompts first though)
removing + per pdt sw rules
Whiteboard: [nsbeta3-][p:1][rtm+ NEED INFO] → [nsbeta3-][p:1][rtm NEED INFO]
Moving milestone to Future for now.
Target Milestone: M19 → Future
removed need info and set to -
Whiteboard: [nsbeta3-][p:1][rtm NEED INFO] → [nsbeta3-][p:1][rtm-]
*** Bug 61965 has been marked as a duplicate of this bug. ***
adding myself to CC, hope I can help out.
Keywords: nsbeta3, rtmnsbeta1
OS: Linux → All
Hardware: PC → All
Whiteboard: [nsbeta3-][p:1][rtm-]
*spam* m0.9
Keywords: mozilla0.9
Target Milestone: Future → mozilla0.9
FYI, when I added the replace feature, I made it so that the find code doesn't iterate backwards to figure out where in the doc it is, if the wrap around checkbox is unchecked. This makes searching much quicker. I will look into fixing the wrap around case.
Goodies. Because a major problem with the search feature is that i have to CLICK in a page before search even knows where to search. If i don't give a web-page focus with a click before a search, search will find nothing, regardless of whether wrap is checked or not. The problem with this is that search after a "click in a page" will "sense" that click as an insertion-point of sorts, and starts searching FROM there and onwards - missing all previous occurances of a word.
The focus problem you describe is a front end (nsFindComponent) problem, which belongs to law@netscape.com. Perhaps you should file a bug/rfe on that? Note there are some intersting issues related to just doing a find automatically in the browser content area ... for example who do you give it to if the content area contains multiple frame sets?
How about "if focus not indicated, search them all" ? And then let focus be indicated by a left-click in a frame, or by which frame user bring up context-menu to "search in frame".
this is a performance-oriented bugs - please file a seperate bug for focus issues
I'll try to get to this next milestone. (Mozilla0.9.1)
Target Milestone: mozilla0.9 → mozilla0.9.1
Target Milestone: mozilla0.9.1 → mozilla0.9.2
Whiteboard: [perf]
moving to 1.0 for perf work
Target Milestone: mozilla0.9.2 → mozilla1.0
(I thought I mentioned this earlier) As dbaron mentioned, nsContentIterator needs to be rewritten, I suspect. This is really obvious on stuff like lxr, because we have a very wide and flat content tree. I tried a one element cache, but that didn't help, because the links are two levels deep (the <a> + the text node). We need to keep a stack of {offset, ptr} pairs, and revalidate (in O(1), from the nsVoidArray), to deal with dynamic content. Does that sound about right?
we dont have to do anything for dynamic content, because the iterator is documented not to be safe to use in the face of dom changes while iterating. If we need to speed up the iterator, it can be done similar to the way the subtree iterator was improved (and similar to the way that copy a range was done in nsDocumentEncoder). But it's the statelessness of Find that is making it slow. It could be quick to find adjacent matches with the current iterator.
*** Bug 84652 has been marked as a duplicate of this bug. ***
Whiteboard: [perf] → [perf][nav+perf]
I made a test... Trying to look for "Rosen" in the LXR view of nsCSSFrameConstructor.cpp, I left the machine running. The string is on the 21th line or so. What a hard thing to find.... It took 17 seconds to find it ! And I think I know why (suspense :-) The LXR view of the source is mostly made of a ***13476*** (2001-06-06) lines PRE element. |nsFindAndReplace::DoFind | calls |nsTextServicesDocument::GetCurrentTextBlock|, itself calling |nsTextServicesDocument::CreateOffsetTable|. And that last method just seem to build a string made of the agregation of ALL the text nodes in the PRE element !!!! We should only agregate the text nodes until the total length is for the first time >= length of the string we look for. If the string we look for is not found in that agregation, remove the first text node in this agregation and add the next text node in traversal order. Of course, generated nodes are excluded and the algo should reset the agregation on block-level elements and replaced elements limits. That should reduce the time spent in finding "Rosen" in nsCSSFrameConstructor.cpp from 17 seconds to something quite neglectable.
excellent debugging
We *should* search in generated content, since the user doesn't know whether it is real content or not.
Blocks: 93204
This is a VERY user-visible performance problem (and not a minor problem; it makes Find close to useless and almost always aggravating); adding topperf (IMHO). Nominating for mozilla 0.9.4. There's lots of good analysis here. Someone needs to make this get fixed.
Ok, here's some analysis: Obviously IndexOf is the main culprit here. The question is why (since IndexOf is pretty optimized). The answer is that nsContentIterator::Next() has some optimizations to cache indexes for the pre-order case (parent, then children), but none for the post-order case (children, then parent (depth-first)). And, of course, there appear to be NO uses of pre-order traversal. The uses in the jpref all come from nsTextServicesDocument.cpp (which is part of the Find and Replace code). I may have some sort of patch for this tomorrow.
I have a patch completed. It doesn't help find much because nsTextServicesDocument::GetFirstTextNodeInNextBlock() calls PositionAt() constantly. My change required PositionAt to use IndexOf() to rebuild the stack of indexes. ARGH. I'm going to fix GetFirstTextNodeInNextBlock() to clone the iterator (then throw away the clone) instead of using PositionAt() to rewind it. (and add cloning code to nsContentIterator). Sigh.
The patch I attached is work-in-progress (though pretty good). It seems to work perfectly and is O(1) for Next/Prev, however Find isn't much faster - because nsTextServicesDocument calls PositionAt all the time, and it forces me to regenerate the stack. (*Sob*) I'm going to solve this two ways: first add ability to clone the iterator (generally useful), and modify nsTextServicesDocument to use a clone instead of using PositionAt to rewind the iterator. Second, I'm going to design (and maybe even implement) a smarter PositionAt rebuilder that rebuilds until it hits a common ancestor. This may not be needed if PositionAt isn't used often after I make the first fix.
*** Bug 93204 has been marked as a duplicate of this bug. ***
Worse yet: nsTextServicesDocument uses PositionAt EVERYWHERE. So I'm going to have to implement a fast PositionAt. Still working on it.
Summary: Find on Page extremely slow → Find on Page [and in view source] extremely slow
Really, even once that is fixed, there will still be fundamental problems with the current code, since IIRC what it's using the iterator to do is copy the entire HTML document into a string and build up data for going in reverse from a position in the string back to the content nodes. What we really need is an faster iterator that can be used to lazily walk through the document while looking for the required string. (This algorithm should be designed so that the whitespace-condensing that needs to happen for Find is easy to do.) It would also be good if the searching algorithm were better than O(N*M) in the worst case. Perhaps it would be good to use a Unicode-friendly variant of the superlinear algorithm (|js_BoyerMooreHorspool|) used in jsstr.c (which is O(N/M) in the usual case, and O(N) in the worst case), although see also bug 82054.
Just to let you know, guys, that the owner of the Text Services code, Kin, is on vacation until next week.
This is a current jprof (without any improvements to ContentIterator). This only covers the actual searching. As you can see, it's _quite_ clear. (I wanted to re-verify the older jprofs, and as a baseline for improvements.)
NOTE: that's not a final (or even compilable) patch, it's meant to be for discussion/review. Ignore (for now) the #if 0'd out save/restorestate code, and the code for Clone() (though that may be useful in the end). The base change is to keep a stack of Indexes. The trick is to make PositionAt() fast, since it's called constantly by nsTextservicesDocument. This patch does a smart rebuild of the index stack, looking for the lowest common parent node and stopping the rebuild there. Since almost all uses have the new position near the old one, this should be close to optimal. (Even better would be to not have to do it, but that would require a fancy save/restore state interface (the one commented out isn't fancy enough), or extensive or total rewrite of nsTextServicesDocument.) Note: the last set of changes to this have been compiled but not run. (The previous version has been run.) I may be able to jprof this later tonight; tomorrow at the latest.
Success! I now have a patch that's _dramatically_ faster. On a bunch of searches, IndexOf didn't even show up on the jprof, and the response to a search was basically instantaneous, even at the end of a 350K jprof file. NOTE: Something EVIL is being done in nsTextServicesDocument when you click Wrap Around. Click that, and performance goes down the tubes - even if it doesn't wrap over the end in the search. I'm going to profile that and file a bug against TextServices; that's not a bug in the iterator. I'm cleaning up my previous patches (there was a minor bug in that last upload, by the way) and will post here for r=/sr=.
I was so happy to see progress here I briefly looked at the code, and here are some comments (this is not a full review, this is not my area): 1. You might want to guard against null parameters to functions. 2. In nsContentIterator::Clone() I think you should return out of memory error if new did not succeed. 3. I believe typecasting 0 to (void*) is not needed, but it does no harm either. More like a question of style. 4. Some functions return not implemented error. If they are meant to never have implementation it would be nice to say so in a comment in the function, and if they are supposed to never be called you'd better add an assertion.
Thanks Heikki. Null parameters: I didn't explicitly add any checks, and I don't think I added any vulerabilities to Null parameters. I'll make a pass over and add any checks that seem to make sense. out-of-mem error from clone(): probably; I'll check it. Typecast ((void *) 0) is to quiet compilers about passing an int to something that wants a void*. The Not Implemented errors are for Clone(), which we may want to implement someday for those classes. I may add an assertion, though, to warn someone they need to implement it if they happen to start calling it.
rjseup: couple of comments on your comments replying to heikki's comments :-) - null parameter checks: no need if you trust callers to be C++ only code where there is no interface confusion or programmer incompetence (and the latter can't be cured completely with null checks, as there are many bad pointer values!). If a caller might be JS or another such language, in params that should be interface pointers might be null, so it pays to check in scriptable interface method implementations. In no case should inout or out parameter pointers (NB: not pointer parameters!) be null-checked. - (void *)0: please use nsnull When In Rome. nsnull expands to 0, which is always valid in a pointer context. If you need to disambiguate the type of an actual argument to an overloaded method, please raise a red flag -- we should unoverload such methods on general principles. Also, (void*)0 is not compatible with all null pointers, according to the standard. Function pointers, IIRC, may not be compatible with void pointers. Not germane to your uses of (void*)0 here, but worth mentioning to spread the news. - not implemented failures: use NS_NOTREACHED("method-name-here"), FYI in case that assertion variant wasn't familiar to you. My comments on the patch: 1. Why nsAutoVoidArray mIndexes? What are the stats (mean, variance) on its population? I don't doubt that ancestor lines are typically short, but I'd like to see numbers for top100 or similar docs. 2. Why strongly type the nsAutoVoidArray *aIndexes parameter, when it could be an nsVoidArray*? Should it not be const nsVoidArray& instead, to better show that it is an in param? Ok, if the answer to 1 is "the population almost always fits in an nsAutoVoidArray", then it makes sense to use the narrowest type, but nsAuto* in a formal parameter type is usually over-narrow. 3. NS_ERROR_OUT_OF_MEMORY, not NS_ERROR_FAILURE (er, what heikki said -- sorry, didn't mean to repeat). 4. Ah, I'm slow: you want to cast small-ish integers to void* to store them in an nsVoidArray. Too bad we never unified the Uint32Array alecf and/or roc were hacking on under xpcom/ds. There are nasty 64-bit portability issues here, see bug 20860, and maybe you can beat that patch in and help it by including the NS_PTR_TO_INT32 and NS_INT32_TO_PTR macros in your patch here? 5. Style nit: bracing then against dangling else may be prudent (I think not), but if the else clause returns, why not invert, so that instead of + if (!mPre) + { + *aSibling = parent; + } + else + return GetNextSibling(parent, aSibling); you have + if (mPre) + return GetNextSibling(parent, aSibling); + *aSibling = parent; Same for the GetPrevSibling case later, and maybe for similar if/else structures that do not contain early returns, but which might want to test mPre rather than !mPre for symmetry. 6. Another style nit, already in the code before your patch but I thought I would ask: why is nsCOMPtr<nsIContent>() better than nsnull, these days? Long ago, one needed null_nsCOMPtr() or an otherwise default-constructed temporary, but no longer. 7. Any thoughts on whether nsDeque should be revised to extend nsVoidArray? At the least, the stack-like uses in your patch could use some Push, Pop, Top, and ReplaceTop sugar-methods. Thanks for digging into this -- keep the patches coming! /be
More information on why Wrap-Around is slow: it's spending ~85% of it's time in nsTextServicesDocument::PrevBlock for some unknown reason. I can't imagine why it would do this, since I'm searching forward, and I never actually wrapped. I'll file a separate bug against that. At this point, unless I believe there's reason not to, I use nsAutoVoidArrays. In this case I think it's fairly likely it'll fit. Also, I doubt we'll have more than a couple of nsContentIterators live at any given time. The strong typing on the Constructor method probably isn't needed, but that's really an internal-only constructor for use from Clone() (in fact, I should make it protected or private I think). I'll look at using those macros. nsVoidArrays are used for integers elsewhere as well; and yes I agree it's annoying. (Better would be for it to be a template class to provide type-checking/etc - sicking was going to spec out a design and post it to n.p.m.xpcom). Style nits: I'll look at them. I coded it that way because it fit the old logic better, and so it was easier for me to check the correctness of my rewrite. I'll add the nsnull's. I've never had a chance to look at deque... Sounds like it would be a good merge with voidarray perhaps. Thanks brendan.
If 'wrap around' is on, it has to call nsTextServicesDocument::PrevBlock a bunch of times to get the index of the starting block, which is remembered so that searching can stop when that index is reached again after searching the entire document.
This is the culprit for wrap around: nsFindAndReplace::GetCurrentBlockIndex() which is basically: do { aDoc->PrevBlock(); blockIndex++; } while (!at start of document); As you can see, that's REALLY bad when you're deep in a large document. It's accentuated a LOT by the implementation of nsTextServicesDocument::PrevBlock, which calls the various {Get,Find,}FirstTextNodeIn{Prev,Current,Next}Block() many times to do a PrevBlock. Even a trivial (and still expensive) recoding of the above to iterate forwards from the start until it was in the same block would be vastly faster. The best solution would be to modify the interface to nsTextServicesDocument to support a wrap-search more directly, so we don't have to find the current index in nsFindAndReplace the very hard way.
Brendan: I left the if (!mPre)/etc code alone. I think in context it makes more sense (to the eye, given the clauses above and below) the way it is. It also should be just as efficient given a reasonable compiler. Ready for r/sr I believe. Just about everything else mentioned was done; I couldn't find any additional places null checks are needed (there are quite a few already).
Why no 'const nsAutoVoidArray& aIndexes' parameter type change for nsContentIterator's ctor? This is a little more substantive a nit than the others I picked, I think, although it's not a big deal. /be
Forgot that one, sorry. imoT found a bug introduced in selection by this (nsContentSubtreeIterator didn't properly update the index stack). Fix for that is in testing.
The subtree bug is semi-fixed. No longer crashes, generally works but produces one weird behavior. I'll update the patch tomorrow once I've shaken it out for a day or so.
Ok, that patch updates all the subtree code - it was making assumptions about got GetNextSibling/etc work that I broke. I've verified that drag selection works correctly again (imoT's problem) and all highlighting. I also ran debug code (in there, compiled out) to verify all nodes returned in subtree Init and Next against an unmodified nsContentIterator, and they all check out correctly. r=/sr=?
Blocks: 91351
Updated per kin's comments in IRC. I also found by inspection a very subtle issue about how endnodes are handling in subtrees that may have been entirely accidental. I restored the old behavior and marked with XXX and comments.
kin is reviewing the latest patch (emailed to him); he'll respond tomorrow. If it's ok by him I'll add it here. Sorry about all the patches, I was modifying and uploading as he made requests.
A couple of questions/comments/issues I see so far in the 08/17/01 09:46 atatchment: * First() and Last() don't reset mIndexes. They can be called at any time just like PositionAt(). * nsContentIterator::Init() doesn't build up an mIndexes array necessary for an immediate call to Next() or Prev(). Imagine the following example: <div><b>bold</b> some text <i>italics</i></div> The DOM tree for the above example would look like this: <div> | | +-------------+-------------+ | | | | | | <b> " some text " <i> | | | | "bold" "italics" Now suppose the start container and offset in the range was the text node containing "bold" and the end container and offset was the text node containing "italics". Walking through the Init() code by hand, I think when we exit, we have mCurNode == the "bold" text node and mIndexes contains only one item, a zero. Wouldn't that create an error situation if Next() was then called, since there aren't enough indexes in the array to get to " some text "? * In nsContentSubtreeIterator::Prev() we can avoid the expensive calls to RebuildIndexStack() if we just used a temp index array which starts out as a copy of mIndexArray. Then we could just override the contents of mIndexArray with the contents of the temp array in the places where things succeeded. Isn't copying the index array much cheaper on average than rebuilding the indexes? * GetTopAncestorInRange() seems to remove everything except the first item in aIndexes while parent is in the range. I'm not sure that's correct? We need to keep in mind that mCommonParent can be higher in the parent hierarchy than the outAncestor returned by this method, so the number of items in the index array when we return should be equiv to the number of parents between and including mCommonParent and outAncestor right? This would insure that we had enough info in the index array to avoid calling IndexOf() when using the various utility methods. * The more I look at this, the more I understand jfrancis' (the original author's) code grouping. GetNext/PrevNode() was supposed to be a utility method that decided what to return next based on Pre/Post iteration, and GetNext/PrevSibling() was supposed to be a simple utility method that got the equivalent of the next sibling as if the entire tree were flattened. With your changes some of the Pre/Post iteration is still done in GetNext/PrevNode and some of it is done in GetNext/PrevSibling() which I think kind of clouds the logic. Also, when used by the SubtreeIterator, most of this Pre/Post code in GetNext/PrevSibling() is turned off, basically restoring it to what jfrancis originally had.
First, Last: Obvious mistake, sorry. Fixed to basically just call PositionAt(), which does an optimal partial-rebuild of the index stack. Note: This meant that the SubtreeIterator needed to support PositionAt (or rather not override it). I couldn't see any reason not to. If we want to keep PositionAt non-public in subtree, we could make it protected, or override First and Last and do Rebuilds, or just decide it's not important and do rebuilds for First/Last in all cases. nsContentIterator::Init(): I think you're right, because of the call to find the common ancestor. Now it ignores the index array until it either calls MakeEmpty(), or it succeeds and calls RebuildIndexStack. nsContentSubtreeInterator::Prev: the RebuildIndexStack calls are more expensive than copying an array, but they should only happen when we hit the beginning of the subtree or something else goes wrong. Copying the index array on every Prev call would be more (total) overhead than rarely rebuilding the index stack. IMO GetTopAncestorInRange: cut-and-paste typo. It was only being called with an actual array parameter from Prev(), and most code doesn't call Prev on the subtree iterator. That should have been: aIndexes->RemoveElementAt(aIndexes->Count()-1); (Remove the last element) The reasoning for moving code from Next/PrevNode() to GetNext/PrevSiblingwas to unify the code; at the time I did it I was concentrated on nsContentIterator, and didn't realize that the SubtreeIterator used them as well. I did it mostly to unify the logic, which was almost identical for Pre and Post, with just minor differences. BTW, there is NO code that uses Pre that I can find. Given that Subtree needs what amounts to the Pre version of GetNextSibling (and the Post version of GetPrevSibling), it may make sense to use just the one parameter, so NextNode becomes: ... // else next sibling is next return GetNextSibling(cN, mIndexes, ioNextNode, mPre); } and change the conditionals in GetNextSibling from (!mPre && !aNoParents) to (!aNoParents). (Similar for GetPrevSibling). Or we can simply re-split GetNextSibling(), and have a bunch of (almost) duplicate code in NextNode. (Ditto for GetPrevSibling() and PrevNode). I tend to prefer parameterized code to duplicate slightly-variant code, but that is just personal preference. (I find it easier to maintain when you don't have to make similar changes in several different places.) My apologies for missing some of these things, and making so many go-rounds. I was very focused on finding some way to save cycles in what was a really nasty hotspot, and didn't take as much time to look over the whole usage as I should have, plus I started out not understanding how this class was used. The patch is much better now due to all your comments I believe. (And imoT's testing)
Reading through the comments here, I think one assumption people (including me) have (mostly) been making, that no one changes the content tree out from under the iterator, is incorrect, at least in editor and maybe others (compose). DeleteSelection(), InsertText(), etc modify the content tree. We have to adjust any cached indexes when this happens. I'm working on this - I plan to add a RebuildState() method to be called after modifying content (such as from nsTextservicesDocument::AdjustContentIterator(), or ::InsertText() (add a call to Adjust, etc)). ALternatively, I could add some double-checking all the time (verify the indexes using GetChildAt() on _every_ call, all the way back up the stack). This would obviously be expensive, though not as bad as the original code (IndexOf). Or we could set a bit in the contentiterator to tell it to rebuild on the next call (a bit of a pain). Or allow the iterator creator to specify if we keep an index stack (and thus guarantee to not change the content, or to tell us if it does). The question becomes: do we know who may modify the document while we're searching it? What about dhtml? Find in a still-loading page? How does deletion/insertion affect subtree iterators (and in particular their other start/end arrays)? TextServicesDocument should be not too hard (modulo the range issue). MAN O MAN do I wish that content nodes were doubly linked to siblings. Then this would ALL be moot.
BTW, thinking about it, the "check on each call" isn't _that_ expensive. It's like this: if (GetChildAt(parent,mIndexes[mIndexes.Count()-1]) == mCurNode) indx = mIndexes[mIndexes.Count()-1]; else indx = parent->IndexOf(mCurNode); indx++; ... mIndexes->ReplaceElementAt(mIndexes.Count()-1,indx); This would only call IndexOf if our index was wrong, and GetChildAt isn't _too_ bad (though it's still cost). It's also guaranteed safe in the face of content changes, so long as the current node (and parents) isn't deleted, which is already dealt with in TextServicesDocument. BTW, that issue (deletion of the current item requiring resetting the iterator position) means that the only issues are insertion (which is always safe), and places that are already calling PositionAt (or equivalent) in case the current node is no longer valid. This means basically it's only TextServicesDocument we need to worry about. I think.
While I still haven't actually looked at the patch, that suggestion seems fine for verifying the tree. I forget who I discussed this with on IRC a couple of months ago (smfr, maybe?), but I think thats what we came up with, GetElementAt should be O(1) - its a simple array lookup. You really can't get much faster than that, I think. If nsVoidArray::ElementAt was moved into the header file so that it could be inlined, you'd probably even eliminate the function call and indirection (mImpl), turning this into one addition + one pointer dereference. Is there a reason that thats not done? What if the current node is reparented, so that the cached parent is still at the same place, but we are not its child anymore? Can this happen? (drag and drop?) wrt generated content, see the 2001-06-03 01:05 comment.
Since we'd only be verifing the current node's index, there's no problem with the parent's index being moved. The parent's index in the array would be wrong, but that doesn't matter until we go up a level - and then we'd end up reverifying that index with it's parent. This would be O(1), as you say, though with a larger K - but not large. Note that if there were a chance that the current node were deleted, code using the iterator would have to use PositionAt (as DeleteSelection does). The only users of PositionAt are TextServicesDocument (which we understand - there are two things that modify the content there; DeleteSelection and InsertText, though it's possible that callers might modify it) and nsHTMLEditor.cpp, which is just using it to set an initial position. However, as I mentioned, callers to TextServicesDocument might add things (or delete things other than the current node), and of course they might add things. The safest solution is as per my comments above - reverify the index before each increment. Patch to be posted. (I'm afraid there might be an award for a bug with the most versions of a patch...)
Patch as per my last comment; reverifies index on each call. Note that this does not change the issue that if you remove the current node from the tree the iterator will become confused. The only code I know that deletes while iterating (nsTextServicesDocument::DeleteSelection) checks for that and PostionAt's to move the iterator to a valid node. A separate bug to inline ElementAt/IndexOf/etc has been filed as bug 96108.
Assignee: kin → jfrancis
Status: ASSIGNED → NEW
jfrancis will take ownership of this bug. discussed in performance meeting today.
095
Status: NEW → ASSIGNED
Target Milestone: mozilla1.0 → mozilla0.9.5
I'm still plugging away at this. Should have something worth testing in the next day. I'm making RJ's index stack approach only apply to nsContentIterator. For nsSubtreeIterator it shouldn't make a very big win over a trivial "one level" cache, and it complicates the code quite a bit. The reason it shouldn't make a big differentce is that the subtree iterator doesn't pop down and then back up the way the regular content iterator does. Example: You are iterating over: 0---0---0---0---0---0---0---0---0---0---0---0---0---0---0---0---0---0---0 | | | | | | | | | | | | | | | | | | | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | | | | | | | | | | 0 0 0 0 0 0 0 0 0 0 In the content iteraotr, you will be constantly diving down into the children, and then back up to the parents (assuming post order iterator). With the subtree iterator, you will never touch the children, and will just hit the top level parents in order. So a simple cached index (instead of stack) should be sufficient and less complicated.
This patch is a more minimal implementation of RJ's index caching scheme. I get underflows when testing it with Find. Selection and editor actions (which both beat on the iterators) seem to work fine, though. I'll debug the underflow issue, but I wanted to post the patch now so others could begin commenting.
The underflow is from PositionAt() I'll bet, since that changes mCurNode, and I don't see any mods to it. That might be ok - the code can rebuild the stack in a lazy manner after PositionAt; it might even be more efficient given the frequent PositionAt calls - however, since PositionAt usually doesn't move the iterator far, my PostionAt code was pretty efficient at rebuilding the tree. My tests indicated that PositionAt is called so often that even a simple "IndexOf() back to the root" on each one was way too expensive, which was why I made the more complex "rebuild only changed part of the index stack" version. It you do a lazy rebuild, though, you'll need to at least clear the index array in PositionAt. I advise benchmarking it against mine before finalizing on that.
Thanks for feedback. I hadn't examined kin's textservices code and didn't know he was using PositionAt. I'll mix back in your work there - don't want to remove juicy bits when we need the juice!
latest patch. This one is pretty real, functionally. I have incorporated RJ's PositionAt() optimizations. Thanks to RJ for doing all the heavy lifting here. I also removed the two places I accidentally left unneeded IndexOf() calls in my previous patch. Seat-of-pants testing on large lxr files reveals that the performance of this patch is on par with RJ's final patch. I also didn't have any correctness problem in the editor or selection, or even hit any cache misses doing Find. Kin has some changes he wants that I have not yet incorporated. If folks would start testing and putting any reqested changes into the bug that would be great.
Looks good to me; I'd be willing to r= it (having been through the code thoroughly before).
*** Bug 99881 has been marked as a duplicate of this bug. ***
this latest patch incorporates review requests from Kin. I think we are ready to rock. Kin, are you ready to sr?
Whiteboard: [perf][nav+perf] → [perf][nav+perf] fixinhand; need r,sr
Comment on attachment 49530 [details] [diff] [review] content/base/src/nsContentIterator.cpp take#3 Looks good to me r=rjesup@wgate.com
Attachment #49530 - Flags: review+
above patch has some fnal tweaks from Kin's super review. Cleanup, comments, and minor changes only.
Comment on attachment 49844 [details] [diff] [review] content/base/src/nsContentIterator.cpp sr=kin@netscape.com
Attachment #49844 - Flags: superreview+
Attachment #49844 - Flags: review+
Whiteboard: [perf][nav+perf] fixinhand; need r,sr → [perf][nav+perf] fixinhand; ready for checkin
Can we commit this soon? I think we're done with r/sr...
fix landed on trunk
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
I tested this with great results on: mozilla.org View Source of mozilla.org slashdot.org View Source of slashdot.org lxr.mozilla.org/mozilla/source/widget/src/windows/nsWindow.cpp It takes about a second to search for "e" on the lxr page, but that's almost 6000 lines of code. It took the same amount of time to search for "ScheduleHookTimer", which doesn't appear until line 5634. I'm running a PII-500. I find the time acceptable, given the enormity of the page. I wanted to test this on View Source of the lxr page, but couldn't because View Source of that page maxes out my CPU. So, looks really good on Windows using 2001092403. Anyone for verifying this using Linux and Mac? Nice work, guys! This fixes one of the large usability complaints I have with Mozilla.
A vast improvement, testing on Linux, new CVS. Ran a query in bugzilla that returned 6583 hits: Bugs of all resolutions cept closed, with "window" in summary. (btw.. there was no hang after it had loaded!) Then searched for the word "work". Unlike before, there was now no slowdown when finding the next occurance of the word: The time it took between finding the 399th the 400th occurance of it, was as short as between finding the 99th and 100th. And the time it took between clicking "find" till the word was found was comfortably close to instant after each time. This is fixed. Thank you all.
(Damn mid-air collisions...) On linux (k6-2 300 w/ 256MB) a search for "for" on this page took about less than 1/4 of a second on search click of "find" regardless of which half it was in. View Source of this page, same search, each time it took three seconds to find each "for" word. This was again consistent throughout the length of the "page" being searched. I wonder if find in View Source cool be speeded up? I noticed just opening View Source on this page took several seconds (for the content to be displayed), so maybe it's an entirely different animal. Thoughts?
A view source document, with highlighting on, is on the order of 10 times bigger (just raw source wise) than the document it's the source of. For documents with lots of markup, 20 times is not uncommon. There's just a lot more nodes to deal with there....
Just tested on linux on view-source:http://lxr.mozilla.org/mozilla/source/widget/src/windows/nsWindow.cpp with syntax highlighting _on_ Searching for "ScheduleHookTimer" Without patch: 3 mins, 40 secs With patch: 6 secs I'd say verified on Linux. :)
(Damnit another mid-air...) Yup with highlighting turned off it's lightning fast in view source. Perhaps something the View Source maintainers can think about some time...
Nominating for nsbranch+. This is a significant user-noticable difference. (Will PDT notice this if it's marked Resolved? I'm guessing not. jfrancis: if you're willing to entertain putting it on the branch, please reopen to put it on PDT radar. If you're not willing, just remove nsbranch+. Thanks.
Keywords: nsbranch+
I'm just wondering, I tried searching this bug 31770 page in the browser for the word "the" is much faster with today's build on W2K than the search in view source for word "the" could this be because of a page is in memory or cached?
reopeneing to get pdt attention for 094 nomination
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [perf][nav+perf] fixinhand; ready for checkin → [perf][nav+perf] fixed on trunk; not fixed on 094
Much as I agree this should land on the branch, it hasn't been awarded any official nsbranch+ yet. Still in nomination so fixing the keyword. CCing jaimejr for nsbranch+
Keywords: nsbranch+nsbranch
Marking nsbranch+
Keywords: nsbranchnsbranch+
Note that bug 101710 may be a regression from these changes.
bug 101710 is a regression due to this patch; I have attached a patch to fix bug 101710 to it.
Updated status
Whiteboard: [perf][nav+perf] fixed on trunk; not fixed on 094 → ETA: fixed on trunk; waiting for approval for 094. [perf][nav+perf]
Note: the patch which fixes bug 101710 must be checked in to the 0.9.4 branch if the patch for this bug is checked in.
It's a little too late to take this one. nsbranch-
Keywords: nsbranch+nsbranch-
Correcting for the right process. Leaving as nsbranch+, but this is PDT-.
Keywords: nsbranch-nsbranch+
Whiteboard: ETA: fixed on trunk; waiting for approval for 094. [perf][nav+perf] → [PDT-] ETA: fixed on trunk; waiting for approval for 094. [perf][nav+perf]
marking fixed again since we are not going to land on 094 branch
Status: REOPENED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
okay, adding vtrunk kw, since this won't be checked into the branch. unless i'm misunderstanding joe's last comment. :) if somehow this *does* get checked into the branch, do remove 'vtrunk' so that it can get back onto my radar for branch verification. thx!
Keywords: vtrunk
verified using boris' test case described in 2001-09-24 12:20 --tested Find in both the web page and the source windows, for grins. linux, 2001.10.18.08-trunk comm: 1.97s [page], 10.65s [source] winNT, 2001.10.18.06-trunk comm: 1.34s [page], 6.28s [source] mac 10.1, 2001.10.22.13-trunk comm: 1.44s [page], 6.19s [source]
Status: RESOLVED → VERIFIED
Keywords: vtrunk
Product: Core → Mozilla Application Suite
Component: XP Apps: GUI Features → UI Design
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: