Closed
Bug 74656
Opened 24 years ago
Closed 20 years ago
Optimizations possible in AppendLines
Categories
(Core :: Layout: Images, Video, and HTML Frames, defect, P5)
Core
Layout: Images, Video, and HTML Frames
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: hyatt, Assigned: shaver)
References
Details
(Keywords: perf)
Attachments
(17 files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/plain
|
Details |
Frame lists need to keep a pointer to their lastChild. Because they don't do this, content insertion and appending is preposterously slow (especially when a large # of insertions/appends are done).
Comment 1•24 years ago
|
||
I foggily recall that we hashed through this before and decided it would be a Bad Thing to add more pointers to nsContainerFrame. Would it be possible to detect that the child list has grown past a certain limit (say, 16 children) and add a frame property?
Reporter | ||
Updated•24 years ago
|
Status: NEW → ASSIGNED
Summary: 20% of reply time spent in O(n^2) LastChild retrieval in frame code → 10% of reply time spent in O(n^2) line walking algorithm in nsLineBox
Target Milestone: --- → mozilla0.9.1
Reporter | ||
Comment 2•24 years ago
|
||
nsLineBox needs to be improved. The implementation is a study in inefficiency.
Reporter | ||
Comment 3•24 years ago
|
||
I can hack around the frame problem, but nsLineBox has to be patched up. Its implementation is ridiculous.
Comment 4•24 years ago
|
||
nsLineBox has to be rewritten. In fact, how about nsBlockFrame too while we are at it? Considering where these live in the food-chain, I think we would do well to get them as fast and clean as possible. Chris, I like your idea of creating a lastChild pointer when the child list gets long (I was thinking much longer than 16, but it could be tuned). Isn't accessing frame properties pretty slow though? I know accessing the View for a frame is via a frame property and it shows up occassionally on profiles or really large documents, which was initially surprising to me. Also, would the lastChild property have to be per-child-list, or would it just be for the default child list?
Comment 5•24 years ago
|
||
(Tangent Alert.) Accessing frame properties should be O(1), but it isn't, thanks to troy's baloney-headed nsDST. I think (but haven't proven) that although intended to be O(log n), it degenerates because of the way it uses the pointer bits. (Using pointer bits to generate a tree is silly. If you do it left-to-right, you're screwed, because the high bits are likely to all be the same: locality. If you go right-to-left, you're screwed, because the lowest bits are all going to be very similar: alignment.) Anyway, I've been looking for an excuse to replace it with pldhash. Point me to a profile!
Reporter | ||
Comment 6•24 years ago
|
||
I can show you plenty of profiles where GetPrimaryFrameFor shows up as a significant amount. It's obviously *not* O(1) because of the dumbass desire to not use a hash table.
Comment 7•24 years ago
|
||
You correct, we did hash through this before, a few times. I pointed this out in August and in January. Both times, all involved thought that the extra bytes were not worth the O(n^2) (I have the email logs, in case anyone wants the info from that heated debate). I disagreed then, and am I glad to see that people are finally seeing the light (even if I am not credited with the find anymore). Go for it, fix this problem (there are a few O(N^2) pieces of code, all of which will fall at some point, I hope).
Comment 8•24 years ago
|
||
hyatt: recall that GetPrimaryFrameFor() often needs to search for frames. Not all frames are hashed (er, dst-ed) from the get-go. It would be interesting to see how much of GetPrimiaryFrameFor() is spent in nsDST::Search() (or whatever).
Assignee | ||
Comment 9•24 years ago
|
||
I dunno about redoing nsLineBox, but I am right now working on making frame properties much cheaper (in time and space), and then making nsLineBox use a frame property to track its lastChild/lastLine. Hyatt: feel free to assign this to me if you want.
Comment 10•24 years ago
|
||
[Just want to re-iterate that if rewriting ns[Block/Inline]Frame is in the air anytime, a fundamental change that I suggest to incorporate is bug 64763 so as to get rid of the cruff within the PerFrameData stuff.]
Assignee | ||
Comment 11•24 years ago
|
||
Waterson and I are no longer certain that this will be a space win, but I have a patch anyway. I'm going to Do The Math on this stuff shortly to see what our real-world frame-property memory usage is with the current DST stuff and compare to a pldhash implementation. Waterson: as far as the effects of alignment on the bit patterns, it looks like the DST code tries to skip the lower few bits of the pointer.
Assignee | ||
Comment 12•24 years ago
|
||
Assignee | ||
Comment 13•24 years ago
|
||
<hyatt> take it off my list! OK, Dave. The patch that I'm about to attach WORKSFORME, though I haven't run the layout block-frame regressions, and I haven't run any profiles yet. Waterson, honey, can you profile for me?
Assignee: hyatt → shaver
Status: ASSIGNED → NEW
Assignee | ||
Comment 14•24 years ago
|
||
Assignee | ||
Comment 15•24 years ago
|
||
Oh yeah: I looked at the frame properties stuff, and we usually don't get more than a few dozen properties. We _do_ often ask for the same (frame,prop) pair repeatedly in succession, sometimes more than 900 times (!), so I have a little patch that adds a one-entry quick cache to avoid consulting the DST for those cases. I'm going to clean it up, and make PropertyListFor move the found list to the head, and then I'll post it here. It won't change the world, but every bit counts, right?
Status: NEW → ASSIGNED
Target Milestone: mozilla0.9.1 → mozilla0.9
Assignee | ||
Comment 16•24 years ago
|
||
Marc: actually ViewProperty was one of the hot spots here, so my little patch could help those cases a fair bit. Do you have test cases we should use for before/after profiling?
Assignee | ||
Comment 17•24 years ago
|
||
Assignee | ||
Comment 18•24 years ago
|
||
The latest patch is for the 1-entry cache I mentioned, which is only tangentally related to this bug. If people want, I can break it out into another. Marc: if you're running profiles on those big pages again, do you want to see if this helps take GetFrameProperty out of the picture? I was tempted to make the FrameHashTable stuff use dhash, but I think it's better to wait until brendan's done with bug 72722 and just convert to nsHashtable there. Comments and review solicited for my latest two patches. (I haven't Done The Math on memory usage for dhash vs. dst for frame properties yet, so I'm not ready to put the first one in.)
Assignee | ||
Comment 19•24 years ago
|
||
Assignee | ||
Comment 20•24 years ago
|
||
The first block-frame patch didn't invalidate the cache when we removed frames, which caused quite a bit of trouble. This one should work much better (I was editing documents with it for a while before I destroyed my tree).
Comment 21•24 years ago
|
||
Looks pretty good. - You add two words to each block frame. It'd be better if we could conditionally cache this when we knew it'd be a big win (i.e., block frame has many lines). (That's my biggest complaint.) - Not sure why you needed to add GetLastChild(). Couldn't you have just used the extant LastChild() method? (Also means less damage to nsBlockFrame.h) - In this part: + if (appending) { + prevSibLine = mCachedLastLine; it's not immediately obvious that mCachedLastLine is guaranteed to be valid because of your previous call to GetLastChild() to compute the boolean ``appending''. You should put a bright red comment here. - Rename GetLastLine() to ComputeLastLineFrom() or something. Heck you only call it from one place. Why does it need to be a function at all? Instead of... + // make sure we're (still) tracking the last line + nsLineBox *line; + if (mCachedLastLine) + line = mCachedLastLine; + else + line = mLines; + line = GetLastLine(line); how about: nsLineBox line = mCachedLastLine ? mCachedLastLine : mLines; while (line->mNext) line = line->mNext; - Local custom... . names pointers as |nsIFoo* foo|, not |nsIFoo *foo| . uses |nsnull| for null pointers, as opposed to bare zero . is not sparse on vertical whitespace (e.g., ``We check here'')
Assignee | ||
Comment 22•24 years ago
|
||
GetLastChild and LastLineFrom are both remnants of earlier revs, with much more debugging and validation cruft in them. They can (and should) go. I'll comment and morph to dominant code style (bad me!). After I do that (and post another patch), I'll look at using a frame property if someone can tell me how big ``big enough''is, and whether we just count lines or lines+children-in-last-line, etc.
Comment 23•24 years ago
|
||
Well, the problem hasn't been noticed except in editor and in a DOM case that edward cooked up, that inserted/appended one node at a time. So, that makes me believe that mainline layout is not typically suffering here, and is why I'm being stingy. :-) Could you mangle your existing patch you could play with different values of `n', and empirically determine ``when you start to notice the evil behavior in editor''? If I had to pick a number out of my ass, I'd say something like 32 or 64 should be a reasonable threshold at which to start caching the last child.
Comment 24•24 years ago
|
||
...and I think we should just count lines. You can count the lines in nsBlockFrame::ReflowDirtyLines(), and when the number of lines exceeds `n', set a new frame state bit (maybe NS_BLOCK_HAS_CACHED_LAST_CHILD) that tells your caching code to keep the properties up-to-date.
Assignee | ||
Comment 25•24 years ago
|
||
OK, that's a good tip. I'll whack up some environment setting that gets used to set `n' and you can try different values with your choice of test sets!
Assignee | ||
Comment 26•24 years ago
|
||
So, as I work on making all of this stuff tunable, I start to wonder if counting lines is the right approach at all: aren't we more concerned with the access model (lots of appends causing lots of LastChild calls, and therefore lots of list-walking) than the number of lines? If we just do a few big AppendFrames calls, we can put together a large document without too much pain, right? (I also wonder if perhaps we shouldn't just be inverting the list and storing the _tail_, since appending would seem to be a much more frequent operation than prepending. But I haven't thought much about that yet.) I'll post a patch in an hour or so that only caches the last line -- caching the last child would cost us an additional two words per caching-enabled block frame, and waterson says that lines are usually pretty short anyway -- and we can discuss better heuristics.
Assignee | ||
Comment 27•24 years ago
|
||
Assignee | ||
Comment 28•24 years ago
|
||
Best patch yet! Set BLOCK_FRAME_LINE_CACHE_THRESHOLD in your environment to tune the point at which we switch to using a frame property (for last line, not last child -- I now see that I should rename the frame-state flag). Seth, Edward, do you want to play with settings here and see what effect they have on your specific test cases?
Comment 29•24 years ago
|
||
collision: http://lxr.mozilla.org/seamonkey/source/layout/html/base/src/nsHTMLParts.h#44 +#define NS_BLOCK_HAS_CACHED_LAST_CHILD 0x00100000 #define NS_BLOCK_SHRINK_WRAP 0x00100000
Comment 30•24 years ago
|
||
Seems to me to be a rather large patch for what I thought to be a simple change. I guess I do not understand why we just do not add the 4 bytes for a tail pointer and be done with it. The amount of code needed for this patch and the time overhead for the extra calculations seem to more than outweigh adding 4 bytes. Please correct me if I am wrong, for I just want to know why? Thanks.
Comment 31•24 years ago
|
||
Well, the patch at 04/10/01 07:21 isn't *that* much larger than the patch from 04/07/01 20:47 (which just keeps a `last' pointer, all the time). And, it's probably a bit larger simply because shaver added some stuff to help pick a sweet spot. In defense of not doing it all the time (since I lobbied for it), a large web page could have tens or even hundreds of thousands of block frames. So, keeping block frames as small as possible is important. We don't run in to this performance problem in the ``normal'' course of layout. So why pay the penalty in space unless there's a win in time?
Comment 32•24 years ago
|
||
I like it, I like it. Nits below. +static int CachedLineThreshold = 0; + gCachedLineThreshold? +#ifdef DEBUG + fprintf(stderr, "BLOCK_FRAME_LINE_CACHE_THRESHOLD is %d\n", + CachedLineThreshold); +#endif All kipps other debug printf()'s just go to stdout. (Or is this code just in there until we figure out what the ``sweet spot'' is?) +inline void +nsBlockFrame::SetCachedLastLine(nsIPresShell *aPresShell, + nsIFrameManager *aFrameManager, + nsLineBox *aLine) Why inline? + // If we fail to set it, clear the cached flag, because something + // is screwy in the cache and we'd rather be slow than crash. + mState &= ~NS_BLOCK_HAS_CACHED_LAST_CHILD; Is this really necessary or even sufficient? Seems like LastLine() can recover gracefully without this. + if (!mLines) + return nsnull; + + // If we don't have a cache, we walk lines + if (!(mState & NS_BLOCK_HAS_CACHED_LAST_CHILD)) + return nsLineBox::LastLine(mLines); + + nsCOMPtr<nsIFrameManager> frameManager; + aPresShell->GetFrameManager(getter_AddRefs(frameManager)); + if (!frameManager) + return nsLineBox::LastLine(mLines); + + nsLineBox* cachedLine; + if (NS_FAILED(frameManager->GetFrameProperty(this, + nsLayoutAtoms::cachedLastLine, + 0, (void **)&cachedLine))) { + return nsLineBox::LastLine(mLines); Flow of control a bit muddy here. Why not just do nested if()? (It'll only end up being three deep, and we've got a c-basic-offset of `2'...) + // the call to LastChild will ensure that the line cache is up to date + PRBool appending = (mState & NS_BLOCK_HAS_CACHED_LAST_CHILD && + aPrevSibling == LastChild(presShell)); + Does bitwise-and have precedence over logical-and? I always forget. Parens? + * Frame state flag indicating that this frame is using a frame property to + * cache its last line and child. See LastChild below and in nsBlockFrame.cpp + * for details. + */ +#define NS_BLOCK_HAS_CACHED_LAST_CHILD 0x00100000 + +/* what rbs said; this should really be in nsHTMLParts.h (it's not obvious to me *why*, but that's where the other flags are). + void InvalidateLastLineCache(nsIPresShell* aPresShell) { + if (!(mState & NS_BLOCK_HAS_CACHED_LAST_CHILD)) + return; + nsCOMPtr<nsIFrameManager> frameManager; + aPresShell->GetFrameManager(getter_AddRefs(frameManager)); + if (frameManager) + frameManager->SetFrameProperty(this, nsLayoutAtoms::cachedLastLine, + nsnull, nsnull); + } Does nested if make flow-of-control simpler?
Comment 33•24 years ago
|
||
Also, when you check in, should we just guess at a sweet spot for [g]CachedLineThreshold? Say, 64 to start?
Reporter | ||
Comment 34•24 years ago
|
||
This seems overly complex to me. Wouldn't it be simpler to not worry about the threshold stuff?
Comment 35•24 years ago
|
||
For the reasons I cite above, I will accept the *very marginal* increase in code complexity.
Assignee | ||
Comment 36•24 years ago
|
||
gCachedLineThreshold it is. Of course it's only there until we find the sweet spot! Who would leave debug spew like that in for any length of time? =) Dunno why inline. Not inline now! LastLine _can't_ recover from the case where it has a line cached that is no longer part of the block frame at all, and if I don't know what the state is in the DST, I don't want to trust it. Maybe that can't really happen, but I'm treading very lightly here, and I had a spare belt that matched my suspenders. I'll wield the mighty Nested If, yes. And the mighty parens, just for you! When (if?) I check in, I'm leaning towards leaving gCachedLineThreshold unset, but only because I don't understand where the number ``64'' comes from. Here's a question I want to ask of you: what %age space-waste is acceptable for this? Then I can set gCachedLineThreshold such that the frame property is < n% of the BlockFrame + its LineBox children. 3%? 5%? 10%? (Don't cheat and work backwards from 64!) New patch coming right up.
Assignee | ||
Comment 37•24 years ago
|
||
Assignee | ||
Comment 38•24 years ago
|
||
Anyone interested in this patch? Anyone dislike it? Mozilla 0.9 is coming soon, and I want to know if I should just walk away from this or spend precious end-of-release cycles making it appropriate for checkin.
Comment 39•24 years ago
|
||
Looked at the patch and it is good to go. Will give r=rbs so that waterson can give sr for you to checkin. Some nits: ===== +static int gCachedLineThreshold = 0; Since the whole point of the patch is to improve the perceived perf, maybe a non-zero value should be used to start with. It is not going to benefit the end-user if 0 is kept (the env variable is undiscoverable). In the meantime, the hunt for the optimal value could continue. ========== + // XXX Use cached last-line data to speed up the line and other checks + // XXX below, instead of just invalidating + InvalidateLastLineCache(presShell); Since 'XXX' more often than not implies: a todo, an uncertainty or doubt, a problem with an edge-case, I will remove the 'XXX' and just say 'This uses the...'
Assignee | ||
Comment 40•24 years ago
|
||
If you can tell me a good number (or, better, answer the acceptable-bloat-%age question I posed to waterson), I'll gleefully set gCachedLineThreshold to it. In the meantime, I was hoping to tell .performance about it and get feedback from people like varada and stephend about the effects of various settings, before we make everyone pay the bloat cost. The XXX does, in fact, indicate a todo: we should use the cached line data to speed up those other checks, rather than simply invalidating the cache there (as we do on the line below the comment). Want to r= it anyway, or wait until we settle the gCachedLineTheshold debate?
Comment 41•24 years ago
|
||
r=rbs. I don't see much that can be reliably added to this in this time-frame. re:threshold, what about trying 64 to start-with as waterson suggested. Any value >> 1 is okay to me as a start. As an additional comment as to why it is better in the long run to add these few lines of code so as not to add the 4 bytes all the time. The block frame is pervasive and is also used elsewhere as a wrapper frame (e.g., in table) to provide functionalities not supported by other frames. Quite often, a block frame can contain lineboxes with a block frame as sole children. That contributes to the thousands of block frames that could be around in a page.
Assignee | ||
Comment 42•24 years ago
|
||
This 64 number seems really popular. Are there data out there about the line-count for nsBlockFrames in general? Do people just like powers of two? I'd rather not just guess a number and check it in, because, as we saw with the reflow interval stuff, it can stay that way for an unpleasantly long time.
Comment 43•24 years ago
|
||
The main mistake that was made with the interval stuff is that no non-futured bug was filed in bugzilla. In order not to repeat the same mistake, a bug should be entered this time around. As I was writting these comments, I thought of filing that before completing my comments, but this code is yet to be checked-in. Also since a shake-up will do most good to layout, this code may perhaps not appear at some later stage in its present form.
Comment 44•24 years ago
|
||
The one-entry cache in the frame manager that is saving from the +900 successive requests of the same property is good stuff. The uncertainty about the threshold is blurring that that one is nice.
Comment 45•24 years ago
|
||
I just debugged an issue where frame-state flags needed to be carried forward into a continuing frame (bug 74184). Let me convince myself that this won't cause problems if we need to create a continuing frame for the block. (I think this'd only happen while printing, which might make it doubly fun to debug!) As far as picking a ``good value'', I've tried to think of some way to statically analyze the problem (to impress brendan, of course), but I can't. So, I think that the way to pick a good value is to pick the largest value that doesn't empirically create a ``noticable delay'' in the test case that's described above on a low-end machine. I've got a crappy machine at work (133MHz Win98) that I'll apply the patch on and try to cough up the juice.
Comment 46•24 years ago
|
||
I tryed that patch on linux and i get crash in large page: http://www.student.oulu.fi/stats/Jan.wwwstats.html Stacktrace is like this: #0 0x41ac11e3 in nsBlockFrame::LastLine(nsIPresShell*) (this=0x91bc744, aPresShell=0x87f1200) at ../../../../../layout/html/base/src/nsBlockFrame.cpp:5351 #1 0x41c60db3 in nsBlockFrame::LastChild(nsIPresShell*) (this=0x91bc744, aPresShell=0x87f1200) at ../../../../../layout/html/base/src/nsBlockFrame.cpp:5373 #2 0x41ac130e in nsBlockFrame::AppendFrames(nsIPresContext*, nsIPresShell&, nsIAtom*, nsIFrame*) (this=0x91bc744, aPresContext=0x86f0bd0, aPresShell=@0x87f1200, aListName=0x0, aFrameList=0x947d338) at ../../../../../layout/html/base/src/nsBlockFrame.cpp:5400 #3 0x41adae6e in FrameManager::AppendFrames(nsIPresContext*, nsIPresShell&, nsIFrame*, nsIAtom*, nsIFrame*) ( this=0x86a6398, aPresContext=0x86f0bd0, aPresShell=@0x87f1200, aParentFrame=0x91bc744, aListName=0x0, aFrameList=0x947d338) at ../../../../../layout/html/base/src/nsFrameManager.cpp:777 #4 0x41b91af7 in nsCSSFrameConstructor::AppendFrames(nsIPresContext*, nsIPresShell*, nsIFrameManager*, nsIContent*, nsIFrame*, nsIFrame*) (this=0xbfffe440, aPresContext=0x86f0bd0, aPresShell=0x87f1200, aFrameManager=0x86a6398, aContainer=0x872eab8, aParentFrame=0x91bc744, aFrameList=0x947d338) at ../../../../../layout/html/style/src/nsCSSFrameConstructor.cpp:7596 #5 0x41b93d86 in nsCSSFrameConstructor::ContentAppended(nsIPresContext*, nsIContent*, int) (this=0x8753310, aPresContext=0x86f0bd0, aContainer=0x872eab8, aNewIndexInContainer=57) at ../../../../../layout/html/style/src/nsCSSFrameConstructor.cpp:8142 code at crash is: 5349 // ...then we can use the cache! 5350 nsLineBox* line = cachedLine; 5351 while (line->mNext) <-- crash here 5352 line = line->mNext;
Comment 47•24 years ago
|
||
Weird. In theory, you shouldn't get there at all because gCachedLineThreshold=0. Did you set a non-zero value in the env variable? If so, which value?
Comment 48•24 years ago
|
||
I set BLOCK_FRAME_LINE_CACHE_THRESHOLD=64 when it crashes, didnt try other values.
Comment 49•24 years ago
|
||
retracting my r pending further investigations.
Assignee | ||
Comment 50•24 years ago
|
||
Dammit. This isn't for 0.9, obviously. I'll investigate that crash when I'm finished being sick.
Target Milestone: mozilla0.9 → mozilla0.9.1
Assignee | ||
Comment 51•24 years ago
|
||
Assignee | ||
Comment 52•24 years ago
|
||
So, here's what was happening: when we remove frames, we invalidate the cache, by storing a null value in the property. I guess I expected that a subsequent retrieval would fail, but that's not the case (I think it was with my dhash-based frame property code, which overloaded null). Much better now. I don't crash on Tomi's page (though it _is_ still heart-breakingly slow :-((((((((( ).
Assignee | ||
Comment 53•24 years ago
|
||
(On my happier days, I dream that someone who really cares about this, like perhaps the mail-compose performance guys, will test it and tell me if I've been wasting my time and everyone else's.)
Assignee | ||
Comment 54•24 years ago
|
||
Assignee | ||
Comment 55•24 years ago
|
||
Even on the "full patch", I am mortified by my own incompetence. I managed to get a bit of hyatt's painting-delay patch in there, and also to forget the nsLayoutAtomList.h change. I'm going to go soak my head now, after I post another patch. Special apologies to stephend and blake, who I have thwarted repeatedly in their attempts to help me profile.
Assignee | ||
Comment 56•24 years ago
|
||
Comment 57•24 years ago
|
||
let's get some of those mail compose guys you dream about cc'd on this bug. It looks like stephend is helping out with testing perf, but maybe JF or varada can also comment on whether or not this speeds up reply.
I'm going to Quantify this as soon as Rational comes through with my keys.
Comment 59•24 years ago
|
||
Wondering if there is any connection with this bug an bug 29584: Exponential time to open text files in the Editor.
Comment 60•24 years ago
|
||
Is profiling really necessary here? Can't someone just apply the darn patch and see if it makes any difference on the test case described in bug 22486?
The two output logs I attached showed the results of being unpatched and patched, but I haven't yet enabled the pref. I'll test BLOCK_FRAME_LINE_CACHE_THRESHOLD values of 0, 1, 16, 64 and 128, and post logs for each run.
My machine used to generate this data is a Pentium III 750 with 128 megs of RAM, running Windows 2000 SP1 with 1 GIG of virtual memory. The build was a debug trunk build pulled at around noon.
Comment 70•24 years ago
|
||
Cool, thanks for doing this Stephen. To tabularize this inline: n Total Insertion 0 (off) 8734 681 1 8164 678 16 7939 678 64 8253 683 128 8679 689 I presume that the times are given in msec? How many lines did the message that you replied to have? I was under the impression that insertion time contributed much more significantly to the overall time to open the window than is indicated by your data. (It looks like the delta that this makes to insertion time is in the noise, frankly. But maybe you tested with a small message?)
Oh, stone me senseless. I knew this was for message compose, but didn't realize this was intended for replying to messages. I thought this was just for initializing the editor widget. I'm also assuming that this is milliseconds, as http://www.mozilla.org/mailnews/performance/msgcompose_loggin.html doesn't nail that down. What I tested with was actually just a blank document, with a 2.43 meg attachment. Chris/Shaver, what size of messages should I be *replying* to to test this properly? Let me know, I'll run the tests again.
Comment 72•24 years ago
|
||
> The build was a debug trunk build pulled at around noon.
Wouldn't it be better to test this way in an optimized build?
John is absolutely right, same procedure as with Quantify, etc, (setting moz_debug to be null). So, with the patches in my tree, the right variable unset, and ready to rock and roll on some more testing, can I receive some scenarios to help me get this tested properly in anticipation of Mike landing his patch and helping out mail/news? Thanks.
Comment 74•24 years ago
|
||
If I understand that code correctly, the code-section being exercised is when a _long_ content needs to be broken in several lines. Mail&News provide a good example for this because the message text has to be broken in several lines in the content window. But in addition to testing Mail&News, another really good example is "viewsource". This is because all of the viewsource content is inside a <pre>...</pre> block. Hence some interesting data can also be obtained by viewing source of http://lxr.mozilla.org/seamonkey/source/layout/html/style/src/nsCSSFrameConstructor.cpp. But this might be an extreme case and other cases could be found. As far as testing is concerned, viewsource might be easier to deal with.
Comment 75•24 years ago
|
||
To test this, I created a 6,000 line plain-text message and mailed it to myself. Then, I tried to reply-to this message. I ended up with about 31s to reply to the message without the patch, and 31s to reply to the message with the patch, and with BLOCK_FRAME_LINE_CACHE_THRESHOLD=16. I also tried plain clipboard insertion, and that didn't seem to show any difference either. Maybe something in editor has changed? Anyway, I'll re-profile this and report back soon.
Comment 76•24 years ago
|
||
Ok, so I don't have a ton of time to do a detailed analysis of the profile, but it looks like nsBlockFrame::AppendFrames() is called _once_, and accounts for a tiny fraction of the time. nsBlockFrame::InsertFrames() is called 11K times, and accounts for 14% of the time, which this patch won't help (I don't think). (I'm profiling clipboard insertion, too...) There seems to be quite a bit of low-hanging fruit here; I'll attach a text version of the profile to the original bug (bug 22486).
Assignee | ||
Comment 77•23 years ago
|
||
This patch has rotted, and I'm sick of chasing the conflicts in my tree, so I hereby declare it dormant. If someone thinks it's worthwhile (and I'm pretty sure it is for large pages, but hey), they can update it and live the adventure. Untargetting and updating summary to keep mail-perf folks from suffering under false hopes. (Someone may want to remove the dependency, but I'm not going to ass-u-me.) Thanks for your time, everyone.
OS: Windows 2000 → All
Priority: -- → P5
Hardware: PC → All
Summary: 10% of reply time spent in O(n^2) line walking algorithm in nsLineBox → Optimizations possible in AppendLines
Target Milestone: mozilla0.9.1 → ---
Comment 78•23 years ago
|
||
mjudge and I were looking at some selection stuff yesterday (bug 62971), and stumbled across this thing that kipp allegedly hacked up days before his departure. It's a ``line iterator''. Here's how it works: it walks the line list once to count the lines, then allocates an array big enough to hold pointers to all of them, then walks the line list again to put the pointers into the array. Now the lines can be accessed like an array. Which made me wonder...why aren't the lines just stored in an array? That would solve this problem (getting to the last line quickly). And it would also solve some of our hit-test headaches (e.g., in selection), since each line knows its bounds, lines don't overlap, and each line's y-coordinate increases monotonically. (I think, somebody call my bluff if this isn't the case.) I think that there's very little code (if any) that inserts lines ``in the middle'' of the line array -- and that which does ends up throwing away all the subsequent lines anyway IIRC. Anyway, something else to think about.
Comment 79•23 years ago
|
||
I did some profiling with this, with long html page time spend in nsBlockFrame::AddFrames() drop from 1.9sec to 0.02 (5.7%->0.1% total time) I'll attach interesting parts from gprof output.
Comment 80•23 years ago
|
||
Comment 81•23 years ago
|
||
The work on bug 86947 should also fix the nsBlockFrame::AddFrames problem, since it makes the line list doubly-linked (for other reasons) and searches the list from the end in the AddFrames case.
Comment 83•23 years ago
|
||
so is this fixed?
Comment 84•23 years ago
|
||
remove self
Comment 85•23 years ago
|
||
What's the current status of this one?
Comment 86•22 years ago
|
||
Well, dbaron fixed the block frame; however, there are other places where it'd be nice to have a last-child member. For example, see bug 145425.
Assignee | ||
Updated•20 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → WORKSFORME
Updated•6 years ago
|
Product: Core → Core Graveyard
Updated•6 years ago
|
Component: Layout: HTML Frames → Layout: Images
Product: Core Graveyard → Core
You need to log in
before you can comment on or make changes to this bug.
Description
•