28783 - [PERF] Tracking: Performance improvements plaintext paste and insert

Reporter

Description

•

25 years ago

On a debug windows PII-450 MHZ machine with 256 MB of RAM, it takes 3 minutes to insert 130K of text into the compose window. That is, I'm replying to a 100K text message and it's taking about that long before I see the body inserted into the editor instance. I've done a lot of analysis on where the hotspots were for this operation and one of the big ones was here in WillInsertText. I sent a more detailed message to Joe and some others but here's the gist of where we can kill some low hanging fruit: 1) The method is taking an nsString which is over 130K (corresponding to the message body) as an input argument and it isn't passing the nsString by reference. This causes an extra allocation of the string. 2) This method extracts one line at a time, copying that line into a new nsString before calling DoTextInsertion (which appears to be an expensive operation). A couple things we do here, searching for the next line is actually expensive using nsString::FindCharInSet on a string as large as this one. Can DoTextInsertion take larger chunks than one line at time? Also, I replaced the search code with FindChar which is a little faster and this helped a bit. Also, we don't need to copy each extracted line into a new nsString and passing that string to DoTextInsertion. Instead, we could pass an index into the original string into DoTextInsertion. 3) We make a couple copies of the input String in this method. At 130K per copy they are expensive. I wonder if we can't somehow re-examine if they are all necessary. Thanks!

Scott MacGregor

Reporter

Comment 1

•

25 years ago

Nominating for beta1, adding perf to the keyword.

Keywords: beta1, perf

Scott MacGregor

Reporter

Comment 2

•

25 years ago

Just for analysis sake (I'm definetly not saying this is the right thing to do). I modified nsTextEditRules::WillInsertText to insert the entire message body in one step instead of trying to find a line, copy that line out into a new nsString then insert that string into the document and repeat. Time BEFORE: 3-4 minutes Time AFTER: 20 seconds The remaining 20 seconds is time spent in nsHTMLToTXTSinkStream::Write searching for new line characters in this huge nsString. This gives us an idea of how much of a performance win we are looking at in here. Keep in mind that my numbers for a high end machine too.

Kathleen :Brade

Comment 3

•

25 years ago

m15; cc sfraser

Target Milestone: M15

Scott MacGregor

Reporter

Comment 4

•

25 years ago

Actually I'm nominating this for beta1. I'm not sure we can triage it to m15 which is after beta1 until PDT determines if they will rule it PDT+.

Joe Francis

Comment 5

•

25 years ago

*** Bug 26388 has been marked as a duplicate of this bug. ***

Joe Francis

Comment 6

•

25 years ago

acceptin bug

Status: NEW → ASSIGNED

daver

Updated

•

25 years ago

Whiteboard: [PDT+]

Joe Francis

Comment 7

•

25 years ago

*** Bug 19273 has been marked as a duplicate of this bug. ***

Joe Francis

Comment 8

•

25 years ago

pasting in my comments from email: Scott MacGregor wrote: > We've got a really bad performance problem [...] > 1) nsTextEditRules:: WillInsertText and DoTextInsertion are costing us > a lot. [...] Mailnews is giving the editor instance a nsString, > in my example this string was 130K. WillInsertText makes several > copies of the string . You correctly point out that this routine is slow. We can certainly speed it up, though if it's really taking 4 minutes for you then we may be doomed. > First, we should pass the nsString by reference in the method > signature for WillInsertText to reduce an extra copy of the string > when calling the method. Second, we immediately copy the entire string > into aOutString as part of the intialization work even though later we > may not end up using that copied value. Can we avoid that? I would like to get rid of the outString (I've wanted to do this before). In some cases outString simply makes no sense. But there are other cases where some old code depends on it. I'm hopeful that we can ditch it with a little work, though. > However I think the most expensive part of this method is the while > loop which determines how we call DoTextInsertion. [...] Ideally, > it'd be great if DoTextInsertion could handle chunks larger than just > one line a time. Any way you slice it, I have to break it up into runs of text separated by newlines. This is because I have to use <break> nodes instead of newlines, even when inside of preformatted text. The selection & caret code cant deal with blank lines caused by newlines in preformatted text, but it can deal with break nodes. At the bottom level, we have to insert a separate text node for each run of chars between breaks. > I'll file a bug so we can get the ball rolling on this part for beta1 > if we can. I can cut down on string copying for beta1. And we can tune the rest. But the basic problem (needing to scan through/break up huge text strings) remains. :-( If you can talk the layout folks into generating frames for blank lines caused by newlines in preformatted text, then we can avoid this problem.

Scott MacGregor

Reporter

Comment 9

•

25 years ago

>If you can talk the layout folks into generating frames for blank lines caused >by newlines in preformatted text, then we can avoid this problem. I agree completely. I've been running with that little hack in my tree where i just insert the entire string in WillInsertText instead of slicing line by line and it is so much faster. Although I did it as just a hack, I don't see any side effects! I must be missing something =) In any case I will definetly help push on the layout folks with you after beta1 so you can get out of the business of reading out line by line. Even if you have to insert one line at a time, you could get a pretty big performance boost by not copying each line into a new nsString. If you just passed a ptr into the string buffer and a length parameter into DoTextInsertion instead of a new nsString you'd save a lot. I briefly looked at doing this and it looks like it would involve changing several method signatures underneath DoTextInsertion to also take a PRUnichar * + a length instead of a nsString. But I didn't see that any of these methods were modifying the input nsString in any way so it may be as easy as just changing the signatures. This would help save a lot as then you are only bounded by: 1) the time it takes to search out the position of the next new line 2) the time it takes for the right frame creation and text insertion for each line of text processed by DoTextInsertion. Just a few thoughts...

kinmoz

Comment 10

•

25 years ago

Reassigning to kin@netscape.com. I'll take a look at rewriting some of the code to use nsSubsumeStr to see if that speeds some things up.

Assignee: jfrancis → kin

Status: ASSIGNED → NEW

leger

Comment 11

•

25 years ago

Must have a fix by 03/03 or will punt for beta1.

Whiteboard: [PDT+] → [PDT+] Need fix by 03/03

bijals (gone)

Comment 12

•

25 years ago

Is this performance an end user issue? If not, can we push this to post beta 1?

Scott MacGregor

Reporter

Comment 13

•

25 years ago

bijals: yes it is. Please read my comments at the top of the bug report where I report that inserting a 100K document into an ender widget takes over 4 minutes on a high end machine. This is very common when replying to large emails. I'd say this effects the end user =).

leger

•

25 years ago

I agree completely with the numbers you saw =). I've been running with a hacked patch in my tree where we always insert the entire nsString in one shot instead of extracting line by line. I know this is supposed to be wrong but I've been running with it in my tree all week and haven't seen anything bad happen. Granted I'm only testing mailnew compose and not Editing documents.

Simon Fraser [no longer active]

Comment 17

•

25 years ago

•

25 years ago

cc: troy, since I'm going to be mentioning GetPrimaryFrameFor

Status: NEW → ASSIGNED

troy

Comment 27

•

25 years ago

I don't want to hear any mention of GetPrimaryFrameFor(). It's exactly the same issue as for BR frames. We don't add text frames to the map by default, because it drastically increases the size of the map, and text nodes are anonymous so normal DOM usage doesn't involve doing a content->frame lookup If you want text nodes added to the map by default in editable documents, then go ahead

Simon Fraser [no longer active]

Comment 28

•

25 years ago

Here's a rundown of where I think we're spending time, after applying fixes for the performance problems described above. These data are for pasting a single line of text (text, then a <br>) into the editor (taken from the middle of pasting 100 lines). Times are microseconds. Two main top-level calls themselves make up almost 100% of the total pasting time: Num calls Time (us) -------------------------------------------------------------------------- nsHTMLEditor::InsertBreak 1 1690 nsEditor::InsertTextImpl (enclosing call only) 1 1862 -------------------------------------------------------------------------- Total 3352 Here are the (non-inclusive) calls that take significant time within the above calls (and together account for around 50% of the time) -------------------------------------------------------------------------- PresShell::ContentInserted (frame construction) 2 434 (12%) nsDOMSelection::Collapse 5 395 (11%) nsEditor::GetPriorNode ( ->IsEditable -> GetPrimaryFrameFor) 3 359 (10%) nsGenericDOMDataNode::ReplaceData ( -> GetPrimaryFrameFor) 1 373 (10%) InsertTextTxn::Init (a string dup) 1 203 (6%)

Simon Fraser [no longer active]

Comment 29

•

25 years ago

Here are those data non-mangled: Num calls Time (us) -------------------------------------------------------------------------- nsHTMLEditor::InsertBreak 1 1690 nsEditor::InsertTextImpl (enclosing call only) 1 1862 -------------------------------------------------------------------------- Total 3352 Here are the (non-inclusive) calls that take significant time within the above calls (and together account for around 50% of the time) -------------------------------------------------------------------------- PresShell::ContentInserted (frame construction) 2 434 (12%) nsDOMSelection::Collapse 5 395 (11%) nsEditor::GetPriorNode ( ->IsEditable -> GetPrimaryFrameFor) 3 359 (10%) nsGenericDOMDataNode::ReplaceData ( -> GetPrimaryFrameFor) 1 373 (10%) InsertTextTxn::Init (a string dup) 1 203 (6%)

Joe Francis

Comment 30

•

25 years ago

removing PDT+ for reconsideration. I recommend we go with M15 for this. We've identified a lot of performance improvements, but there is no silver bullet. We need a lot of small and medium sized changes. I'd feel better about running with this code throughout M15 + M16 and having it be in beta2.

Whiteboard: [PDT+] w/b minus on 03/03

Joe Francis

Comment 31

•

25 years ago

i'm expirementing with a rewrite of much of the insert text codepath, designed to avoid the pitfalls that we have discovered with the current code. Right now my first quick+dirty cut at the rewrite takes around the same amount of time as "fast" version of the old code (ie, the old code with all the improvements we have discovered). I'm trying to write the new version in such a manner as to avoid triggering expensive GetPrimaryFrameFor() calls and also avoid any selection related calls until the paste is completely finished.

kinmoz

Comment 32

•

25 years ago

I know Joe is rewriting the editor code to not call collapse so much ... but I went ahead and modified nsDOMSelection so that it caches the iterators used in selectFrames(). This seems to shave about a second off the time it takes to paste the 500 and 2250 line case. I'll attatch the updated diffs for nsRangeList.cpp.

kinmoz

Comment 33

•

25 years ago

Attached patch Updated nsRangeList.cpp diff with cached iterators. (deleted) — Details — Splinter Review

Phil Peterson

Comment 34

•

25 years ago

PDT-

Whiteboard: [PDT-]

Joe Francis

Comment 35

•

25 years ago

reassigning to self

Assignee: sfraser → jfrancis

Status: ASSIGNED → NEW

rubydoo123

Comment 36

•

25 years ago

moving to M17

Target Milestone: M15 → M17

Joe Francis

Comment 37

•

25 years ago

per my recent work and re-profiling, adding dependencies on 5 bugs.

Status: NEW → ASSIGNED

Depends on: 35292, 35293, 35294, 35295, 35296

rubydoo123

Comment 38

•

25 years ago

moving to future

Keywords: beta1

Whiteboard: [PDT-]

Target Milestone: M17 → Future

rubydoo123

Comment 39

•

25 years ago

moving this back to m19 since it blocks 29584 -- I didn't see the dependency earlier

Target Milestone: Future → M19

rubydoo123

Comment 40

•

24 years ago

35292 has been futured 35293 is futured (just now) 35294 is dependent on 43863 (which is futured) and 35294 is futured 35295 is nsbeta3+ 35296 is nsbeta3+ since over half of the dependency bugs are futured, moving this to future

Target Milestone: M19 → Future

kinmoz

Updated

•

24 years ago

Blocks: 53252

kinmoz

Comment 41

•

24 years ago

Removing dependencies on 35292,35293,35294,35295,35296 since they really aren't dependencies. Bug 53252 is now the tracking bug for all areas that need tuning.

No longer blocks: 53252

No longer depends on: 35292, 35293, 35294, 35295, 35296

kinmoz

Updated

•

24 years ago

Blocks: 53252

rubydoo123

Updated

•

24 years ago

Severity: normal → major

Keywords: rtm

Priority: P3 → P1

Whiteboard: [rtm+]

Target Milestone: Future → M19

rubydoo123

Comment 42

•

24 years ago

removed future now that a parent paste bug has been opened, paste performance must be fixed before rtm

rubydoo123

Updated

•

24 years ago

Whiteboard: [rtm+] → [rtm+][p:1]

Joe Francis

Comment 43

•

24 years ago

per beppe removing status whiteboard

Keywords: rtm

Whiteboard: [rtm+][p:1]

Target Milestone: M19 → Future

Simon Fraser [no longer active]

Comment 44

•

•

23 years ago

updating summary, milestone, dependencies

Depends on: 35292, 35293, 35294, 35295, 35296

Summary: Performance improvements for nsTextEditRules::WillInsertText → [PERF] Tracking: Performance improvements plaintext paste and insert

Target Milestone: mozilla1.0 → mozilla0.9.7

Joe Francis

Comment 48

•

23 years ago

awaiting 35294 work.

Joe Francis

•

23 years ago

a lot of work has been done here. We need to start from scratch on perf investigation since many of the hotspots have received attention. On fix that hasn't yet landed is at the front of the first attached patch: kin speeded things up by reordering the switch statement in the editor factory method.

Joe Francis

Comment 52

•

23 years ago

The trunk is the wave of the future!

Target Milestone: mozilla1.1alpha → mozilla1.1beta

Joe Francis

Comment 53

•

•

3 years ago

Closing this as resolved:incomplete, it has been a long time since the last update, and the reporter's account + other parties in this discussions are disabled so it's hard to get a status update on this. The used hardware and dependencies for this performance issue are also critically old and not relevant today.

Status: NEW → RESOLVED

Closed: 3 years ago

Resolution: --- → INCOMPLETE

Patch set containing fixes from kin, mjudge, jfrancis, and sfraser 25 years ago kinmoz (deleted), patch		Details \| Diff \| Splinter Review
Updated nsRangeList.cpp diff with cached iterators. 25 years ago kinmoz (deleted), patch		Details \| Diff \| Splinter Review