Closed Bug 92709 Opened 23 years ago Closed 22 years ago

FindCharInSet 10x faster

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 75081

People

(Reporter: bratell, Assigned: bratell)

References

Details

(Keywords: perf)

Attachments

(3 files)

Well, there are improvements to make among the obsolete string classes. I know they are supposed to be replaced, but since I thought of an algorithm for FindCharInSet that seems to work well in the parser, I wanted to try it for ns[C]String. Before my patch Quantify reported 10569ms spent in FindCharInSet, afterwards 866ms (12.2 times faster). I hope you like the patch coming up. It cuts ~5% of the tokenizer time for the table stress case.
Attached patch Patch to FindCharInSet (deleted) — Splinter Review
If one wants to improve the patch even further, one could make the caller supply the filter, but even though that was simple with an help class in the tokenizer, it probably complicates the API too much for the string library. Now 35% of the remaining time is spent calculating the filter. scc, what do you think?
Keywords: perf, review
I noticed that Mozilla didn't use most of these functions at all so I took the liberty of removing them. What is the meaning of searching for a set of Unicode characters in a C string anyway? So left is ns[C]String::[R]FindCharInSet(char *) and nsString::[R]FindCharInSet(PRUnichar *) Gone is ns[C]String::[R]FindCharInSet(nsStr&), nsCString::[R]FindCharInSet(PRUnichar *) and nsStr::[R]FindCharInSet(...) The functions that are left are very similar so maybe could they be implemented someway smarter. I don't want to introduce function calls though. So, scc, can you review?
One more patch that doesn't hang mail. I had misunderstood the meaning of the offset parameter when searching backwards.
Assignee: scc → bratell
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9.4
Attached patch With working RFindCharInSet (deleted) — Splinter Review
OS: Windows 2000 → All
Hardware: PC → All
scc, could you comment on this. Is it worthy of a review?
No review yet so I have to move it to the next milestone.
Target Milestone: mozilla0.9.4 → mozilla0.9.5
I want this in, but have no time to push scc or jag or bryner or someone else for reviews so I will clear the milestone.
Target Milestone: mozilla0.9.5 → ---
jag, is this worth pursuing? I don't have the mastar plan for the Mozilla strings so I don't know if this will fit. It's a smart and fast fix even though I don't know how rotten it has become.
Daniel, I think it'll still be worth it, but cc'ing alecf who's been making some changes in this code too.
this is cool, though I've whacked the hell out of this code in my tree. Basically, I've change around some routines and broken the toplevel nsStr routines into FindCharInSet1in1 FindCharInSet1in2 FindCharInSet2in1 FindCharInSet1in2 as a part of bug 114450 Would you mind waiting until I land the patch there, and then reapply your changes? I haven't bothered to see which ns*String:: routines are unused in the tree, so after I land we can probably remove the same functions.
Depends on: 114450
ok, that bug has been fixed.. feel free to make a 2nd attempt, though you'll notice that we now have 4 implementation of each *FindCharInSet The good news is that 1) I'm probably getting rid of some of them (FindCharInSet2in1 is probably going away, for example) 2) case sensitivity is going away from all of the ones that involve PRUnichar, such as FindCharInSet2in1, FindCharInSet1in2, etc.. unless the particular code involves changing the case of the ascii string/character. so you may want to stay tuned to bug 107575 before you make another attempt.
Depends on: 107575
Now that bug 107575 is fixed I want to make another go on this. One problem though, the nsStr::(R)FindCharInSet routines have a aIgnoreCase parameter that I want to remove. It isn't used so it should not be a problem. Can I do that? jag?
if its really not used, please do..again, nsStrPrivate is private, so you can go tweaking that all you want, without breaking binary compatibility. (besides, nobody ever said nsString was frozen, so we can continue to change that API, though people hate it if it affects too many files :))
Go ahead an remove that parameter.
There's a patch for this attached in bug 75081.
should we just mark this one a dupe of the other, since the other covers this bug, and more?
I was waiting for the reviewer's objects but I can always reopen this if I would like to split it again. :-) *** This bug has been marked as a duplicate of 75081 ***
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → DUPLICATE
Component: String → XPCOM
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: