Closed
Bug 91437
Opened 23 years ago
Closed 23 years ago
Speedup of nsScanner::{ReadUntil,SkipWhitespace}
Categories
(Core :: DOM: HTML Parser, defect)
Tracking
()
VERIFIED
FIXED
mozilla0.9.4
People
(Reporter: bratell, Assigned: bratell)
Details
(Keywords: perf, Whiteboard: [fix in hand] awaiting review)
Attachments
(5 files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
Details | Diff | Splinter Review |
You thought that nsScanner::ReadUntil couldn't be faster? Well, you were wrong.
I hade an idea which I tested and it brought a slight (~10-20%) speedup when
scanning for short (~5 chars) strings. The patch will work better the longer the
string since it for 99% of the cases replaces the inner loop with a bit AND and
a != 0 check. In the worst case there will be the bitwise AND, the != 0 check
and the original loop. (I wonder if this magic should work itself into other
places too)
Patch coming when I'm back home with access to my mozilla tree.
Assignee | ||
Comment 1•23 years ago
|
||
In the scanner there was two functions were Mozilla spent time. That was
ReadUntil and SkipWhitespace. They were taking 0.5% to 2% of the page load time
for some pages I tested. With this patch that is down to 0.4% to 1% in my tests.
The main reason for that is that SkipWhitespace don't call SetPosition when it
hasn't skipped anything. That made SkipWhitespace about 70% faster. The rest is
from ReadUntil which:
1) Has one iterator copy less
2) Does the search faster by first checking against a bitmask. That makes the
inner loop faster.
3) Reuses the character from the Peek reducing iterator accesses by one.
harishd, what do you think about this?
Assignee: harishd → bratell
Keywords: patch
Summary: Speedup of nsScanner::ReadUntil → Speedup of nsScanner::{ReadUntil,SkipWhitespace}
Assignee | ||
Comment 2•23 years ago
|
||
Assignee | ||
Updated•23 years ago
|
Whiteboard: [fix in hand] awaiting review
Target Milestone: --- → mozilla0.9.3
Daniel: I'm currently working on another important issue...will review your
patch once I'm done with it. Thanx for the patch.
Status: NEW → ASSIGNED
Assignee | ||
Comment 4•23 years ago
|
||
The review won't be in time for this milestone. Moving to 0.9.4.
Target Milestone: mozilla0.9.3 → mozilla0.9.4
Assignee | ||
Comment 5•23 years ago
|
||
I hope you haven't started reading the patch, because here comes one even
faster. There is not much time left within ReadUntil with this one. The time in
ReadUntil is now to 80% spent in SetPosition and AppendUnicode which I can't do
much about.
The last patch moves the filter calculation to an external class,
nsReadEndCondition which is passed in to the ReadUntil functions. That allows
the callers to precalculate the filter (by instantiating the
nsReadEndCondition) and by storing the nsReadEndCondition in a static variable,
there is no cost (~100 ns for the whole application runtime) for that
calculation anymore.
The total benefit of this change is about 5% of the ReadUntil time. And the
total win for the patch is less than (or equal to) 1% of the load time, 10-15%
of the tokenizer time.
The patch also removes some unused code, and removes two ReadUntil methods that
are no longer used so I guess the code size might be reduced.
(~60 fewer code lines)
Assignee | ||
Comment 6•23 years ago
|
||
Assignee | ||
Comment 7•23 years ago
|
||
So the last patch contains a bug and I _will_ find it. Hang on.
Assignee | ||
Comment 8•23 years ago
|
||
Assignee | ||
Comment 9•23 years ago
|
||
Found it.
Not very good to initialize a static variable with an argument that varies for
every call.
Saw another bug too, that is removed by the patch:
- const nsDependentString theTerminals(aTerminalChars,
- sizeof(aTerminalChars)/sizeof(aTerminalChars[0]) - 1);
aTerminalChars is no array, it's a pointer, so only the first char in
aTerminalChars will be copied. Well, accept my patch and that bug is gone (if it
ever has affected anything :-) ).
Assignee | ||
Comment 10•23 years ago
|
||
So harishd, you are welcome to look at this now. I promise to not detect any
more defects. :-)
Comment 11•23 years ago
|
||
Daniel, your patch will conflict with my patch for bug 57724. I want to move
newline counting to scanner to avoid iterating over the string again in the
tokenizer. Thus I need versions of ReadUntil() and SkipWhitespace() counting
newlines (current counting in SkipWhitespace() is not good enough).
I've written functions basing on your patch which count newlines:
nsresult nsScanner::SkipWhitespace(PRInt32& aNewlineCount) {
if (!mSlidingBuffer) {
return kEOF;
}
nsresult result;
PRUnichar ch;
PRUnichar filter = PRUnichar(~0x2f);
while (1) {
if (!mCountRemaining) {
result = Eof();
if (NS_FAILED(result))
return result;
}
ch = *mCurrentPosition;
if (ch & filter)
// this cannot be a whitespace char
return NS_OK;
switch (ch) {
case kCR:
++aNewlineCount;
++mCurrentPosition;
--mCountRemaining;
if (!mCountRemaining) {
result = Eof();
if (NS_FAILED(result))
return result;
}
ch = *mCurrentPosition;
if (ch == kLF)
break;
continue;
case kLF:
++aNewlineCount;
case kSpace:
case kTab:
case kBackSpace:
break;
default:
return NS_OK;
}
++mCurrentPosition;
--mCountRemaining;
}
}
/**
* Consume characters until you encounter one contained in given
* input set. Counts newlines and does NOT stop reading on CR or LF,
* even if specified in the terminal set.
*/
nsresult nsScanner::ReadUntil(nsReadingIterator<PRUnichar>& aStart,
nsReadingIterator<PRUnichar>& aEnd,
const nsReadEndCondition &aEndCondition,
PRInt32& aNewlineCount,
PRBool addTerminal)
{
if (!mSlidingBuffer) {
return kEOF;
}
nsresult result = NS_OK;
PRUnichar ch;
PRUnichar filter = aEndCondition.mFilter & PRUnichar(~0x0f);
const PRUnichar* setstart = aEndCondition.mChars;
const PRUnichar* setcurrent;
aStart = mCurrentPosition;
while (1) {
if (!mCountRemaining) {
result = Eof();
if (NS_FAILED(result))
break;
}
ch = *mCurrentPosition;
// Filter out completely wrong characters
// Check if all bits are in the required area
if (!(ch & filter)) {
if (ch == kCR) {
++aNewlineCount;
++mCurrentPosition;
--mCountRemaining;
if (!mCountRemaining) {
result = Eof();
if (NS_FAILED(result))
goto done;
}
ch = *mCurrentPosition;
if (ch != kLF)
goto check;
}
else if (ch == kLF) {
++aNewlineCount;
}
else {
check:
// Check a character that resemble the right ones
for (setcurrent = setstart; *setcurrent; ++setcurrent) {
if (*setcurrent == ch) {
if (addTerminal) {
++mCurrentPosition;
--mCountRemaining;
}
goto done;
}
}
}
}
++mCurrentPosition;
--mCountRemaining;
}
done:
aEnd = mCurrentPosition;
return result;
}
Some Notes:
- Calling SetPosition() is not neccessary. It will iterate over the chars again
and we have just done that.
- Everywhere in nsScanner there seem to be bugs with Eof(). If Eof() fills the
buffer, we should not abort. I think the way I do it above would be the
correct way to handle Eof(), but in practice it does not make a difference
because mInputStream is always (?) null and thus Eof() returns kEOF.
- We'll not win that much by filtering chars with my changes to nsHTMLTokenizer
and nsHTMLTokens because ReadUntil() will typically be called with only 2
terminal chars ('<' and '&' for text tokens). But I think it's still an
improvement.
- The patch triggers 2 warnings on gcc:
operator=(const nsReadEndCondition& aOther); // No assigning
should return a value (e.g. void) and for
mFilter(~PRUnichar(0))
a conflict between signed and unsigned is reported (you can say
mFilter(PRUnichar(~0))
instead).
- I'll post a new patch to bug 57724 including a modified version of your patch
in case you want to test it (but note that it's still work in progress).
Assignee | ||
Comment 12•23 years ago
|
||
I think I believe I understand what you want to do.
But, do you think we can keep it as seperate issues? If we merge them it will be
even harder for harishd and company to review and understand what's going on. I
would like to see this relatively small patch in and then work from that point.
You could still build on it and your own patch might be a little smaller, or
what do you think?
(Comment to your changes meant for the other bug but I take it here:
0x0f: too magic, let the compiler do it for you '\n' & '\r'. )
I do the changes that removes the warnings though before the checkin (adding
void before operator= and moving the ~ into the cast.
Believe it or not, it is a win when searching for only two chars. The speed is
highly dependent on the size of the inner loop and with the filter that becomes
only about ~10 machine instructions. The only times we lose is when the char to
be found is within the first 1-3 chars or when the filter becomes 0x0 so that
nothing is filtered, but that doesn't happen.
Comment 13•23 years ago
|
||
I do not want to merge the bugs, but you should know that SkipWhitespace() and
ReadUntil() without line counting won't be called much after my changes. If you
want to offer functions with line counting in your patch, I'll use them, if not
I will include them in my patch. For now I had to include parts of your patch
in my patch because I already changed it to use nsReadEndCondition and that's
not in the tree yet.
If you don't take my suggestions (avoiding calls to SetPosition() and issues
with Eof()), I won't include them in bug 57724 either, but file another bug
later. I just noticed that while writing my patch and wanted to tell you here.
My comment saying there'd be only 2 terminal chars was wrong. Another 2 will be
added inside ReadUntil() (yes, it should be ~kCR & ~kLF, not ~0x0f).
Comment 14•23 years ago
|
||
Daniel: Scanned thro' your patch ( haven't done a complete reivew yet! ) and I
like your idea. Will update more on your patch soon.
Comment 15•23 years ago
|
||
daniel: why are numerals filtered out?
Ex.
<html>
<head>
<title>01234</title>
</head>
<body>
test
</body>
</html>
In the above test case 01234 will not be a part of mFilter.
Assignee | ||
Comment 16•23 years ago
|
||
They shouldn't be. Can you explain what happens?
Or do you mean that they matches mFilter when searching for \r, >, & or
something similar? That's just because the bit patterns happen to that way.
'0' has ASCII value 00110000 and '<' has 00111100 so the '0' "match" the '<'.
But the filter is only used to shortcut search so the loop after the filtering
will see that it's not right.
Now, digits are not that common in html pages so the extra (very small) cost for
them is outweight by the shortcut whenever a letter with ascii value greater
than 0x3f, i.e. A, B...a, b...
Comment 17•23 years ago
|
||
In ReadUntil() digits don't match the following condition:
if(theChar & aEndCondition.mFilter) {
...
}
and therefore end up in the while loop looking for a match in the terminal set.
>Now, digits are not that common in html pages so the extra (very small) cost
>for them is outweight by the shortcut whenever a letter with ascii value
>greater than 0x3f, i.e. A, B...a, b...
Agreed. So for digits the performance would be no better than what we currently
have...right?
Assignee | ||
Comment 18•23 years ago
|
||
Correct. There would be like 3-4 extra machine instructions* for a false match
in the filter compared to no filter at all. Though, I removed an unnecessary
iterator copy that was visible in the Quantify runs which may compensate for
that so it _might_ be faster even for false matches.
*) at least an AND, a CMP and a JEQ
Comment 19•23 years ago
|
||
Daniel: How much testing have you done and how confident are you that this will
not break anything? Please test throughly...with that r=harishd
Assignee | ||
Comment 20•23 years ago
|
||
I've been running with it for some time without seeing anything strange but is
hard to test everything. Do the parser have any regression tests?
I will check the changes thoroughly before checking it in.
jst, since you're listed as a super reviewer for the parser, can you take a
look?
Assignee | ||
Comment 21•23 years ago
|
||
Comment 22•23 years ago
|
||
Style nit:
switch(theChar) {
case '\n': mNewlinesSkipped++;
case ' ' :
@@ -555,23 +574,26 @@
default:
found=PR_FALSE;
break;
- }
+ }
The above indentation change makes the end brace not line up with the line that
starts the indentation block.
+ for (;current != mEndPosition; ++current, theChar = *current) {
For clarity, move theChar = *current into the loop since theChar is not really
part of the loop expression (this same pattern occires in a few places).
- if (current == end) {
- AppendUnicodeTo(origin, current, aString);
- return Eof();
- }
+ return result;
+
//DoErrTest(aString);
return result;
-}
+}
Eh, many return's at the end of a method (nsScanner::ReadUntil)?
With that fixed and a r= from Harish, sr=jst.
I believe Harish has some regression tests for the parser.
Assignee | ||
Comment 23•23 years ago
|
||
Assignee | ||
Comment 24•23 years ago
|
||
jst thanks for the look. I have incorporated your changes. I thought that I
could trick the compiler into doing a better optimization the way I did it but
it didn't work. When comparing the rough code:
for (; end_condition; statement1, statement2) {
if (check) continue;
<more code>
}
to the better looking code with
while (end_condition) {
if (!check) {
<more code>
}
statement1;
statement2;
}
the generated machine code was identical. Not optimal but identical.
I have now run the relevant parts of the smoke tests and haven't seen anything
strange anywhere.
Harish, can you take a new look and see if this is still (or even more) worthy
of your r= ?
Keywords: review
Comment 25•23 years ago
|
||
if (!skipped)
+ return result; // from Peek
instead why can't we have something like
+ PRUnichar theChar=0;
+ nsresult result=Peek(theChar);
+ NS_ENSURE_SUCCESS(result,result);
Other than that I'm ok with your changes. I did run my regression test suite (
very rudimentary! ) and it passed :-). If you could run browser buster
http://komodo.mozilla.org/buster/ I'd greatly appreciate it.
Assignee | ||
Comment 26•23 years ago
|
||
That would change the way things work. Right now, even if the Peek fails we
still scan for whitespace. I only added the comment "// from Peek" because I
thought it was difficult to see the origin of the return value.
Do you still want that change? If so I guess we should just return NS_OK if
there was nothing to skip.
Comment 27•23 years ago
|
||
If Peek() returns an error code then the while(condition) should also fail
wouldn't it?
Assignee | ||
Comment 28•23 years ago
|
||
Yes, I too think so, so I added the NS_ENSURE_SUCCESS(result, result) and return
NS_OK where result used to be returned.
jst is that change ok with you so that I can check it in?
Comment 29•23 years ago
|
||
Ok with me.
Assignee | ||
Comment 30•23 years ago
|
||
Fix is in. Thanks for your patience.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 31•23 years ago
|
||
Judging from the output in the tinderbox logs the improvement was 9.836%.
Comment 32•23 years ago
|
||
great job daniel. thanx.
Comment 33•23 years ago
|
||
daniel: could you please verify this.
Assignee | ||
Comment 34•23 years ago
|
||
Well, I can confirm that the code is still in the tree and that my measurements
at that time showed a total speedup of ~1% and a speedup of ~10% in the
tokenizer as measured by Quantify. In
http://bugzilla.mozilla.org/show_bug.cgi?id=91437#c31 I noticed that a tinderbox
showed a tokenizer improvement of the same size.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•