Closed Bug 744830 Opened 13 years ago Closed 12 years ago

Implement fast serializer for innerHTML/outerHTML

Categories

(Core :: DOM: Core & HTML, defect)

12 Branch
x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: smaug, Assigned: smaug)

References

(Depends on 1 open bug, )

Details

Attachments

(7 files, 4 obsolete files)

Currently we use the slow and generic serializer. We could do better.
I suggest doing the following: Don't have a heap-allocated serializer object. Instead, use a method with a signature like static nsresult Serialize(nsINode* aRoot, bool aDescendantsOnly, nsAString& aOut) where the return value signals OOM (fallible string appends should probably be used in order to avoid OOM crashes when a page gets the innerHTML of the root of a huge document), aRoot is the root of the subtree to serialize, aDescendantsOnly indicates whether tags for the aRoot node itself should be included and aOut is the resulting serialization. Don't recurse. Instead, use the iterative algorithm seen in https://hg.mozilla.org/projects/htmlparser/file/5b026546075d/src/nu/validator/htmlparser/dom/Dom2Sax.java#l57 . However, instead of calling out to other methods, just inline the stuff that's calls out to contentHandler or lexicalHandler in the Dom2Sax code above. When the Java code above is used for serialization, both contentHander and lexicalHandler point to an instance of https://hg.mozilla.org/projects/htmlparser/file/5b026546075d/src/nu/validator/htmlparser/sax/HtmlSerializer.java
Another option would be to change nsINode::GetNextNode() to make a callback (to an inline function template parameter) when stepping out of an element. Not sure whether that would be better than just redoing the iterative walk or not.....
(In reply to Boris Zbarsky (:bz) from comment #3) > Another option would be to change nsINode::GetNextNode() to make a callback > (to an inline function template parameter) when stepping out of an element. > Not sure whether that would be better than just redoing the iterative walk > or not..... To me, that seems more complicated than letting the tree walk loop appear in the serialization method itself. (I didn't suggest GetNextNode, because it's not nice for writing end tags.)
I'm playing around with implementing comment 2.
WIP.
Assignee: nobody → josh
Comment on attachment 616606 [details] [diff] [review] Implement fast serializer for innerHTML/outerHTML. Review of attachment 616606 [details] [diff] [review]: ----------------------------------------------------------------- Some nits... ::: content/html/content/src/nsGenericHTMLElement.cpp @@ +672,5 @@ > + aBuf.BeginReading(start_iter); > + aBuf.EndReading(end_iter); > + > + const PRUnichar* c = start_iter.get(); > + const PRUnichar* end = end_iter.get(); const PRUnichar* c = aBuf.BeginReading(); const PRUnichar* end = aBuf.EndReading(); @@ +694,5 @@ > + break; > + } > + c++; > + } > + aBuf = modified; We probably only want to do this if anything changed. @@ +738,5 @@ > + return true; > +} > + > +static nsresult > +StartElement(const nsAString& aLocalName, nsIContent* aContent, Element* aElement Does this need to return nsresult, or is a bool sufficient? @@ +763,5 @@ > + PRInt32 attNs = name->NamespaceID(); > + nsIAtom* attName = name->LocalName(); > + > + if (attNs == kNameSpaceID_None) { > + NS_ENSURE_TRUE(FallibleAppend(aOut, NS_LITERAL_STRING(" ")), NS_ERROR_OUT_OF_MEMORY); Mm... A macro for NS_ENSURE_TRUE(FallibleAppend(foo, bar), NS_ERROR_OUT_OF_MEMORY);? @@ +799,5 @@ > + static const char* voidElements[] = { > + "area", "base", "basefont", "bgsound", "br", "col", "command", "embed", "frame", > + "hr", "img", "input", "keygen", "link", "meta", "param", "source", "track", "wbr" > + }; > + for (PRUint32 i = 0; i < mozilla::ArrayLength(voidElements); i++) { Drop "mozilla::" @@ +800,5 @@ > + "area", "base", "basefont", "bgsound", "br", "col", "command", "embed", "frame", > + "hr", "img", "input", "keygen", "link", "meta", "param", "source", "track", "wbr" > + }; > + for (PRUint32 i = 0; i < mozilla::ArrayLength(voidElements); i++) { > + if (aLocalName.EqualsASCII(voidElements[i])) { Atoms? @@ +816,5 @@ > + static const char* nonEscapingElements[] = { > + "iframe", "noembed", "noframes", "noscript", "plaintext", "script", "style", "xmp" > + }; > + for (PRUint32 i = 0; i < mozilla::ArrayLength(nonEscapingElements); i++) { > + if (aLocalName.EqualsASCII(nonEscapingElements[i])) { Same @@ +841,5 @@ > + case nsIDOMNode::ELEMENT_NODE: { > + nsString name = current->LocalName(); > + if (name.IsEmpty()) > + name = current->NodeName(); > + nsresult rv = StartElement(name, static_cast<nsIContent*>(current), current->AsElement() @@ +857,5 @@ > + break; > + } > + > + case nsIDOMNode::COMMENT_NODE: { > + const nsTextFragment* text =static_cast<nsIContent*>(current)->GetText(); ' = ' @@ +868,5 @@ > + break; > + } > + > + case nsIDOMNode::DOCUMENT_NODE: > + buf = NS_LITERAL_STRING("<!DOCTYPE html>"); Eh? @@ +876,5 @@ > + NS_ENSURE_SUCCESS(FallibleAppend(aOut, buf), NS_ERROR_OUT_OF_MEMORY); > + > + nsIContent* next; > + if ((next = current->GetFirstChild())) { > + current = static_cast<nsINode*>(next); No cast @@ +883,5 @@ > + > + while (true) { > + if (current->NodeType() == nsIDOMNode::ELEMENT_NODE) { > + if (escapeLevel > 0) > + escapeLevel--; {} @@ +902,5 @@ > + return NS_OK; > + } > + > + if ((next = current->GetNextSibling())) { > + current = static_cast<nsINode*>(next); No need for the cast
> Atoms? We should file a followup to consider moving that boolean into nodeinfo.
This patch needs to handle moz-dirty based on the work being done in bug 599983.
Depends on: 599983
(In reply to Josh Matthews [:jdm] from comment #10) > Smaug says this is a good testcase: > https://bug-81214-attachments.webkit.org/attachment.cgi?id=132034 (Note to everybody, this will hang your browser for a while)
Attachment #616606 - Attachment is obsolete: true
Comment on attachment 616711 [details] [diff] [review] Implement fast serializer for innerHTML/outerHTML. Review of attachment 616711 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/html/content/src/nsGenericHTMLElement.cpp @@ +743,5 @@ > + > +#define APPEND_OR_FAIL(a, b) NS_ENSURE_TRUE(FallibleAppend(a, b), false) > + > +static bool > +StartElement(nsIAtom* aLocalName, Element* aContent, Why pass aLocalName separately? @@ +782,5 @@ > + // Filter out special case of <br type="_moz"> or <br _moz*>, > + // used by the editor. Bug 16988. Yuck. > + // > + PRInt32 tagNS = aContent->GetNameSpaceID(); > + if (aLocalName == nsGkAtoms::br && tagNS == kNameSpaceID_XHTML && aContent->IsHTML(nsGkAtoms::br) @@ +941,5 @@ > nsGenericHTMLElement::GetMarkup(bool aIncludeSelf, nsAString& aMarkup) > { > aMarkup.Truncate(); > > nsIDocument* doc = OwnerDoc(); Move this below the if
Builds should be available for profiling at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/josh@joshmatthews.net-9d168c7b6001 in a few hours. I fixed up the problems in the last try run, so I would expect all tests to be passing with this build.
Try run for 4cbda5817a8c is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=4cbda5817a8c Results (out of 15 total builds): exception: 14 failure: 1 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/josh@joshmatthews.net-4cbda5817a8c
Try run for ccc14e9dc24d is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=ccc14e9dc24d Results (out of 296 total builds): exception: 20 success: 139 warnings: 81 failure: 56 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/josh@joshmatthews.net-ccc14e9dc24d
Try run for 9d168c7b6001 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=9d168c7b6001 Results (out of 288 total builds): success: 246 warnings: 40 failure: 2 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/josh@joshmatthews.net-9d168c7b6001
Comment on attachment 616711 [details] [diff] [review] Implement fast serializer for innerHTML/outerHTML. Thanks for working on this! Some comments: >+static bool >+StartElement(nsIAtom* aLocalName, Element* aContent, >+ PRUint32& aEscapeLevel, PRUint32& aIgnoreLevel, >+ nsAString& aOut) >+{ >+ if (aEscapeLevel > 0) { >+ aEscapeLevel++; >+ } Hmm. Looks like the spec now only considers the parent--not ancestors for deciding when not to escape. See http://www.whatwg.org/specs/web-apps/current-work/multipage/the-end.html#html-fragment-serialization-algorithm >+ if (attNs == kNameSpaceID_None) { >+ APPEND_OR_FAIL(aOut, NS_LITERAL_STRING(" ")); >+ } else if (!aContent->IsHTML() && attNs == kNameSpaceID_XLink) { >+ APPEND_OR_FAIL(aOut, NS_LITERAL_STRING(" xlink:")); The spec says to do this even if the element is an HTML element. >+ } else if (attNs == kNameSpaceID_XML) { >+ if (aContent->IsHTML()) { >+ if (attName == nsGkAtoms::lang) { >+ APPEND_OR_FAIL(aOut, NS_LITERAL_STRING(" ")); This lang stuff is not needed per spec. It's again for differing use cases of the Java code. >+ } else { >+ continue; Again, the spec says to do this even if the element is an HTML element. >+ } Additionally, the xmlns cases need to be handled here. (The Java code doesn't handle them, because Java SAX doesn't represent namespace declarations as attributes.) >+ } else { >+ APPEND_OR_FAIL(aOut, NS_LITERAL_STRING(" xml:")); >+ } >+ } else { >+ continue; The spec says to use the attributes qualified name in the anything else case. >+ static const nsIAtom* voidElements[] = { >+ nsGkAtoms::area, nsGkAtoms::base, nsGkAtoms::basefont, nsGkAtoms::bgsound, nsGkAtoms::br, nsGkAtoms::cols, >+ nsGkAtoms::command, nsGkAtoms::embed, nsGkAtoms::frame, nsGkAtoms::hr, nsGkAtoms::img, nsGkAtoms::input, >+ nsGkAtoms::keygen, nsGkAtoms::link, nsGkAtoms::meta, nsGkAtoms::param, nsGkAtoms::source, nsGkAtoms::track, >+ nsGkAtoms::wbr >+ }; >+ for (PRUint32 i = 0; i < ArrayLength(voidElements); i++) { >+ if (aLocalName == voidElements[i]) { >+ aIgnoreLevel++; >+ return true; >+ } >+ } Instead of having an ignore level integer here and having the caller method walk the descendants of void elements, you could inline the start tag handling into the tree walking loop and make the tree walking skip the next sibling when the current node is a void element. >+ /*if (aLocalName == nsGkAtoms::pre || aLocalName == nsGkAtoms::textarea || >+ aLocalName == nsGkAtoms::listing) { >+ APPEND_OR_FAIL(aOut, NS_LITERAL_STRING("\n")); >+ }*/ This needs to check the element namespace, too. >+ static const nsIAtom* nonEscapingElements[] = { >+ nsGkAtoms::iframe, nsGkAtoms::noembed, nsGkAtoms::noframes, nsGkAtoms::noscript, nsGkAtoms::plaintext, >+ nsGkAtoms::script, nsGkAtoms::style, nsGkAtoms::xmp >+ }; >+ for (PRUint32 i = 0; i < ArrayLength(nonEscapingElements); i++) { >+ if (aLocalName == nonEscapingElements[i]) { >+ aEscapeLevel++; >+ } >+ } Per spec, escaping only considers the parent of the text node, so there's no need to maintain an escape level integer. >+ case nsIDOMNode::COMMENT_NODE: { >+ const nsTextFragment* text = static_cast<nsIContent*>(current)->GetText(); >+ text->AppendTo(buf); >+ if (escapeLevel == 0 && ignoreLevel == 0) { >+ buf = NS_LITERAL_STRING("<!--") + buf + NS_LITERAL_STRING("-->"); >+ } else { >+ buf = EmptyString(); The spec serializes comments unconditionally even if the result doesn't round trip. >+ case nsIDOMNode::DOCUMENT_NODE: >+ buf = NS_LITERAL_STRING("<!DOCTYPE html>"); >+ break; Again, the Java code intentionally deviates from the spec here due to different use cases and different APIs. In Gecko, instead of having a case for the document node, there should be a spec-compliant case for doctype nodes. Curiously, the spec requires handling processing instruction nodes, even though they don't round trip. >+ nsString name = NS_LITERAL_STRING("</"); >+ nsString nodeName = current->LocalName(); For elements that aren't in the HTML, SVG or MathML namespaces, the spec requires using the qualified name as the tag name. (I'm slightly uncomfortable with using the qualified name when the qualified name is the same as the local name, since the effects will be surprising when round-tripping, but that's what the spec says currently.)
Attached patch a stringbuilder (deleted) — Splinter Review
I was just testing something based on a stringbuilder, but still couldn't get the speed of the current serializer.
This is my most recent work from before I left on my travels. I optimized the 'always append everything' codepath fairly well based on profiles; as I recall, the majority of the work seemed to be in SetCapacity and the actual DOM grubbing last time I checked. I don't recall how the speed stacked up against the current serializer. I'm happy to let somebody run with this, as I'm focusing on per-window pb in my limited free time.
Attachment #616711 - Attachment is obsolete: true
Comment on attachment 620064 [details] [diff] [review] Implement fast serializer for innerHTML/outerHTML. Review of attachment 620064 [details] [diff] [review]: ----------------------------------------------------------------- A couple of nits for when you get back ::: content/html/content/src/nsGenericHTMLElement.cpp @@ +681,5 @@ > + if (!found) > + return; > + > + nsAutoString modified; > + modified.SetCapacity(aBuf.Length() + 5 * found); I would like a constant for the 5 (MAX_ENTITY_LENGTH?) @@ +747,5 @@ > + > +static inline bool > +FallibleAppend(nsAString& aBuf, const nsAString& aExtra) > +{ > + if(NS_UNLIKELY(!aBuf.SetCapacity(aBuf.Length() + aExtra.Length()))) 'if (', and {}s @@ +766,5 @@ > + APPEND_OR_FAIL(aOut, nsDependentAtomString(localName)); > + //toAppend.AppendElement(NS_LITERAL_STRING("<")); > + //toAppend.AppendElement(nsDependentAtomString(localName)); > + > + PRInt32 count = aContent->GetAttrCount(); PRUint32 @@ +787,5 @@ > + // > + // Filter out special case of <br type="_moz"> or <br _moz*>, > + // used by the editor. Bug 16988. Yuck. > + // > + PRInt32 tagNS = aContent->GetNameSpaceID(); Why is this in the loop? @@ +885,5 @@ > +Serialize(nsINode* aRoot, bool aDescendentsOnly, nsAString& aOut) > +{ > + nsINode* current = aDescendentsOnly ? aRoot->GetFirstChild() : aRoot; > + if (!current) > + return true; {} @@ +893,5 @@ > + while (true) { > + nsAutoString buf; > + switch (current->NodeType()) { > + case nsIDOMNode::ELEMENT_NODE: { > + mozilla::dom::Element* elem = current->AsElement(); This is in content/, so drop the 'mozilla::dom::' @@ +935,5 @@ > + buf = NS_LITERAL_STRING("<?"); > + buf.Append(current->NodeName()); > + buf.Append(NS_LITERAL_STRING(">")); > + break; > + } default case with MOZ_NOT_REACHED? @@ +943,5 @@ > + APPEND_OR_FAIL(aOut, buf); > + } > + > + nsIContent* next; > + if ((next = current->GetFirstChild())) { Either if (nsIContent* next = current->GetFirstChild()) { or nsIContent* next = current->GetFirstChild(); if (next) { @@ +955,5 @@ > + ignoreLevel--; > + } else { > + nsString name = NS_LITERAL_STRING("</"); > + nsString nodeName; > + mozilla::dom::Element* elem = current->AsElement(); Same @@ +971,5 @@ > + if (current == aRoot) { > + return true; > + } > + > + if ((next = current->GetNextSibling())) { Same
I can't figure out how the magical newline handling should work. Per spec document.body.innerHTML = null; document.body.appendChild(document.createElement("textarea")); document.body.firstChild.appendChild(document.createTextNode("\nhello")); console.log(document.body.innerHTML) should report "<textarea>\n\nhello</textarea>" But that is not what current browsers do. (tested N, C, O)
In the testcase this gives same or a bit better performance than trunk-C. https://tbpl.mozilla.org/?tree=Try&rev=9911221e883d
(In reply to Olli Pettay [:smaug] from comment #22) > I can't figure out how the magical newline handling should work. > Per spec > document.body.innerHTML = null; > document.body.appendChild(document.createElement("textarea")); > document.body.firstChild.appendChild(document.createTextNode("\nhello")); > console.log(document.body.innerHTML) > should report "<textarea>\n\nhello</textarea>" > > But that is not what current browsers do. (tested N, C, O) Current browsers are broken -- document.body.innerHTML = document.body.innerHTML will eat the newline in this case. I.e., the DOM doesn't round-trip through innerHTML. There's no reason this should happen here. Same for <pre>. Browsers should change to match the spec. Also, <xmp> needs to have its contents not HTML-escaped, same as <style> and <script>. The spec requires this and WebKit does it, but Gecko gets it wrong. (This is obviously a corner case, but my editing tests hit it, FWIW.) See bug 608322 and dependencies.
(In reply to Aryeh Gregor from comment #24) > Current browsers are broken -- document.body.innerHTML = > document.body.innerHTML will eat the newline in this case. I.e., the DOM > doesn't round-trip through innerHTML. There's no reason this should happen > here. Same for <pre>. Browsers should change to match the spec. Ok, but then this is a *very* risky change, and need to happen at the beginning of a cycle.
Actually, I think we should not change newline handling in this bug. There are enough changes and risk for regressions even without the change.
Comment on attachment 627305 [details] [diff] [review] without newline changes Nothing sticks out at me as an obvious optimization, but the linked list made my brain explode. You could look into Henri's suggestion of the way to get rid of ignoreLevel.
Attachment #627305 - Flags: feedback?(josh) → feedback+
Wouldn't simply writing to a string but using a doubling strategy for growing the string length result in significantly simpler code but with just as good performance?
(In reply to Jonas Sicking (:sicking) from comment #31) > Wouldn't simply writing to a string but using a doubling strategy for > growing the string length That is what our string code does atm. > result in significantly simpler code but with just > as good performance? And no, it doesn't give good enough performance. At least I haven't managed to do so.
(In reply to Henri Sivonen (:hsivonen) from comment #18) > Instead of having an ignore level integer here and having the caller method > walk the descendants of void elements, you could inline the start tag > handling into the tree walking loop and make the tree walking skip the next > sibling when the current node is a void element. Hmm, actually I don't understand the spec. "If current node is an area, base, basefont, bgsound, br, col, command, embed, frame, hr, img, input, keygen, link, meta, param, source, track or wbr element, then continue on to the next child node at this point." The next child node of what? current node? or the node?
I wonder which behavior the spec defines here. At least not Gecko nor Webkit behavior.
Since Gecko creates <input><div></div> if div is a child of an input element. Webkit creates <input></input>
What should happen if DOM is <input><div></div></input> and one does inputElement.innerHTML?
(In reply to Olli Pettay [:smaug] from comment #36) > What should happen if DOM is <input><div></div></input> and one does > inputElement.innerHTML? I would say '<input>'.
For a DOM like <input><div/></input> returning "<input>" for myInputElement.innerHTML wouldn't make sense. The question is if we should return "" or "<div></div>". (Or throw, but I think that would be a bad idea)
Comment on attachment 627640 [details] [diff] [review] without ignoreLevel Apparently I broke something with this.
Attachment #627640 - Attachment is obsolete: true
Attached patch patch (deleted) — Splinter Review
Assignee: josh → bugs
Attachment #627682 - Flags: review?(hsivonen)
Comment on attachment 627682 [details] [diff] [review] patch >+ var PI = document.createProcessingInstruction('foo', 'bar="1.0"'); >+ t.appendChild(PI); >+ is(t.innerHTML, '<span>hi</span> there <!-- mon ami --><?foo bar="1.0">', >+ "pi nodes should be included"); Please add tests for: 1) An SVG element 2) A MathML element 3) An element that's none of HTML, MathML or SVG and has a prefixed node name 4) An element that's none of HTML, MathML or SVG and has a prefixless node name 5) An xml:* attribute 6) An xlink:* attribute 7) An xmlns="foo" attribute 8) An xmlns:foo="bar" attribute 9) A namespaced attribute that's none of the above 10) A CDATA section. 11) HTML script and style with <, >, & and no-break space in the content 12) SVG script and style with <, >, & and no-break space in the content (different escaping from the HTML case) 13) MathML script and style with <, >, & and no-break space in the content (same as the SVG case) >+#define STRING_BUFFER_UNITS 1020 Please add a comment that explains where this magic number comes from. The magic number looks like it's trying to avoid a clown-shoe allocation. If possible, please add a static assertion about sizeof(StringBuilder) to make sure that subsequent changes don't accidentally push the size of the object over a clown shoe boundary. >+ enum Type >+ { >+ eUnknown, >+ eAtom, >+ eString, >+ eStringWithEncode, >+ eLiteral, >+ eTextFragment, >+ eTextFragmentWithEncode, >+ }; There are plenty of cases where the unit ends up containing a single-character string literal. Would it be worthwhile to save a pointer dereference and have a unit type for a single character and passing e.g. '<' instead of "<" to Append()? >+ template<int N> >+ void Append(const char (&aLiteral)[N]) I haven't seen this template pattern before, but I trust that it does what it seems to do. >+ void Append(const nsAString& aString) >+ { >+ Unit* u = AddUnit(); >+ u->mString = new nsAutoString(aString); Using nsAutoString instead of nsString seems to be against the rules. However, using nsString doesn't make sense, because append to the final output string always copies the buffer of the string instead of adopting it. It seems to me that it would make sense to use PRUnichar[] instead of nsAutoString here. >+ PRUint32 mLength; Please add a comment that explains that this field is only meaningful on the first StringBuffer object in the linked list. In general, it would be nice to have a comment that explains the linked list/unit array duality. >+ if (attName == nsGkAtoms::xmlns) { >+ aBuilder.Append(" xmlns"); ... >+ aBuilder.Append(attName); The xmlns="foo" case looks wrong. >+static void >+AppendEncodedCharacters(const nsTextFragment* aText, StringBuilder& aBuilder) ... >+static void >+AppendEncodedAttributeValue(nsAutoString* aValue, StringBuilder& aBuilder) >+ ++found; >+ aBuilder.AppendWithAttrEncode(aValue, aValue->Length() + (5 * found)); Instead of getting an inaccurate result my multiplication here, I think it would make more sense to count the exact number of extra characters generated by each escape by incrementing |found| by 3 in the case of &lt; and &gt;, by 4 in the case of &amp; and by 5 in the case of &nbsp; and &quot;. Please also add a comment that explains why it's worthwhile to iterate over the strings twice in order to make a better-sized allocation for the final output string. r=me with the above points addressed.
Attachment #627682 - Flags: review?(hsivonen) → review+
Comment on attachment 627682 [details] [diff] [review] patch Review of attachment 627682 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/html/content/src/nsGenericHTMLElement.cpp @@ +667,5 @@ > + nsAutoString* mString; > + const nsTextFragment* mTextFragment; > + }; > + Type mType; > + PRUint32 mLength; Worth using mozilla/StandardInteger.h types here and other places? I realize this doesn't match existing style of the file, but this is mostly new code anyway. @@ +811,5 @@ > + const PRUnichar* end = aValue.EndReading(); > + while (c < end) { > + switch (*c) { > + case '"': > + aOut.Append(NS_LITERAL_STRING("&quot;")); AppendLiteral("&quot;") here and other places is slightly clearer, IMHO.
(In reply to Nathan Froyd (:froydnj) from comment #43) > > + aOut.Append(NS_LITERAL_STRING("&quot;")); > > AppendLiteral("&quot;") here and other places is slightly clearer, IMHO. Oops. Yeah, consider that part of my review, too.
(In reply to Henri Sivonen (:hsivonen) from comment #42) > >+ template<int N> > >+ void Append(const char (&aLiteral)[N]) > > I haven't seen this template pattern before, but I trust that it does what > it seems to do. That is how AppendLiteral works in strings. > >+ void Append(const nsAString& aString) > >+ { > >+ Unit* u = AddUnit(); > >+ u->mString = new nsAutoString(aString); > > Using nsAutoString instead of nsString seems to be against the rules. > However, using nsString doesn't make sense, because append to the final > output string always copies the buffer of the string instead of adopting it. > It seems to me that it would make sense to use PRUnichar[] instead of > nsAutoString here. Well, the whole idea to use nsAutoString here is that autostring has internal buffer which is enough for common cases. Append(const nsAString& aString) is quite rare case. I don't see any good way to get PRUnichar[] attr values, and there shouldn't be need for special case PI and DTD nodes.
(In reply to Olli Pettay [:smaug] from comment #45) > Append(const nsAString& aString) is quite rare case. OK. > I don't see any good way to get PRUnichar[] attr values, and there shouldn't > be need > for special case PI and DTD nodes. I see. Let's go with the code you already wrote, then.
(In reply to Henri Sivonen (:hsivonen) from comment #42) > > There are plenty of cases where the unit ends up containing a > single-character string literal. Would it be worthwhile to save a pointer > dereference and have a unit type for a single character and passing e.g. '<' > instead of "<" to Append()? Not sure worth the slight complexity. Could do after profiling.
Attached patch v7 (deleted) — Splinter Review
https://tbpl.mozilla.org/?tree=Try&rev=8f58f13a9e00 couldn't figure out how to add a truly useful check for jemalloc bucket size, but added a comment.
Attachment #628790 - Flags: review?(hsivonen)
Comment on attachment 628790 [details] [diff] [review] v7 Looks good. Thanks. If you have benchmark numbers that compare the new and old serializers, it would probably be a good idea to ping the Hacks blog team about those numbers to get them into a "what's new" article.
Attachment #628790 - Flags: review?(hsivonen) → review+
https://hg.mozilla.org/mozilla-central/rev/7bf0125b26b5 I'll get the numbers from Nightlies. I hope this change doesn't break too many websites.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
(In reply to Josh Matthews [:jdm] (travelling until June 25th) from comment #10) > Smaug says this is a good testcase: > https://bug-81214-attachments.webkit.org/attachment.cgi?id=132034 Tested Nightly builds on an old Windows machine: a build w/o the patch with the patch div.innerHTML : 2905 1350 div.outerHTML : 15513 5101 body.innerHTML : 1119 790 body.outerHTML : 1116 758
Depends on: 770867
Depends on: 788444
Depends on: 789081
Depends on: 813441
Depends on: 986913
Blocks: 578141
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: