Closed
Bug 264412
(innertext)
Opened 20 years ago
Closed 9 years ago
Add support for element.innerText
Categories
(Core :: DOM: Core & HTML, defect)
Core
DOM: Core & HTML
Tracking
()
RESOLVED
FIXED
mozilla45
People
(Reporter: Paul_Adams, Assigned: roc)
References
()
Details
(Keywords: dev-doc-needed, feature, Whiteboard: [compat])
Attachments
(3 files)
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
(deleted),
text/x-review-board-request
|
MatsPalmgren_bugz
:
review+
MarcoZ
:
review+
|
Details |
(deleted),
text/x-review-board-request
|
MatsPalmgren_bugz
:
review+
smaug
:
review+
|
Details |
(deleted),
text/x-review-board-request
|
MatsPalmgren_bugz
:
review+
|
Details |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20040913 Firefox/0.10
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20040913 Firefox/0.10
It seems that for some reason, the innerText property is not updatable with the
script in the above URL.
Reproducible: Always
Steps to Reproduce:
1. Go to the URL listed in this bug report
2.
3.
Actual Results:
The innerText properties of the p, td, div and span elements are undefined when
the javascript starts, and they never get updated by the script.
The input element does actually get updated if the script is downloaded and
modified to take out the null checking.
Expected Results:
The innerText property of the p, div, span, and td elements should have been
updated via the javascript.
If the innerText property in the 'clock.js' script is replaced by innerHTML,
then everything works OK
Comment 1•20 years ago
|
||
Gecko (and thus the Mozilla suite, Firefox, Netscape, Camino, Galeon, ...) never
has supported innerText -
http://www.mozilla.org/docs/web-developer/upgrade_2.html#dom_unsupp
If you actually meant to file an enhancement bug (on Browser - DOM Level 0) for
innerText support, well, in an era when we support document.all as long as you
don't ask first, who knows?
Status: UNCONFIRMED → RESOLVED
Closed: 20 years ago
Resolution: --- → INVALID
Reporter | ||
Comment 2•20 years ago
|
||
This does beg the question why do you continue to support innerHTML?!!
Comment 3•20 years ago
|
||
*** Bug 282317 has been marked as a duplicate of this bug. ***
Comment 4•19 years ago
|
||
*** Bug 310661 has been marked as a duplicate of this bug. ***
Comment 5•19 years ago
|
||
*** Bug 338801 has been marked as a duplicate of this bug. ***
Comment 6•19 years ago
|
||
*** Bug 287395 has been marked as a duplicate of this bug. ***
Comment 7•19 years ago
|
||
*** Bug 317330 has been marked as a duplicate of this bug. ***
Comment 8•19 years ago
|
||
*** Bug 339017 has been marked as a duplicate of this bug. ***
Comment 9•19 years ago
|
||
*** Bug 254174 has been marked as a duplicate of this bug. ***
Updated•19 years ago
|
Alias: innertext
Updated•19 years ago
|
Status: RESOLVED → VERIFIED
Updated•14 years ago
|
Assignee: bugzilla → nobody
Component: General → DOM
Product: Firefox → Core
QA Contact: general → general
Comment 10•14 years ago
|
||
Is Gecko interested in implementing this if a detailed spec is written? It's complicated, but considerably more useful than textContent for getting a plaintext version of an HTML element, and every other browser has it. WebKit seems not to be interested in dropping it, and it would be nice if we could get interop here.
Status: VERIFIED → REOPENED
Ever confirmed: true
Resolution: INVALID → ---
Updated•14 years ago
|
Component: DOM → DOM: Core & HTML
OS: Windows XP → All
Hardware: x86 → All
Version: unspecified → Trunk
Comment 11•14 years ago
|
||
Why innerText is more useful than textContent?
Based on http://msdn.microsoft.com/en-us/library/ms533899%28v=vs.85%29.aspx
it is quite bizarre API.
Comment 12•14 years ago
|
||
innerText returns the element's HTML contents converted to plaintext. Conceptually, it works the same way as Selection stringification -- in fact, WebKit uses the same algorithm for both, and I'm going to specify both as the same algorithm. textContent just concatenates the text node descendants in tree order, so it preserves indentation and so forth. innerText collapses runs of whitespace to a single space, inserts newlines after block elements, etc.
I don't have specific use-cases offhand for why anyone would want to use innerText instead of textContent. The large majority of authors seem to only set it, in which case aliasing it to textContent would work, and in fact Opera treats it similarly to textContent (although they apparently have some compat bugs as a result). But given that everyone needs to implement the plaintext conversion algorithm for Selection stringification anyway, I don't see any reason not to provide innerText too. The additional implementation burden should be low, and it makes things easier for authors if all browsers implement it instead of all but Firefox.
Comment 14•14 years ago
|
||
I got a list of webpages using innerText and went through a whole bunch. Most of them use it either for setting (where it behaves the same as textContent), or for getting for elements where it behaves basically the same as textContent anyway (e.g., only text node children). The most interesting use I found was to transform a comment to plaintext for a quote, in two places:
1) http://www.cristianofino.net/post/Comment-Toolbar-plugin-per-WordPress.aspx
This uses innerText for a quotation feature -- it's part of an actively-maintained Wordpress plugin, <http://wordpress.org/extend/plugins/comment-toolbar/>. The relevant JS is at <http://www.cristianofino.net/User%20Controls/CommentToolbar.js>. To get the quote when you click the "Quota" button on a comment, it basically stringifies the Selection if there is one, otherwise takes the innerText of the post you clicked on, otherwise takes the textContent. It so happens that this works okay, because there's no block-level markup in the comments and <br>'s and newlines are paired up exactly. But given more general markup, Firefox (and Opera and IE9) would look much worse here, with lots of extra whitespace stuck in.
2) http://blog.livedoor.jp/geinin_matome/archives/1080186.html
Similar to (1) -- hover over the links like ">>66" to get a tooltip with a plaintext quote in WebKit or IE. Again, no interesting markup that would make a big difference between innerText and textContent, but there could have been.
Comment 15•13 years ago
|
||
I filed a bug asking for this to be specced in HTML5 using a relatively trivial algorithm (textContent but with <br> converted to "\n"):
http://www.w3.org/Bugs/Public/show_bug.cgi?id=13145
I did this because I was reminded of the research and discussion I did about this last February:
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-February/030179.html
Notable from that link is that I found some pages that do break on Firefox because of missing innerText, e.g.:
http://beihai.gov.cn/5863/2010_4_28/5863_98289_1272423962000.html?open=1
At the time of this writing, a large chunk of content is missing in Firefox 6.0a2 which is present in Chrome 14 dev and Opera 11.50:
"4月26日上午,北海船检局召开中心组成员《新跨越》专题学习会。 会议根据广西区港航管理局《关于在全区港航系统开展推动广西水运事业新跨越专题学习讨论活动的通知》(港航政治[2010]59号文精神,围绕推动广西水运事业新跨越主题,认真组织学习了广西区人民政府..."
That's because of the following script:
<script>var strCol=document.getElementById("contentID").innerText;if(strCol.length>150){document.write(strCol.substring(0,150)+"...");}else{document.write(strCol);}</script>
The script does nothing in Gecko because the conditional checks the length of undefined, which is a fatal error. It works in all other browsers.
Comment hidden (advocacy) |
Updated•13 years ago
|
Blocks: ringmark-ring1
Comment hidden (advocacy) |
Updated•12 years ago
|
Keywords: dev-doc-needed
Comment 18•12 years ago
|
||
As I see innerText is in spec TODO list (http://wiki.whatwg.org/wiki/Specs_todo).
Looks like nobody is going to write it. So, we neve get innerText in Firefox ? :(
Comment 22•10 years ago
|
||
This really needs to be fixed. Here is the test case just in case http://jsfiddle.net/azpnmxuc/
Otherwise people will keep creating duplicates, like they did before.
Comment 23•10 years ago
|
||
WebCompat issue created because of lack of support of innerText
https://github.com/webcompat/web-bugs/issues/633
Comment 24•10 years ago
|
||
Other previously reported Webcompat issues (some are now invalid or resolved):
* Bug 942989
* Bug 281825
* Bug 187698
* Bug 131793
* Bug 358349
* Bug 25918
* Bug 148812
* Bug 169598
* Bug 220538
* Bug 471755
Comment 25•10 years ago
|
||
Adding for context, an interesting blog post with textContent and innerText, performances computations, etc.
http://perfectionkills.com/the-poor-misunderstood-innerText/
Comment 26•10 years ago
|
||
innerText should be faster than innerHtml, because the former doesn't need any parsing. Seems like browsers should be motivated to support it well as a good practice.
Comment 27•10 years ago
|
||
Would Mozilla be interested implementing innerText after all? As Karl mentioned, I've outlined use-cases in a recent blog post. I believe innerText is more than just a pesky webcompat offender. I'll try to work on spec in the meantime.
Comment 28•10 years ago
|
||
Another Web Compatibility issue https://webcompat.com/issues/1051
Comment 29•10 years ago
|
||
Another Web Compatibility issue https://webcompat.com/issues/1071#issuecomment-101837467
Comment 30•9 years ago
|
||
Apart from the web compatibility gains, Juriy has written a great summary of innerText and a possible way forward in terms of a (pseudo) spec. If Chrome and IE are willing to converge, great. If not, we should pick one of the flavors and implement it.
http://perfectionkills.com/the-poor-misunderstood-innerText/#naive-spec
http://kangax.github.io/jstests/innerText/
Summary: innerText property on various elements not updatable with javascript → Add support for element.innerText
Updated•9 years ago
|
tracking-p11:
--- → +
Whiteboard: [compat]
Comment 31•9 years ago
|
||
Comment 32•9 years ago
|
||
Breaks Google Calendar (and probably GMail) navigation rendering: https://bugzilla.mozilla.org/show_bug.cgi?id=914252#c16
Updated•9 years ago
|
See Also: → https://github.com/whatwg/compat/issues/5
Updated•9 years ago
|
Comment 33•9 years ago
|
||
Breaks Facebook status update field resizing - bug 1197166
Status: REOPENED → NEW
Assignee | ||
Comment 34•9 years ago
|
||
Breaks clicking on erroneous words in xkcd SimpleWriter. http://blog.xkcd.com/2015/09/22/a-thing-explainer-word-checker/
Comment 35•9 years ago
|
||
For what is worth, please note that the original URL ( http://www.sharepointcustomization.com/resources/codesamples/TimeZoneClocks/clock_example.htm ) is gone.
Assignee | ||
Comment 36•9 years ago
|
||
We need to implement something here, because regardless of whether innerText is a desirable feature, Web developers are using it and sites are broken in Firefox without it.
I propose we implement the last version of Aryeh's proposed spec, here: https://rawgit.com/timdown/rangy/master/fiddlings/spec/innerText.htm
Olli, does that sound OK to you?
Assignee: nobody → roc
Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(bugs)
Comment 37•9 years ago
|
||
Given that this is so bizarre feature, I'd like to hear feedback from anne and bz too.
But yes, I agree, we need to implement something here.
Flags: needinfo?(bzbarsky)
Flags: needinfo?(annevk)
Comment 38•9 years ago
|
||
What makes the most sense to me is that getting behaves equivalently to stringifying a selection. So once selection is defined, this feature would be too. That matches what WebKit is doing if memory serves me.
Flags: needinfo?(annevk)
Comment 39•9 years ago
|
||
Yeah, that has just the issue that selection isn't properly specified.
How differently would we behave if we just used our selection stringification?
Flags: needinfo?(bugs)
Comment 40•9 years ago
|
||
Good question. I have no time now but if someone could run it through my test — http://kangax.github.io/jstests/innerText/ (which also shows how other implementations differ among each other; and which I'm planning to update with couple more options soon)
Comment 41•9 years ago
|
||
I think in the short term we should do whatever is simplest here. If just stringifying a selection is simpler for us than implementing Aryeh's somewhat stillborn spec, we should do that. Compat for this stuff is pretty much nonexistent anyway. :(
Flags: needinfo?(bzbarsky)
Assignee | ||
Comment 42•9 years ago
|
||
My priorities are in this order:
0) Don't break stuff that currently works.
1) Implement a reasonably Web-compatible version of innerText in Gecko.
2) Have a spec for innerText that browsers converge on.
3) Make innerText consistent with selection.toString().
I think perhaps the best way to proceed is to extend nsPlainTextSerializer with new options until it supports Aryeh's draft spec, then implement innerText using nsPlainTextSerializer, and iterate on the implementation and draft spec until we've shipped something Web-compatible enough. Then we can try changing selection.toString() to use the same options.
Juriy, do you know of any issues with the 2011 draft? https://rawgit.com/timdown/rangy/master/fiddlings/spec/innerText.htm
Flags: needinfo?(kangax)
Assignee | ||
Comment 43•9 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #41)
> I think in the short term we should do whatever is simplest here. If just
> stringifying a selection is simpler for us than implementing Aryeh's
> somewhat stillborn spec, we should do that. Compat for this stuff is pretty
> much nonexistent anyway. :(
While that's true, I would like to have a spec, and I'm not confident that just pushing our current selection.toString() behavior onto the Web would be good for the Web or us.
Comment 44•9 years ago
|
||
> Juriy, do you know of any issues with the 2011 draft?
Unfortunately, I haven't had a chance to look into Aryeh's test thoroughly. Glancing at it now, the algorithm seems correct but it's probably missing few things. I see that Aryeh spec'd tabs between table-cell elements — this mimics webkit behavior but not IE. It also looks like inline-block elements don't create newlines — this has been the case in IE<=10 but not anymore (including webkit). There's no mention of what iframes resolve to (webkit ignores them, IE doesn't); or <canvas>, or <video>, or <select>. Well, <select> would resolve to its content (due to display:inline?) so that's IE's behavior but not WebKit. Finally, visibility:hidden mimics WebKit's but not IE behavior.
Yeah, this is a sad state of affairs.
Like I mentioned in a post, my understanding is that we need to:
1) Write a spec
2) Try to converge IE and WebKit/Blink behavior
3) Implement spec'd behavior in Firefox
The hardest part is — of course — choosing which behavior to go with. Considering drastic difference between IE and WebKit, I doubt there's any serious compat issues lurking here. Most of the problems are from not implementing _any_ version of innerText.
I'd love to see what FF's selection behavior is — is it closer to WK or IE's innerText implementations? Perhaps that's the starting point.
Flags: needinfo?(kangax)
Comment 45•9 years ago
|
||
*Aryeh's spec thoroughly
Assignee | ||
Comment 46•9 years ago
|
||
When in doubt we should probably just follow Chrome, which I expect they mostly got from Webkit.
Assignee | ||
Comment 47•9 years ago
|
||
Here are the results I get on Juriy's tests using selection.toString():
Newlines between paragraphs: 2
Newlines after table caption: 0
Newlines after block, non-paragraph elements (e.g. <div>): 1
Includes display:none elements: no
Includes visibility:hidden elements: no
Includes <select> content: no
Includes <button> content: no
Includes <canvas>, <video> fallback content: canvas no, video yes
Includes <iframe> fallback content: no
Includes <script>, <style> content: no
Creates leading space for anonymous inline boxes (including <li>): yes
Creates newline around inline, block-styled elements (e.g. <span>): no
Respects white-space:pre: yes
Creates (only) spaces between block, inline-styled elements (e.g. <div>): no
Puts tabs between <td> tags: yes
Preserves upper/lower-cased text: no
Assignee | ||
Comment 48•9 years ago
|
||
Looking at nsPlainTextSerializer, it's pretty complex. Some changes we'd want to make would be quite disruptive, e.g. right now many HTML elements (such as <span>) ignore IsElementBlock() in nsPlainTextSerializer::DoOpenContainer. I'd feel a lot more comfortable about having a separate implementation for innerText that we can evolve independently. I also think it would be good to deploy an implementation of innerText whose behavior is as simple as possible, so that the spec can be as simple as possible while still be being Web-compatible. Maybe we should try picking the simpler behavior everywhere IE and Chrome differ.
Comment 49•9 years ago
|
||
Roc,
First of all. Very cool. That's a very good news. Thanks for taking this.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) (offline Oct 1-4) from comment #43)
> While that's true, I would like to have a spec
That's partly the goal of the opened issue on the webcompat spec
https://github.com/whatwg/compat/issues/5
You are welcome to suggest prose or writing along the implementation and we can adjust step by step.
I'm pretty sure Mike, Hallvord and I also can dig out old Web Compat issues and see how we fare with regards to them with the implementation you will be doing. That would be nice test cases. And specifically how it would work in the face of code like these ones which try to address the Web Compatibility issue.
----
> function innerText (elem) {
> return elem.innerText || elem.textContent;
> }
module.exports = innerText;
In https://github.com/joshwnj/marki/blob/475b400cb4ad9a2c1d43fe46bebb0f518c397c05/src/inner-text.js
----
> if (el.textContent !== undefined)
> el.textContent = string;
> else
> el.innerText = string;
In https://github.com/HubSpot/youmightnotneedjquery/blob/ef987223c20e480fcbfb5924d96c11cd928e1226/comparisons/elements/set_text/ie8.js
----
> function text (el, value) {
> if (arguments.length === 2) {
> el.innerText = el.textContent = value;
> }
> if (typeof el.innerText === 'string') {
> return el.innerText;
> }
> return el.textContent;
> }
In https://github.com/bevacqua/insignia/blob/53742c6929332b7c776d85dade8d7d82f524195b/text.js
----
> // innerText polyfill for Firefox
> ((document) => {
> const dummy = document.createElement('div');
> Object.defineProperty(HTMLElement.prototype, 'innerText', {
> get: function() { return this.textContent },
> set: function(v) { this.textContent = v; }
> });
> }
> })(document);
In https://github.com/ng-kyoto/ng-kyoto.github.io/blob/9315fb7b24baec5a1614b4692bfedf9c0fc267dc/app/utils/innertext-polyfill.ts
Comment 50•9 years ago
|
||
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) (offline Oct 1-4) from comment #48)
> Looking at nsPlainTextSerializer, it's pretty complex.
It is, which is one reason why we for innerHTML, which is quite well spec'ed, moved out from using ns*Serializer (at least in HTML case). Performance was another reason (bug 744830 comment 51).
FragmentOrElement::GetMarkup does the innerHTML serialization.
> I'd feel a lot more comfortable
> about having a separate implementation for innerText that we can evolve
> independently. I also think it would be good to deploy an implementation of
> innerText whose behavior is as simple as possible, so that the spec can be
> as simple as possible while still be being Web-compatible. Maybe we should
> try picking the simpler behavior everywhere IE and Chrome differ.
This all sounds good
Assignee | ||
Comment 51•9 years ago
|
||
(In reply to Karl Dubost :karlcow from comment #49)
> I'm pretty sure Mike, Hallvord and I also can dig out old Web Compat issues
> and see how we fare with regards to them with the implementation you will be
> doing. That would be nice test cases. And specifically how it would work in
> the face of code like these ones which try to address the Web Compatibility
> issue.
So far I have not been able to find any specific Web compat issues that require more than just making innerText an alias for textContent. I suspect they exist since IE's behavior has changed over time; I've asked Travis if he can find out. That's the information we really need.
(I looked at Webkit history for information about how and why their implementation has evolved, but changes to their plaintext serializer (TextIterator) do not typically mention why the change was made, and linked Webkit Bugzilla bugs are not helpful. Webkit's TextIterator is used for clipboard and other features so it's unclear which, if any, of their behavior is actually important for Web apps.)
Assignee | ||
Comment 52•9 years ago
|
||
Though actually, we need to not just come up with a spec that's the minimum required for Web compatibility, but something that IE and Chrome at least are willing to implement, so we might need something more complicated than the minimum to satisfy their perceived needs rather than just their actual needs...
Assignee | ||
Comment 53•9 years ago
|
||
I have created a proposed spec plus testsuite here: https://github.com/rocallahan/innerText-spec
Assignee | ||
Comment 54•9 years ago
|
||
Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats
Attachment #8675486 -
Flags: review?(mats)
Assignee | ||
Comment 55•9 years ago
|
||
Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Attachment #8675487 -
Flags: review?(mats)
Attachment #8675487 -
Flags: review?(bugs)
Comment 56•9 years ago
|
||
Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
https://reviewboard.mozilla.org/r/22369/#review20043
::: layout/generic/nsIFrame.h:1907
(Diff revision 1)
> * @param aSkippedMaxLength Maximum number of characters to return
The GetRenderedText() documentation needs to be updated.
::: layout/generic/nsIFrame.h:1922
(Diff revision 1)
> + virtual RenderedText GetRenderedText(uint32_t aFlags = 0,
It's a bit hard to understand what the zero means at the call site:
GetRenderedText(0, aContentOffset, aContentOffset + 1);
I think you should add a CONTENT_TEXT_OFFSETS to make it clear that the flag passed matches the offsets:
GetRenderedText(CONTENT_TEXT_OFFSETS, aContentOffset, aContentOffset + 1);
Do you think it's likely that we'll add additional flags here in the future? If not, then I suggest you make it an enum class to avoid typos like:
GetRenderedText(aContentOffset, aContentOffset + 1);
Actually, I think we should do that now anyway. Then if we need more flags in the future we add "operator|" etc.
Alternatively, could we bundle these three params into an object, DOMOffsets/RenderedOffsets subclassed of SomeOffsets, or something like that? e.g.
GetRenderedText(DOMOffsets(aContentOffset, aContentOffset + 1));
GetRenderedText(RenderedOffsets(aRenderedOffset, aRenderedOffset + 1));
::: layout/generic/nsTextFrame.h:264
(Diff revision 1)
> - virtual nsresult GetRenderedText(nsAString* aString = nullptr,
> + virtual RenderedText GetRenderedText(uint32_t aFlags = 0,
aFlags = CONTENT_TEXT_OFFSETS would make this clearer too
::: layout/generic/nsTextFrame.cpp:8888
(Diff revision 1)
> - uint32_t validCharsLength = 0;
> + bool setOffsets = false;
|setOffsets| is a confusing name. It's used like so:
if (!setOffsets) {
// set the offsets
setOffsets = true;
}
Which appears to be the opposite of what the name suggests.
Perhaps |haveOffsets| is better?
Attachment #8675486 -
Flags: review?(mats)
Comment 57•9 years ago
|
||
Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
https://reviewboard.mozilla.org/r/22371/#review20051
r=mats with those nits addressed as you see fit
::: dom/base/nsRange.cpp:3201
(Diff revision 1)
> + int mRequiredLineBreakCount;
Why use a signed type for mRequiredLineBreakCount?
And why int instead of int32_t?
::: dom/base/nsRange.cpp:3206
(Diff revision 1)
> + while (mRequiredLineBreakCount > 0) {
This loop seems silly. Why not just do:
if (mRequiredLineBreakCount > 0) {
if (!mString.IsEmpty()) {
mString.Append('\n');
}
mRequiredLineBreakCount = 0;
}
::: dom/base/nsRange.cpp:3228
(Diff revision 1)
> +static bool IsVisibleAndNotInReplacedElement(nsIFrame* aFrame)
add a newline after |bool| please
::: dom/base/nsRange.cpp:3279
(Diff revision 1)
> + if (aFrame->GetType() != nsGkAtoms::tableCellFrame) {
And nsGkAtoms::bcTableCellFrame ?
::: dom/base/nsRange.cpp:3325
(Diff revision 1)
> + nsIContent* currentNode = static_cast<nsIContent*>(mStartParent.get());
Can you use AsContent() here?
::: dom/base/nsRange.cpp:3328
(Diff revision 1)
> + nsGenericDOMDataNode* t = static_cast<nsGenericDOMDataNode*>(mStartParent.get());
'auto' makes this pattern more readable, IMO.
::: dom/base/nsRange.cpp:3358
(Diff revision 1)
> + nsIFrame::RenderedText text = f->GetRenderedText(0, 0, UINT32_MAX);
Just GetRenderedText() seems clearer.
::: dom/base/nsRange.cpp:3371
(Diff revision 1)
> + result.Append(NS_LITERAL_STRING("\n"));
Append('\n') is shorter (three places)
::: dom/base/nsRange.cpp:3373
(Diff revision 1)
> + uint8_t display = f->StyleDisplay()->mDisplay;
No need to declare a variable for this, just use the expression directly in the 'switch'.
Attachment #8675487 -
Flags: review?(mats) → review+
Comment 58•9 years ago
|
||
FYI, nsDocumentEncoder has some special code for handling ShadowRoot
and IsHTMLElement(nsGkAtoms::rp) here:
http://mxr.mozilla.org/mozilla-central/source/dom/base/nsDocumentEncoder.cpp#107
did you consider those?
Comment 59•9 years ago
|
||
It would be good to have test that checks <table><tfoot>x</tfoot><tbody>y</tbody></table>
since those are reordered in rendering. And some tests using shadow dom.
Assignee | ||
Comment 60•9 years ago
|
||
https://reviewboard.mozilla.org/r/22369/#review20043
> It's a bit hard to understand what the zero means at the call site:
> GetRenderedText(0, aContentOffset, aContentOffset + 1);
> I think you should add a CONTENT_TEXT_OFFSETS to make it clear that the flag passed matches the offsets:
> GetRenderedText(CONTENT_TEXT_OFFSETS, aContentOffset, aContentOffset + 1);
>
> Do you think it's likely that we'll add additional flags here in the future? If not, then I suggest you make it an enum class to avoid typos like:
> GetRenderedText(aContentOffset, aContentOffset + 1);
> Actually, I think we should do that now anyway. Then if we need more flags in the future we add "operator|" etc.
>
> Alternatively, could we bundle these three params into an object, DOMOffsets/RenderedOffsets subclassed of SomeOffsets, or something like that? e.g.
> GetRenderedText(DOMOffsets(aContentOffset, aContentOffset + 1));
> GetRenderedText(RenderedOffsets(aRenderedOffset, aRenderedOffset + 1));
I thought I would have to add more flags but in the end I didn't. So making this an enum class parameter, and moving it to the end, seems like the way to go.
> |setOffsets| is a confusing name. It's used like so:
> if (!setOffsets) {
> // set the offsets
> setOffsets = true;
> }
> Which appears to be the opposite of what the name suggests.
> Perhaps |haveOffsets| is better?
Renamed to haveOffsets.
Assignee | ||
Comment 61•9 years ago
|
||
https://reviewboard.mozilla.org/r/22371/#review20065
::: dom/base/nsRange.cpp:3201
(Diff revision 1)
> + int mRequiredLineBreakCount;
It can only be 0, 1 or 2. I'll make it int8_t.
::: dom/base/nsRange.cpp:3206
(Diff revision 1)
> + while (mRequiredLineBreakCount > 0) {
Because we need to add as many newlines as mRequiredLineBreakCount says. In particular sometimes (when a <p> is present) we need to add two newlines.
::: dom/base/nsRange.cpp:3279
(Diff revision 1)
> + if (aFrame->GetType() != nsGkAtoms::tableCellFrame) {
Yes, good catch.
::: dom/base/nsRange.cpp:3325
(Diff revision 1)
> + nsIContent* currentNode = static_cast<nsIContent*>(mStartParent.get());
Sure.
::: dom/base/nsRange.cpp:3328
(Diff revision 1)
> + nsGenericDOMDataNode* t = static_cast<nsGenericDOMDataNode*>(mStartParent.get());
OK
::: dom/base/nsRange.cpp:3358
(Diff revision 1)
> + nsIFrame::RenderedText text = f->GetRenderedText(0, 0, UINT32_MAX);
Sure.
::: dom/base/nsRange.cpp:3371
(Diff revision 1)
> + result.Append(NS_LITERAL_STRING("\n"));
OK, though I have had to add an Append(char) overload for that.
::: dom/base/nsRange.cpp:3373
(Diff revision 1)
> + uint8_t display = f->StyleDisplay()->mDisplay;
OK
Assignee | ||
Comment 62•9 years ago
|
||
(In reply to Mats Palmgren (:mats) from comment #58)
> FYI, nsDocumentEncoder has some special code for handling ShadowRoot
I don't understand what that code is for. It was added in bug 806506 without explanation. I think we should ignore that for now, but let's ask William. Unless I hear otherwise I'd like to completely ignore shadow DOM, like we're ignoring CSS anonymous content. I'll write some tests to check that it's ignored. Chrome excludes shadow DOM content from "innerText".
> and IsHTMLElement(nsGkAtoms::rp) here:
> http://mxr.mozilla.org/mozilla-central/source/dom/base/nsDocumentEncoder.
> cpp#107
Yes ... clearly <rp> contents should be included in innerText, though Chrome doesn't do it. I'll fix that in the spec and here.
> It would be good to have test that checks <table><tfoot>x</tfoot><tbody>y</tbody></table>
> since those are reordered in rendering. And some tests using shadow dom.
Sure.
Flags: needinfo?(wchen)
Assignee | ||
Updated•9 years ago
|
Attachment #8675486 -
Flags: review?(mats)
Assignee | ||
Comment 63•9 years ago
|
||
Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats
Assignee | ||
Comment 64•9 years ago
|
||
Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Updated•9 years ago
|
Attachment #8675486 -
Flags: review?(mats) → review+
Comment 65•9 years ago
|
||
Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
https://reviewboard.mozilla.org/r/22369/#review20107
r=mats with the issue below fixed
::: layout/generic/nsIFrame.h:1907
(Diff revision 2)
> * @param aSkippedMaxLength Maximum number of characters to return
The updated documentation is still missing.
Comment 66•9 years ago
|
||
https://reviewboard.mozilla.org/r/22371/#review20111
::: dom/base/nsRange.cpp:3267
(Diff revision 2)
> +static int
You're still using 'int' here. I guess you want int8_t?
::: dom/base/nsRange.cpp:3386
(Diff revision 2)
> + result.Append('\n');
Oh, right, 'result' is the accumulator object here. I misread it as a string type the last time which is why I suggested to use Assign(char). Adding an overload is fine, but don't feel like you have to if you want to revert that bit.
Comment 67•9 years ago
|
||
(It's fine, but I still don't get why you want to use a signed type though.)
Comment 68•9 years ago
|
||
We're not compatible with Chrome on this test (table.innerText):
<table>
<tfoot><tr><td>footer</tfoot>
<thead><tr><td style="visibility:collapse">thead</thead>
<tbody><tr><td>tbody</tbody>
</table>
You may want to fix that, and perhaps also update the spec?
Please also add tests with visibility:collapse on (each) row-group, row, and cell.
Comment 69•9 years ago
|
||
We should probably also add tests for grid/flex items that use 'order':
<div style="display:grid"><x style="order:1">1</x><x>2</x></div>
<div style="display:flex"><x style="order:1">1</x><x>2</x></div>
"2" is rendered before "1", but the innerText should be "1\n2"
I suppose (since the <x>'s are blockified).
Probably also worth having tests with visibility:collapse on some items,
since it's supported at least for flexbox:
http://www.w3.org/TR/css-flexbox-1/#visibility-collapse
(and Grid might support it too in the future)
Assignee | ||
Comment 70•9 years ago
|
||
https://reviewboard.mozilla.org/r/22369/#review20197
::: layout/generic/nsIFrame.h:1907
(Diff revision 2)
> * @param aSkippedMaxLength Maximum number of characters to return
Fixed
Assignee | ||
Comment 71•9 years ago
|
||
https://reviewboard.mozilla.org/r/22371/#review20199
::: dom/base/nsRange.cpp:3267
(Diff revision 2)
> +static int
Yes.
I'm using a signed type because especially for limited-range types, it's safer. Bugs like for (...; x >= 0; ...) can't bite.
Assignee | ||
Comment 72•9 years ago
|
||
(In reply to Mats Palmgren (:mats) from comment #68)
> We're not compatible with Chrome on this test (table.innerText):
> <table>
> <tfoot><tr><td>footer</tfoot>
> <thead><tr><td style="visibility:collapse">thead</thead>
> <tbody><tr><td>tbody</tbody>
> </table>
>
> You may want to fix that, and perhaps also update the spec?
I'd rather not change anything here. I don't think there's anything clearly wrong with what the spec and our implementation do, changing this to match Chrome would add complexity, and I'd be surprised if it causes Web compatibility problems in practice.
> Please also add tests with visibility:collapse on (each) row-group, row, and
> cell.
Done.
(In reply to Mats Palmgren (:mats) from comment #69)
> We should probably also add tests for grid/flex items that use 'order':
> <div style="display:grid"><x style="order:1">1</x><x>2</x></div>
> <div style="display:flex"><x style="order:1">1</x><x>2</x></div>
>
> "2" is rendered before "1", but the innerText should be "1\n2"
> I suppose (since the <x>'s are blockified).
Done.
> Probably also worth having tests with visibility:collapse on some items,
> since it's supported at least for flexbox:
> http://www.w3.org/TR/css-flexbox-1/#visibility-collapse
> (and Grid might support it too in the future)
Done.
Assignee | ||
Comment 73•9 years ago
|
||
Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats
Assignee | ||
Comment 74•9 years ago
|
||
Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Assignee | ||
Comment 75•9 years ago
|
||
Assignee | ||
Comment 76•9 years ago
|
||
Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Assignee | ||
Comment 77•9 years ago
|
||
https://reviewboard.mozilla.org/r/22371/#review20203
I've updated <rp> handling a little bit to ensure that visiblity:hidden <rp>s don't contribute to the result of innerText.
Comment 78•9 years ago
|
||
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #62)
> (In reply to Mats Palmgren (:mats) from comment #58)
> > FYI, nsDocumentEncoder has some special code for handling ShadowRoot
>
> I don't understand what that code is for. It was added in bug 806506 without
> explanation. I think we should ignore that for now, but let's ask William.
> Unless I hear otherwise I'd like to completely ignore shadow DOM, like we're
> ignoring CSS anonymous content. I'll write some tests to check that it's
> ignored. Chrome excludes shadow DOM content from "innerText".
If I recall correctly, the code is to make serializing from a selection in shadow DOM work. In some cases, selection within a shadow DOM is described by a range where the root container is the shadow root and there is code that checks if the root container is visible before serializing the contents. It looks like innerText is supposed to work similar to serializing a selection, and given that selection with shadow DOM is not well defined and behaves poorly in all implementations (spec only says user agents provide a best attempt), it's not clear what behavior we actually want for textContent so whatever is easiest is probably fine for now. It should be more clear once selection and shadow DOM is specified.
Flags: needinfo?(wchen)
Assignee | ||
Comment 79•9 years ago
|
||
OK great, thanks.
Assignee | ||
Comment 80•9 years ago
|
||
These patches caused some accessibility test failures that I'm fixing up now.
Assignee | ||
Comment 81•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/ab657569f554a1b1dae9575fb95ca5a94949442a
Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
https://hg.mozilla.org/integration/mozilla-inbound/rev/a396f4262479a5695c918f59bacd542ba21448b0
Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Comment 82•9 years ago
|
||
Hmm, this landed already? Is my review needed here?
(Sorry, I've had some other stuff to review too)
Comment 83•9 years ago
|
||
Roc landed by mistake, it seems:
b70e89c03c56 Robert O'Callahan — Revert incorrectly committed changes ab657569f554 and a396f4262479
Assignee | ||
Comment 84•9 years ago
|
||
I landed it by mistake and backed it out immediately. So it still needs your review.
Flags: needinfo?(bugs)
Assignee | ||
Comment 85•9 years ago
|
||
Assignee | ||
Comment 86•9 years ago
|
||
Assignee | ||
Comment 87•9 years ago
|
||
Assignee | ||
Comment 88•9 years ago
|
||
Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
The test changes here are to adjust for the fact that
nsTextFrame::GetRenderedText can now trim whitespace from the end of lines
that end in a hard line break.
Attachment #8675486 -
Attachment description: MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats → MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
Attachment #8675486 -
Flags: review?(mzehe)
Assignee | ||
Comment 89•9 years ago
|
||
Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Assignee | ||
Comment 90•9 years ago
|
||
https://reviewboard.mozilla.org/r/22369/#review20995
mats: I'm adding aPushedFrame = true; in nsLineLayout.cpp. This fixes a bug that was revealed by the accessibility tests, where a line was not getting SetLineWrapped(true) even though we had a soft line break, because nsBlockFrame::ReflowInlineFrame did not see a pushed frame. So innerText could strip spaces before soft line breaks when it shouldn't.
This also fixes a bug in GetRenderedText where whitespace trimming was not being taken into account when we compute the length of the rendered text for a frame, and so our rendered text offsets were off sometimes. Also revealed by the accessibility tests.
marcoz: The accessibility test changes are entirely due to the new code trimming trailing whitespace when it's before a hard line break and collapsible.
Assignee | ||
Comment 91•9 years ago
|
||
https://reviewboard.mozilla.org/r/22371/#review20997
The changes here are very minor, just a static-analysis fix and a couple of tests to catch the bug about lines not being marked as having a soft line break.
Assignee | ||
Comment 92•9 years ago
|
||
Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
The test changes here are to adjust for the fact that
nsTextFrame::GetRenderedText can now trim whitespace from the end of lines
that end in a hard line break.
Assignee | ||
Comment 93•9 years ago
|
||
Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Assignee | ||
Comment 94•9 years ago
|
||
Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
The test changes here are to adjust for the fact that
nsTextFrame::GetRenderedText can now trim whitespace from the end of lines
that end in a hard line break.
Assignee | ||
Comment 95•9 years ago
|
||
Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Assignee | ||
Comment 96•9 years ago
|
||
Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
The test changes here are to adjust for the fact that
nsTextFrame::GetRenderedText can now trim whitespace from the end of lines
that end in a hard line break.
Assignee | ||
Comment 97•9 years ago
|
||
Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Assignee | ||
Updated•9 years ago
|
Attachment #8675486 -
Flags: review+ → review?(mats)
Assignee | ||
Updated•9 years ago
|
Attachment #8675487 -
Flags: review+ → review?(mats)
Comment 98•9 years ago
|
||
Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
https://reviewboard.mozilla.org/r/22369/#review21029
r=marcoz
::: accessible/generic/HyperTextAccessible.cpp:1989
(Diff revision 4)
> - // Only get info up to original offset, we know that will be larger than skipped offset
> + *aRenderedOffset = text.mOffsetWithinNodeRenderedText;
Can we keep the comment, please?
::: accessible/generic/HyperTextAccessible.cpp:2013
(Diff revision 4)
> - // We only need info up to skipped offset -- that is what we're converting to original offset
> + *aContentOffset = text.mOffsetWithinNodeText;
Same here.
r=me with the two nits fixed. Thanks!
Attachment #8675486 -
Flags: review?(mzehe) → review+
Comment 99•9 years ago
|
||
Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
https://reviewboard.mozilla.org/r/22369/#review21059
r=mats
::: layout/generic/nsTextFrame.cpp:9009
(Diff revisions 3 - 4)
> + NS_ASSERTION(result.mOffsetWithinNodeRenderedText >= offsetInRenderedString &&
Please use MOZ_ASSERT instead.
Attachment #8675486 -
Flags: review?(mats) → review+
Comment 100•9 years ago
|
||
Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
https://reviewboard.mozilla.org/r/22371/#review21061
Attachment #8675487 -
Flags: review?(mats) → review+
Comment 101•9 years ago
|
||
+struct InnerTextAccumulator {
{ goes to its own line
+ nsString& mString;
+ int8_t mRequiredLineBreakCount;
Could you put these to the end of the struct.
void Append(char ch)
s/ch/aCh/
+static bool
+ElementIsVisible(nsIContent* aContent)
+{
+ Element* elem = aContent->AsElement();
+ if (!elem) {
+ return false;
+ }
You must check aContent->IsElement() before using AsElement(). AsElement just does cast.
(and in DOM world its name would be GetAsElement() if it might return null.)
+ RefPtr<nsStyleContext> sc = nsComputedDOMStyle::GetStyleContextForElement(
+ elem, nullptr, nullptr);
2 spaces for indentation.
How does ElementIsVisible behave if the element is inside display: none; ? And how should it work?
GetRequiredInnerTextLineBreakCount
So <p<>'s linebreak handling never depends on its styling.
+enum TreeTraversalState {
+ AT_NODE,
+ AFTER_NODE
+};
Could you please add some comment what the states are about.
+void
+nsGenericHTMLElement::GetInnerText(mozilla::dom::DOMString& aValue,
+ mozilla::ErrorResult& aError)
you can drop mozilla::dom and mozilla:: here
+{
+ if (!GetPrimaryFrame(Flush_Layout)) {
+ RefPtr<nsStyleContext> sc =
+ nsComputedDOMStyle::GetStyleContextForElementNoFlush(this, nullptr, nullptr);
2 space for indentation
+ RefPtr<nsRange> range;
+ nsresult rv = nsRange::CreateRange(static_cast<nsINode*>(this), 0, this,
+ GetChildCount(), getter_AddRefs(range));
2 spaces for indentation
Hmm, GetInnerText looks a bit malloc heavy. Did you compare the performance to other engines?
One simple optimization would be to not create Range object all the time. Either reuse one (and let Document to keep it alive) or
don't use Range at all but just pass the right 'startParent' and 'endParent' to GetInnerTextNoFlush.
In fact, since you just pass 'this' as start and end, why do we need Range at all?
GetInnerTextNoFlush could be just a static method taking startParent and endOffset, or if we want to support using if for other cases too,
make it take start/endParent and also start/endOffset. But no need for creating Range object all the time.
+ nsString str;
+ const char16_t* s = aValue.BeginReading();
+ const char16_t* end = aValue.EndReading();
nsAutoString for str.
+ if (!str.IsEmpty()) {
+ RefPtr<nsTextNode> textContent =
+ new nsTextNode(NodeInfo()->NodeInfoManager());
2 spaces for indentation
+ already_AddRefed<mozilla::dom::NodeInfo> ni =
+ NodeInfo()->NodeInfoManager()->GetNodeInfo(nsGkAtoms::br,
ditto
+ str.SetLength(0);
.Truncate()
So I'd like to see a bit less malloc heavy (especially given that Range is cycle collectable object) setup for the getter, and it should be rather easy to do.
Flags: needinfo?(bugs)
Updated•9 years ago
|
Attachment #8675487 -
Flags: review?(bugs) → review-
Assignee | ||
Comment 102•9 years ago
|
||
(In reply to Marco Zehe (:MarcoZ) from comment #98)
> ::: accessible/generic/HyperTextAccessible.cpp:1989
> (Diff revision 4)
> > - // Only get info up to original offset, we know that will be larger than skipped offset
> > + *aRenderedOffset = text.mOffsetWithinNodeRenderedText;> Can we keep the comment, please?
I don't think we should keep it, because the fact "we know that will be larger than skipped offset" is no longer relevant. The new code doesn't depend on that assumption.
> ::: accessible/generic/HyperTextAccessible.cpp:2013
> (Diff revision 4)
> > - // We only need info up to skipped offset -- that is what we're converting to original offset
> > + *aContentOffset = text.mOffsetWithinNodeText;
>
> Same here.
I don't think this comment really makes sense anymore either.
Flags: needinfo?(mzehe)
Assignee | ||
Comment 103•9 years ago
|
||
(In reply to Mats Palmgren (:mats) from comment #99)
> ::: layout/generic/nsTextFrame.cpp:9009
> (Diff revisions 3 - 4)
> > + NS_ASSERTION(result.mOffsetWithinNodeRenderedText >= offsetInRenderedString &&
>
> Please use MOZ_ASSERT instead.
I'm not a fan of this change, since I think an error in rendered text offsets is not worth interrupting a test run for, but I don't really care so I'll do it.
Assignee | ||
Comment 104•9 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #101)
> +struct InnerTextAccumulator {
> { goes to its own line
Done.
> + nsString& mString;
> + int8_t mRequiredLineBreakCount;
> Could you put these to the end of the struct.
Done.
> void Append(char ch)
> s/ch/aCh/
Done
> +static bool
> +ElementIsVisible(nsIContent* aContent)
> +{
> + Element* elem = aContent->AsElement();
> + if (!elem) {
> + return false;
> + }
> You must check aContent->IsElement() before using AsElement(). AsElement
> just does cast.
> (and in DOM world its name would be GetAsElement() if it might return null.)
I made ElementIsVisible take Element* and moved AsElement() to the caller. The caller only calls it if IsHTMLElement(nsGkAtoms::rp) is true, which I guess is enough to justify calling AsElement().
> + RefPtr<nsStyleContext> sc = nsComputedDOMStyle::GetStyleContextForElement(
> + elem, nullptr, nullptr);
> 2 spaces for indentation.
Done.
> How does ElementIsVisible behave if the element is inside display: none; ?
> And how should it work?
It returns true. It's only supposed to check the visiblity style. This is because <rp> is normally display:none but we want to include it in innerText --- but we *don't* want to include it if it's visibility:hidden.
> GetRequiredInnerTextLineBreakCount
> So <p<>'s linebreak handling never depends on its styling.
Correct. I'm guessing that making linebreak handling depend on CSS style (e.g. 'margin') would be unnecessarily complicated.
> +enum TreeTraversalState {
> + AT_NODE,
> + AFTER_NODE
> +};
> Could you please add some comment what the states are about.
Sure.
> +void
> +nsGenericHTMLElement::GetInnerText(mozilla::dom::DOMString& aValue,
> + mozilla::ErrorResult& aError)
> you can drop mozilla::dom and mozilla:: here
Done.
> +{
> + if (!GetPrimaryFrame(Flush_Layout)) {
> + RefPtr<nsStyleContext> sc =
> + nsComputedDOMStyle::GetStyleContextForElementNoFlush(this, nullptr,
> nullptr);
> 2 space for indentation
Done.
> + RefPtr<nsRange> range;
> + nsresult rv = nsRange::CreateRange(static_cast<nsINode*>(this), 0, this,
> + GetChildCount(), getter_AddRefs(range));
> 2 spaces for indentation
Done.
> Hmm, GetInnerText looks a bit malloc heavy. Did you compare the performance
> to other engines?
No. I don't know of any use-cases where performance would matter, though I suppose that won't stop people writing silly benchmarks.
> One simple optimization would be to not create Range object all the time.
> Either reuse one (and let Document to keep it alive) or
> don't use Range at all but just pass the right 'startParent' and 'endParent'
> to GetInnerTextNoFlush.
> In fact, since you just pass 'this' as start and end, why do we need Range
> at all?
Because I'd like to reuse this to implement Selection.toString().
> GetInnerTextNoFlush could be just a static method taking startParent and
> endOffset, or if we want to support using if for other cases too,
> make it take start/endParent and also start/endOffset. But no need for
> creating Range object all the time.
OK, I'll do that.
Assignee | ||
Comment 105•9 years ago
|
||
Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
The test changes here are to adjust for the fact that
nsTextFrame::GetRenderedText can now trim whitespace from the end of lines
that end in a hard line break.
Assignee | ||
Comment 106•9 years ago
|
||
Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Attachment #8675487 -
Flags: review- → review?(bugs)
Comment 107•9 years ago
|
||
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #102)
> (In reply to Marco Zehe (:MarcoZ) from comment #98)
> > ::: accessible/generic/HyperTextAccessible.cpp:1989
> > (Diff revision 4)
> > > - // Only get info up to original offset, we know that will be larger than skipped offset
> > > + *aRenderedOffset = text.mOffsetWithinNodeRenderedText;> Can we keep the comment, please?
>
> I don't think we should keep it, because the fact "we know that will be
> larger than skipped offset" is no longer relevant. The new code doesn't
> depend on that assumption.
OK, thanks for the clarification! Just wanted to make sure we don't lose any important information. This goes for the second one, too.
Flags: needinfo?(mzehe)
Comment 108•9 years ago
|
||
Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
Note
struct RenderedText {
nsString mString;
uint32_t mOffsetWithinNodeRenderedText;
int32_t mOffsetWithinNodeText;
RenderedText() : mOffsetWithinNodeRenderedText(0),
mOffsetWithinNodeText(0) {}
};
that mString shows up pretty badly in a silly testcase like
http://mozilla.pettay.fi/moztests/innertext_nonsimple_nostyle_test.html
That test has two parts, and it is the latter part where mString handling shows up.
The first 500 runs is reflow bounded (can we do anything to that?), and the latter part is malloc heavy.
Using nsAutoString and not nsString reduces allocations quite a bit. Still behind blink's performance, but at least a bit closer.
Comment 109•9 years ago
|
||
Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
+IsVisibleAndNotInReplacedElement(nsIFrame* aFrame)
+{
+ if (!aFrame || !aFrame->StyleVisibility()->IsVisible()) {
+ return false;
+ }
+ for (nsIFrame* f = aFrame->GetParent(); f; f = f->GetParent()) {
+ if (f->GetContent() && f->GetContent()->IsHTMLElement(nsGkAtoms::button)) {
+ continue;
+ }
+ if (f->IsFrameOfType(nsIFrame::eReplaced)) {
+ return false;
+ }
+ }
+ return true;
+}
This shows up in performance profiles quite a bit (less than the nsString though). Anything we can do to it?
+nsRange::GetInnerTextNoFlush(DOMString& aValue, ErrorResult& aError,
+ nsINode* aStartParent, uint32_t aStartOffset,
+ nsINode* aEndParent, uint32_t aEndOffset)
I think aStartParent and aEndParent should be nsIContent objects since that is what the algorithm assumes.
(aEndParent->AsContent() isn't safe if aEndParent isn't nsIContent)
+void
+nsGenericHTMLElement::GetInnerText(mozilla::dom::DOMString& aValue,
+ mozilla::ErrorResult& aError)
+{
+ if (!GetPrimaryFrame(Flush_Layout)) {
+ RefPtr<nsStyleContext> sc =
+ nsComputedDOMStyle::GetStyleContextForElementNoFlush(this, nullptr, nullptr);
+ if (!sc || sc->StyleDisplay()->mDisplay == NS_STYLE_DISPLAY_NONE ||
+ !GetCurrentDoc()) {
GetCurrentDoc() is deprecated, and I think we want this to deal with shadow dom too, so IsInComposedDoc() should work fine.
+ // Batch possible DOMSubtreeModified events.
+ mozAutoSubtreeModified subtree(nullptr, nullptr);
+
+ // Scope firing mutation events so that we don't carry any state that
+ // might be stale
+ {
+ // We're relying on mozAutoSubtreeModified to keep a strong reference if
+ // needed.
+ nsIDocument* doc = OwnerDoc();
+
+ // Optimize the common case of there being no observers
+ if (nsContentUtils::HasMutationListeners(doc, NS_EVENT_BITS_MUTATION_NODEREMOVED)) {
+ subtree.UpdateTarget(doc, nullptr);
+ // Keep "this" alive during mutation listener firing
+ kungFuDeathGrip = this;
+ nsCOMPtr<nsINode> child;
+ for (child = nsINode::GetFirstChild();
+ child && child->GetParentNode() == this;
+ child = child->GetNextSibling()) {
+ nsContentUtils::MaybeFireNodeRemoved(child, this, doc);
+ }
+ }
+ }
// Batch possible DOMSubtreeModified events.
mozAutoSubtreeModified subtree(doc, nullptr);
FireNodeRemovedForChildren();
should work here.
Attachment #8675487 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 110•9 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #108)
> Note
> struct RenderedText {
> nsString mString;
> uint32_t mOffsetWithinNodeRenderedText;
> int32_t mOffsetWithinNodeText;
> RenderedText() : mOffsetWithinNodeRenderedText(0),
> mOffsetWithinNodeText(0) {}
> };
> that mString shows up pretty badly in a silly testcase like
> http://mozilla.pettay.fi/moztests/innertext_nonsimple_nostyle_test.html
> That test has two parts, and it is the latter part where mString handling
> shows up.
> The first 500 runs is reflow bounded (can we do anything to that?)
Not the way it's currently specced, no.
> and the
> latter part is malloc heavy.
> Using nsAutoString and not nsString reduces allocations quite a bit. Still
> behind blink's performance, but at least a bit closer.
OK. The question is, is this something worth optimizing?
Assignee | ||
Comment 111•9 years ago
|
||
According to my profile 2/3 of the time is actually in nsTextFrame::GetRenderedText for your benchmark. 1/7 is in IsVisibleAndNotInReplacedElement. The string memory overhead (Linux) seems to be quite low, maybe 1/15.
Assignee | ||
Comment 112•9 years ago
|
||
Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
The test changes here are to adjust for the fact that
nsTextFrame::GetRenderedText can now trim whitespace from the end of lines
that end in a hard line break.
Assignee | ||
Comment 113•9 years ago
|
||
Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
Attachment #8675487 -
Flags: review+ → review?(bugs)
Assignee | ||
Comment 114•9 years ago
|
||
Bug 264412. Optimize GetRenderedText. r=mats
With these changes we're slightly faster than Chrome on the non-reflowing part of
Olli's testcase.
Attachment #8681097 -
Flags: review?(mats)
Assignee | ||
Comment 115•9 years ago
|
||
I(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #111)
> According to my profile 2/3 of the time is actually in
> nsTextFrame::GetRenderedText for your benchmark. 1/7 is in
> IsVisibleAndNotInReplacedElement. The string memory overhead (Linux) seems
> to be quite low, maybe 1/15.
I misread the profile. A lot of the time in GetRenderedText was string processing; Append(char) seems to have significant fixed overhead. The third patch here avoids that. I also made changes to IsVisibleAndNotInReplacedElement to make it somewhat faster, and made a tweak so we call it roughly half as often.
Comment 116•9 years ago
|
||
Comment on attachment 8681097 [details]
MozReview Request: Bug 264412. Optimize GetRenderedText. r=mats
https://reviewboard.mozilla.org/r/23769/#review21253
::: layout/generic/nsTextFrame.cpp:8906
(Diff revision 1)
> + if (ch == '\n') {
Perhaps this is more readable?
if ((ch == '\n' && !aStyle->NewlineIsSignificant(aFrame)) ||
(ch == '\t' && !aStyle->TabIsSignificant())) {
ch = ' ';
}
::: layout/generic/nsTextFrame.cpp:8918
(Diff revision 1)
> + if (aStyle->mTextTransform != NS_STYLE_TEXT_TRANSFORM_NONE) {
Why exclude TRANSFORM_NONE up front?
The 'switch' is a NOP for that value already so it doesn't seem necessary.
Is it just an optimization for the common case?
If so, you should probably point that out for clarity, and perhaps also
make it an early return instead?
::: layout/generic/nsTextFrame.cpp:9058
(Diff revision 1)
> + trimmedOffsets.GetEnd() - iter.GetOriginalOffset());
iter.GetOriginalOffset() hides a subtraction and this expression is
repeated in a couple of places (also in a loop).
I think you should make this a local var, say originalOffset.
That should make the std::min fit on one line?
Also, you can avoid the addition inside the loop, like so:
auto end = originalOffset + runLength;
for (int32_t i = originalOffset; i < end; ++i) {
... CharAt(i)
It might be worth getting rid of iter.GetOriginalOffset()
in the while-loop condition too unless it complicates
the code too much.
Attachment #8681097 -
Flags: review?(mats)
Comment 117•9 years ago
|
||
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #110)
> (In reply to Olli Pettay [:smaug] from comment #108)
> > and the
> > latter part is malloc heavy.
> > Using nsAutoString and not nsString reduces allocations quite a bit. Still
> > behind blink's performance, but at least a bit closer.
>
> OK. The question is, is this something worth optimizing?
Well, we've spent quite some time to optimize innerHTML for example since many or probably most of
DOM related benchmarks have some silly micro benchmarks for it.
Now with innerText supported I wouldn't be surprised benchmarks adding tests for it.
The reason I tested the performance here was to make sure we aren't way of the performance of others.
Updated•9 years ago
|
Attachment #8675487 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 118•9 years ago
|
||
(In reply to Mats Palmgren (:mats) from comment #116)
> ::: layout/generic/nsTextFrame.cpp:8906
> (Diff revision 1)
> > + if (ch == '\n') {
>
> Perhaps this is more readable?
>
> if ((ch == '\n' && !aStyle->NewlineIsSignificant(aFrame)) ||
> (ch == '\t' && !aStyle->TabIsSignificant())) {
> ch = ' ';
> }
Sure
> ::: layout/generic/nsTextFrame.cpp:8918
> (Diff revision 1)
> > + if (aStyle->mTextTransform != NS_STYLE_TEXT_TRANSFORM_NONE) {
>
> Why exclude TRANSFORM_NONE up front?
> The 'switch' is a NOP for that value already so it doesn't seem necessary.
> Is it just an optimization for the common case?
> If so, you should probably point that out for clarity, and perhaps also
> make it an early return instead?
Sorry, that was left over from previous work. Removed.
> ::: layout/generic/nsTextFrame.cpp:9058
> (Diff revision 1)
> > + trimmedOffsets.GetEnd() - iter.GetOriginalOffset());
>
> iter.GetOriginalOffset() hides a subtraction and this expression is
> repeated in a couple of places (also in a loop).
> I think you should make this a local var, say originalOffset.
> That should make the std::min fit on one line?
> Also, you can avoid the addition inside the loop, like so:
>
> auto end = originalOffset + runLength;
> for (int32_t i = originalOffset; i < end; ++i) {
> ... CharAt(i)
>
> It might be worth getting rid of iter.GetOriginalOffset()
> in the while-loop condition too unless it complicates
> the code too much.
I'd rather not do these things because they complicate the code and I very much doubt there's any measurable performance impact.
Comment 119•9 years ago
|
||
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #118)
> I'd rather not do these things because they complicate the code and I very
> much doubt there's any measurable performance impact.
Well, I didn't quite like the formatting here:
runLength = std::min(runLength,
trimmedOffsets.GetEnd() - iter.GetOriginalOffset());
so I figured a local var would fix that:
auto originalOffset = iter.GetOriginalOffset();
runLength = std::min(runLength, trimmedOffsets.GetEnd() - originalOffset);
and when you have that you might as well use it in the for-loop.
Oh well, it's a matter of opinion I guess. :-)
So r=mats with the TRANSFORM_NONE check removed.
Assignee | ||
Comment 120•9 years ago
|
||
Comment on attachment 8681097 [details]
MozReview Request: Bug 264412. Optimize GetRenderedText. r=mats
Bug 264412. Optimize GetRenderedText. r=mats
With these changes we're slightly faster than Chrome on the non-reflowing part of
Olli's testcase.
Attachment #8681097 -
Flags: review?(mats)
Updated•9 years ago
|
Attachment #8681097 -
Flags: review?(mats) → review+
Comment 121•9 years ago
|
||
Comment on attachment 8681097 [details]
MozReview Request: Bug 264412. Optimize GetRenderedText. r=mats
https://reviewboard.mozilla.org/r/23769/#review21259
r=mats if my question below isn't an issue
::: dom/base/nsRange.cpp:3403
(Diff revision 2)
> - } else {
> + if (currentNode == endNode && currentState == endState) {
> + break;
Hmm, doesn't this skip the AFTER_NODE action for the endNode?
::: layout/generic/nsTextFrame.cpp:9050
(Diff revision 2)
> + runLength = std::min(runLength,
> + trimmedOffsets.GetEnd() - iter.GetOriginalOffset());
I'd prefer if the second arg to std::min lined up with the first.
Assignee | ||
Comment 122•9 years ago
|
||
(In reply to Mats Palmgren (:mats) from comment #121)
> ::: dom/base/nsRange.cpp:3403
> (Diff revision 2)
> > - } else {
> > + if (currentNode == endNode && currentState == endState) {
> > + break;
>
> Hmm, doesn't this skip the AFTER_NODE action for the endNode?
That's the same behavior as before. We stop as soon as we reach the state endNode/endState. And that's OK. For example, when HTMLElement calls GetInnerTextNoFlush, we pass the subject element and its GetChildCount() for aEndParent/aEndOffset. So we stop when currentNode is aEndParent and currentState is AFTER_NODE, which is before we've done the AFTER_NODE actions for aEndParent, which is correct per spec.
Comment 123•9 years ago
|
||
OK, good.
Assignee | ||
Comment 124•9 years ago
|
||
Assignee | ||
Comment 125•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/95346f49d048f5abb3d60df4beec4fe4ef412017
Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz
https://hg.mozilla.org/integration/mozilla-inbound/rev/5ebc59281c25fbb8ea288f24797b9ece1fdb21a5
Bug 264412. Implement HTMLElement.innerText. r=smaug,mats
https://hg.mozilla.org/integration/mozilla-inbound/rev/9160f08660b8290559e427fd80d617edd86fe2a6
Bug 264412. Optimize GetRenderedText. r=mats
Comment 126•9 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/95346f49d048
https://hg.mozilla.org/mozilla-central/rev/5ebc59281c25
https://hg.mozilla.org/mozilla-central/rev/9160f08660b8
Status: NEW → RESOLVED
Closed: 20 years ago → 9 years ago
status-firefox45:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
Comment 127•9 years ago
|
||
Please update the compatibility table in https://developer.mozilla.org/en-US/docs/Web/API/Node/innerText - supported since Firefox 45.
Comment 128•9 years ago
|
||
I think there's a bug.
<p style="visibility:hidden"><span style="visibility:visible">p</span></p><div>div</div>
This chunk produces "p\ndiv" in FF but "p\n\ndiv" in Chrome.
Chrome behavior seems to make more sense to me. Also, removing visibility:hidden/visibility:visible from <p> produces more sensible double \n in FF right away.
Comment 129•9 years ago
|
||
Besides the above mentioned combination (which I assume to be a bug), I have so far found 2 differences in Chrome and FF: Chrome doesn't respect "whitespace: pre-line" (actually none of the browsers do besides FF) which I think is a bug in Chrome (and IE/Edge). Chrome puts 1 newline between paragraph and non-paragraph (like div) whereas FF puts 2, which I think is correct, and which IE10+/Edge do too.
The updated compat table — http://kangax.github.io/jstests/innerText/
I'm planning to add few more tests (from the ones Robert added with this commit) to find out if there's more deviations among implementations. And will try to file tickets with Chrome as well.
Comment 130•9 years ago
|
||
Quick note about performance.
On my test page, I get 25-30ms vs. 35-40ms. in Chrome vs. Firefox (jsperf is unusable so my benchmark is crude) — so a pretty significant/noticeable difference. This might be because Chrome cuts corners.
The reason this is important is because some apps need to request innerText quite frequently on large documents when editing text and having to calculate position of cursor in formatted text.
P.S. Edge seems to perform really bad (~100ms) although I'm checking it through VirtualBox so perhaps that's what's affecting it.
Comment 131•9 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/95346f49d048
https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/5ebc59281c25
https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/9160f08660b8
status-b2g-v2.5:
--- → fixed
Comment 132•9 years ago
|
||
removing the b2g 2.5 flag since this commit has been reverted due to an incorrect merge, sorry for the confusion
status-b2g-v2.5:
fixed → ---
Assignee | ||
Comment 133•9 years ago
|
||
(In reply to Juriy "kangax" Zaytsev from comment #128)
> <p style="visibility:hidden"><span
> style="visibility:visible">p</span></p><div>div</div>
>
> This chunk produces "p\ndiv" in FF but "p\n\ndiv" in Chrome.
>
> Chrome behavior seems to make more sense to me. Also, removing
> visibility:hidden/visibility:visible from <p> produces more sensible double
> \n in FF right away.
The FF/spec behavior here is intentional.
Handling CSS 'visibility' in innerText is tricky because in CSS 'visibility' does not affect layout, but that doesn't make sense in the context of 'innerText'. (The most faithful mimicry would be to replace text with spaces, perhaps, but that seems crazy and no browser does it.) So I've chosen to take a simple approach in FF and the spec where visibility:visible elements are just ignored for innerText. Chrome has a halfway approach where visibility:hidden suppresses some related text but not a <p>'s extra newlines *if the <p> has visible descendants* (or something like that).
We could make the spec a little more complicated to match Chrome here, but I'm reluctant to do that without a strong reason. Chrome's <p> handling has some definitely undesirable features, e.g. abc<p>123</p>def gives "abc\n123\n\ndef", i.e. you get a "bottom margin" but not a "top margin", which makes no sense, so even if Chrome's <p>-related behavior had a simple spec (which I doubt), we wouldn't really want to follow it. I very much doubt this case will matter in practice.
Assignee | ||
Comment 134•9 years ago
|
||
(In reply to Juriy "kangax" Zaytsev from comment #129)
> Besides the above mentioned combination (which I assume to be a bug), I have
> so far found 2 differences in Chrome and FF: Chrome doesn't respect
> "whitespace: pre-line" (actually none of the browsers do besides FF) which I
> think is a bug in Chrome (and IE/Edge). Chrome puts 1 newline between
> paragraph and non-paragraph (like div) whereas FF puts 2, which I think is
> correct, and which IE10+/Edge do too.
>
> The updated compat table — http://kangax.github.io/jstests/innerText/
>
> I'm planning to add few more tests (from the ones Robert added with this
> commit) to find out if there's more deviations among implementations. And
> will try to file tickets with Chrome as well.
Thanks!!!
Assignee | ||
Comment 135•9 years ago
|
||
(In reply to Juriy "kangax" Zaytsev from comment #130)
> On my test page, I get 25-30ms vs. 35-40ms. in Chrome vs. Firefox (jsperf is
> unusable so my benchmark is crude) — so a pretty significant/noticeable
> difference. This might be because Chrome cuts corners.
>
> The reason this is important is because some apps need to request innerText
> quite frequently on large documents when editing text and having to
> calculate position of cursor in formatted text.
OK ... it would be great if you can file a bug with a specific testcase on this.
Flags: needinfo?(kangax)
Comment 136•9 years ago
|
||
From a spec standpoint, just to confirm, this from 2011-02-03 is the latest there is right?
https://rawgit.com/timdown/rangy/master/fiddlings/spec/innerText.htm
Comment 137•9 years ago
|
||
Tantek,
here by Roc
https://github.com/rocallahan/innerText-spec
Comment 138•9 years ago
|
||
http://rocallahan.github.io/innerText-spec/ is WHATWG approved innerText spec (https://github.com/whatwg/compat/issues/5#issuecomment-168049752)
Firefox Nightly 46 passes all tests (http://rocallahan.github.io/innerText-spec/setter-tests.html, http://rocallahan.github.io/innerText-spec/getter-tests.html).
Chrome issue https://code.google.com/p/chromium/issues/detail?id=573309
Comment 139•9 years ago
|
||
When we try to fetch innerText when it is not set at all, it used return undefined and now it is returning "" (empty string), can this be changed so that in the case that innerText is not set then it comes as undefined and not as an empty string. Technically if you have not set any thing it should be undefined, or was there a reason to actually have the default be "" empty string.
Comment 140•9 years ago
|
||
innerText is not supposed to return something that was set before. It returns element's textual representation, which always exists and should always be a string.
Flags: needinfo?(kangax)
Updated•4 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•