Closed
Bug 68411
Opened 24 years ago
Closed 24 years ago
Patch to avoid ignorable whitespace in a general manner
Categories
(Core :: Layout, defect)
Core
Layout
Tracking
()
VERIFIED
FIXED
mozilla0.9
People
(Reporter: rbs, Assigned: rbs)
References
(Blocks 1 open bug)
Details
(Keywords: perf)
Attachments
(5 files)
Currently, frames are created for ignorable whitespace even when these are not
necessary. For example, the whitespace between <table>\n<tr>\n<td>
should be excluded during the construction of child frames. And indeed,
the table code includes a logic at the frame construction level to avoid
these ignorable whitespace. Unfortunately, other elements like math (or even
xul/xbl) are left to themselves.
As a consequence, I was led in the mathml code (see nsMathMLContaineFrame.cpp,
for example) to temporarily avoid these whitespace in a _very innefficient_
manner, doing each loop in the form:
while {
if (!IsOnlyWhiteSpace(childFrame)) {
...
}
}
On the other hand, xbl took the drastic but safe measure of getting rid of
the whitespace nodes directly from the content model ! (see
nsXBLService::StripWhitespaceNodes()].
This patch aims at extending the solution used by the table code so that
other frames can benefit from it too! Essentially, a frame which doesn't
want ignorable whitespace as children just has to set the newly added bit:
state |= NS_FRAME_EXCLUDE_IGNORABLE_WHITESPACE;
Hence, any frame container (most block elements?) will be able to set the bit to
get the optimization. (I recollect that the optimization in memory and
reflow-time by avoiding ignorable whitespace is something that has been in the
air for quite some time, but nobody has taken the time to do it.)
The addition of this bit has meant that I had to update the table code
to set it there. (But these remained straightforward statements in the
corresponding Init() of table related elements.)
While adding this bit, I collected the bits that were elsewhere and added
them to nsIFrame.h. I am not sure that everybody will be agreable with
this collocation. Please let me know if this isn't okay, and I will put
them back where they came from. I collected the bits there to avoid collisions
in their use.
Attachements coming. Of note: the second change in the diff of
nsCSSFrameConstructor.cpp is the place where the bit is set for mathml
frames, i.e., it doesn't affect other frames (and it is in #ifdef).
Comment 5•24 years ago
|
||
This patch, of course, doesn't include the removal of the unneeded code we can
rip out once this is done. :-) It might be useful to make sure (by dumping the
frame tree in certain cases) that this is really taking out all the unneeded
whitespace that we have hacks all over the place to ignore.
Indeed it is not the intention of this bug. But that could well be a follow-up
bug. During development, I had the following additional debug radars:
+ nsFrameState state;
+ aParentFrame->GetFrameState(&state);
+ if (NS_FRAME_EXCLUDE_IGNORABLE_WHITESPACE & state) {
nsFrame::ListTag(stdout, aParentFrame)
printf(" is avoiding ignorable whitespace\n");
...
}
else {
nsFrame::ListTag(stdout, aParentFrame)
printf(" is taking ignorable whitespace\n");
}
It helped to see exactly the tags that were (or were not) filtering. I could
see numerous tags like <div>, <p>, etc, taking the whitespace.
Once the patch is checked in, another bug can be opened and someone familiar
with where the hacks are could go ahead and set the bit for the cases, and then
remove the hacks. A word of caution though: the present exclusion is made in
ConstrutFrameByTag() which caters for the normal case of inter-tag relationship
<tag>\n<tag>, like table and mathml. It would therefore be necessary to check
that NS_NewTextFrame() is not getting called in any unexpected place (e.g.,
generated content that is only whitespace, assuming this case is possible).
Keywords: perf
Comment 7•24 years ago
|
||
Does this patch make sure that you're not excluding whitespace when
'white-space: pre'? If it doesn't, you'll break PRE and 'white-space: pre'.
(Also, once that's done, the style hint for white-space will probably need to
be NS_STYLE_HINT_FRAMECHANGE.)
Actually, the patch is not excluding whitespace on the frames. Its intention is
only to provide a basis to do so. The whitespace that are excluded at those that
the table code was already excluding. I just also added the exclusion of the
whitespace not needed for mathml. This is in view of removing the numerous
hacks "if (!IsOnlyWhitespace(childFrame))" that are polluting the mathml code.
So, the patch by itself is low-risk, while the other much wider task involving
other block/inline frames is high-risk and would need more care and scrutiny,
_if_ it has to be carried out.
Filed follow-up bug 68489 on the issue of activating the bit in other frames.
Assignee | ||
Comment 10•24 years ago
|
||
r=? sr=?
Assignee | ||
Comment 11•24 years ago
|
||
BTW, apart from four cases from the renowned false positives in the table
regression tests, I couldn't detect any weirdness with my changes. Moreover,
I fortuitously uncovered, by inspection, a bug that was lingering in the
<caption> case and which is going to be fixed by the patch.
The previous filtering code was ignoring the whitespace even in the <caption>.
This means that <caption> <i>italic</i> \n <b>bold></b> </caption> would have
rendered "italicbold" jammed together. However, this situation never happened
because of the other bug 58197 that causes nested elements in a <caption> to be
dropped altogether.
The <caption> (nsTableCaptionFrame) is a nsBlockFrame, and I didn't include the
setting of the bit NS_FRAME_EXCLUDE_IGNORABLE_WHITESPACE in the Init() of
nsTableCaptionFrame. For completeness, the bit is also not set in
nsTableCellFrame -- the previous code didn't exclude whitespace there
(<td><td> is very different from <td>\n</td>, isn't it). Apart from
these two, the bit is set on all other table elements.
The bit is set on all math elements because whitespace is ignorable
everywhere within the <math>...</math> environment. (The inter-frame
spacing is defined in a custom way, e.g., via the lspace="." or rspace="."
attributes which have some default values in the Operator Dictionary.)
Comment 12•24 years ago
|
||
I'll let karnaze comment on the table diffs, but this certainly seems like a
great idea!
As to the consolidation of the frame state flags, I could go either way on this.
It's nice to have them in one place; OTOH, I don't think we need to worry about
collisions (we shouldn't be running through code that cares about
NS_BLOCK_MARGIN_ROOT while reflowing an inline frame, e.g.) Was there a place
that you actually saw a collision? (I seem to recall a message indicating that
you did, but can't remember where...)
Comment 13•24 years ago
|
||
overall, not a bad idea. we'll discuss it this afternoon in our group mtg to
see if there are any issues.
I want to check with the editor folks because I recall some issue with
whitespace and frames having to do with caret painting. I also want to verify
that the output system doesn't need these frames to preserve whitespace on
output. We just walk the content model, right? Editor folks, please remind us
what the issues are here.
one mistake: please do not consolidate all the flags into nsIFrame.h. The
flags in nsIFrame.h are those that are universal to all frames...
// The low 16 bits of the frame state word are reserved by this API.
#define NS_FRAME_RESERVED 0x0000FFFF
All the other flags are for the use of the specific implementation. By
definition, there cannot be meaningful collision between high-order bits between
frame types. The place where we have to be careful is if a frame class uses
some bits, and other frames are derived from it, then the derived class has to
respect the bits reserved by it's parent class.
Comment 14•24 years ago
|
||
I assume this will be checking with the style system, and omitting frames only
for whitespace which is insignificant given the current style and context? If
so, it shouldn't cause any problems I'm aware of (and sounds like a good idea).
The output system only cares about the content model, and doesn't know anything
about frames (at least currently). So this shouldn't cause a problem with
output, as long as the whitespace still has nodes in the content model. (We've
talked about having a smarter output system that could output what we're really
showing, style and all, but it's not likely that's going to happen in the
forseeable future, and even if it did, this would still be okay as long as
significant whitespace still had frames.)
The caret issue is that we always need a frame on which to draw the caret. As
long as there will still be frames for *significant* whitespace, this shouldn't
create an editor or caret problem. There have been editor/caret issues
involving frames for whitespace, but it's always been related to significant
whitespace -- e.g. newlines inside <pre> tags (if we could get frames for those
instead of having to translate them to <br> tags, we could make huge performance
gains).
Comment 15•24 years ago
|
||
what akkana sed
Comment 16•24 years ago
|
||
if this is done, be sure to test the case where a block is dynamically
attributed with whitespace: pre, to make sure we construct the whitespace frames
correctly. If this doesn't work, it's probably because the hint for that
particular style change isn't strong enough. Marc A. knows all about this.
Assignee | ||
Comment 17•24 years ago
|
||
[collision of bits] (waterson, buster)
I later understood as buster explained, that there wasn't need to worry about
clashes in different frames so long as the meanings of bits are preserved across
descedants.
+#define NS_INLINE_FRAME_CONTAINS_PERCENT_AWARE_CHILD 0x00010000
+#define NS_BLOCK_SHRINK_WRAP 0x00010000
[consolidation] (buster)
Since collisions are okay, I can indeed let each header file with its own set of
relevant bits. buster, all the 16 lower bits have been taken... are you okay to
reserve 4 additional bits, and shift implementations to the 12 higher bits?
[style system] (akkana)
yep, only ignorable whitespace are meant to be excluded. The decision is made
by the parent frame. By avoiding children that are whitespace, no style
contexts will be created for these. So, some potential gains may arise from
that too.
(I suspect also that the style system has its own separate hacks to avoid
whitespace, in order, for example, to match things like :first-node ...)
[what comes next] (dbaron, buster)
This is just an enabling patch in the sense that it doesn't remove the
whitespace. The actual bug to decide what/how to do is bug 68489. In the
meantime, the patch enables frames (like table or math) that have no concern
about whitespace to ignore them via the simple setting of that bit.
Comment 18•24 years ago
|
||
RE: reserve 4 additional bits, and shift implementations to the 12 higher bits?
The 16/16 split is arbitrary, so it's fine to reserve a few more, as long as no
implementation is already over the 12 bit limit. You'll have to check, if you
haven't already. If so, we'd have to get creative.
Assignee | ||
Comment 19•24 years ago
|
||
I just did a search for "nsGetFrameState", then "nsFrameState", then
"nsSetFrameState", then each of the bits 0x????0000 themselves, with comparative
testings of the search results (in a contextual sense) to increase the
unlikeness of leaving a nasty bug that will strike someday... Producers of
additional bits that I could find were nsBlockFrame/nsInlineFrame (which are
already in the patch of nsIFrame.h), and the others given below. Fortunately,
nobody appears to be using more than 12 bits.
nsTextFrame.cpp
=============================================
1079 // Flag indicating that whitespace was skipped
1080 #define TEXT_SKIP_LEADING_WS 0x01000000
1081 #define TEXT_HAS_MULTIBYTE 0x02000000
1082 #define TEXT_IN_WORD 0x04000000
1083 // This bit is set on the first frame in a continuation indicating
1084 // that it was chopped short because of :first-letter style.
1085 #define TEXT_FIRST_LETTER 0x08000000
1086 #define TEXT_WAS_TRANSFORMED 0x10000000
1087
1088 // Bits in mState used for reflow flags
1089 #define TEXT_REFLOW_FLAGS 0x1F000000
1090
1091 #define TEXT_TRIMMED_WS 0x20000000
1092
1093 #define TEXT_OPTIMIZE_RESIZE 0x40000000
1094
1095 #define TEXT_BLINK_ON 0x80000000
nsTextBoxFrame.cpp
============================================
51 #define NS_STATE_NEED_LAYOUT 0x01000000
nsBox.h
============================================
51 // flags from box
52 #define NS_STATE_IS_HORIZONTAL 0x00400000
53 #define NS_STATE_AUTO_STRETCH 0x00800000
54 #define NS_STATE_IS_ROOT 0x01000000
55 #define NS_STATE_CURRENTLY_IN_DEBUG 0x02000000
56 #define NS_STATE_SET_TO_DEBUG 0x04000000
57 #define NS_STATE_DEBUG_WAS_SET 0x08000000
58 #define NS_STATE_IS_COLLAPSED 0x10000000
59 #define NS_STATE_DEFAULT_HORIZONTAL 0x20000000
60 #define NS_STATE_STYLE_CHANGE 0x40000000
61 #define NS_STATE_EQUAL_SIZE 0x80000000
nsTableCellFrame.h
============================================
#define NS_TABLE_CELL_FRAME_CONTENT_EMPTY 0x80000000
nsTableRowGroupFrame.h
============================================
#define NS_ROWGROUP_REPEATABLE 0x80000000
Assignee | ||
Comment 20•24 years ago
|
||
Assignee | ||
Comment 21•24 years ago
|
||
The rtests (block and table) looked fine (no additional changes in the table).
(My other changes on this topic are mathml-specific, mass removal of those
!IsOnlyWhitespace(), and the ensuing consolidation/relief from knowing that
these whitespace frames are never going to be seen again...)
Assignee | ||
Comment 22•24 years ago
|
||
rbs@localhost: ping -r -sr karnaze | waterson
OS: Windows 2000 → All
Hardware: PC → All
Comment 23•24 years ago
|
||
rbs: sr=waterson for
02/10/01 01:40 Patch in the table land
02/16/01 03:52 Collated diffs with implementation bits shifted in the upper
12 positions
(If you'd meant to check in something else, please let me know!)
Comment 24•24 years ago
|
||
Roger, I did overlook a serious problem with the frame state bit you selected.
Generic frame bits are the low order 16. Specific frame bits are the high order
16. You need to pick one of the remaining low order bits.
Assignee | ||
Comment 25•24 years ago
|
||
That's okay chrisK, since all the initial 16 bits were already taken, I
suggested reserving 4 more bits and buster said it was okay, provided I first
checked that nobody was already using more than 12 bits. That's why after
searching/verification [in my comments of 2001-02-12 17:19], I shifted
implementations to the 12 higher bits, leaving room for 4 more universal bits.
I also updated the comments in nsIFrame.h to reflect this change of policy in
the collated patch.
Assignee | ||
Comment 26•24 years ago
|
||
Checked-in. Resolving as FIXED.
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•