Closed
Bug 88952
Opened 23 years ago
Closed 20 years ago
Remove record trailing content
Categories
(Core :: DOM: HTML Parser, defect)
Tracking
()
RESOLVED
FIXED
Future
People
(Reporter: harishd, Assigned: mrbkap)
References
Details
(Whiteboard: [fixinhand?])
Attachments
(1 file, 9 obsolete files)
(deleted),
patch
|
jst
:
review+
rbs
:
superreview+
|
Details | Diff | Splinter Review |
Comment 5•23 years ago
|
||
What's the status on this?
This bug needs a description...
Whiteboard: [fixinhand?]
Target Milestone: mozilla1.2alpha → mozilla1.2beta
Futuring since I don't have the time right now.
Target Milestone: mozilla1.2beta → Future
Assignee | ||
Comment 8•20 years ago
|
||
Stealing this so I can update it to the current tree. bz, this patch would make all of our <textarea> bugs go away and I think I could update it. The patch makes a lot of sense. Would a patch this large be considered, though? I'm a bit confused about how the patch handles <noscript>, but I need to investigate more to find out how the DTD handles those sorts of tags. I'll give this bug a better description (per comment 6) sometime when it isn't 2:00am.
Assignee: harishd → mrbkap
Status: ASSIGNED → NEW
Assignee | ||
Comment 9•20 years ago
|
||
Ok, I understand the NOSCRIPT/NOFRAMES stuff now: if script/frames are disabled, we want to parse it as actual HTML content so we can display it on the screen. Otherwise, we want to parse it as text for easy discarding (this has bad effects on view-source, for tags like this, I'm going to need to special-case view-source).
Comment 10•20 years ago
|
||
Blake, if we can fix all the textarea stuff, that would be wonderful. As for size, if it's large but understandable, we can take it in an alpha.... ;)
Assignee | ||
Comment 11•20 years ago
|
||
This is a first pass at making this all work. There are a couple of problems still (i.e., this isn't the final patch, I'm attaching it so that whoever's reviewing can take a look). There are some questions remaining about this: 1) This needs testing, especially <noframes> and <noscript>. <xmp> should work (thanks to the elementtable changes). <script>/<style> should "just work", but they should be tested also. The logic around there changed fairly dramatically, and I'm worried about edge cases. 2) IE has a bug where if you don't close <title>, it doesn't display the document. However, if you don't close <textarea> everything down to EOF becomes part of the <textarea>. Currently, I emulate these exactly, but this regresses bug 42945, so I think ConsumeParsedCharacterData() needs to deal with both cases (since the CNavDTD.cpp solution will no longer apply). Any opinions? Comments welcome.
Assignee | ||
Updated•20 years ago
|
Attachment #40989 -
Attachment is obsolete: true
Attachment #45651 -
Attachment is obsolete: true
Attachment #45655 -
Attachment is obsolete: true
Attachment #45697 -
Attachment is obsolete: true
Comment 12•20 years ago
|
||
We don't really want to regress bug 42945
Assignee | ||
Comment 13•20 years ago
|
||
Comment on attachment 165855 [details] [diff] [review] patch v1 In the interests of sanity, I'm going to make new return values to return from ConsumeParsedCharacterData(). I'm also going to add another parameter to it to indicate which behavior to follow (either stopping at the first < as for <title> or consuming all the way to EOF as for <textarea>). I'll mark this obsolete when I have a new patch.
Assignee | ||
Comment 14•20 years ago
|
||
Updated patch. We now handle the lack of a </title> the same way as before. As a note: I think "skipped content" in CNavDTD could possibly go away with this patch, but that's another bug for another time. I removed CopyState from nsITokenizer as it's no longer needed (the only state to copy was the trailing content target). I've commented things the best I can. We also now handle <xmp> not followed by </xmp> the same way IE does (correctly). I'm not sure about <noscript>, <noframes> they seem to do the right thing (i.e., not show their content when frames, script are enabled. <noscript> does show when scripts are disabled too. Turning frames off messes up my browser too much to test). I can't think of anything else except to test and get reviews. Comments are welcome.
Attachment #165855 -
Attachment is obsolete: true
Assignee | ||
Comment 15•20 years ago
|
||
Comment on attachment 165870 [details] [diff] [review] patch v2 After running through the regression tests, I'm pretty confident with this patch. There were differences because with this patch we parse the contents of <noscript> and friends as actual content instead of PCDATA, and some stray <newline> tokens hanging around (which are fine). The only real difference was with our parsing of: <title><!-- </title> --></title> With this patch, we have a title of "<!--" and text content "-->". Is this a problem? I could make ConsumePCDATA() respect comments, but I don't feel it's worth it. Is this a common problem?
Comment 16•20 years ago
|
||
> Turning frames off messes up my browser too much to test Erm... it should not... please file bugs? > I could make ConsumePCDATA() respect comments, but I don't feel it's worth it. Agreed.
Hixie's comment page is always a good thing to test against: http://www.hixie.ch/tests/evil/mixed/comments-evil.html
Assignee | ||
Comment 18•20 years ago
|
||
This fixes all known problems with the previous patch. The only differences shown by the regression tests are the <!-- </title> --> difference mentioned above and some differences handling newlines (more correctly!) around </xmp>. <noscript> works as expected. As mentioned above, turning frames off in the browser breaks it too much to test, but <noframes> *should* have precisely the same behavior. I'm going to ask for review on this patch, but I have two questions outstanding: I've added a parameter to CParserContext::GetTokenizer() so that the HTMLTokenizer can know the status of the script/frames prefs. Is this really the right place to do this and what should theFlags be if the QI fails (can it fail in that branch?)? Note: The change to nsLoggingSink.h is useful so that aResult can't go back uninitialized.
Attachment #165870 -
Attachment is obsolete: true
Assignee | ||
Comment 19•20 years ago
|
||
Comment on attachment 165970 [details] [diff] [review] patch v3 rbs, could you give this a look?
Attachment #165970 -
Flags: review?(rbs)
Assignee | ||
Comment 20•20 years ago
|
||
The last patch's nsHTMLTokenizer.cpp had a problem with <script />content</script> (we'd lose the content). I cleaned up the /> code some and fixed that problem. I also made us copy our old behavior on missing </iframe>, </xmp>, </noscript>, </noframes> (consume until the end of the document, don't act as if we had <xmp></xmp>).
Assignee | ||
Comment 21•20 years ago
|
||
Comment on attachment 166026 [details] [diff] [review] new nsHTMLTokenizer changes rbs, please look at these nsHTMLTokenizer.cpp changes instead of the ones in patch v3.
Attachment #166026 -
Flags: review?(rbs)
Comment 22•20 years ago
|
||
I won't be able to review your patch(es) in the next few days. I have to deal with something urgent in my plate now. Feel free to request other reviewers in the meantime.
Assignee | ||
Comment 23•20 years ago
|
||
Comment on attachment 165970 [details] [diff] [review] patch v3 Obsoleting both this and the next patch since rbs won't be able to review them for a couple of days. In the meantime I've given PCDATA the ability to respecct comments, and we shouldn't regress that behavior anyway. I'll attach a new patch tomorrow.
Attachment #165970 -
Attachment is obsolete: true
Attachment #165970 -
Flags: review?(rbs)
Assignee | ||
Updated•20 years ago
|
Attachment #166026 -
Attachment is obsolete: true
Attachment #166026 -
Flags: review?(rbs)
Assignee | ||
Comment 24•20 years ago
|
||
This is the latest and greatest version of the patch. It handles <title> <!-- </title> --> Hi! </title> properly, it handles missing close tags properly, and it passes all of the parser regression tests. It also fixes the bug where the <! of an unterminated comment in strict mode gets dropped (since that behavior is bad if we're going to be using it to display PCDATA).
Assignee | ||
Comment 25•20 years ago
|
||
Comment on attachment 166246 [details] [diff] [review] latest and greatest jst, could you give this a look?
Attachment #166246 -
Flags: review?(jst)
Assignee | ||
Comment 26•20 years ago
|
||
The changes required to make this merge with bryner's string changes were non-trivial, and there was a slight "bug" where we'd assert if we didn't find a </script>, so I've updated this patch. Sorry for the bugspam!
Attachment #166246 -
Attachment is obsolete: true
Assignee | ||
Updated•20 years ago
|
Attachment #167406 -
Flags: review?(jst)
Assignee | ||
Updated•20 years ago
|
Attachment #166246 -
Flags: review?(jst)
Comment 27•20 years ago
|
||
Comment on attachment 167406 [details] [diff] [review] updated to tip +nsHTMLTokenizer::nsHTMLTokenizer(PRInt32 aParseMode, + eParserDocType aDocType, + eParserCommands aCommand, + PRUint16 aPrefs) : nsITokenizer(), mTokenDeque(0) { + mFlags = aPrefs; Use member initializer syntax here, and shouldn't aPrefs be named aFlags? Other than that this looks good, but given a change of this size in this code the only way to really know if it does the right thing is to land it and keep an eye on incoming parser bugs. r=jst
Attachment #167406 -
Flags: review?(jst) → review+
Assignee | ||
Comment 28•20 years ago
|
||
Comment on attachment 167406 [details] [diff] [review] updated to tip Looking for sr= now.
Attachment #167406 -
Flags: superreview?(rbs)
Comment 29•20 years ago
|
||
Comment on attachment 167406 [details] [diff] [review] updated to tip sr=rbs + if (eHTMLTag_iframe == theTag && (mFlags & NS_IPARSER_FLAG_FRAMES_ENABLED) || + eHTMLTag_noframes == theTag && (mFlags & NS_IPARSER_FLAG_FRAMES_ENABLED) || + eHTMLTag_noscript == theTag && (mFlags & NS_IPARSER_FLAG_SCRIPT_ENABLED)) { + isCDATA = PR_TRUE; } Missing some parentheses there. You want: if ( (a && b) || (c && d) ... The problems you reported with <noframes> may be due to the fact that it has "display:none" in html.css, and is not toggled back to really display when frames are disabled... (see bug 236450 for pointers where this is done for <noscript>).
Attachment #167406 -
Flags: superreview?(rbs) → superreview+
Assignee | ||
Comment 30•20 years ago
|
||
Fix checked in.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
This could be unrelated, but since you're touching textarea parsing anyway... Right now the content in textareas doesn't get passed to the sink as a normal textnode. Instead the sink has to call CollectSkippedContent and then manually shuffel it to the right place. The 'shuffeling to the right place' actually ends up just creating a normal textnode as a child of the textarea. see the comment at http://lxr.mozilla.org/mozilla/source/content/html/document/src/nsHTMLContentSink.cpp#902 as well as a few lines further down. It would be great if this whole dance was removed and the 'skipped content' just got passed as a textnode.
Assignee | ||
Comment 32•20 years ago
|
||
Yeah, we can easily disable skipped content for <textarea>. It would be nice to remove skipped content altogether... but there are a couple of more issues to be cleared up before doing that (like how to handle <script>). I'll open a bug tomorrow on these things.
Comment 33•20 years ago
|
||
Great! Please cc me on the new bug!
me too
You need to log in
before you can comment on or make changes to this bug.
Description
•