Closed Bug 150323 Opened 22 years ago Closed 20 years ago

htmlparser screws up on finding ETAGO

Categories

(Core :: DOM: HTML Parser, defect, P4)

x86
All
defect

Tracking

()

RESOLVED DUPLICATE of bug 153753
mozilla1.5beta

People

(Reporter: will, Assigned: harishd)

References

()

Details

(Keywords: testcase)

Attachments

(2 files)

The javascript engine appears to regard the end-of-file condition as a valid script terminator, like </script>. This is wrongful behavior as it lets an improperly terminated dynamically generated script run, and the page will seem to display as intended. I don't know if there is a security problem with this. I do know that this behavior can allow a buggy script to pass QA, and cause any subsequent script in the file to appear to fail in a very odd way. The URL provided gives a clear simple example and more background and conjecture than perhaps you really want to know.
Will has a very interesting testcase. Here it is, reduced to a minimum. The script is trying to set the background color of the page to yellow: <script type="text/javascript"> document.write("<script>"); document.write("</" + "s c r i p t>"); //<<<------------ MALFORMED document.write("<style>body {background-color: #FFFF80;}</style>"); </script> If we load this in NN4.7 or IE6, the background color remains white. The malformed closing script tag prevents the rest of the script from executing. However, if we load this in Mozilla (using trunk binary 20020611xx), the background color turns yellow. The malformed closing script tag did NOT stop the rest of the script from executing. Not sure what the correct component is, but it's not JS Engine AFAICT. The proper opening and closing of script tags has to be determined by the browser, before the JS gets compiled by the Engine - I'm going to guess this is a Parser issue, but will cc Brendan, Mike, jst, rginda on this issue. Also cc'ing mstoltz for possible Security issues here. Will attach the reduced testcase below - Note: the URL that Will has given states that some DHMTL template sites (such as Dreamweaver?) can make malformed script-closing tags like this, so this issue does have real-world relevance, unfortunately...
Assignee: rogerl → harishd
Component: JavaScript Engine → Parser
QA Contact: pschwartau → moied
Attached file Reduced HTML testcase (deleted) —
Note: it is interesting to look at the content model of the reduced testcase in the Mozilla DOM Inspector -
Status: UNCONFIRMED → NEW
Ever confirmed: true
Phil's comments triggered an insight and I devised a more simple test case that shows several aspects of the problem more clearly. After uploading to my web site, I found another very disturbing aspect of this situation. Here is the minimal test case: http://bbd.thornhenge.net/TestCase20020611.html 1: <html> 2: <head> 3: <title>Reduced HTML Test Case: wbw20020611</title> 4: <script> 5: <body> 6: <p>1st paragraph: above the end-of-script in a comment</p> 7: 8: <!-- ============================================= 9: remove the spaces between the "</" and the "script>" to see the page change 10: </ script> 11: ============================================= --> 12: 13: <p>2nd paragraph: below the end-of-script comment</p> 14: 15: <p>Note that the display of this page changes when the contents of an HTML comment are altered [<i>Fiddle with line #10 in the source</i>]. Both NN47 and IE6 do not display this page at all.</p> 16: 17: </body> 18: </html> 19: However if you look at the source from my web site, both Moz1.0 and NS6.2 show the fourth line as "<script></script>". But if you use NN4.7 or IE6 to view the source, the fourth line remains as shown above. The Geckos display the page contents in the browser window; the other two browsers show an empty page. I think this behavior is unfortunate in NS6.2 and I think it is a *critical error* in Moz1.0, since Moz is intended for development work. I'd like to see this situation given a high priority. I would also like assurance that Moz accurately displays existing source code so that it can be proofread. If you put this test case on a local drive (making sure that line #4 is as stated and not "corrected" by the browser), it will exhibit the other incorrect behaviors I reported earlier.
Fixing summary. The better way to make a closing script tag that won't be recognized as having an ETAGO prefix is "<" + "/script>" or even "<\/script>" (JS treats \/ as / -- \ is "literal next" if the next char is not magic, e.g. not n or r). Harish, are we treating "</" as ETAGO? My memory of the SGML rules says that we should treat "</@" as ETAGO iff @ is alphabetic. /be
Summary: javascript engine screws up on finding end-of-file → htmlparser screws up on finding ETAGO
I think we don't quite do proper ETAGO recognition in CDATA-containing elements (such as <script>), but only recognize the full termination of the tag. (i.e., <script> is closed by </script>, not </@). Would it were not so, but I presume it's a concession to all the document.write(</foo>) floating around.
Keywords: testcase
>Harish, are we treating "</" as ETAGO? My memory of the SGML rules says that we >hould treat "</@" as ETAGO iff @ is alphabetic. Parser _does_not_ treat "</" as ETAGO. <script> can only be closed by </script(whitespace*)>. In other words, in the above example, "</" + "s c r i p t>" is not mistaken for an </script>. It must be something else. Will investigate.
btw, I don't believe it's a JS engine problem.
It turns out that since the document.written content does not have an </script> the parser ends up treating document.written <script> as <script/>!. And because of that the rest of the content, i.e., <style>...</style>, is not wrapped in the script and hence the bug. If it has to be fixed it must be fixed in the parser :-(
Status: NEW → ASSIGNED
Will: wrt your testcase, a <script> element has CDATA content, so no markup, including comments is recognized in it, except for (theoretically) </ (and practically) </script>.
Attached file Testcase without any document.write (deleted) —
Re comment #11 from harishd@netscape.com: Good test case. Please review my Comment #4 and note that when your test case is downloaded from a web site, Moz "corrects" the page by closing the script with a "</script>" that it generates on its own. I'm pretty sure that this is the cause of the problems. I think the geckos need to be dummied down-- it is better if they don't correct this kind of error than the current situation, where their corrections can cause other problems.
Priority: -- → P4
Target Milestone: --- → mozilla1.5beta
Alas, compat. overrules the specs here. *** This bug has been marked as a duplicate of 153753 ***
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: