Closed Bug 190955 Opened 22 years ago Closed 15 years ago

html parser turns non significant CRLF into LFLF on Windows instead of LF

Categories

(Core :: DOM: HTML Parser, defect)

defect
Not set
major

Tracking

()

RESOLVED FIXED

People

(Reporter: glazou, Unassigned)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [fixed by the HTML5 parser])

Attachments

(1 file)

1. launch Mozilla ***on Windows*** 2. open http://daniel.glazman.free.fr/tmp/htmlparser.html 3. see result on the page Expected result: 1 byte LF (decimal code 10) Actual result: 2 bytes LF... A direct side effect of this bug is the insertion of unwanted blank lines by the Editor when the document is serialized. Each LF is output according to platform specifics and then we end up with CRLFCRLF, ie **2** line breaks instead of one.
I see a 2-char textnode on Linux too...
OS: Windows 2000 → All
Hardware: PC → All
Blocks: 97278
Blocks: 174361
The newlines after the </head> are getting dropped on the floor by the parser. The newlines in the testcase are coming directly from the newlines before the <head>. The fix is to lazily append the <head> to the root element instead of appending it in HTMLContentSink::Init(). Note that the testcase will report *zero* text after the </head> when this is fixed. That should be addressed in another bug, however.
Assignee: harishd → mrbkap
Attached file testcase (deleted) —
This is the testcase. I'm attaching it so that it will still be accessible once Daniel decides to take the version on his site down.
QA Contact: moied → parser
This seems to be working now.
Except the underlying bug here isn't fixed yet :-(.
Assignee: mrbkap → nobody
This is now handled per HTML5: 1) LF, CR and CRLF turn to LF. 2) Space characters at the start of the document are discarded.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Whiteboard: [fixed by the HTML5 parser]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: