Closed
Bug 85184
Opened 23 years ago
Closed 23 years ago
[serializer]Composer breaks lines at inappropriate positions
Categories
(Core :: DOM: Serializers, defect, P3)
Tracking
()
VERIFIED
FIXED
mozilla1.2alpha
People
(Reporter: biro.arpad, Assigned: t_mutreja)
References
Details
(Whiteboard: [C][patch needs a=])
Attachments
(2 files)
(deleted),
text/html
|
Details | |
(deleted),
patch
|
akkzilla
:
review+
jst
:
superreview+
asa
:
approval+
|
Details | Diff | Splinter Review |
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.1) Gecko/20010607
BuildID: 2001060703
1. Occasionally Composer introduces an unwanted space by breaking line after a
tag (see example file).
2. In certain cases, Mozilla displays an extra "<" character (again, see the
example file).
(these are two bugs, but can be reproduced in the same way)
Reproducible: Always
Steps to Reproduce:
1. start Composer (Mozilla 0.9.1)
2. open the sample HTML file that is included in this bug report (it's 7607
bytes long with Windows EOLs)
3. save the document under a different name
4. exit Composer and open both files with Mozilla
5. scroll to the end of both documents
Actual Results: Now, compare the last paragraphs of the two documents visually.
You should see two differences in the last paragraph.
In the new document there's an unwanted space between "classic-852-16.psf.gz"
and the closing parenthesis (which the original document does not have).
The second difference: in the new document there's a "<" sign before
"iso02_cp852.trans", which is not there in the original document.
Expected Results: 1. Composer: do not add extra space
2. Mozilla: do not display that "<" sign
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html lang="hu">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-2">
<meta http-equiv="Content-Language" content="hu">
<title>Magyarul-HOGYAN</title>
</head>
<body bgcolor="#ffffff">
<div align=center>
<img src="Magyarul-HOGYAN.gif" alt="Magyarul-HOGYAN">
</div>
<!-- h1 align=center>Magyarul-HOGYAN</h1 -->
<h3 align=center>
avagy tippek és trükkök magyarul beszél?
Linux-felhasználóknak
</h3>
<p>
<center><div align=center>
Verzió: <tt>0.5.6</tt><br>
1999. július 19.
</div></center>
<p>
<table border=0 bgcolor="#e0e0e0" cellspacing=8 cellpadding=9> <tr><td>
Ha valaki olyan böngész?vel
rendelkezik, amelyikben ez a lap nem jól olvasható, kérem írja meg.
</td></tr> </table>
<p>
Ez a lap a Linux operációs rendszercsalád magyarosításával foglalkozik.
F?bb témái: magyar bet?k, billenty?zet használata; magyar nyelv? dokumentumok
az Interneten, különféle programok beállítása. A felsorolt megoldások,
beállítások általában RedHat változat használatát feltételezik, ez azonban
csak azt jelenti, hogy a szerz? ilyet használ, és nem azt hogy más
Linux változatok (pl. Debian, SuSe, Slackware, stb.) ne lennének ugyanolyan jók
a magyar felhasználók számára, és a közölt megoldások némi változtatással
ne lennének használhatóak az utóbb említett változatokban.
A szerz? örömmel venné a nem RedHat Linux-hoz készült leírásokat, megoldásokat
melyeket közzé is tenne ezen a helyen.
<p>
<i>vbzoli<i>@</i>vbzo.li</i>
<p>
<h2>TARTALOM</h2>
<ul>
<li><a href="#hol">0. Hol érhet? el a Magyarul-HOGYAN?</a>
<li><a href="#bl2">1. Magyar billenty?kiosztás és
latin-2 (ISO-8859-2) bet?k használata</a>
<ul>
<li><a href="#bl2-lat2">1.1. A latin-1 és latin-2 kódkiosztás</a>
<li><a href="#bl2-lin">1.2. A Linuxon használt kódkiosztások</a>
<li><a href="#bl2-kl2">1.3. Latin-2 kiosztás használata konzolon</a>
<b><FRISSÍTETT></b>
<li><a href="#bl2-kl2-r">1.4. Latin-2 kiosztás használata konzolon (régi
megoldás, nem ajánlott)</a>
<li><a href="#bl2-kbill">1.5. Magyar billenty?kiosztás használata konzolon</a>
<li><a href="#bl2-kdeb">1.6. Konzol-beállítás Debian 1.2 alatt</a>
<li><a href="#bl2-krh6">1.7. Konzol-beállítás RedHat 6.0 alatt</a>
<b><ÚJ></b>
<li><a href="#bl2-krh5">1.8. Konzol-beállítás RedHat 5.x alatt</a>
<li><a href="#bl2-kslak">1.9. Magyar Slackware csomag (régi megoldás, nem
ajánlott)</a>
<li><a href="#bl2-l2x">1.10. Latin-2 kiosztás használata X-Window felületen</a>
<li><a href="#bl2-kx">1.11. Magyar billenty?zet használata X-Window felületen</a>
</ul>
<li><a href="#p">2. Egyes programok beállításai</a>
<ul>
<li><a href="#p-shell">2.1. bash, tcsh</a>
<li><a href="#p-less">2.2. less</a>
<li><a href="#p-tex">2.3. TeX, LaTeX</a>
<li><a href="#p-lyx">2.4. LyX</a>
<li><a href="#p-joe">2.5. joe</a>
<li><a href="#p-emacs">2.6. emacs</a>
<li><a href="#p-netscape">2.7. netscape</a>
<li><a href="#p-nedit">2.8. nedit</a>
<li><a href="#p-lynx">2.9. lynx</a>
<li><a href="#p-xterm">2.10. xterm</a>
<li><a href="#p-ls">2.11. ls</a>
<li><a href="#p-pgsql">2.12. postgresql</a>
</ul>
<li><a href="#tipp">3. Tippek</a>
<ul>
<li><a href="#tipp-pstxt">3.1. Hogyan nyomtassunk latin-2 kódolású
szövegállományokat bármilyen - Ghostscript
által támogatott, vagy postscript - nyomtatón?</a>
<li><a href="#tipp-l2html">3.2. Hogyan készítsünk magyar
(latin-2-ben kódolt) WWW (HTML) oldalakat?</a>
</ul>
<li><a href="#rotsuveg">4. R?t Süveg, azaz magyar Linux</a>
<li><a href="#inet">5. Magyar vonatkozású Linuxos INTERNET források</a>
<ul>
<li><a href="#inet-ftp">5.1. ftp-helyek</a>
<li><a href="#inet-www">5.2. www-helyek</a>
<li><a href="#inet-lev">5.3. Levelez?listák</a>
<li><a href="#inet-news">5.4. Újság</a>
</ul>
<li><a href="#doksi">6. Magyar nyelv? Linux (UNIX) leírások</a>
<li><a href="#tanf">7. Linux tanfolyamok</a>
<li><a href="#kozre">8. Közrem?köd?k</a>
</ul>
<p><hr><p>
<h2><a name="hol">
0. Hol érhet? el a Magyarul-HOGYAN?
</a></h2>
<ul>
<li><a href="http://vbzo.li/linux/Magyarul-HOGYAN.html">
http://vbzo.li/linux/Magyarul-HOGYAN.html</a>
<li><i>Ha valaki tükrözné ezt a lapot, jelezze nekem, hogy ide
felvehessem!</i>
</ul>
<p><hr><p>
<h2><a name="bl2">
1. Magyar billenty?kiosztás és latin-2 (ISO-8859-2) bet?k használata
</a></h2>
<p><h4><a name="bl2-lat2">
1.1. A latin-1 és latin-2 kódkiosztás
</a></h4>
A <code>UNIX</code> világában elterjedt 8-bites kódkiosztás
amely tartalmazza a magyar ékezetes bet?ket is az <code>ISO-8859-2</code>,
azaz a latin-2 kiosztás.
Ez a kódkiosztás tartalmazza a latin bet?s szláv
nyelvek (horvát, szlovén, szlovák, cseh, lengyel), és a
magyar, román, német nyelv ékezetes bet?it.
<p>
A Nyugat-Európai országok az <code>ISO-8859-1</code> kódkiosztást
használják (latin-1). A latin-1 kiosztás tartalmazza a magyar
ékezetes bet?ket is (ugyanazon kóddal) az ? (o") és ? (u") kivételével.
Az ? (o") és ? (u") bet?k helyén a latin-1-es kiosztásban az o~ és az u^
szerepel (o tetején hullámvonal, u tetején kalap);
így a latin-2-ben kódolt magyar szövegek olvashatóak
latin-1-es kiosztás használatával is (jobb híján).
<p><h4><a name="bl2-lin">
1.2. A Linuxon használt kódkiosztások
</a></h4>
A Linux két legjobban elterjedt felhasználói felülete a konzol és az
X-Window grafikus rendszer. Konzol alatt a szöveges üzemmódú
képerny?t (általában VGA-monitor) és billety?zetet értjük. Az esetleges
küls? terminálok beállításaival (egyel?re még) nem foglalkozunk.
<p>
A linux rendszermag alapesetben a konzolon a latin-1-es kiosztást használja úgy,
hogy a latin-1-es kódokat leképezi a PC-s <tt>437</tt>-es kódlapra.
(A 437-es kódlapot egyébként a monitorvezérl?-kártya tartalmazza PC-n.)
Ezzel a módszerrel csak olyan latin-1 bet?ket tud megjeleníteni,
melyek szerepelnek a <tt>437</tt>-es lapon.
<p>
Az X-Window rendszer alapesetben a latin-1 (<code>ISO-8859-1</code>)
kódolást használja, de rendelkezésre állnak latin-2 és egyéb kódú
bet?készletek is már szép számban. Pl. az 5.2-es RedHat-ben már
az alap-telepít?készlet része néhány latin-2-es bet?csomag.
<p><h4><a name="bl2-kl2">
1.3. Latin-2 kiosztás használata konzolon
</a></h4>
Latin-2-es bet?k használatát legcélszer?bben (hasonlóan az alapeset
<tt>437</tt>-es kódlapjához) 852-es kódlap szerint kódolt
bet?készlettel valósíthatjuk meg.
A 852-es kódlapot csak a képrny?fontok kódolására használjuk,
és a latin-2 (ISO8859-2) kódokat leképezzük a 852-es kódokra.
<p>
Két ok miatt célszer? a 852 kódlap használata a latin-2-ben kódolt
bet?készlethez képest:
<ul>
<li>Nem kell átírni a <tt>termcap</tt>, <tt>terminfo</tt> bejegyzéseket;
<li>A VGA kártyák a 9. bit kiegészítését csak a megfelel? helyen lev?
vonal(keret)rajzoló karaktereknél támogatják.
<li>A fentiek miatt pl. az <code>mc</code> és egyéb konzolon
futó programok rendes vonalrajzoló karaktereket írnak ki.
</ul>
<p>
<b>Újabb <tt>console-tools</tt> csomag (konzol-eszközök) használatánál</b>
<p>
A megfelel? bet?kiosztást tartalmazó állományt
(pl. <tt>classic-852-16.psf.gz</tt>)
a <tt>/usr/lib/kbd/consolefonts</tt> könyvtárba, míg a képerny?-leképezést
(pl. <tt>iso02_cp852.trans</tt>)
tartalmazó állományt a <tt>/usr/lib/kbd/consoletrans</tt> könyvtárba másoljuk.
</html>
Comment 1•23 years ago
|
||
I see the problem -- there is a <p> just before the table toward the top of the
file. After the document has gone through the parser, the file is normalized and
the </p> is inserted after the table. When I moved the </p> to before the table,
it all worked fine.
went from this (snippet):
<center><div align=center>
Verzió: <tt>0.5.6</tt><br>
1999. július 19.
</div></center>
<p>
<table border=0 bgcolor="#e0e0e0" cellspacing=8 cellpadding=9> <tr><td>
Ha valaki olyan böngész?vel
rendelkezik, amelyikben ez a lap nem jól olvasható, kérem írja meg.
</td></tr> </table>
to this:
<center><div align=center>
Verzió: <tt>0.5.6</tt><br>
1999. július 19.
</div></center>
<p> </p> !!!!!!!!!!!!!!NOTE THE END </P>
<table border=0 bgcolor="#e0e0e0" cellspacing=8 cellpadding=9> <tr><td>
Ha valaki olyan böngész?vel
rendelkezik, amelyikben ez a lap nem jól olvasható, kérem írja meg.
</td></tr> </table>
Comment 2•23 years ago
|
||
actually this is a parser issue, reassiging to parser
Assignee: beppe → harishd
Status: UNCONFIRMED → NEW
Component: Editor → Parser
Ever confirmed: true
QA Contact: sujay → bsharma
Reporter | ||
Comment 3•23 years ago
|
||
Status: NEW → ASSIGNED
Ok, I'm not able to reproduce the problem. Marking WFM.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → WORKSFORME
Reporter | ||
Comment 6•23 years ago
|
||
One of the two bugs have been fixed. But the "extra space" bug remains
(also seen it in the 2001071104 Win32 build).
Try with this small sample (162 bytes with Windows EOLs):
<html>
<head>
<title>test</title>
</head>
<body>
<p>
1 123456789 1234567890123 1234567890 123456789
(123 <tt>123456789012345678901</tt>) 123456.
</html>
Composer saves this as:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>test</title>
</head>
<body>
<p> 1 123456789 1234567890123 1234567890 123456789 (123
<tt>123456789012345678901</tt>
) 123456. </p>
</body>
</html>
So a space is introduced before the ")".
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Content model for the above testcase ( Ref. comment 2001-07-12 02:22 ):
***********************************************************************
docshell=0135E310
html@02D02B30 refcount=8<
head@02D02A40 refcount=2<
title@02D23140 refcount=2<
Text@02D23420 refcount=2<test>
>
>
Text@02D1FBB0 refcount=3<\n>
body@02D233C0 refcount=3<
Text@01394A70 refcount=3<\n>
p@013949A0 refcount=3<
Text@01394940 refcount=3<\n1 123456789 1234567890123 1234567890 123456789\
n(123 >
tt@013947E0 refcount=3<
Text@01394780 refcount=3<123456789012345678901>
>
Text@01394640 refcount=4<) 123456.> <<<<<<<< NO NEW LINE <<<<<<<<
>
Text@0137A650 refcount=3<\n>
>
>
Content model after saving the testcase thro' composer:
*******************************************************
docshell=0135E310
html@0251CA90 refcount=8<
head@0251B200 refcount=2<
title@0245E6E0 refcount=2<
Text@0245AA90 refcount=2<test>
>
>
Text@02458C70 refcount=3<\n>
body@0245AA30 refcount=3<
Text@0250A7E0 refcount=3<\n \n>
p@0250A730 refcount=3<
Text@0250A6D0 refcount=3< 1 123456789 1234567890123 1234567890 123456789 (
123 >
tt@0250A570 refcount=3<
Text@0250A510 refcount=3<123456789012345678901>
>
Text@0250A3D0 refcount=4<\n) 123456.> <<<<< THERE IS THE NEW LINE <<<<
>
Text@0250A090 refcount=3<\n \n>
>
>
Looks like composer has added the extra new line. Back to Beppe.
Assignee: harishd → beppe
Status: REOPENED → NEW
Comment 8•23 years ago
|
||
not sure why we are inserting a space. I would normally give this to Joe, but
handing over to akkana and cc kin. reducing from P1 to P3
Assignee: beppe → akkana
Priority: P1 → P3
This extra space/newline is coming from
nsHTMLContentSerializer::AppendToStringWrapped() because the particular line
containing the </tt> exceeds 72 characters.
Akk, are we supposed to be enforcing 72 col hard wraps in composer output, or
just MsgCompose?
Perhaps we should be setting some flag in composer to avoid this
Serializer behavior.
On a side note, the extra new line after <body> is
happening, because "body" is listed
in nsHTMLContentSerializer::LineBreakAfterClose().
Comment 10•23 years ago
|
||
We definitely want to have wrapping of composer output, not just mail,
otherwise source composer-generated documents will be very difficult to read and
edit.
But we need to be smarter about wrapping just before or just after a tag when
there's no adjacent whitespace, apparently.
Status: NEW → ASSIGNED
Comment 11•23 years ago
|
||
This is basically the same as bug 56921: the nsHTMLContentSinkStream had
wrapping code that worked, and for some reason it was replaced with new wrapping
code which doesn't work. Handing over to Anthonyd for when the rewrite of the
wrapping code (or plugging in the code that was previously there) happens since
these bugs should go together, since there's no point in fixing it in the
current code if it's going to be replaced. I note that we're no longer calling
the nsLineBreaker interfaces needed for I18n, either.
For the record, the offending newline is coming from the last AppendToString
call in nsHTMLContentSerializer::AppendToStringWrapped, currently line 560.
Assignee: akkana → anthonyd
Status: ASSIGNED → NEW
Comment 12•23 years ago
|
||
moving to 0.9.4
Status: NEW → ASSIGNED
Target Milestone: mozilla0.9.3 → mozilla0.9.4
Comment 13•23 years ago
|
||
after mucho discussion with kin, this bug is not going to be easy to fix, if it
should be even fixed at all. more investigation with the mail news team needs to
be done to figure out a solution to this.
part of the problem is that when we are feeding out the document from the
stream, by the time we get a line that is longer than 72 characters, we cant go
back to break at an earlier spot. Not sure how we are going to get around that
small technical problem.
anthonyd
Whiteboard: [C]
Comment 15•23 years ago
|
||
-->DOM to Text Conversion module owner
Component: Parser → DOM to Text Conversion
Summary: Composer breaks lines at inappropriate positions → [serializer]Composer breaks lines at inappropriate positions
Comment 17•23 years ago
|
||
accodring to syd, harishd is the new module owner of serializer.
-->harishd
Assignee: anthonyd → harishd
Comment 18•23 years ago
|
||
And when did that happen! I wasn't aware of it until I spoke to heikki.
FYI: The decision is not final. I need to talk to people before accepting
ownership. For now I'm not the owner.
In anycase this is not going to get fixed for 0.9.4. Moving to 0.9.5
Target Milestone: mozilla0.9.4 → mozilla0.9.5
Comment 19•23 years ago
|
||
Reassigning plaintext serializer bugs to Peter ;-)
Assignee: harishd → peterv
Comment 20•23 years ago
|
||
All these missed the bus/train/plane/boat/whatever. Sad.
Target Milestone: mozilla0.9.5 → mozilla0.9.6
Updated•23 years ago
|
Target Milestone: mozilla0.9.6 → mozilla0.9.8
Assignee | ||
Comment 21•23 years ago
|
||
Assigning to myself after a discussion with Nisheeth.
Assignee | ||
Comment 23•23 years ago
|
||
In the existing serializer(nsHTMLContentSerializer.cpp) code, whenever we see a
new node appearing at/after the 72 column, we insert a line break. Though it
provides a better view of the source but at times, this changes the original
HTML view too.
As all the new line character(s) and white space(s) are squeezed to a single
white space while viewing the HTML, in the patch I inserted the break only at
the places having a new line or white space. In that case the 'view source'
would be formatted to some extent but these additional breaks will not affect
the original HTML in terms of looks. Additionally, as we also add the line
breaks before/after certain tags like <p>, breaking at such a point should work
in most of the cases.
Comment 24•23 years ago
|
||
Nisheeth and I just went through the patch. It looks like it should fix the
problem. While understanding the logic, we also found the cause of bug 56921:
the "else" clause (line 581 in the patched file) starts at the wrap column and
then searches forward to the next space, hence always makes lines longer than
the wrapcol.
That's an old bug, not made any worse by this patch, so it doesn't block
acceptance of this patch. I'll add more comments in that bug.
The only thing I worry about with the present fix is what happens if we create a
file in the editor with a long block of open/close tags with no spaces between
them, only newlines (which might happen in a table, for example). I'm going to
apply the patch in my tree and do some testing, but if you've already tested
this sort of case and are confident that it works, please say so.
Comment 25•23 years ago
|
||
Oh, one other issue: you should probably run this by someone in the intl group
to see if you should be using nsLineBreaker instead of explicitly searching for
a space and nothing else. Naoki, can you take a look at this patch, or pass
this bug along to someone else who can look at tell us if this is okay?
Comment 26•23 years ago
|
||
I am not familiar with this bug, so just comment general info.
Some languages can break without a space. The line breaker interface can return
possible breakable position. Shanjian is the owner of the code, cc to him.
I think the patch of looking for spaces only will not do anything wrong for
languages which does not requre a space for line breaking. But using the
linebreaker would make the code benefit more languages.
Reporter | ||
Comment 27•23 years ago
|
||
Tried the 0.9.8 final build (mozilla-win32-0.9.8-talkback.zip). The extra space
bug remains. Just try it with the small sample previously (2001-07-12) posted.
Reporter | ||
Comment 28•23 years ago
|
||
When testing, please be sure to have the
Reformat ("pretty print") HTML source
option selected (Preferences/Composer).
Past 0.9.8, moving forward.
Target Milestone: mozilla0.9.8 → mozilla0.9.9
Assignee | ||
Comment 30•23 years ago
|
||
*** Bug 101755 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 31•23 years ago
|
||
*** Bug 104144 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 32•23 years ago
|
||
Biro, the patch is not yet checked-in. Would you please verify it once the bug
is closed!
Updated•23 years ago
|
Whiteboard: [C] → [C][patch needs r/sr=]
Assignee | ||
Updated•23 years ago
|
Status: NEW → ASSIGNED
Target Milestone: mozilla0.9.9 → mozilla1.0
Comment 33•23 years ago
|
||
Moving Netscape owned 0.9.9 and 1.0 bugs that don't have an nsbeta1, nsbeta1+,
topembed, topembed+, Mozilla0.9.9+ or Mozilla1.0+ keyword. Please send any
questions or feedback about this to adt@netscape.com. You can search for
"Moving bugs not scheduled for a project" to quickly delete this bugmail.
Target Milestone: mozilla1.0 → mozilla1.2
Comment 34•23 years ago
|
||
Comment on attachment 64804 [details] [diff] [review]
Patch fot fixing it...
sr=jst. Akkana should r=.
Attachment #64804 -
Flags: superreview+
Assignee | ||
Updated•23 years ago
|
Whiteboard: [C][patch needs r/sr=] → [C][patch needs r=/a=]
Updated•23 years ago
|
Attachment #64804 -
Flags: review+
Updated•23 years ago
|
Whiteboard: [C][patch needs r=/a=] → [C][patch needs a=]
Comment 35•23 years ago
|
||
Comment on attachment 64804 [details] [diff] [review]
Patch fot fixing it...
I checked the review box but bugzilla didn't list a comment that I'd done it.
Trying again: r=akkana imeanitthistime.
Comment 36•23 years ago
|
||
Comment on attachment 64804 [details] [diff] [review]
Patch fot fixing it...
a=asa (on behalf of drivers) for checkin to the 1.0 trunk
Attachment #64804 -
Flags: approval+
Comment 37•23 years ago
|
||
Checking in for tmutreja
Fixed with checkin
D:\mozilla\content\base\src>cvs commit
cvs commit: Examining .
Checking in nsHTMLContentSerializer.cpp;
/cvsroot/mozilla/content/base/src/nsHTMLContentSerializer.cpp,v <-- nsHTMLCont
entSerializer.cpp
new revision: 1.41; previous revision: 1.40
done
Status: ASSIGNED → RESOLVED
Closed: 23 years ago → 23 years ago
Resolution: --- → FIXED
Comment 38•23 years ago
|
||
Using build 04-01, I am unable to reproduce the original problem, or the problem
discussed in comment #6.
Marking VERIFIED.
If anyone is still able to reproduce this problem, feel free to reopen this bug.
Status: RESOLVED → VERIFIED
Comment 39•22 years ago
|
||
*** Bug 149200 has been marked as a duplicate of this bug. ***
You need to log in
before you can comment on or make changes to this bug.
Description
•