Closed Bug 75283 Opened 24 years ago Closed 23 years ago

Pasted text from beginning of line always appends a newline

Categories

(Core :: DOM: Serializers, defect, P4)

x86
All
defect

Tracking

()

VERIFIED FIXED
mozilla1.0

People

(Reporter: jesup, Assigned: t_mutreja)

References

()

Details

(Keywords: dataloss, Whiteboard: [cr/lf] [needs sr=/a=])

Attachments

(3 files, 4 obsolete files)

FreeBSD 4.1 Mozilla 20010409xx When pasting text from a page I browsed to in mozilla, it always has a newline appended to the paste. For example, on the page above I select "NSS 3.2 source and binary distributions" (which is all on one line, and is followed by more text), and then paste it into an xterm, emacs, etc, and it always has a newline appended to it. If I do the same thing from ns4.x, I don't see this. Probably applies to all Xwindows, maybe all platforms. Annoying and quite unexpected (caught me by surprise once when I planned to edit the pasted text before committing to running the command I'd cut out). This _could_ cause dataloss in theory if someone got surprised in just the wrong way.
Nominating for 1.0. Note: this may be related to bug 55661.
Keywords: mozilla1.0
assigning to akkana to help in narrowing it down. This also happens on win98, if I select the paste into notepad I get the linefeed. If I paste into Composer, no linefeed. If I paste into the plaintext editor, I get a space at the end. I don't know if any of that helps or not.
Assignee: mjudge → akkana
Priority: -- → P4
Target Milestone: --- → mozilla0.9.1
Here's what appears to be going on: this happens if your selection includes the beginning of a paragraph, but if you start the selection one character later (e.g. "SS 3.2 source and binary distributions" you don't get the newline. I suspect this is a dup of something else Joe is working on, so I'm sending it to him (Joe, if I'm wrong, feel free to send it back). I'm trying to investigate further to be sure, but my current build (from last night) can't load any urls either from the command line or the urlbar, so I'll have to update and hope the problem is fixed.
Assignee: akkana → jfrancis
Summary: Pasted text from Mozilla html always appends a newline → Pasted text from beginning of line always appends a newline
Target Milestone: mozilla0.9.1 → mozilla0.9.2
Keywords: correctness
Whiteboard: [cr/lf]
Target Milestone: mozilla0.9.2 → mozilla1.0
I see this for all pasted elements where the selection either started at the begining of the line, - OR - for elements like this: <pre class="admin"> Example root command $lt;/pre> (I'm not sure if Bugilla munges gt and lt as characters, so the above may look wonky, but hopefully you'll get the idea...) Every character from the end of the first > to the closing < will do the linefeed thing. In case it matters here's the appropriate chunk of the stylesheet: .admin {font-family: fixed; font-weight: bold; color: #00aa00;} This is especially annoying to me personally since a lot of the documentation I write uses stylesheets like the one above to illustrate root vs. user level commands. 4.77 doesn't like the (validated) CSS, and Mozilla now has a hard time pasting it... :(
Interesting, Linux build 2001060713 was appending *two* new lines to each paste, but now Linux build 2001062508 is only apending one, and it happens only when copying from certain html tags (h2 is one of them). Btw, I originally complained about this in the MailNews reader but I now see it is ubiquitous. Will the mailnews GUI folks get this report too?
*** Bug 90932 has been marked as a duplicate of this bug. ***
Assignee: jfrancis → anthonyd
Component: Selection → DOM to Text Conversion
QA Contact: blakeross → sujay
It doesn't always add a newline. You copied an entire html paragraph. the serializer needs to change: instead of emitting a br when you get to the end of a block, it needs to set some state and only emit a br if other content gets serialized after that. over to serializer.
For selection at the end of a paragraph, I agree - we need to see if anything is added after it before adding the newline/etc. However, I've found that some other selections append a newline even if they end in the middle of a word. Go to www.mozilla.org, and select (say) "Bugzilla 2.1" (from the header of a paragraph in bold). Paste it. Note that a newline was appended. Select "ugzilla 2.1", paste it. Note that there's no appended newline. Odd. Go into a <pre> section (I used a jprof.html file). Select some text in the middle of a word. Paste it. 2 newlines are appended. So, we have _lots_ of problems with selections.... Upping severity because this has screwed me over innumerable times trying to paste data out of <pre> blocks into search widgets or command-lines. OS->All; it happens everywhere.
Severity: minor → major
OS: FreeBSD → All
The newlines resulting from copying out of a pre block are covered by another bug (jfrancis) and are a different issue. As regards this bug: why aren't we doing what we do for many other tags, saying that we only include the enclosing <p> tag if the entire contents of the tag is selected? Wouldn't that be a better fix? If it is fixed in the serializer, EnsureVerticalSpace() might do the job. But let's make sure that's what we want to do, and that we really want to include the enclosing <p> in a case where only the first bits of a paragraph are copied.
Nominating for 0.9.6. This (and the associated <pre> bug) are a MASSIVE annoyance to using mozilla as a primary browser. I have to paste text into a text editor, then clip it out of there to paste into other places. This also can cause dataloss when the user is constructing command lines using paste (since the newline pasted in starts the command before you expected it to.)
Bugs targeted at mozilla1.0 without the mozilla1.0 keyword moved to mozilla1.0.1 (you can query for this string to delete spam or retrieve the list of bugs I've moved)
Target Milestone: mozilla1.0 → mozilla1.0.1
-->myself
really -->myself
Assignee: anthonyd → tmutreja
Yhere are things we can do in the serializer (see my earlier comment) to improve the situation. However, there are other things which probably have to happen in the document encoder to fix some of the other issues raised here. In particular Randell's "Bugzilla" vs "ugzilla" result. The only way the serializer would treat these differently is if the encoder was passing the H1 (or whatever block tag it is) to the serializer. I agree with Akkana: that seems wrong. After you do what you can in the serializer you may need to assign the remaining issues to me.
Thanks for taking over this bug, Tanu. Removing old target and nomination to get this back on radar (since we have a new owner). This _is_ a dataloss bug (potentially), and a truely, intensely annoying bug in all cases. I for one would not accept as finished a browser with this bug in it. I live with the bug for now because I know about it and can work around it (with pain), but I wouldn't let this loose on unsuspecting people with this bug. Web pages often have shell/command-window commands in them that people select and paste, expecting to be able to edit them. Not to mention constantly having to delete extra newlines when pasting jprof data into things, etc, etc.
Keywords: mozilla0.9.6mozilla1.0
Target Milestone: mozilla1.0.1 → ---
Following diff solves this bug. I tried it for few more test cases but still need to make sure it's not breaking anything else. Index: nsPlainTextSerializer.cpp =================================================================== RCS file: /cvsroot/mozilla/content/base/src/nsPlainTextSerializer.cpp,v retrieving revision 1.39 diff -u -r1.39 nsPlainTextSerializer.cpp --- nsPlainTextSerializer.cpp 2002/01/10 14:06:15 1.39 +++ nsPlainTextSerializer.cpp 2002/01/20 14:46:44 @@ -836,7 +836,7 @@ // This is hard. Sometimes 0 is a better number, but // how to know? EnsureVerticalSpace((mFlags & nsIDocumentEncoder::OutputFormatted) - ? 1 : 0); + ? mEmptyLines + 1 : mEmptyLines); }
I don't understand the reasoning behind the patch. In copy/paste, the formatted flag should not be set, so we would have been calling EnsureVerticalSpace with an argument of 0. What is mEmptyLines set to in the example? What happens if you make a selection that includes the beginning of a paragraph, then goes through the end of it and partway into the next paragraph, then paste into plaintext? Do we get a proper break between the two paragraphs?
The reason why I replaced hard coded '0' with mEmptyLines was to avoid the situations where we are already in the middle of a line and hence we have mEmptyLines as -1. In that case, EnsureVerticalSpace(0) calls EndLine() once which adds a linebreak to the content. However, I've some concerns here: o In general <p> is kinda autoclose tag. So in cases where only the partial content of tag <p> is selected and copied, I think during serialization also the parser should close the "p" tag as it does for the content being passed by the netlib. o Now as the HTML specifications at http://www.w3.org/TR/html4/appendix/notes.html#notes-line-breaks says that "The following two HTML examples must be rendered identically: <P>Thomas is watching TV.</P> <P> Thomas is watching TV. </P>", we should be able to ensure a line break before we close the "p" tag (in formatted cases) but without effecting the content of the p tag. In EndLine() we see that we add a linebreak to the current content by " mCurrentLine.Append (mLineBreak);" which actually adds a line break to the content of the tag. This cause quite a few bugs including this one too. The only solution seems to be able to find some way which actually attaches the linebreak to the closing tag but not to the current content i.e. mCurrentLine here. This seems to effect only the DOM-TXT serialization but not otherwise probably because tokenizer takes care of it(not sure!) in other cases. Additionally, it will not cause any problem for the situations where we cross a complete "p" tag and select some part of the next tag too as in that case the line break attached to the closing tag will also come in the picture and provide a separation between the two tags. o Another solution for this bug may be obtained by treating "p" tag in a distinct manner in DoCloseContainer(). There if we can check that we have some more content after this "p" tag(which so far I don't know how to do :-)), we should ensure a line break otherwise we should call EnsureVerticalSpace (mEmptyLines).
The p tags will be autoclosed for us by the parser -- a DOM representation has no way of expressing a p tag that doesn't close. Regarding html source that has newlines in it: there shouldn't be any difference to us -- we should ignore newlines in html source, as they aren't significant except as whitespace (i.e. they implement a word break and are otherwise equivalent to other whitespace). If we're ever putting out a linebreak in the plaintext serializer just because we saw a line break in the html source (not in a <pre> block), that's a bug and should be fixed. I don't know of any cases where we're doing that, but maybe you do. Thanks for the explanation about mEmptyLines -- it makes sense, and your patch works well in all the cases I've tested. r=akkana
Keywords: patch
Whiteboard: [cr/lf] → [cr/lf] [has-review] [needs SR=]
sr=jst PS. In the future, please attach the diff files to bugs, even if they're tiny.
Fixed with checkin C:\mozilla\content\base\src>cvs commit cvs commit: Examining . Checking in nsPlainTextSerializer.cpp; /cvsroot/mozilla/content/base/src/nsPlainTextSerializer.cpp,v <-- nsPlainTextS erializer.cpp new revision: 1.42; previous revision: 1.41 done
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
This checkin caused some of the dom to text conversion tests to fail (simplecopy.out simplefmt.out mailquote.out). See http://tinderbox.mozilla.org/showlog.cgi?tree=SeaMonkey&errorparser=unix&logfile=1014040740.29214.gz&buildtime=1014040740&buildname=comet%20Linux%20Depend&fulltext=1 , look for ----------- Output from DomToTextConversionTest ------------. The tests are in mozilla/source/htmlparser/tests/outsinks/.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
1. For failure of simplefmt.out, accroding to Akkana's comments(#19) it looks that Test Cases should be changed to behave properly . I would expect "H" instead of a new line here. Full report for this failure is as below: Testing simple html to plaintext formatting ... Type Manifest File: D:\mozilla_Feb17\mozilla\dist\WIN32_D.OBJ\bin\components\xpti.dat nsNativeComponentLoader: autoregistering begins. nsNativeComponentLoader: autoregistering succeeded nNCL: registering deferred (0) Comparison failed at char 521: generated was 72, file had 10 Comparison failed at char 521: ----- Simple html page Here is a link to the mozilla.org <http://www.mozilla.org> page. Here is some _underlined and *bold*ened_ified text plus some <angle bracket entities>. Here is a line ending with a space followed by a line break. Plaintext output should contain only one space (and no line breaks) between "space" and "followed". Here is a /list/: * An item * A nested ordered list: * 1. item one 2. item two * last item Here is a paragraph after the list. H ----- Simple formatting test failed. 2. For failure of SimpleCopy.out I feel that we need to treat <p> tag a bit different from other BlockLevel tags(for which IsBlockLevel() returns true). In other cases we may still need to ensure one line break after the tag closes. Though we still need to change this TestCase for the same reason as above i.e. at "Here is a paragraph after the list. H" I would expect "H" instead of a new line. Testing simple copy case ... Type Manifest File: D:\mozilla_Feb17\mozilla\dist\WIN32_D.OBJ\bin\components\xpti.dat nsNativeComponentLoader: autoregistering begins. nsNativeComponentLoader: autoregistering succeeded nNCL: registering deferred (0) Comparison failed at char 17: generated was 32, file had 10 Comparison failed at char 17: ----- Simple html page ----- Simple copy test failed. 3. For mailquote.out also, I think we need to change the test case. Will upload a proper patch tomorrow.
Um, shouldn't this be backed out until we have a fix for the tests? We don't want tinderbox to be orange all day because of this...
Looks like this has already been abcked out by rev 1.43. Thanks and sorry for the confusion.
Attached patch Fixing differently... (obsolete) (deleted) — Splinter Review
I tested the behavior of IE and found that earlier patch was not providing a compatible solution, in the sense that using that patch --> <p>abc</p> <p>xyz</p> was getting pasted as: abc xyz whereas IE does it as: abc xyz So tried a different approach to solve this bug and make the behavior compatible with IE. Now with every closing <p> tag, setting a new member variable to TRUE. Later, before handling the next tag, checking if that variable is TRUE, which actually means that addition of a line break is due. If TRUE, adding a line break before going any further. With this fix, I need to modify just one TestCase "mailquote.out". In "mailquote.out", <P> is the last tag before </body></html>. So as there is no content after <P>, there should be just one new line at the end. Next I shall be attaching diff for "mailquote.out" too.
Removing a new line from the end.
Whiteboard: [cr/lf] [has-review] [needs SR=] → [cr/lf] [needs review]
Comment on attachment 70285 [details] [diff] [review] Fixing differently... This is something that should have been fixed long ago. I'm glad that you're looking at it. But this problem isn't limited to <p>, is it? The same problem should be there for all block level tags, so it would better to not seperate out <p>. Instead you should use the same solution for all blocks. I never see a reason to add a newline unless there's something else coming afterwards, or am I mistaken?
Tamu, you may want to look at what I did for bug 98572, which is a little more general and allows for arbitrary amounts of vertical whitespace.
Attached patch Generalising the solution... (obsolete) (deleted) — Splinter Review
Generalizing the solution. As other closing tags are handled in specific way based on the mFlag value, this is the only way I could find for generic solution. (It passes the test cases successfully)
Whiteboard: [cr/lf] [needs review] → [cr/lf] [needs r=]
I like the generalized version, the code is pretty straightforward. For whatever it's worth, r=rjesup@wgate.com
Comment on attachment 71662 [details] [diff] [review] Generalising the solution... Yes, this looks fine. There is only one thing that I'm not sure of. You set mFloatingLines to 0 after a </li> and then, if it's already set to 1 after </pre> (for instance), you miss one line break. If that is so (I think it is), you should check and set mFloatingLines only if it is less than 0. That would also require you to reset mFloatingLines to -1 after using it. Otherwise, and with the above considered r=bratell@lysator.liu.se
Daniel, I'm using mFloatingLines just to pass a varying value to EnsureVerticalSpace(). Inside EnsureVerticalSpace(), noOfRows are compared to mEmptyLines. So it should not miss required new lines. Please see if I'm missing something!!!
*** Bug 98572 has been marked as a duplicate of this bug. ***
Scenario: You get a </pre> : mLineBreakIsDue = TRUE, mFloatingLines = 1 Then, directly you get a </li> : mLineBreakIsDue = TRUE, mFloatingLines = 0 Then you get a start tag, since mLineBreakIsDue = TRUE, you call EnsureVerticalSpace(mFloatingLines), i.e. EnsureVerticalSpace(0). EnsureVerticalSpace compares the number of current line breaks to 0 and ensures that we are on a new line. But since we had a </pre> we should have called EnsureVericalSpace(1). The </li> made us forget that.
Attached patch Including Daniel's Suggestions. (obsolete) (deleted) — Splinter Review
o. Before setting mFloatingLines to 0, included a check if (mFloatingLines < 0). o. Setting mFloatingLines to -1 everytime it's job is done for a new element.
Attachment #70285 - Attachment is obsolete: true
Attached file Test case to verify latest patch(#71863) (obsolete) (deleted) —
Sorry, attached a wrong test case last time.
Attachment #71865 - Attachment is obsolete: true
Comment on attachment 71863 [details] [diff] [review] Including Daniel's Suggestions. r=bratell@lysator.liu.se
Attachment #71863 - Flags: review+
Status: REOPENED → ASSIGNED
Whiteboard: [cr/lf] [needs r=] → [cr/lf] [needs sr=/a=]
Target Milestone: --- → mozilla1.0
Attachment #71662 - Attachment is obsolete: true
This is going to make a lot of people happy :-)
Comment on attachment 71863 [details] [diff] [review] Including Daniel's Suggestions. sr=jst
Attachment #71863 - Flags: superreview+
Comment on attachment 71863 [details] [diff] [review] Including Daniel's Suggestions. a=asa (on behalf of drivers) for checkin to 0.9.9 and the trunk!
Attachment #71863 - Flags: approval+
// The wrap column is how many standard sized chars (western languages) // should be allowed on a line. There could be less chars if the chars @@ -195,6 +196,11 @@ PRPackedBool mInWhitespace; PRPackedBool mPreFormatted; PRPackedBool mStartedOutput; // we've produced at least a character + + /*While handling a new tag, this variable should remind if any line break + is due because of a closing tag. Setting it to "TRUE" while closing the tags. + Hence opening tags are guaranteed to start with appropriate line breaks.*/ + PRPackedBool mLineBreakDue; Use a C++ // ... comment style above, with a space between each // and the first word after it -- in other words, imitate the multi-line // comment shown above, in the "The wrap column ..." comment. Thanks for fixing this, let's get it into 0.9.9. /be
Attached patch With Brendan's Suggestions. (deleted) — Splinter Review
Attachment #71863 - Attachment is obsolete: true
Fixed for trunk with details as: C:\moz099\mozilla\content\base\src>cvs commit cvs commit: Examining . Checking in nsPlainTextSerializer.cpp; /cvsroot/mozilla/content/base/src/nsPlainTextSerializer.cpp,v <-- nsPlainTextSerializer.cpp new revision: 1.46; previous revision: 1.45 done Checking in nsPlainTextSerializer.h; /cvsroot/mozilla/content/base/src/nsPlainTextSerializer.h,v <-- nsPlainTextSerializer.h new revision: 1.17; previous revision: 1.16 done Fixed for 099 branch with details as: C:\moz099\mozilla\content\base\src>cvs commit cvs commit: Examining . Checking in nsPlainTextSerializer.cpp; /cvsroot/mozilla/content/base/src/nsPlainTextSerializer.cpp,v <-- nsPlainTextSerializer.cpp new revision: 1.45.2.1; previous revision: 1.45 done Checking in nsPlainTextSerializer.h; /cvsroot/mozilla/content/base/src/nsPlainTextSerializer.h,v <-- nsPlainTextSer ializer.h new revision: 1.16.10.1; previous revision: 1.16 done
Status: ASSIGNED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
*** Bug 128890 has been marked as a duplicate of this bug. ***
Tanu: will this also fix bug 128890 ?
Yes Peter, it fixes 128890 also and that has been marked as a dupe of it.
Reporter: please verify and mark verified-fixed...thanks!
sujay@netscape.com suggests: > Reporter: please verify and mark verified-fixed...thanks! wish I could! But the Linux version (of RC2 even) has another annoying little paste problem - after mozilla has been running for a few days, I can only paste to xclipboard and no where else!
verified fixed
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: