Closed Bug 84261 Opened 24 years ago Closed 24 years ago

insert linefeeds into messages so that saving as draft and sending will always succeed

Categories

(MailNews Core :: Backend, defect, P1)

defect

Tracking

(Not tracked)

VERIFIED FIXED
mozilla0.9.2

People

(Reporter: sspitzer, Assigned: sspitzer)

References

Details

(Whiteboard: [nsbeta1+][PDT+])

Attachments

(6 files)

quoting sfraser: "I think the content serializer should have a flag that enforces a line length (which, as seth should know, should be < 1000 octets so that NNTP works)." with out this, long lines will cause saving as draft, copying to sent folder, posting emails and sending emails to fail. I'll start investigating a fix.
putting into 0.9.2.
Keywords: nsbeta1
Priority: -- → P1
Summary: content serializer should have a flag that enforces a line length → content serializer should have a flag that enforces a line length
Whiteboard: [nsbeta1+]
Target Milestone: --- → mozilla0.9.2
OS: Windows 2000 → All
QA Contact: esther → stephend
Hardware: PC → All
adding PDT+
Whiteboard: [nsbeta1+] → [nsbeta1+][PDT+]
spun off from bug #83381 investigating this now...
Status: NEW → ASSIGNED
after investigating, there are some issues that are making me lean towards putting the fix in the mailnews code and not changing the content serializer. please comment. in the mailnews code, after we've gotten the body from editor (by calling nsIEditorShell::GetContentsAs()], we tweak the body. the tweaks I see are: 1) structure / link conversion. this is everybody's favorite code that turns *foo* into <b>foo</b> and http://foo.com into <a href="http://foo.com">http://foo.com</a> 2) charset conversions I'm not sure if the charset conversions can alter the body's length, but I know the first one will. that's important because if I were to fix the html content serializer to have linebreaks every x bytes, the post processing we do could alter the body and break us. here's what I propose to do: I'll add some code that will check the "final" body, and insert linebreaks if necessary. this code will only need to be called in two places in nsMsgCompose.cpp and nsMsgSend.cpp, after we've gotten and tweaked the body. this approach is simpler (and less risky) since I won't have to keep track of the number of chars since the linebreak in nsHTMLContentSerializer. that gets a little hairy, for example, when the message has links. we over cross into nsXMLContentSerializer::SerializeAttr(). the code I plan to add will scan the final body, making sure there is a linefeed every so many bytes. in most cases, we should be ok and no action will be required. when action is required, we'll pay the price with string copy and I'll insert the linebreak. while I wait for feedback, I'll start working on a fix. I think this fix will be low risk / high reward enough to take it for 0.9.2
Attached patch patch (deleted) — Splinter Review
I've attached a patch. this patch will we'll insert CRLFs if necessary. in the current test case (HTML reply to a message with a large <pre> block) we insert the CRLFs and I'm able to save (as draft) and send the message. ducarroz, mscott please review. I'm also looking for feedback about the decision to do this in mailnews, instead of in the HTML content serializer.
Yes, charset conversions alter the length. It also true for the HTML entities like &nbsp;.
Comments: 1. This patch looks like it will always strdup the body (though I guess it was being copied before your patch). It's unforunate that this copy has to take place if no modifications are needed. If you have to insert CRLF, then you'll allocate and copy the entire string twice. 2. It looks like the old copy code did not assume a nul-terminated 'attachment1 [details] [diff] [review]_body' string (since it passes the length), but your new PL_strdup call does. 3. + // XXX TODO + // march backwards and determine the "best" place for the CRLF How important do you think it is to do this? Currently, your code may cause unexpected linebreaks. 4. Can you always assume that the string has CRLF breaks at this point? 5. In the + if ((body[i] == nsCRT::CR) && (body[i+1] == nsCRT::LF)) { clause, don't you want to ++i to account for the LF? 6. + body[i] = '\0'; + newBody.Append(body+lastPos); + newBody.Append(temp); Rather than nuking chars in body, can't you just append with a length?
to answer point 1. The copy is needed because we don't have control on the life of the original data! That's why we are keeping our own internal copy. Vive mime :-) Simon is right about point 2, you cannot presume the buffer received would be always null terminated. Please use a strndup instead.
Comments: > 1. This patch looks like it will always strdup the body (though I guess it was > being copied before your patch). It's unforunate that this copy has to take > place if no modifications are needed. If you have to insert CRLF, then > you'll allocate and copy the entire string twice. as ducarroz pointed out, we already have to do the copy (without rewriting more code to do less copies.) yes, to insert a CRLF will cause another copy. > 2. It looks like the old copy code did not assume a nul-terminated > 'attachment1 [details] [diff] [review]_body' string (since it passes the length), but your new > PL_strdup call does. since it used memcpy(), I worried about that. but I checked: the length (attachment1 [details] [diff] [review]_body_length) is achieved with PL_strlen(). line 4263, nsMsgSend.cpp: attachment1 [details] [diff] [review]_body_length = PL_strlen(attachment1 [details] [diff] [review]_body); and line 630, nsMsgCompose.cpp: bodyLength = PL_strlen(bodyString); > + // XXX TODO > + // march backwards and determine the "best" place for the CRLF > How important do you think it is to do this? Currently, your code > may cause unexpected linebreaks. when this bug bites you, it is most likely going to be when quoting large <pre> blocks. since we removed newlines from the original <pre> block, there is going to be some unexpected line breaks. jfrancis logged a bug on karnaze to get the ball rolling. I think he depends on layout fixing something before he can remove his "\n" -> <br> code (to work around cursor placement.) I don't want to spend too many cycles on a work around to a work around. I'd rather fix the send / save problems first and worry about the bad CRLF > 4. Can you always assume that the string has CRLF breaks at this point? good question. I don't know for sure. I better go check what happens on mac and linux. > 5. In the if ((body[i] == nsCRT::CR) && (body[i+1] == nsCRT::LF)) { > clause, don't you want to ++i to account for the LF? I could, but it would be a trade: one compare for one add. after finding CRLF, we know that in the next iteration of the loop, body[i] will be LF. or am I missing something? > 6. + body[i] = '\0'; > + newBody.Append(body+lastPos); > + newBody.Append(temp); > > Rather than nuking chars in body, can't you just append with a length? I didn't know there was a string API for that. looks like there is, I'll rewrite to use it. thanks for the feedback, sfraser.
the second patch is the same as the first, except I've addressed sfraser's issue #6. I'm still investigating the CRLF (on all platforms) issue.
small nit: can we get rid of the nsCAutoString and make it a nsCString. Why waste the extra copying. Messages bodies are most frequently going to be larger than 29 bytes (or whatever the initial array allocation is these days for the auto string). Also, can't we pre-allocate the new body string based on the size of the original body length? This will reduce the amount of re-allocations that have to go on as we append to the body strings. nsString isn't very efficient in this regards. It's just going to be adding 64 bytes or so at a time when it needs more space. Then it reallocates again. So let's avoid all that.....
mscott: good points, not small nits at all. if the default size for nsCAutoString is that small, it doesn't pay to put it on the stack, just to copy it off to the heap. your pre-allocate idea is slick. I can do use something like: original length + ((original length / limit between CRLFs) * 2) which should give me how many bytes I'll need in the worst case. (2 bytes for each CRLF) I'll go look at the nsCString apis to see how to specify a size. I'll also finish investigating if it is always CRLF (on all platforms) new patch coming.
patch (version 3) addresses mscott's points: use nsCString instead of nsCAutoString, and use SetCapacity() to avoid needless copies and re-allocations. sfraser is right, I messed up on the CRLF issue. at first, I thought I was ok looking at the comments in nsHTMLContentSerializer and nsPlainTextSerializer: // Set the line break character: if ((mFlags & nsIDocumentEncoder::OutputCRLineBreak) && (mFlags & nsIDocumentEncoder::OutputLFLineBreak)) // Windows/mail mLineBreak.AssignWithConversion("\r\n"); but, it turns out we aren't doing that for mail. (I'll fix the comments) instead, we end up using the platform's line break. this is correct because later in the mailnews code, we'll turn the platform line break into CRLF when we need to. new fix coming up. thanks for the catch, smfr.
I don't really see the point of doing: + // body will not have any null bytes, so we can use PL_strdup + char *body = PL_strdup(bodyStr); + if (!body) return NS_ERROR_OUT_OF_MEMORY; at the entry of the function, why not doing the copy only at the end when we detect that we can use the body as it (in the else part of the if statment)?
originally I had to do it at the beginning because I was inserting null bytes so that I could use Append(str). but now that I'm using Append(str,length) [as smfr suggested] I can delay that allocation. good catch, ducarroz. I can also delay the setting of the nsCString's capacity until I know I'll need it. new patch coming.
there might be a way to save another string allocation / copy when I have to insert linebreaks. (currently I allocate once to get a nsCString and then again when I strdup that string at the end.) I'll work on removing that extra copy.
ok, I'm testing a patch that reduces the allocations to one (in all cases) and copying down to a minimum. (at worst, we only copy the string once.) I'll attach it. I've tested on linux, next up win32.
Attached patch patch, v4 (deleted) — Splinter Review
working good on win32, also. I'll send mail asking for re-review from sfmr, ducarroz and mscott. thanks for the excellent feedback.
R=ducarroz
sr=sfraser
updating summary. this doesn't have to do with the html content serializer anymore.
Summary: content serializer should have a flag that enforces a line length → insert linefeeds into messages so that saving as draft and sending will always succeed
a= asa@mozilla.org for checkin to the trunk. (on behalf of drivers)
Blocks: 83989
fixed. I'll attach a test case for QA to use when verifying.
fixed. stephend, let me know if you have questions about how to verify with that sample email.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Verified FIXED using builds: Windows 2K - 2001-06-18-04 Mac OS 9.1 - 2001-06-18-08 RedHat 7.1 - 2001-06-18-08 When using the message that Seth attached (saved in the Local Folders as a .eml that contains the HTML message), I can use Reply, Reply All, and Save (which I verified by looking in my Drafts folder).
Status: RESOLVED → VERIFIED
*** Bug 77062 has been marked as a duplicate of this bug. ***
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: