Closed
Bug 119441
Opened 23 years ago
Closed 23 years ago
Problems importing Outlook & Eudora Mail with "From" in the beginning of a line
Categories
(MailNews Core :: Backend, defect, P2)
Tracking
(Not tracked)
VERIFIED
FIXED
mozilla0.9.9
People
(Reporter: victor.wong, Assigned: cavin)
References
(Blocks 1 open bug)
Details
(Whiteboard: nab-imp)
Attachments
(3 files, 2 obsolete files)
(deleted),
application/octet-stream
|
Details | |
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
patch
|
bugzilla
:
review+
sspitzer
:
superreview+
asa
:
approval+
|
Details | Diff | Splinter Review |
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Q312461)
BuildID: 2002011003
I'm trying to import a 12,000+ message inbox from Outlook XP to Mozilla. After
the import, when I view the list of messages in the Inbox in Mozilla, there are
some messages without a subject and all with dates on 12/31/1969 at 4:00pm.
After checking the original messages, I found that these were all were
fragments of original messages. The original messages had lines in the message
body that started like "From now on...".
Reproducible: Always
Steps to Reproduce:
1. Import mail from Outlook
2. Look at list of messages
Actual Results: Some messages with lines in the message body starting with the
word "From..." are misinterpreted as a separate message.
Expected Results: No messages should be misinterpreted.
I'm guessing it's one of two things:
1. If Mozilla stores mail in the MBOX format then the import utility might have
forgotten to perform From_ line quoting as specified in the RFC.
2. The messages are misparsed by the procedure that greps out the header
information for the message list.
Reporter | ||
Comment 1•23 years ago
|
||
This was one of the messages that when imported from Outlook would be
recognized as two messages. One message would have all the header information
and the first part of the message. The other message would have no subject and
have a sender of something like "CPU".
Reporter | ||
Comment 2•23 years ago
|
||
The attachment is a .msg file saved from Outlook. You can just drag it back
into Outlook and then do the import to see the bug.
Comment 3•23 years ago
|
||
yes, this would be a bug in the import utility. I'm not sure who owns it, so I'm
cc'ing Scott and reassigning to Cavin since he's the only one I know who has
worked on import issues in the past.
Assignee: bienvenu → cavin
Status: UNCONFIRMED → NEW
Component: Mail Database → Mail Back End
Ever confirmed: true
Comment 4•23 years ago
|
||
Trunk build 2002-01-11-03:WinMe
Change summary from "Problems with importing mail from Outlook XP" to "Problems
with importing mail from Outlook", since I was able to reproduce the problem
using Outlook on WinMe.
Whiteboard: nab-imp
Reporter | ||
Comment 6•23 years ago
|
||
I decided to test a bit more and found a possibly related problem in build
200201103.
Try this:
1. Click on 'Compose' on the toolbar.
2. Type in 'From me' in the message body.
3. Close the window.
4. It will ask you if you want to save it. Click yes.
5. Select the message in the drafts folder.
You will see that the message has '>From me' as it's message body.
This looks to me as if the display code doesn't understand From_ quoting. So
there seems to be a problem with the import code not performing From_ quoting
and then also a problem with the display code misunderstanding the quoting.
Updated•23 years ago
|
Summary: Problems with importing mail from Outlook XP → Problems with importing mail from Outlook
Assignee | ||
Comment 7•23 years ago
|
||
Yes, it's caused by the import code not performing the "From_" line quoting (in
the body text). Reporter, if you like, you can fix the problem now by manually
adding '>' in front of all the "from ..." lines in the imported INBOX file.
Then, remove the associated INBOX.msf file, restart mailnews program and reopen
the INBOX. Just a workaround for now until this is fixed.
For the "display code doesn't understand From_ quoting" problem, we should
address it in a separate bug.
Comment 8•23 years ago
|
||
Changing summary to "Problems with importing mail from Outlook & Subjects
starting with "From".
Summary: Problems with importing mail from Outlook → Problems with importing mail from Outlook & Subjects starting with "From"
Comment 9•23 years ago
|
||
Sorry for the noise. Changing summary to more closely reflect reality.
Changing from 'Problems with importing mail from Outlook & Subjects starting
with "From"' to 'Problems importing Outlook Mail with "From" in the beginning of
a line'.
Note: the problem also occurs if the attachment has text that has "from" in the
begging of a line.
Summary: Problems with importing mail from Outlook & Subjects starting with "From" → Problems importing Outlook Mail with "From" in the beginning of a line
Assignee | ||
Comment 11•23 years ago
|
||
Added code to prefix lines starting with "From " with ">" in the msg body.
Comment 12•23 years ago
|
||
Comment on attachment 65324 [details] [diff] [review]
(Av1) proposed patch
The patch looks good but I have a little concern about performance especially
on Mac with the fact your are doing a bunch of small write to disk. Maybe you
should try to write to a buffer and flush it to disk when it's full. A size of
4K should be fine. Import is not a frequent operation but it could be very time
consuming when you are importing a hudge inbox. R=ducarroz
Attachment #65324 -
Flags: review+
Comment 13•23 years ago
|
||
+ if (dataLen == 1)
+ {
+ pDst->Write(start, 1, &written);
+ return NS_OK;
+ }
if (dataLen < 5)
{
pDst->Write(start, dataLen, &written);
return NS_OK;
}
:-)
Also what is dataLen?
And you use NS_ENSURE_TRUE(rv) on other writes...
+ // Find a line so check for "From ".
+ *pChar = 0;
+ if ( nsCRT::strlen(start) > 4 && (*(start) == 'F') && (*(start + 1) ==
'r') &&
+ (*(start + 2) == 'o') && (*(start + 3) == 'm') && (*(start + 4) == '
') )
+ rv = pDst->Write(">", 1, &written);
+ *pChar = 0x0D;
simply if ((pChar - start) > 4 && .....)
rv = pDst->Write(">", 1, &written);
?
+ rv = pDst->Write(start, pChar-start+2, &written);
^^^
Any chance that message has unix-style linebreaks? (Just speculating, I'm not
windows guy)
Comment 14•23 years ago
|
||
What about smth like
pChar = start;
while (start < end)
{
while (pChar <= end && pChar != 'F')
pChar++;
if (pChar < end) {
if ((pChar == body.m_pBuffer + body.m_writeOffset || (pChar[-1] == '\n' &&
pChar[-2] == '\r')) &&
pChar[1] == 'r' && pChar[2] == 'o' && pChar[3] == 'm' && pChar[4] ==' ')
{
if (start < pChar)
{
pDst->Write(start, pChar - start, &written);
start = pChar;
pChar += 5;
}
rv = pDst->Write(">", 1, &written);
}
}
else if (start < end)
{
// Flush out the remaining data and we're done.
rv = pDst->Write(start, end-start, &written);
NS_ENSURE_SUCCESS(rv,rv);
break;
}
}
Assignee | ||
Comment 15•23 years ago
|
||
+ if (dataLen == 1)
+ {
+ pDst->Write(start, 1, &written);
+ return NS_OK;
+ }
I guess I attached the wrong patch file. Good catch. I'll attach the final one
and the main logic stays the same.
Assignee | ||
Comment 16•23 years ago
|
||
Assignee | ||
Updated•23 years ago
|
Attachment #65324 -
Attachment is obsolete: true
Comment 17•23 years ago
|
||
Comment on attachment 66138 [details] [diff] [review]
(Av2) The right patch
R=ducarroz
Attachment #66138 -
Flags: review+
Comment 18•23 years ago
|
||
Comment on attachment 66138 [details] [diff] [review]
(Av2) The right patch
some general comments:
1) did you log a bug on us not handling ">From_" quoting?
2) I noticed that when I composed a plain text draft message, wrote "From ",
and save it,
it got saved as " From ". So we appear to be doing some sort of "From "
escaping, but it doesn't
seem like we are doing the right type.
3) Proper "From " quoting also means that you quote ">From " as ">>From ",
">>From ", as ">>>From ", etc.
4) What about other methods of import? (does this bug affect Eudora import?)
>+ char *start = body.m_pBuffer + body.m_writeOffset;
>+ char *end = body.m_pBuffer+body.m_bytesInBuf;
end should be a const char *, since it doesn't change, right?
>+ while ((pChar <= end) && (*pChar != 0x0D) && (*(pChar+1) != 0x0A))
if pChar == end, pChar+1 might be off the end of the buffer, right?
>+ if ( nsCRT::strlen(start) > 4 && (*(start) == 'F') && (*(start + 1) == 'r') &&
>+ (*(start + 2) == 'o') && (*(start + 3) == 'm') && (*(start + 4) == ' ') )
doing a strlen here is bad for performance.
I think you are going to need a while loop, to eat leading '>', looking for the
first 'F'.
once you find it, then you can do strncmp(ptr, "From ", 5);
you should find out where and why the compose (or local folder?) code is
quoting "From " as " From ".
We should be fixing that code as well. Investigate if it is worth having this
code live under local,
and we call it from the import code.
The next step, after this is fixed, is to properly handle ">From" quoting on
display. This should be
easy, as we currently handle "." escaping in various places.
Attachment #66138 -
Flags: needs-work+
Comment 19•23 years ago
|
||
I think you can break it up into two bugs:
1) escaping (make sure the eudora and outlook import code works and fix the
code that is escaping "From " -> " From " when writing to disk. Make sure to
see do the right thing if you have a message on the imap server, and you copy
it to a local folder, as well as when you save drafts, templates, send later,
etc.) I think it will make sense to write the escaping code once (in
mozilla/mailnews/local or in mozilla/mailnews/base/util I'm not sure yet), and
then call it from all various places that need it. There will be at least two.
2) unescaping (make sure that the local folder parsing code
escapes ">From ", ">>From ", ">>>From ", etc.). This can be a seperate bug.
Comment 20•23 years ago
|
||
I think you can break it up even further, into three bugs:
1) escaping, part 1. (make sure the eudora and outlook import code works)
It will make sense to write the escaping code once (in
mozilla/mailnews/local or in mozilla/mailnews/base/util I'm not sure yet), and
then call it from all various places that need it. There will be at least two.
2) escaping, part 2. fix the mozilla mbox code to escape the same way and fix
the import code does. Make sure to see do the right thing if you have a
message on the imap server, and you copy
it to a local folder, as well as when you save drafts, templates, send later,
etc.)
3) unescaping (make sure that the local folder parsing code
escapes ">From ", ">>From ", ">>>From ", etc.). This can be a seperate bug.
Assignee | ||
Comment 21•23 years ago
|
||
Assignee | ||
Updated•23 years ago
|
Attachment #66138 -
Attachment is obsolete: true
Assignee | ||
Comment 22•23 years ago
|
||
Outlook Express has its own way of escaping "From " so I left it alone. The code
is quite different from those of Outlook and Eudora import (looks like written
by different people). Let me know if that needs to be changed as well (it'll
take some more time to incorporate the util functions in the code).
ducarroz, I'm not sure if the makefile for Mac will need to be changed.
Assignee | ||
Comment 23•23 years ago
|
||
Adding "Eudora" to Summary because it has the same problem.
Summary: Problems importing Outlook Mail with "From" in the beginning of a line → Problems importing Outlook & Eudora Mail with "From" in the beginning of a line
Assignee | ||
Comment 24•23 years ago
|
||
Per comment #20, "escaping, part2" and "unescaping" are logged as bug 121946 and
bug 121947 respectively.
Updated•23 years ago
|
Comment 25•23 years ago
|
||
Comment on attachment 66559 [details] [diff] [review]
(Av3) Put the util functions in nsLocalUtils.cpp, fix Outlook & Eduora problem
[Checked in: Comment 26]
looks good. some minor comments:
>+ if ( (*start == 'F') && (end-start > 4) && !nsCRT::strncmp(start, "From ", 5) )
do strncmp() instead of nsCRT::strncmp(), it's faster.
>+ rv = PR_TRUE;
>+ return rv;
>+}
>+
>+//
>+// This function finds all lines starting with "From " or "From " preceeding
>+// with one or more '>' (ie, ">From", ">>From", etc) in the input buffer
>+// (between 'start' and 'end') and prefix them with a ">" .
>+//
>+nsresult EscapeFromSpaceLine(nsIFileSpec *pDst, char *start, const char *end)
>+{
>+ nsresult rv;
>+ char *pChar;
>+ PRInt32 written;
>+
>+ pChar = start;
>+ while (start < end)
>+ {
this code is now not in a windows only file. (that import code was windows
only, right?)
so is this code "correct" for all platforms?
>+ while ((pChar < end) && (*pChar != 0x0D) && (*(pChar+1) != 0x0A))
>+ pChar++;
instead of 0x0A and 0x0D, consider using nsCRT::CR and nsCRT::LF, to make the
code more readable.
(if you want to fix the rest of the import code, that would be great, but that
can be spun off to a new bug.)
these functions might be better in msgbaseutil, as local and the import
libraries (at least eudora?) already links that in,
and adding local (which is large static library) will eat up more RAM at run
time.
address those nits, make sure we are doing the line endings correct for all
platforms, and investigate moving this code to baseutil, and then sr=sspitzer
Assignee | ||
Comment 26•23 years ago
|
||
Fix checked in.
Tested the patch on Mac Eudora import without problems. On all platforms we do
end lines with CRLF when, for example, saving draft msgs to local folders. So
the new function should be fine.
The new code was moved to msgbaseutil and two minor things were fixed.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Comment 27•23 years ago
|
||
Reopening
Trunk build 2002-03-01: WinMe
After importing Eudora Mail I see two messages where the Subject is blank, the
Sender state "the", and the date is 1969. I'm sending Cavin my Eudora Mail
folder to determine the cause.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 28•23 years ago
|
||
Ok, I see where the problem is. Working on a patch.
Assignee | ||
Comment 29•23 years ago
|
||
Changes are:
1. Recplace pDst->Write() call with EscapeFromSpaceLine() when body text is >
8K. This is where the problem was.
2. Replace "NS_ENSURE_SUCCESS(rv,rv)" with "if (NS_SUCCEEDED(rv))" so that the
code will follow through and the source file stream will be closed in case of
errors.
Both Eudora and Outlook have the same problem.
Comment 30•23 years ago
|
||
Comment on attachment 72630 [details] [diff] [review]
(Bv1) Fixed not escaping from line problem when body text is > 8K
[Checked in: Comment 33]
R=ducarroz
Attachment #72630 -
Flags: review+
Comment 31•23 years ago
|
||
Comment on attachment 72630 [details] [diff] [review]
(Bv1) Fixed not escaping from line problem when body text is > 8K
[Checked in: Comment 33]
sr=sspitzer
Attachment #72630 -
Flags: superreview+
Comment 32•23 years ago
|
||
Comment on attachment 72630 [details] [diff] [review]
(Bv1) Fixed not escaping from line problem when body text is > 8K
[Checked in: Comment 33]
a=asa (on behalf of drivers) for checkin to the 1.0 trunk
Attachment #72630 -
Flags: approval+
Assignee | ||
Comment 33•23 years ago
|
||
Fix checked into trunk.
Status: REOPENED → RESOLVED
Closed: 23 years ago → 23 years ago
Resolution: --- → FIXED
Comment 34•23 years ago
|
||
Trunk build 2002-03-22: WinMe, Mac 9.1
Verified Fixed, checked Eudora, Outlook Express and Outlook.
Status: RESOLVED → VERIFIED
Updated•21 years ago
|
Attachment #65324 -
Attachment description: proposed patch, v1 → (Av1) proposed patch
Updated•21 years ago
|
Attachment #66138 -
Attachment description: The right patch. → (Av2) The right patch
Updated•21 years ago
|
Attachment #72630 -
Attachment description: Fixed not escaping from line problem when body text is > 8K. → (Bv1) Fixed not escaping from line problem when body text is > 8K
[Checked in: Comment 33]
Updated•21 years ago
|
Attachment #66559 -
Attachment description: Put the util functions in nsLocalUtils.cpp, fix Outlook & Eduora problem. → (Av3) Put the util functions in nsLocalUtils.cpp, fix Outlook & Eduora problem
[Checked in: Comment 26]
Updated•20 years ago
|
Product: MailNews → Core
Updated•16 years ago
|
Product: Core → MailNews Core
You need to log in
before you can comment on or make changes to this bug.
Description
•