Closed
Bug 973316
Opened 11 years ago
Closed 11 years ago
TEST-UNEXPECTED-FAIL + PROCESS-CRASH | xpcshell/tests/mailnews/local/test/unit/test_nsIMsgParseMailMsgState.js
Categories
(MailNews Core :: MIME, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
Thunderbird 31.0
People
(Reporter: mkmelin, Assigned: jcranmer)
References
Details
(Keywords: intermittent-failure)
Attachments
(2 files)
(deleted),
patch
|
rkent
:
review+
|
Details | Diff | Splinter Review |
(deleted),
patch
|
Details | Diff | Splinter Review |
Seen on Linux x64 Opt
https://tbpl.mozilla.org/php/getParsedLog.php?id=34733807&tree=Thunderbird-Trunk&full=1#error1
TEST-UNEXPECTED-FAIL | /builds/slave/test/build/xpcshell/tests/mailnews/local/test/unit/test_nsIMsgParseMailMsgState.js | test failed (with xpcshell return code: -11), see following log:
>>>>>>>
TEST-INFO | (xpcshell/head.js) | test MAIN run_test pending (1)
<<<<<<<
mozcrash INFO | Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/thunderbird/tinderbox-builds/comm-central-linux64/1392462351/thunderbird-30.0a1.en-US.linux-x86_64.crashreporter-symbols.zip
PROCESS-CRASH | /builds/slave/test/build/xpcshell/tests/mailnews/local/test/unit/test_nsIMsgParseMailMsgState.js | application crashed [@ nsDependentCString::nsDependentCString(char const*)]
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
It only seems to happen on linux 64bit and the stack is:
Thread 0 (crashed)
0 libxul.so!nsDependentCString::nsDependentCString(char const*) [nsCharTraits.h:1ba07b762e2e : 505 + 0x9]
rbx = 0x00007fffaeab1048 r12 = 0x00007fffaeab1040
r13 = 0x00007f1a479eb640 r14 = 0x00007f1a56499460
r15 = 0x00007f1a4b445880 rip = 0x00007f1a551415dd
rsp = 0x00007fffaeab0e70 rbp = 0x00007fffaeab1030
Found by: given as instruction pointer in context
1 libxul.so!NS_MsgStripRE(char const**, unsigned int*, char**) [nsMsgUtils.cpp:1ba07b762e2e : 652 + 0x16]
rbx = 0x00007fffaeab1048 r12 = 0x00007fffaeab1040
r13 = 0x00007f1a479eb640 r14 = 0x00007f1a56499460
r15 = 0x00007f1a4b445880 rip = 0x00007f1a5637975c
rsp = 0x00007fffaeab0e80 rbp = 0x00007fffaeab1030
Found by: call frame info
2 libxul.so!nsParseMailMessageState::InternSubject(message_header*) [nsParseMailbox.cpp:1ba07b762e2e : 1234 + 0x4]
rbx = 0x00007f1a466d3400 r12 = 0x00007fffaeab1050
r13 = 0x00007fffaeab1168 r14 = 0x0000000000001003
r15 = 0x00007f1a4b445880 rip = 0x00007f1a56465dd1
rsp = 0x00007fffaeab1040 rbp = 0x00007fffaeab1080
Found by: call frame info
3 libxul.so!nsParseMailMessageState::FinalizeHeaders() [nsParseMailbox.cpp:1ba07b762e2e : 1466 + 0xe]
rbx = 0x00007f1a466d3400 r12 = 0x00007f1a466d3480
r13 = 0x00007fffaeab1168 r14 = 0x0000000000001003
r15 = 0x00007f1a4b445880 rip = 0x00007f1a56469497
rsp = 0x00007fffaeab1090 rbp = 0x00007fffaeab1310
Found by: call frame info
4 libxul.so!nsParseMailMessageState::ParseFolderLine(char const*, unsigned int) [nsParseMailbox.cpp:1ba07b762e2e : 676 + 0x7]
rbx = 0x00007f1a466d3400 r12 = 0x0000000000000002
r13 = 0x00007f1a551828fd r14 = 0x00007fffaeab1520
r15 = 0x00007fffaeab1506 rip = 0x00007f1a56469d47
May it be related to patch http://hg.mozilla.org/comm-central/rev/c7ecb1bd06ae that last did changes in nsParseMailMessageState::FinalizeHeaders , just before the InterSubject call? Even though it does not seem to touch the subject variable. May there be some in-memory variable aligning problem only affecting 64bit?
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 17•11 years ago
|
||
I looked into this since the tests were very erratic to actually work. Turns out, we have a nice heap overflow in here.
nsParseMsgMailState uses non-null terminated strings to pass into NS_MsgStripRE. NS_MsgStripRE assumes its strings are null-terminated in places. I'd blame my attempts to nsCString-ify nsIMimeConverter, but the old (and, well, current) implementation still used string in the xpidl, so the original versions were still wrong.
I've pushed an attempted fix to try: <https://tbpl.mozilla.org/?tree=Thunderbird-Try&rev=09eca902f9a3>.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 21•11 years ago
|
||
Sob sob sob sob.
The attempted fix(es) didn't work. The real problem is that nsParseMsgMailState thinks it's giving out nul-terminated strings (it tries hard to), but it fails to do so. If I add an assert that fails if we give out a non-null-terminated string, this crash is so persistently common that I can actually debug it.
nsParseMsgMailState only manages to null-terminate if buf points to a \r or a \n character. Which it always will, unless the last header line consists of a header line containing only FWS. Confirmation that this is what is occuring in the tests is obvious when you look at code coverage results: <http://www.tjhsst.edu/~jcranmer/c-ccov/mailnews/local/src/nsParseMailbox.cpp.html> (a branch that is necessary for null termination that is not always taken!?).
I once played with rewriting this code in JS. After all the effort I've gone through to get this damn thing to pass a newly-added assertion that it only hands out null-terminated strings, I'm going to move that from "fun side experiment" to "serious attempts to make a reviewable patch and get it checked in:" the C-style string handling here is baroque, full of errors (this makes, by my count, the 5th patch in as many years fixing a string overrun in this code).
Newer try push: <https://tbpl.mozilla.org/?tree=Thunderbird-Try&rev=c53367503e8f>.
Assignee | ||
Comment 22•11 years ago
|
||
This adds the code that satisfies the checks and also fixes other potential buffer overruns (buf_end isn't safely dereferenceable, although it requires a very precise message to hit such a scenario due to the growth strategy of the underlying buffer).
Comment 23•11 years ago
|
||
May this cause dataloss, or it crashes immediately by buffer overflow (access to somebody elses memory)?
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 26•11 years ago
|
||
IRC description:
11:44:29 AM - rkent: jcranmer: can you give me a short description of what is the key to your patch fixing the current crash? I haven;t tried to figure that out yet, still trying to recall how all of this was supposed to work.
11:45:51 AM - jcranmer: okay
11:46:30 AM - jcranmer: the key issue is that the only way the header gets null terminated is if buf points to \r or \n at the end (line 1118 or so)
11:47:55 AM - jcranmer: the error scenario is if you the last line of the header blocks is nothing but whitespace
11:48:36 AM - jcranmer: in that scenario, when the code hits the if(header) {} if (*buf == '\r' || *buf == '\n') {} checks
11:48:51 AM - jcranmer: buf is pointing to buf_end, which is random uninitialized memory
11:49:09 AM - jcranmer: well, actually, it's not random uninitialized memory
11:49:22 AM * jcranmer thinks
11:49:35 AM - jcranmer: uh... maybe it is
11:49:55 AM - jcranmer: I think I was following the wrong pointer
11:50:09 AM - jcranmer: in any case, wherever buf is pointing to, it isn't a CR or a LF character
11:50:33 AM - jcranmer: so the null terminator is never added
11:50:48 AM - jcranmer: and we pass around a string to everybody that is assumed to be null terminated that isn't
11:50:57 AM - jcranmer: and very very many bad things happen
11:51:35 AM - jcranmer: the main addition block on lines 1092-1102 fixes the main error
11:52:18 AM - jcranmer: I added the latter MOZ_ASSERT to guarantee null-termination, and the first MOZ_ASSERT is a precondition for the well-working of the entire code block
11:53:29 AM - jcranmer: (namely, if there is no trailing CRLF to the header block, then the code to add the null terminator doesn't have anything to land on and safely change)
11:53:35 AM - Archae|mobile has left the room (Quit: Bye).
11:54:05 AM - jcranmer: the change on line 1110 is used to fix an issue found in another test
11:54:25 AM - jcranmer: where value runs out of bounds and everything blows up
11:54:40 AM - jcranmer: [well, same test, other test data]
11:55:25 AM - jcranmer: the other two changes were visual inspections to try to ensure that we don't ever dereference buf_end
11:56:22 AM - jcranmer: [the test data that causes the second failure is a completely empty message header, so the attempt to strip the whitespace fails]
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 28•11 years ago
|
||
Comment on attachment 8410683 [details] [diff] [review]
Fix the buffer overruns
Review of attachment 8410683 [details] [diff] [review]:
-----------------------------------------------------------------
::: mailnews/local/src/nsParseMailbox.cpp
@@ +1055,5 @@
>
> SEARCH_NEWLINE:
> // move past any non terminating characters, rewriting them if folding white space
> // exists
> + while (buf < buf_end && *buf != '\r' && *buf != '\n')
This is reverting a change from bug 877831. I agree that the reversion is correct, and it was the later code in that patch (that checked for negative header length) that actually fixed the crash.
@@ +1106,5 @@
> if (header)
> {
> value = colon + 1;
> // eliminate trailing blanks after the colon
> + while (value < (buf - writeOffset) && (*value == ' ' || *value == '\t'))
Note for reference: this change really is a fix for the same issue that bug 877831 was trying to address. After this fix, the later code "if (header->length < 0)" is no longer needed, but there is no harm in keeping it in. What is happening is that, when there are both folded lines as well as a blank header, these trailing blanks after the colon are incorporated into the folding.
@@ +1114,5 @@
> header->length = buf - header->value - writeOffset;
> if (header->length < 0)
> header->length = 0;
> }
> if (*buf == '\r' || *buf == '\n')
Rather than do your --buf above, which strikes me as odd, couldn't you instead add here "|| buf == buf_end"? It seems to me that the real issue here is that buf == buf_end is also a terminating condition, and we need to allow for that possibility at this point.
Attachment #8410683 -
Flags: feedback?(Pidgeot18)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 33•11 years ago
|
||
Here's my try push:
https://tbpl.mozilla.org/?tree=Thunderbird-Try&rev=4b369acd38d6
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 35•11 years ago
|
||
Comment 36•11 years ago
|
||
Comment on attachment 8413263 [details] [diff] [review]
RKJ variant using buf>=buf_end as terminal condition
Here is my variant to the patch. Although I prefer this version, I think that either this or the previous version is acceptable. I'll let jcranmer make the final call, but I'll r+ his patch.
Attachment #8413263 -
Attachment description: RJV variant using buf>=buf_end as terminal condition → RKJ variant using buf>=buf_end as terminal condition
Updated•11 years ago
|
Attachment #8410683 -
Flags: review?(kent) → review+
Updated•11 years ago
|
Attachment #8410683 -
Flags: feedback?(Pidgeot18)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 39•11 years ago
|
||
https://hg.mozilla.org/comm-central/rev/9ee372f0a6b8
After discussion on IRC, the problem with the alternative version of the patch, using buf_end as a separate error condition, is that *buf_end isn't necessarily valid memory to stomp on, although the growth strategy of nsByteArray makes that hard to execute.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 31.0
Updated•10 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•