Closed Bug 973316 Opened 11 years ago Closed 11 years ago

TEST-UNEXPECTED-FAIL + PROCESS-CRASH | xpcshell/tests/mailnews/local/test/unit/test_nsIMsgParseMailMsgState.js

Categories

(MailNews Core :: MIME, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 31.0

People

(Reporter: mkmelin, Assigned: jcranmer)

References

Details

(Keywords: intermittent-failure)

Attachments

(2 files)

Seen on Linux x64 Opt https://tbpl.mozilla.org/php/getParsedLog.php?id=34733807&tree=Thunderbird-Trunk&full=1#error1 TEST-UNEXPECTED-FAIL | /builds/slave/test/build/xpcshell/tests/mailnews/local/test/unit/test_nsIMsgParseMailMsgState.js | test failed (with xpcshell return code: -11), see following log: >>>>>>> TEST-INFO | (xpcshell/head.js) | test MAIN run_test pending (1) <<<<<<< mozcrash INFO | Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/thunderbird/tinderbox-builds/comm-central-linux64/1392462351/thunderbird-30.0a1.en-US.linux-x86_64.crashreporter-symbols.zip PROCESS-CRASH | /builds/slave/test/build/xpcshell/tests/mailnews/local/test/unit/test_nsIMsgParseMailMsgState.js | application crashed [@ nsDependentCString::nsDependentCString(char const*)]
It only seems to happen on linux 64bit and the stack is: Thread 0 (crashed) 0 libxul.so!nsDependentCString::nsDependentCString(char const*) [nsCharTraits.h:1ba07b762e2e : 505 + 0x9] rbx = 0x00007fffaeab1048 r12 = 0x00007fffaeab1040 r13 = 0x00007f1a479eb640 r14 = 0x00007f1a56499460 r15 = 0x00007f1a4b445880 rip = 0x00007f1a551415dd rsp = 0x00007fffaeab0e70 rbp = 0x00007fffaeab1030 Found by: given as instruction pointer in context 1 libxul.so!NS_MsgStripRE(char const**, unsigned int*, char**) [nsMsgUtils.cpp:1ba07b762e2e : 652 + 0x16] rbx = 0x00007fffaeab1048 r12 = 0x00007fffaeab1040 r13 = 0x00007f1a479eb640 r14 = 0x00007f1a56499460 r15 = 0x00007f1a4b445880 rip = 0x00007f1a5637975c rsp = 0x00007fffaeab0e80 rbp = 0x00007fffaeab1030 Found by: call frame info 2 libxul.so!nsParseMailMessageState::InternSubject(message_header*) [nsParseMailbox.cpp:1ba07b762e2e : 1234 + 0x4] rbx = 0x00007f1a466d3400 r12 = 0x00007fffaeab1050 r13 = 0x00007fffaeab1168 r14 = 0x0000000000001003 r15 = 0x00007f1a4b445880 rip = 0x00007f1a56465dd1 rsp = 0x00007fffaeab1040 rbp = 0x00007fffaeab1080 Found by: call frame info 3 libxul.so!nsParseMailMessageState::FinalizeHeaders() [nsParseMailbox.cpp:1ba07b762e2e : 1466 + 0xe] rbx = 0x00007f1a466d3400 r12 = 0x00007f1a466d3480 r13 = 0x00007fffaeab1168 r14 = 0x0000000000001003 r15 = 0x00007f1a4b445880 rip = 0x00007f1a56469497 rsp = 0x00007fffaeab1090 rbp = 0x00007fffaeab1310 Found by: call frame info 4 libxul.so!nsParseMailMessageState::ParseFolderLine(char const*, unsigned int) [nsParseMailbox.cpp:1ba07b762e2e : 676 + 0x7] rbx = 0x00007f1a466d3400 r12 = 0x0000000000000002 r13 = 0x00007f1a551828fd r14 = 0x00007fffaeab1520 r15 = 0x00007fffaeab1506 rip = 0x00007f1a56469d47 May it be related to patch http://hg.mozilla.org/comm-central/rev/c7ecb1bd06ae that last did changes in nsParseMailMessageState::FinalizeHeaders , just before the InterSubject call? Even though it does not seem to touch the subject variable. May there be some in-memory variable aligning problem only affecting 64bit?
I looked into this since the tests were very erratic to actually work. Turns out, we have a nice heap overflow in here. nsParseMsgMailState uses non-null terminated strings to pass into NS_MsgStripRE. NS_MsgStripRE assumes its strings are null-terminated in places. I'd blame my attempts to nsCString-ify nsIMimeConverter, but the old (and, well, current) implementation still used string in the xpidl, so the original versions were still wrong. I've pushed an attempted fix to try: <https://tbpl.mozilla.org/?tree=Thunderbird-Try&rev=09eca902f9a3>.
Sob sob sob sob. The attempted fix(es) didn't work. The real problem is that nsParseMsgMailState thinks it's giving out nul-terminated strings (it tries hard to), but it fails to do so. If I add an assert that fails if we give out a non-null-terminated string, this crash is so persistently common that I can actually debug it. nsParseMsgMailState only manages to null-terminate if buf points to a \r or a \n character. Which it always will, unless the last header line consists of a header line containing only FWS. Confirmation that this is what is occuring in the tests is obvious when you look at code coverage results: <http://www.tjhsst.edu/~jcranmer/c-ccov/mailnews/local/src/nsParseMailbox.cpp.html> (a branch that is necessary for null termination that is not always taken!?). I once played with rewriting this code in JS. After all the effort I've gone through to get this damn thing to pass a newly-added assertion that it only hands out null-terminated strings, I'm going to move that from "fun side experiment" to "serious attempts to make a reviewable patch and get it checked in:" the C-style string handling here is baroque, full of errors (this makes, by my count, the 5th patch in as many years fixing a string overrun in this code). Newer try push: <https://tbpl.mozilla.org/?tree=Thunderbird-Try&rev=c53367503e8f>.
Attached patch Fix the buffer overruns (deleted) — Splinter Review
This adds the code that satisfies the checks and also fixes other potential buffer overruns (buf_end isn't safely dereferenceable, although it requires a very precise message to hit such a scenario due to the growth strategy of the underlying buffer).
Assignee: nobody → Pidgeot18
Status: NEW → ASSIGNED
Attachment #8410683 - Flags: review?(kent)
May this cause dataloss, or it crashes immediately by buffer overflow (access to somebody elses memory)?
IRC description: 11:44:29 AM - rkent: jcranmer: can you give me a short description of what is the key to your patch fixing the current crash? I haven;t tried to figure that out yet, still trying to recall how all of this was supposed to work. 11:45:51 AM - jcranmer: okay 11:46:30 AM - jcranmer: the key issue is that the only way the header gets null terminated is if buf points to \r or \n at the end (line 1118 or so) 11:47:55 AM - jcranmer: the error scenario is if you the last line of the header blocks is nothing but whitespace 11:48:36 AM - jcranmer: in that scenario, when the code hits the if(header) {} if (*buf == '\r' || *buf == '\n') {} checks 11:48:51 AM - jcranmer: buf is pointing to buf_end, which is random uninitialized memory 11:49:09 AM - jcranmer: well, actually, it's not random uninitialized memory 11:49:22 AM * jcranmer thinks 11:49:35 AM - jcranmer: uh... maybe it is 11:49:55 AM - jcranmer: I think I was following the wrong pointer 11:50:09 AM - jcranmer: in any case, wherever buf is pointing to, it isn't a CR or a LF character 11:50:33 AM - jcranmer: so the null terminator is never added 11:50:48 AM - jcranmer: and we pass around a string to everybody that is assumed to be null terminated that isn't 11:50:57 AM - jcranmer: and very very many bad things happen 11:51:35 AM - jcranmer: the main addition block on lines 1092-1102 fixes the main error 11:52:18 AM - jcranmer: I added the latter MOZ_ASSERT to guarantee null-termination, and the first MOZ_ASSERT is a precondition for the well-working of the entire code block 11:53:29 AM - jcranmer: (namely, if there is no trailing CRLF to the header block, then the code to add the null terminator doesn't have anything to land on and safely change) 11:53:35 AM - Archae|mobile has left the room (Quit: Bye). 11:54:05 AM - jcranmer: the change on line 1110 is used to fix an issue found in another test 11:54:25 AM - jcranmer: where value runs out of bounds and everything blows up 11:54:40 AM - jcranmer: [well, same test, other test data] 11:55:25 AM - jcranmer: the other two changes were visual inspections to try to ensure that we don't ever dereference buf_end 11:56:22 AM - jcranmer: [the test data that causes the second failure is a completely empty message header, so the attempt to strip the whitespace fails]
Comment on attachment 8410683 [details] [diff] [review] Fix the buffer overruns Review of attachment 8410683 [details] [diff] [review]: ----------------------------------------------------------------- ::: mailnews/local/src/nsParseMailbox.cpp @@ +1055,5 @@ > > SEARCH_NEWLINE: > // move past any non terminating characters, rewriting them if folding white space > // exists > + while (buf < buf_end && *buf != '\r' && *buf != '\n') This is reverting a change from bug 877831. I agree that the reversion is correct, and it was the later code in that patch (that checked for negative header length) that actually fixed the crash. @@ +1106,5 @@ > if (header) > { > value = colon + 1; > // eliminate trailing blanks after the colon > + while (value < (buf - writeOffset) && (*value == ' ' || *value == '\t')) Note for reference: this change really is a fix for the same issue that bug 877831 was trying to address. After this fix, the later code "if (header->length < 0)" is no longer needed, but there is no harm in keeping it in. What is happening is that, when there are both folded lines as well as a blank header, these trailing blanks after the colon are incorporated into the folding. @@ +1114,5 @@ > header->length = buf - header->value - writeOffset; > if (header->length < 0) > header->length = 0; > } > if (*buf == '\r' || *buf == '\n') Rather than do your --buf above, which strikes me as odd, couldn't you instead add here "|| buf == buf_end"? It seems to me that the real issue here is that buf == buf_end is also a terminating condition, and we need to allow for that possibility at this point.
Attachment #8410683 - Flags: feedback?(Pidgeot18)
Comment on attachment 8413263 [details] [diff] [review] RKJ variant using buf>=buf_end as terminal condition Here is my variant to the patch. Although I prefer this version, I think that either this or the previous version is acceptable. I'll let jcranmer make the final call, but I'll r+ his patch.
Attachment #8413263 - Attachment description: RJV variant using buf>=buf_end as terminal condition → RKJ variant using buf>=buf_end as terminal condition
Attachment #8410683 - Flags: review?(kent) → review+
Attachment #8410683 - Flags: feedback?(Pidgeot18)
https://hg.mozilla.org/comm-central/rev/9ee372f0a6b8 After discussion on IRC, the problem with the alternative version of the patch, using buf_end as a separate error condition, is that *buf_end isn't necessarily valid memory to stomp on, although the growth strategy of nsByteArray makes that hard to execute.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 31.0
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: