Closed Bug 1523818 Opened 6 years ago Closed 6 years ago

Crash in nsMapiHook::PopulateCompFieldsW

Categories

(Thunderbird :: General, defect)

x86
Windows 7
defect
Not set
critical

Tracking

(thunderbird66 fixed, thunderbird67 fixed)

RESOLVED FIXED
Thunderbird 67.0
Tracking Status
thunderbird66 --- fixed
thunderbird67 --- fixed

People

(Reporter: wsmwk, Assigned: jorgk-bmo)

References

(Blocks 1 open bug)

Details

(Keywords: crash)

Crash Data

Attachments

(2 files, 6 obsolete files)

#1 crash for 65.0b4. 50 crashes in 7 hours for 20 users.

This bug is for crash report bp-81c27caa-267b-46eb-b740-7a1b00190130.

Top 10 frames of crashing thread:

0 xul.dll nsMapiHook::PopulateCompFieldsW comm/mailnews/mapi/mapihook/src/msgMapiHook.cpp:701
1 xul.dll CMapiImp::SendMailW comm/mailnews/mapi/mapihook/src/msgMapiImp.cpp:264
2 rpcrt4.dll Invoke
3 rpcrt4.dll _imp_load__FreeAddrInfoW
4 ole32.dll ole32.dll@0x1407f5
5 ole32.dll ChannelProcessInitialize
6 ole32.dll ChannelWrapper_QueryInterface
7 ole32.dll CCtxComChnl::ContextInvoke
8 ole32.dll MTAInvoke d:\w7rtm\com\ole32\com\dcomrem\callctrl.cxx:2097
9 ole32.dll STAInvoke

Attached patch 1523818-crash.patch (obsolete) (deleted) — Splinter Review

Looks like we forgot to zero out this structure. I'm not sure what the {} were meant to do.

Assignee: nobody → jorgk
Status: NEW → ASSIGNED
Attachment #9040038 - Flags: review?(mikekaganski)
Comment on attachment 9040038 [details] [diff] [review] 1523818-crash.patch Review of attachment 9040038 [details] [diff] [review]: ----------------------------------------------------------------- ::: mailnews/mapi/mapihook/src/msgMapiImp.cpp @@ -247,5 @@ > > MOZ_LOG(MAPI, mozilla::LogLevel::Debug, ("CMapiImp::SendMailW using flags %d\n", aFlags)); > > // Handle possible nullptr argument. > - nsMapiMessageW Message{}; https://en.cppreference.com/w/cpp/language/zero_initialization

Interesting, I've never seen it. So why would msgMapiHook.cpp:701 crash? if (aMessage->lpOriginator). From the crash stats and our own experience, Windows prefers to use SendMailW so we get into this code patch.

Strange that it doesn't crash for us.

Wayne, do you have access to user comments? Maybe someone is stating how exactly they've been invoking this, it wouldn't have been through "Sent to > Mail recipient".

A bit sad that we stopped crashes in SendMail (without the W) for 64bit users and created a new crash elsewhere. The reports I looked at were all for x86, so 32bit :-(

Flags: needinfo?(vseerror)
MOZ_LOG(MAPI, mozilla::LogLevel::Debug, ("CMapiImp::SendMailW flags=%x subject: %s sender: %s\n",
        aFlags,

Parameter aFlags is of type long. Should the format string use "flags=%lx" ?
Not sure if that can cause a side effect here.

Is NS_ConvertUTF16toUTF8(NULL).get() allowed? That's what
NS_ConvertUTF16toUTF8(aMessage->lpszSubject).get(),
will do, if aMessage is NULL, and Message is zero-initialized.

Right, thanks for looking for possible crash reasons. I think all that stuff is only evaluated if logging is switched on. That said, we can fix this issue. The crash is in PopulateCompFieldsW and that doesn't have logging.

I guess after being so happy about getting 64bit going, we need to compile 32bit again and check it. Maybe the crash comes from "normal" "Sent to > Mail recipient". That would be bad.

701 if (aMessage->lpOriginator)
702 aCompFields->SetFrom(nsDependentString(aMessage->lpOriginator->lpszAddress));

I wonder if a crash on line 702 could get reported as crashing on line 701. Maybe that's possible with compiler optimizations?

If yes, could we crash because aMessage->lpOriginator is non-NULL, but aMessage->lpOriginator->lpszAddress is NULL?

Also, it makes me a bit nervous that we pass a pointer to a local variable around, and hope that it has been correctly initialized. It might be fine, but maybe it makes it more difficult for us to see a side effect. I suggest to simply the code to avoid that local variable.

Attached patch no-local-pointer.patch (deleted) — Splinter Review

(In reply to Kai Engert (:kaie:) from comment #7)

I suggest to simply the code to avoid that local variable.

Like in this attachment.

Thanks, I guess you found the crash :-) - Likely that we're one line off and it's here:
aCompFields->SetFrom(nsDependentString(aMessage->lpOriginator->lpszAddress));
If you pass in a null, you crash.

I'd take that hunk and the improvements to the logging, but not the other, well, overzealous checking. I'll do a patch, OK?

If you want to try the simpler change first, that's fine, but if it doesn't fix it, we'll try the overzealous checking next, ok? ;-)

Attached patch no-local-pointer.patch (v2) (obsolete) (deleted) — Splinter Review

How about this. Just fixing the crash and the logging (and the tabs).

Attachment #9040038 - Attachment is obsolete: true
Attachment #9040038 - Flags: review?(mikekaganski)
Attachment #9040247 - Flags: feedback?(kaie)
Attached patch no-local-pointer.patch (v2b) (obsolete) (deleted) — Splinter Review

Or like this. Dumping empty strings is never helpful.

Attachment #9040247 - Attachment is obsolete: true
Attachment #9040247 - Flags: feedback?(kaie)
Flags: needinfo?(vseerror)
Attachment #9040248 - Flags: feedback?(kaie)

If you keep
aMessage=&Message;
then you don't need the initial aMessage&& in the log statements.

Use flags=%lx in both places.

If PopulateCompFieldsW and PopulateCompFields require that aMessage is non-null, then maybe they should check on entry, and return with failure if NULL. Yes, it's unlikely that it ever gets called with a NULL parameter given the current code, but it would document the function's expectation. Also, if we'll still crash at the same place, we know it cannot be a null pointer being passed in, but a corrupt pointer.

Attached patch no-local-pointer.patch (v2c) (obsolete) (deleted) — Splinter Review

Tweaks as per previous comment.

Attachment #9040248 - Attachment is obsolete: true
Attachment #9040248 - Flags: feedback?(kaie)
Attachment #9040252 - Flags: feedback?(kaie)
Attached patch no-local-pointer.patch (v2d) (obsolete) (deleted) — Splinter Review

Sorry, there needed to be one more check.

Attachment #9040252 - Attachment is obsolete: true
Attachment #9040252 - Flags: feedback?(kaie)
Attachment #9040254 - Flags: feedback?(kaie)
Comment on attachment 9040254 [details] [diff] [review] no-local-pointer.patch (v2d) r=kaie I'd prefer to add if (!aMessage) return NS_ERROR_FAILURE; to the beginning of both PopulateCompFieldsWithConversion and PopulateCompFieldsW, but up to you, r+ with or without this change.
Attachment #9040254 - Flags: feedback?(kaie) → feedback+
Attached patch no-local-pointer.patch (v3) (obsolete) (deleted) — Splinter Review

Well, what you're suggesting will change the behaviour of the system. How about returning early with OK since that's the net effect anyway.

I was going to land this with you as the author since you spotted the problem. Hence the f? to make sure you're happy with it since it's put in your name :-)

Attachment #9040254 - Attachment is obsolete: true
Attachment #9040267 - Flags: review+
Attachment #9040267 - Flags: feedback?(kaie)

(In reply to Jorg K (GMT+1) from comment #17)

Well, what you're suggesting will change the behaviour of the system.

No. Currently, if PopulateCompFieldsW* gets called with aMessage==NULL, it will crash, because the code will dereference the bad aMessage pointer.

My suggestion to return immediately will prevent such a crash, and ensure we'll fail/abort the activity in a safe way

If we still crash inside PopulateCompFieldsW*, then we will know that it isnn't because of a NULL pointer, but because of a corrupted pointer.

Some comments

  • send an email from Quickbooks
  • trying to attach photo's from windows photo program
  • trying to send an attachment to an email, TB said someone else was using my email, I said yes and Tbird crashed
Comment on attachment 9040267 [details] [diff] [review] no-local-pointer.patch (v3) (In reply to Kai Engert (:kaie:) from comment #18) > No. Currently, if PopulateCompFieldsW* gets called with aMessage==NULL, it will crash, because the code will dereference the bad aMessage pointer. > > My suggestion to return immediately will prevent such a crash, and ensure we'll fail/abort the activity in a safe way With the trickery of passing the address of local variable, null will never be passed. So my check is nonsense. I'll move the check into the caller.
Attachment #9040267 - Attachment is obsolete: true
Attachment #9040267 - Flags: feedback?(kaie)
Attached patch no-local-pointer.patch (v4) (deleted) — Splinter Review

How about this? Close to your original suggestion.

Attachment #9040322 - Flags: feedback?(kaie)
Comment on attachment 9040322 [details] [diff] [review] no-local-pointer.patch (v4) Ok, this is a good simplification of the original suggestion. It has a slight side effect, v4 will skip calling the SetTo/SetCc/SetBcc functions with the empty string. However, those fields should already be empty, so the change should be fine.
Attachment #9040322 - Flags: feedback?(kaie) → feedback+

Perhaps related crash sig nsTSubstring<T>::Assign | nsTSubstring<T>::Assign | nsMapiHook::PopulateCompFieldsW ?

bp-b34ad4f5-bf95-4f69-b414-049850190131 bp-b1ae66d4-234a-437e-8625-5c9b50190131 bp-8efc6d9d-774c-403c-869f-ca7810190130 bp-cc1d4526-7d05-4e8b-a418-6847b0190130

(In reply to Kai Engert (:kaie:) from comment #22)

However, those fields should already be empty, so the change should be fine.

Yes, they should be initialised.

Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/bab34c7694e5
Don't pass null to nsDependentString() to fix crash in nsMapiHook::PopulateCompFieldsW. r=jorgk

Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 67.0
Attachment #9040322 - Flags: review+
Attachment #9040322 - Flags: approval-comm-esr60+
Attachment #9040322 - Flags: approval-comm-beta+

I'm really sorry for the mistakes. Thank you for fixing my bug(s)!

(In reply to Wayne Mery (:wsmwk) from comment #24)

Perhaps related crash sig nsTSubstring<T>::Assign | nsTSubstring<T>::Assign | nsMapiHook::PopulateCompFieldsW ?

I've been staring at that code for a while.

Those crashes on line 764 suggest that a valid pointer, which points to invalid/random memory locations, is passed to the function.

However, if that's what happened, then it's very surprising it didn't crash earlier in the function, like on line 714, 755, or inside HandleAttachmentsW. If the contents were really random, it's unlikely that most fields were null (and caused these earlier lines to be skipped).

Could there be a problem with the Unicode/UTF-16 processing on line 764 ? Could the arriving data be in an unexpected encoding, that we treat incorrectly? The string code crashes while trying to obtain the length of the input.

FWIW, we had 893 crashes in three days, but only 4 crashes like those from comment 24. Maybe we shouldn't worry too much about comment 24, and check later, after we have crash data on a build with the attached fix.

behavior change when sending mail via MAPI

Before this bugfix was introduced, when sending a letter via MAPI, a third-party application opened the form for editing the letter before sending it and it was possible to add a text or make changes. And you had to press the "Send" button to send it.

Now sending the letter immediately (or if not turned off, it shows a warning that a third-party application is trying to send a letter without your knowledge)

Let's continue the discussion in bug 1527450. Meanwhile we had lots of MAPI "fun" in bug 1530820 where users reported problems after an auto-upgrade from 60.5.1 to 60.5.2 which shipped a corrected MAPI library. The solution seems to be to install again over the top.

In your case I suggest you reinstall TB 66 beta 2 again which might solve the issue. We have withdrawn MAPISendMailW() and all should return to normal now.

Attachment #9040322 - Flags: approval-comm-esr60+ → approval-comm-esr60?
Comment on attachment 9040322 [details] [diff] [review] no-local-pointer.patch (v4) Let's leave some of the fun for TB 68. We had enough MAPI chagrin in TB 60.
Attachment #9040322 - Flags: approval-comm-esr60?
Blocks: 1580994
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: