Closed Bug 1680914 Opened 4 years ago Closed 4 years ago

Shutdown crash in [@ md_UnlockAndPostNotifies] (nss) via ldap_msgdelete (ldap60.dll) use-after-free

Categories

(Thunderbird :: General, defect)

Unspecified
Windows
defect
Not set
critical

Tracking

(thunderbird_esr78+ fixed, thunderbird85 fixed)

RESOLVED FIXED
86 Branch
Tracking Status
thunderbird_esr78 + fixed
thunderbird85 --- fixed

People

(Reporter: wsmwk, Assigned: benc)

References

Details

(Keywords: crash, regression, topcrash-thunderbird)

Crash Data

Attachments

(2 files)

Attached image md_UnlockAndPostNotifies crash rate (deleted) —

Ben ready for another crack at this? Still a topcrash after Bug #1625677.

Example bp-0f6d7776-eb21-4c44-8ced-fdeb50201203 Crash Address 0xe5e5e5e5

Crash rate on beta is 2-3 per week. Example bp-aaa32df1-12fe-4841-b07e-8421b0201204 Crash Address 0xffffffffffffffff

Flags: needinfo?(benc)

I can still confirm this bug, but my subjective feeling is, that it happened a bit less often in my setup now.

Chris, if you had good steps to reproduce, please add them here.

Hadrian2002 do you agree with the following steps to reproduce?

Chris, if you had good steps to reproduce, please add them here.

The STR are already in bug 1680492 comment 0

Thunderbird 78.5.1 crashes every time an LDAP connection is closed.
In Thunderbird 78.5.0 everything is ok.

Steps to reproduce:

Open a command line window
Set environment variable NSPR_LOG_MODULES = LDAP: 5
Start Thunderbird from the command line (to get standard output).
Configure the new LDAP address book using the ldaps protocol and simple binding.
Open the Address books window.
Select the newly created address book.
search for something in this address book.
Close the Address books window.
Close the main window.

Output :
[(null) 8388: Main Thread]: D/LDAP nsLDAPOperation::SimpleBind(): called; bindName = '';
[(null) 8388: Socket Thread]: D/LDAP Operation id=1 added (1 now pending)
[(null) 8388: Socket Thread]: D/LDAP pending operation removed; total pending operations now = 0
[(null) 8388: Main Thread]: D/LDAP nsLDAPOperation::SearchExt(): called with aBaseDn = 'xxxxxxxxx'; aFilter = 'xxxxxxxxx'; aAttributes = xxxxxxxxx; aSizeLimit = 100
[(null) 8388: Socket Thread]: D/LDAP Operation id=2 added (1 now pending)
[(null) 8388: Socket Thread]: D/LDAP pending operation removed; total pending operations now = 0
[(null) 8388: Main Thread]: D/LDAP nsLDAPOperation::SimpleBind(): called; bindName = '';
[(null) 8388: Socket Thread]: D/LDAP Operation id=1 added (1 now pending)
[(null) 8388: Socket Thread]: D/LDAP pending operation removed; total pending operations now = 0
[(null) 8388: Main Thread]: D/LDAP nsLDAPOperation::SearchExt(): called with aBaseDn = 'xxxxxxxxx'; aFilter = 'xxxxxxxxx'; aAttributes = xxxxxxxxx; aSizeLimit = 100
[(null) 8388: Socket Thread]: D/LDAP Operation id=2 added (1 now pending)
[(null) 8388: Socket Thread]: D/LDAP pending operation removed; total pending operations now = 0
[(null) 8388: Main Thread]: D/LDAP unbinding

The application exits with code 1.
The crashreporter is launched.

You can see that the "D/LDAP unbound" is missing from the standard output.

Flags: needinfo?(Hadrian2002)

Steps to reproduce it each time :

  • Install Thunderbird 78.5.1
  • Launch Thunderbird
  • No need to configure any account
  • Display the Address Books window
  • If needed, add a new LDAP address book (no need to be really accessible : host must exist but credential can be wrong)
  • Select a LDAP address book
  • Search for "test" contacts (no result required)
  • Close the Address Books window
  • Close the main window
  • Wait for the crash reporter

I can reproduce it with 78.5.1, 84.0b3 and 85.0a (20201207103830).
LDAP host must exist in each case.

It seems OK using Thunderbird 78.6.0.
Thank you so much.

Could you confirm the issue has been fixed into the nsLDAPSecurityGlue.cpp file ?
Do you consider that this issue has been fixed ?

Flags: needinfo?(benc853)
Flags: needinfo?(benc853)
Assignee: nobody → benc
Flags: needinfo?(benc)

I can't reproduce the bug using the latest code and the steps in Comment 4 (although that was a sure-fire a way to crash it prior to the fix in Bug 1625677).
But the crash logs show that something bad is happening.

I've had a poke about and I can see a place where such borkage is likely:
When the app is shut down, an attempt is made to tell the server to cancel any pending LDAP operations, by issuing LDAP AbandonExt (Cancel?) requests. However, because the request is sent from another thread, the LDAP connection is already gone by the time that thread gets around to it.
And the crash log we see is consistent with trying to send the abandon on a closed LDAP connection.
Quick fix is to just not bother sending the Abandon requests. This is valid LDAP, even if it could be considered a little rude: the server has to be able to cope with clients crashing without notice anyway.
We'd still send an Abandon for any address lookup operation the user explicitly cancels (at least, there is some code there - not sure how that's accessed in the UI ;- ). We're not savages.

Not sure if this will fix this particular issue, but I'm pretty sure it addresses something that was all wrong :-)

This patch changes the shutdown handler in the LDAP connection class to not call AbandonExt() on any outstanding operations.

It also adds an LDAP log line when AbandonExt() is called. Which might be helpful for future diagnostics.

There's also a couple of lines of lint-churn in nsLDAPConnection::InvokeErrorCallback() (no idea why it got rearranged now, but hey).

Attachment #9193840 - Flags: review?(mkmelin+mozilla)
Attachment #9193840 - Flags: review?(mkmelin+mozilla) → review+
Status: NEW → ASSIGNED
Target Milestone: --- → 86 Branch

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/96609a7b8976
Don't abandon pending LDAP ops on shutdown because the connection will already be gone. r=mkmelin

Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED

Comment on attachment 9193840 [details] [diff] [review]
1680914-avoid-ldap-abandon-on-shutown-1.patch

[Approval Request Comment]
Fix for top crash.

Attachment #9193840 - Flags: approval-comm-esr78?
Attachment #9193840 - Flags: approval-comm-beta?

Comment on attachment 9193840 [details] [diff] [review]
1680914-avoid-ldap-abandon-on-shutown-1.patch

[Triage Comment]
Approved for beta

Attachment #9193840 - Flags: approval-comm-beta? → approval-comm-beta+

sorry for my late answer ... I use now the 78.6.0 release and the bug, which was creating high cpu usage seems to be gone. Unfortunately, the actual ldap functionality also stopped to work.

Altough, it's not a bug crashing thunderbird, I shortly describe my situation:

I work at a university, and we use a Microsoft Exchange/AD Server. I wanted to use the global contact directory via ldap.

If I'm searching directly in the address book at the ldap repository, all my searches return instantaneously. If I try to use the search in the "To:" Field, when start at an empty email, nothing happens. I get no responses even, if I use known names of the global address book. I recorded logs for first, an address book search for "abc" and "asd" and then the same in "To:" field of a new mail.:

[(null) 32744: Main Thread]: D/LDAP nsLDAPOperation::SimpleBind(): called; bindName = 'uid=**,ou=people,o=**,c=de'; 
[(null) 32744: Socket Thread]: D/LDAP Operation id=1 added (1 now pending)
[(null) 32744: Socket Thread]: D/LDAP pending operation removed; total pending operations now = 0
[(null) 32744: Main Thread]: D/LDAP nsLDAPOperation::SearchExt(): called with aBaseDn = 'ou=people,o=**,c=de'; aFilter = '(&(|(cn=*abc*)(givenName=*abc*)(sn=*abc*)(mozillaNickname=*abc*)(mail=*abc*)(mozillaSecondEmail=*abc*)(&(description=*abc*))(o=*abc*)(ou=*abc*)(title=*abc*)(mozillaWorkUrl=*abc*)(mozillaHomeUrl=*abc*)))'; aAttributes = mozillaCustom2,custom2,l,locality,cn,commonname,mozillaHomeLocalityName,mozillaHomeState,mail,mozillaHomeUrl,homeurl,o,company,mozillaNickname,xmozillanickname,nsAIMid,nscpaimscreenname,mobile,cellphone,carphone,modifytimestamp,pager,pagerphone,birthyear,mozillaHomeStreet,mozillaWorkUrl,workurl,labeledURI,mozillaCustom1,custom1,mozillaHomeCountryName,postalCode,zip,c,countryname,st,region,title,mozillaHomePostalCode,mozillaSecondEmail,xmozillasecondemail,mozillaHomeStreet2,facsimiletelephonenumber,fax,description,notes,mozillaCustom3,custom3,homePhone,mozillaWorkStreet2,mozillaUseHtmlMail,xmozillausehtmlmail,givenName,telephoneNumber,birthday,street,streetaddress,postOfficeBox,mozillaCustom4,custom4,sn,surname,birthmonth,ou,department,departmentnumber,orgunit,objectClass; aSizeLimit = 1000
[(null) 32744: Socket Thread]: D/LDAP Operation id=2 added (1 now pending)
[(null) 32744: Socket Thread]: D/LDAP pending operation removed; total pending operations now = 0
[(null) 32744: Main Thread]: D/LDAP nsLDAPOperation::SimpleBind(): called; bindName = 'uid=**,ou=people,o=**,c=de'; 
[(null) 32744: Socket Thread]: D/LDAP Operation id=1 added (1 now pending)
[(null) 32744: Socket Thread]: D/LDAP pending operation removed; total pending operations now = 0
[(null) 32744: Main Thread]: D/LDAP nsLDAPOperation::SearchExt(): called with aBaseDn = 'ou=people,o=**,c=de'; aFilter = '(&(|(cn=*asd*)(givenName=*asd*)(sn=*asd*)(mozillaNickname=*asd*)(mail=*asd*)(mozillaSecondEmail=*asd*)(&(description=*asd*))(o=*asd*)(ou=*asd*)(title=*asd*)(mozillaWorkUrl=*asd*)(mozillaHomeUrl=*asd*)))'; aAttributes = mozillaCustom2,custom2,l,locality,cn,commonname,mozillaHomeLocalityName,mozillaHomeState,mail,mozillaHomeUrl,homeurl,o,company,mozillaNickname,xmozillanickname,nsAIMid,nscpaimscreenname,mobile,cellphone,carphone,modifytimestamp,pager,pagerphone,birthyear,mozillaHomeStreet,mozillaWorkUrl,workurl,labeledURI,mozillaCustom1,custom1,mozillaHomeCountryName,postalCode,zip,c,countryname,st,region,title,mozillaHomePostalCode,mozillaSecondEmail,xmozillasecondemail,mozillaHomeStreet2,facsimiletelephonenumber,fax,description,notes,mozillaCustom3,custom3,homePhone,mozillaWorkStreet2,mozillaUseHtmlMail,xmozillausehtmlmail,givenName,telephoneNumber,birthday,street,streetaddress,postOfficeBox,mozillaCustom4,custom4,sn,surname,birthmonth,ou,department,departmentnumber,orgunit,objectClass; aSizeLimit = 1000
[(null) 32744: Socket Thread]: D/LDAP Operation id=2 added (1 now pending)
[(null) 32744: Socket Thread]: D/LDAP pending operation removed; total pending operations now = 0
[(null) 32744: Main Thread]: D/LDAP nsLDAPOperation::SimpleBind(): called; bindName = 'uid=**,ou=people,o=**,c=de'; 
[(null) 32744: Socket Thread]: D/LDAP Operation id=1 added (1 now pending)
[(null) 32744: Socket Thread]: D/LDAP pending operation removed; total pending operations now = 0
[(null) 32744: Main Thread]: D/LDAP nsLDAPOperation::SearchExt(): called with aBaseDn = 'ou=people,o=**,c=de'; aFilter = '(|(cn=abc**)(mail=abc**)(sn=abc**))'; aAttributes = cn,commonname,mail,objectClass; aSizeLimit = 1000
[(null) 32744: Socket Thread]: D/LDAP Operation id=2 added (1 now pending)
[(null) 32744: Socket Thread]: D/LDAP pending operation removed; total pending operations now = 0
[(null) 32744: Main Thread]: D/LDAP nsLDAPOperation::SearchExt(): called with aBaseDn = 'ou=people,o=**,c=de'; aFilter = '(|(cn

I removed my uid and our organisational tag and replaced them with "*".

First observation: the address book ldap searches use a (maybe intentionally) totally different search filter then the "To:"-field searches.
Second: I need to quit Thunderbird to create any line in the logger (maybe also intentionally), but the logs even get truncated, which seems to be odd.

My guess is, that there is somehow another locking bug between different ldap operations.

I haven't searched for related bugs, maybe you are already aware of this.

Flags: needinfo?(Hadrian2002)
Flags: needinfo?(benc)

Comment on attachment 9193840 [details] [diff] [review]
1680914-avoid-ldap-abandon-on-shutown-1.patch

[Triage Comment]
Approved for esr78

Attachment #9193840 - Flags: approval-comm-esr78? → approval-comm-esr78+
Flags: needinfo?(benc)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: