Open Bug 1667047 Opened 4 years ago Updated 1 year ago

Crash in [@ ssl3_SendClientHello | ssl_BeginClientHandshake | ssl_SecureRecv | ssl_Recv | PSMRecv]

Categories

(Core :: Networking: HTTP, defect, P2)

Unspecified
All
defect

Tracking

()

People

(Reporter: gsvelto, Unassigned, NeedInfo)

References

Details

(Keywords: crash, csectype-nullptr, Whiteboard: [necko-triaged][necko-priority-review])

Crash Data

Crash report: https://crash-stats.mozilla.org/report/index/3e4be348-0a4b-40dc-bdc2-203a40200923

Top 10 frames of crashing thread:

0 nss3.dll ssl3_SendClientHello security/nss/lib/ssl/ssl3con.c:5151
1 nss3.dll ssl_BeginClientHandshake security/nss/lib/ssl/sslcon.c:189
2 nss3.dll ssl_SecureRecv security/nss/lib/ssl/sslsecur.c:810
3 nss3.dll ssl_Recv security/nss/lib/ssl/sslsock.c:3135
4 xul.dll PSMRecv security/manager/ssl/nsNSSIOLayer.cpp:1232
5 xul.dll mozilla::net::nsSocketTransport::IsAlive netwerk/base/nsSocketTransport2.cpp:2714
6 xul.dll mozilla::net::nsHttpConnection::IsAlive netwerk/protocol/http/nsHttpConnection.cpp:1208
7 xul.dll mozilla::net::nsHttpConnection::CanReuse netwerk/protocol/http/nsHttpConnection.cpp:1146
8 xul.dll mozilla::net::nsHttpConnection::CanDirectlyActivate netwerk/protocol/http/nsHttpConnection.cpp:1171
9 xul.dll mozilla::net::nsHttpConnectionMgr::UpdateCoalescingForNewConn netwerk/protocol/http/nsHttpConnectionMgr.cpp:853

This is a NULL-pointer dereference crash, happening on both Linux and Windows.

Dragana, you reviewed bug 1556491 which has the same signature. Is there any relation?

Flags: needinfo?(dd.mozilla)

Thy are not related. Something very strange is happening here, I will take a look.

Assignee: nobody → dd.mozilla
Status: NEW → ASSIGNED
Flags: needinfo?(dd.mozilla)
Severity: -- → S4
Priority: -- → P2
Whiteboard: [necko-triaged]

Hi Thyla,

most probably the reason we are crashing is in necko, but I need help to understand what conditions can lead to this crash.

Is this the same as the analysis from this comment. The crash in that bug is old so I do not have stack trace to look at.
Can you find someone to look at it?

thanks!

Flags: needinfo?(tvandermerwe)

Hi Dragana,

I'm going to ask Kevin to take a look -- he'll be back next week (and I'll be out for a bit).

Thyla

Flags: needinfo?(tvandermerwe) → needinfo?(kjacobs.bugzilla)

To my eyes, this looks like the same issue as bug 1556491 making the analysis in https://bugzilla.mozilla.org/show_bug.cgi?id=1556491#c4 still valid. The higher stack frames differ, but the conditions look the same.

Dragana, are you sure they're not related?

There is one thing I don't understand from Martin's analysis: SSL_ResetHandshake doesn't reset ss->ssl3.cwSpec->macDef. While it will be overwritten by the next handshake, ISTM we have a problem unless we either

  1. Add a check to tolerate a NULL macDef (and don't reset it in SSL_ResetHandshake) to follow the 1.3 case, OR
  2. Reset it to ssl_mac_null in SSL_ResetHandshake so that's never NULL in ssl3_SendClientHello.
Flags: needinfo?(kjacobs.bugzilla) → needinfo?(dd.mozilla)
Assignee: dd.mozilla → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(dd.mozilla)

Probably should re-triage this

Severity: S4 → --
Priority: P2 → --
Severity: -- → S4
Priority: -- → P3

(In reply to Kevin Jacobs [:kjacobs] from comment #5)

To my eyes, this looks like the same issue as bug 1556491 making the analysis in https://bugzilla.mozilla.org/show_bug.cgi?id=1556491#c4 still valid. The higher stack frames differ, but the conditions look the same.

Dragana, are you sure they're not related?

There is one thing I don't understand from Martin's analysis: SSL_ResetHandshake doesn't reset ss->ssl3.cwSpec->macDef. While it will be overwritten by the next handshake, ISTM we have a problem unless we either

  1. Add a check to tolerate a NULL macDef (and don't reset it in SSL_ResetHandshake) to follow the 1.3 case, OR
  2. Reset it to ssl_mac_null in SSL_ResetHandshake so that's never NULL in ssl3_SendClientHello.

Dana, is this something we can fix in NSS?

Flags: needinfo?(dkeeler)

I imagine so. John?

Flags: needinfo?(dkeeler) → needinfo?(jschanck)

Looks like some relevant changes landed in 105 as part of Bug 1772201. I see 4 instances of this crash signature in 105 (with unusual stacks) and none for 106. Are we sure this is still a problem?

Flags: needinfo?(jschanck)

Maybe you're right. All of the 105 instances have tmmon.dll on the stack. It's possible there's nothing we can do here anymore.

Crash Signature: [@ ssl3_SendClientHello | ssl_BeginClientHandshake | ssl_SecureRecv | ssl_Recv | PSMRecv] → [@ ssl3_SendClientHello | ssl_BeginClientHandshake | ssl_SecureRecv | ssl_Recv | PSMRecv] [@ ssl3_SendClientHello | ssl_BeginClientHandshake | ssl_Do1stHandshake | ssl_SecureRecv | ssl_Recv | PSMRecv]

This is still lurking around across all platforms. Can we please re-triage?

Severity: S4 → --
Priority: P3 → --

Not totally certain on the severity, but marking for review. Please let me know if this is more urgent/severe.

Severity: -- → S3
Priority: -- → P2
Whiteboard: [necko-triaged] → [necko-triaged][necko-priority-review]

Only perhaps 1/2 the current instances seem to have tmmon on the stack (tmmon.dll or tmmon64.dll)

The others have near the base of the stack:
mozilla::interceptor::FuncHook<mozilla::interceptor::WindowsDllInterceptor<mozilla::interceptor::VMSharingPolicyShared>, void ()(int, void, void*)>::operator()(int&, void*&, void*&) const
From: toolkit/xre/dllservices/mozglue/nsWindowsDllInterceptor.h:150

For reference, tmmon64.dll is part of Trend Micro. See https://bugs.chromium.org/p/chromium/issues/detail?id=882982
Chrome did block them for a short while in 2019, but no longer: https://source.chromium.org/chromium/chromium/src/+/main:chrome/chrome_elf/third_party_dlls/hardcoded_blocklist.cc

Flags: needinfo?(kershaw)

John, sorry that I am going to ask again: could we try to fix this in NSS?
As per comment#13, this crash is still happening and half of the reports don't have tmmon.dll.

I think bug 1556491 should have already fixed the problem at the necko side, so I am out of ideas here. Could you take a look? Thanks.

Flags: needinfo?(kershaw) → needinfo?(jschanck)

(In reply to christopher.staite from bug 1848013 comment #6)

I have created a server so you can reproduce the crash, however I believe this is a separate issue and should have a separate bug report.

Steps to reproduce crash:
Set HTTP/HTTPS proxy to h2.menlotest.com port 3128
Navigate to https://h2.menlotest.com/websocket-test/
Note that Firefox crashes immediately

Interestingly, if the page is loaded on startup it doesn't crash, only if you navigate to it.

Following Christopher's instructions, I was able to reproduce the crash in a debug build. I hit an assertion failure in ssl_SetClientHelloSpecVersion.

[Parent 1261615: Socket Thread]: V/nsHttp proxy CONNECT succeeded! endtoendssl=1 onlyconnect=0
[Parent 1261615: Socket Thread]: V/nsHttp TlsHandshaker::InitSSLParams [mOwner=7fe05ad7da00] connectingToProxy=0
[Parent 1261615: Socket Thread]: V/nsHttp nsHttpConnection::GetTLSSocketControl trans=7fe052c4d600 socket=7fe059345d90
[Parent 1261615: Socket Thread]: V/nsHttp nsHttpConnection::SetupSSL Allow SPDY NPN selection
[Parent 1261615: Socket Thread]: I/nsHttp Http2Session::ALPNCallback sslsocketcontrol=7fe052e3aa00
[Parent 1261615: Socket Thread]: I/nsHttp Http2Session::ALPNCallback version=304
[Parent 1261615: Socket Thread]: V/nsHttp TlsHandshaker::SetupNPNList 7fe05ad7da00 0
[Parent 1261615: Socket Thread]: V/nsHttp InitSSLParams Setting up SPDY Negotiation OK mOwner=7fe05ad7da00
[Parent 1261615: Socket Thread]: V/nsHttp InitSSLParams [rv=0]
[Parent 1261615: Socket Thread]: V/nsHttp OutputStreamTunnel::AsyncWait [this=7fe056f76610]
[Parent 1261615: Socket Thread]: I/nsHttp Http2Session::TransactionHasDataToWrite 7fe052c4cc00 stream=7fe059345c40 ID=0x11
[Parent 1261615: Socket Thread]: V/nsHttp nsHttpConnection::ResumeSend [this=7fe0538b4100]
[Parent 1261615: Socket Thread]: V/nsHttp nsHttpConnection::ForceSend [this=7fe0538b4100]
[Parent 1261615: Socket Thread]: V/nsHttp resetting transaction's response head
[Parent 1261615: Socket Thread]: V/nsHttp nsHttpResponseHead::Reset
[Parent 1261615: Socket Thread]: V/nsHttp nsHttpConnection::OnSocketReadable 7fe05ad7da00 trans->ws rv=0 n=0 socketin=0
[Parent 1261615: Socket Thread]: V/nsHttp nsHttpConnection::GetTLSSocketControl trans=7fe052c4d600 socket=7fe059345d90
[Parent 1261615: Socket Thread]: E/nsHttp nsHttpTransaction::OnSocketStatus [this=7fe052c4d600 status=4b000c progress=0]
[Parent 1261615: Socket Thread]: V/nsHttp TlsHandshaker::EnsureNPNComplete [mOwner=7fe05ad7da00] drive TLS handshake
[Parent 1261615: Socket Thread]: V/nsHttp nsHttpConnection::OnSocketReadable 7fe05ad7da00 return due to inactive tunnel setup but incomplete NPN state
[Parent 1261615: Socket Thread]: I/nsHttp Http2StreamBase::WriteSegments 0
[Parent 1261615: Socket Thread]: V/nsHttp nsHttpConnection::OnSocketReadable 7fe0538b4100 trans->ws rv=0 n=0 socketin=0
[Parent 1261615: Socket Thread]: I/nsHttp Http2Session::WriteSegments 7fe052c4cc00 InternalState 1
[Parent 1261615: Socket Thread]: V/nsHttp nsHttpConnection::ResumeSend [this=7fe0538b4100]
1261615: SSL[1368674848]: sending client-hello
1261615: SSL[1368674848]: using external token
1261615: SSL3[1368674848]: send initial ClientHello handshake
Assertion failure: spec->cipherDef->cipher == cipher_null, at /home/john/repos/mozilla-unified-2/security/nss/lib/ssl/ssl3con.c:5010

The log shows a call to TlsHandshaker::InitSSLParams. Stepping through in a debugger, I see that call has proxyStartSSL set to true, and this ultimately causes SSL_ResetHandshake to be called. This puts us on the path to sending a client hello, but SSL_ResetHandshake does not reset ss->ssl3.cwSpec (spec in the assertion failure).

I'm not sure what's supposed to happen here. Either we shouldn't be calling SSL_ResetHandshake, or SSL_ResetHandshake isn't doing its job.

Dennis, any thoughts?

Flags: needinfo?(jschanck) → needinfo?(djackson)
You need to log in before you can comment on or make changes to this bug.