socket thread consuming 80% CPU doing SSL polling
Categories
(Core :: Security, defect, P3)
Tracking
()
People
(Reporter: dbaron, Assigned: dragana)
References
(Blocks 2 open bugs)
Details
(Keywords: perf, power)
Crash Data
Attachments
(13 files, 1 obsolete file)
(deleted),
image/png
|
Details | |
(deleted),
image/png
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
application/gzip
|
Details | |
(deleted),
text/plain
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details | |
(deleted),
text/x-phabricator-request
|
Details |
Reporter | ||
Comment 1•7 years ago
|
||
Reporter | ||
Comment 2•7 years ago
|
||
Reporter | ||
Comment 3•7 years ago
|
||
Comment 4•7 years ago
|
||
Reporter | ||
Comment 5•7 years ago
|
||
Comment 6•7 years ago
|
||
Updated•7 years ago
|
Updated•7 years ago
|
Comment 7•7 years ago
|
||
Comment 8•7 years ago
|
||
Comment 9•7 years ago
|
||
Comment 10•7 years ago
|
||
Comment 11•7 years ago
|
||
Comment 12•7 years ago
|
||
Comment 13•7 years ago
|
||
Comment 14•7 years ago
|
||
Comment 16•7 years ago
|
||
Updated•7 years ago
|
Comment 17•7 years ago
|
||
Comment 18•7 years ago
|
||
Comment 19•7 years ago
|
||
Comment 20•7 years ago
|
||
Comment 21•7 years ago
|
||
Comment 22•7 years ago
|
||
Comment 23•7 years ago
|
||
Updated•7 years ago
|
Updated•7 years ago
|
Comment 24•7 years ago
|
||
Comment 25•7 years ago
|
||
Comment 26•7 years ago
|
||
Comment 27•7 years ago
|
||
Comment 28•7 years ago
|
||
Comment 29•7 years ago
|
||
Comment 30•7 years ago
|
||
Comment 31•7 years ago
|
||
Comment 32•7 years ago
|
||
Comment 33•7 years ago
|
||
Updated•7 years ago
|
Reporter | ||
Comment 34•7 years ago
|
||
Reporter | ||
Comment 35•7 years ago
|
||
Comment 36•7 years ago
|
||
Comment 37•7 years ago
|
||
Updated•3 years ago
|
Assignee | ||
Comment 38•3 years ago
|
||
I know what the problem is:
Polling for WRITE has 2 meanings: 1) it is use to drive handshake 2) it is use when necko wants to write data
In times before 0RTT this was all correct and meant the same thing, because the client always needs to write first and it poll for WRITE until the socket is ready.
In case of 0RTT: necko polls for WRITE and it is able to write 0RTT -> necko does not have any more data to write, but it continues to poll for WRITE to drive handshake -> nss returns from poll immediately because it can write more 0RTT data. In this case polls for WRITE actually has 2 meaning.
I will try to change the behavior of necko and make it not poll for WRITE if it does not have data to write. Let's see how difficult it is, this is more than decade old code (probably 2 decates)....
Updated•3 years ago
|
Assignee | ||
Comment 39•3 years ago
|
||
Polling for write will in most cases always return immediately which causes necko to basically busy wait for the handshake to finish. To avoid thiis we will poll for READ and call OnSocketWritable to drive the haandshake.
Comment 40•3 years ago
|
||
Comment 41•3 years ago
|
||
bugherder |
Updated•3 years ago
|
Assignee | ||
Comment 42•3 years ago
|
||
Can we back this out. I want to resolve this in a different way. Sorry for making troubles.
Assignee | ||
Comment 43•3 years ago
|
||
Comment 44•3 years ago
|
||
Backed out as requested by Dragana.
Updated•3 years ago
|
Updated•3 years ago
|
Assignee | ||
Comment 45•3 years ago
|
||
Extract Check0RttEnabled
The old code checks 0RTT state then does a DriveHandshake then checks 0RTT again. This is done in this way because before DriveHandshake is called for the first time 0RTT states are not set. DriveHandshake is sometimes called as a side effect by IsAlive() check. The new code makes this more complex and just calls DriveHandshaek before checking 0RTT.
Extract code for setting 0RTT telemetry values.
Remove some code that set timing because the same code is called a bit later again.
Assignee | ||
Comment 46•3 years ago
|
||
Assignee | ||
Comment 47•3 years ago
|
||
HandshakeDone will be called after a handshake is finished and also after the certificate verifications are done.
The code relies on HandshakeDone to signal that the handshake is done. When early-data is not available HandshakeDone is responsible for setting up a Http2 session if needed. There are 2 outcomes when early-data is used:
- early-data is accepted and transaction continues polling for read,
- early-data is rejected. In this case, the transaction is restarted as well as polling flags, i.e. the connection will stop polling for read and start polling for write.
Another difference is that a transaction that is started during the early-data period will behave as a normal transaction, i.e. it will write data and continue polling for read to receive response. The special cases during early-data(mWaitingFor0RTTResponse==true) are removed from nsHttpConnection::OnSocketWritable().
EnsureNPNComplete is only responsible for driving handshake and checking the early-data availability. All logic for finishing a handshake (i.e. checking whether early-data is accepted and checking alpn value) has been moved to HandshakeDone.
The patch also extracts FinishNPNSetup that is responsible for the bookkeeping after a handshake is done or fails, e.g. resetting transactions if 0Rtt is used but handshake fails, updating timings and sending telemetry.
HandshakeDone needs to be dispatched so that it is not called inside nss locks. The side effect of this is that nsHttpConnection::OnSocketWritable() may be called in between HandshakeDone being dispatched and executed. Therefore we still need to keep CheckCanWrite0RTTData(). This can be fixed in a follow up patch.
Side cleanups:
Remove mNotTrustedMitmDetected - his was used for ESNI, but it is not used anymore
Assignee | ||
Comment 48•3 years ago
|
||
Updated•3 years ago
|
Assignee | ||
Comment 49•3 years ago
|
||
In this case necko should poll for read (not for write) and reset the poll flags when the handshake is done.
The other option is to inspect the resumption ticket before adding it to the nss socket and find out which alpn will be used and disable the early-data if the version is http/1.1 and the transaction cannot send early-data. This currently only works on Nightly. When we roll out the necko’s token cache we can consider making this change.
Additional changes:
Consolidate mEarlyDataNegotiated and mWaitingFor0RTTResponse into mEarlyDataState
Assignee | ||
Comment 50•3 years ago
|
||
Before this bug TLS handshake was only driven by forcing writes. SecurtyInfo was set during a write code path. That is not anymore true and the TLS handshake can be driven by reading from a socket. That causes an issue where the SecurtyInfo was not set in case a TLS handshake fails. This bug added the setting of the SecurtyInfo to the read code path, but that causes problems when the transaction is closed due to corrupted response.
This patch fixes this by moving the setting of SecurtyInfo to Close() function.
Do not call HandshakeDoneInternal if the connectioon has been closed between posting the HandshakeDoneInternal runnable aand executing it.
Comment 51•3 years ago
|
||
Comment 52•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/fe6b8f41ca83
https://hg.mozilla.org/mozilla-central/rev/632eefa7e529
https://hg.mozilla.org/mozilla-central/rev/0c049a8ee4a8
https://hg.mozilla.org/mozilla-central/rev/7f2260b49e60
https://hg.mozilla.org/mozilla-central/rev/c922a30d444e
https://hg.mozilla.org/mozilla-central/rev/1db60c600b97
https://hg.mozilla.org/mozilla-central/rev/e0d76ea1bdd2
Comment 53•3 years ago
|
||
Backed out for causing very frequent networking crashes:
https://hg.mozilla.org/mozilla-central/rev/02d0875ec13db627ae89c6393718a1f853ac8e7b
Signatures:
- mozilla::net::nsHttpConnection::OnInputStreamReady
- mozilla::net::nsHttpConnection::FinishNPNSetup
- mozilla::net::nsHttpConnection::OnReadSegment
- mozilla::net::nsHttpNTLMAuth::GenerateCredentials
- mozilla::net::nsHttpConnection::TryTakeSubTransactions
User erosman in the Nightly channel on Element mentioned it might be related to proxy autoconfig.
Assignee | ||
Comment 54•3 years ago
|
||
Also set mEarlyDataState to done earlier.
Updated•3 years ago
|
Comment 55•3 years ago
|
||
Comment 56•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/e9635ec4198f
https://hg.mozilla.org/mozilla-central/rev/f69657a7e597
https://hg.mozilla.org/mozilla-central/rev/971bda7b5986
https://hg.mozilla.org/mozilla-central/rev/f8d31e3ebc93
https://hg.mozilla.org/mozilla-central/rev/18651e373837
https://hg.mozilla.org/mozilla-central/rev/4150724f355f
https://hg.mozilla.org/mozilla-central/rev/7f0065ff7b82
https://hg.mozilla.org/mozilla-central/rev/40e64e1009be
Assignee | ||
Updated•3 years ago
|
Description
•