Closed Bug 1278071 Opened 9 years ago Closed 7 years ago

increase number of iterations for export to PKCS #12

Categories

(NSS :: Libraries, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dveditz, Assigned: KaiE)

References

(Depends on 1 open bug)

Details

(Keywords: csectype-other, sec-moderate)

Attachments

(3 files, 2 obsolete files)

When bug 640625 was fixed the number of iterations chosen for exporting to pkcs12 (2000) was already known to be at the low end of acceptable. We need to increase that NSS default, and better if clients like Firefox can specify an increased number (bug 670462).
Microsoft chose to use 2000 iterations in 1997 -- naive Moore's law says we should be using 13 million iterations now. Or if 2000 was still "OK" five years ago at a minimum we'd need to be using 20,000, but since we're not likely to change it again any time soon we should up it to have some breathing room.
Who wants to suggest a good base number? In addition, is it good to have the iterations be a constant number, or should we randomize it by e.g. being up to 1% smaller or bigger?
I'd say a million. The best solution would be to detect the speed of the CPU and decide based on that, so that the number scales with the hardware. But we still need reasonable minimum for cases of overprovisioned VM hosts, slow hardware, etc. Randomising the iteration count doesn't give us anything that the already used random salt doesn't give. It's just a work factor.
This isn't fixed, right? Can you please help triage.
Flags: needinfo?(dveditz)
Priority: -- → P3
(In reply to Hubert Kario from comment #3) > The best solution would be to detect the speed of the CPU and decide based > on that, so that the number scales with the hardware. The attacker may be using their own hardware and the user on something fairly slow (a phone?), but on the plus side at least the number would generally scale in the future. Honestly, though, I'd rather just pick a number now and use it so we can fix this ASAP. Self-adjusting work factors sounds slick, but I bet no one ever writes that code and meanwhile we've got a known-bad factor in-use. If we pick a number it's a one-line change: https://searchfox.org/mozilla-central/rev/6326724982c66aaeaf70bb7c7ee170f7a38ca226/security/nss/lib/pkcs7/p7create.c#21 A million sounds good to me, too, since IIRC we're still using 3DES for this. But if that's too slow at least 100,000.
Flags: needinfo?(dveditz)
Attached patch pkcs12-iterations.patch (obsolete) (deleted) — Splinter Review
Thanks Dan for pointing us to the code, and for the suggestion. Simple patch attached. Hubert also said today that we should try a million, and see if it's sufficiently fast. I've started an NSS try build here, to check if that causes any issues or timeouts. https://treeherder.mozilla.org/#/jobs?repo=nss-try&revision=5a90bc1560c4459465b92451e311d3ec657d94c0
Hubert, would you like to do a performance review of this change?
Flags: needinfo?(hkario)
yes (retaining need info as a reminder)
Attached image openssl-vs-nss-performance.png (deleted) —
So, it looks like million iterations is about on the level of usability. The test was performed on a single core VM with a 2.6GHz Haswell CPU. Default configuration on such machine will require 2s to export key to a PKCS#12 file. Extreme configuration (AES-256-CBC for both key and certificate) will require about 3s to export. What is quite interesting is that there is significant correlation for NSS between cipher (or PBKDF) used for encryption of the key and time. For PBES2 ciphers that need more than 128 bit of keying material (AES-192-CBC, AES-256-CBC, Camellia-256-CBC), NSS requires about 2.4s of processing time. For PBES2 ciphers that use 128 bit of keying material (AES-128-CBC, SEED-CBC) NSS requires about 1.7s of processing time. That value is respectively 1.7 and 1.1 times more than what OpenSSL requires for decrypting those files. For PBES1 ciphers, the times are correlated with the cipher. 1.7s for RC2 (both 40 bit and 128 bit), 1.16s for RC4 (both 40 bit and 128 bit), 2s for 3DES. OpenSSL is able to process those files in relatively constant 0.9 s. Graph attached.
Flags: needinfo?(hkario)
Comment on attachment 8919846 [details] [diff] [review] pkcs12-iterations.patch Franziskus, do you think this patch and the performance change are acceptable? Hubert, thanks for the performance analysis.
Attachment #8919846 - Flags: review?(franziskuskiefer)
Comment on attachment 8919846 [details] [diff] [review] pkcs12-iterations.patch Review of attachment 8919846 [details] [diff] [review]: ----------------------------------------------------------------- I think the performance is ok (exporting PK12 isn't something people do all the time).
Attachment #8919846 - Flags: review?(franziskuskiefer) → review+
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → 3.34
With this change, the taskcluster "tools" step runs into timeouts after 60 minutes. I'm reopening, because we'll require a solution for the test systems. To get a better understanding of the times before and after the change, I've executed the test locally. I've restricted it to the "sharedb" cycle, which should be around 25% to 30% of the entire time. Elpased times for the sharedb cycle are with old iteration count: 16.23 minutes with new iteration count: 41.23 minutes Franziskus, Tim, can you reconfigure the timeouts of taskcluster accordingly? Would we require a 25 * 4 = 100 minutes longer timeout than previously? Alternatively, should we introduce a compile-time variable, to keep the iteration count small, when building in a testing environment?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Note, my measured times are for NSS_CYCLES=sharedb ./all.sh (I didn't measure the tools step.)
Franziskus said, if this change was limited to pkcs12 export, it should haven't caused these timeouts, because we only do about 137 exports, but it increased from 7 minutes to over 60 minutes. Looking at the place, where this this constant is defined in the sources, it's in a file named p7create, which suggests pkcs7. There might be additional operations involving p7 that are now slower with this higher constant. As a temporary measure to the timeouts, I suggested to decrease it again from 1 million to 100 thousand, and franziskus gave r+ on IRC. More investigation should probably be done to better understand what exactly we slowed down.
Looks to me like Hubert has a pretty fast machine. On my mid-2015 Macbook Pro an export with -c DES-EDE3-CBC -C RC2-CBC needs 8.6 seconds. Given that this is a relatively fast computer 1000000 doesn't sound like a reasonable limit. Without more comprehensible testing 100000 and 200000 gives me numbers between 1.5 and 2.5 seconds, which would be reasonable.
(In reply to Kai Engert (:kaie:) from comment #15) > Franziskus said, if this change was limited to pkcs12 export, it should > haven't caused these timeouts, because we only do about 137 exports, but it > increased from 7 minutes to over 60 minutes. importing such files later will also cause the it to take longer > As a temporary measure to the timeouts, I suggested to decrease it again > from 1 million to 100 thousand, and franziskus gave r+ on IRC. > > More investigation should probably be done to better understand what exactly > we slowed down. +1 if 100k is good enough for testing for now, we can keep it like this and investigate later, change in next release though probably adding ability to change iteration counts for testing is a good idea to speed it up, especially given the difference at hand (In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #17) > Looks to me like Hubert has a pretty fast machine. On my mid-2015 Macbook > Pro an export with -c DES-EDE3-CBC -C RC2-CBC needs 8.6 seconds. > Given that this is a relatively fast computer 1000000 doesn't sound like a > reasonable limit. Without more comprehensible testing 100000 and 200000 > gives me numbers between 1.5 and 2.5 seconds, which would be reasonable. did you run those on battery or AC power? It could have also been thermally throttled. From what I can see, MBP could have been configured with a 2.2GHz Haswell CPU on the low end, so a 4-times slower performance difference is unexpected... Like I said I ran it on 2.6GHz Haswell, so the same microarchitecture. And that's a 3 year old CPU now.
I think I know where the difference is coming from. The results from Comment #9 were from optimised, production build. When I re-run the tests with a debug build on a bare Skylake i7 6600U 2.6GHz I got similar performance you saw: -c DES-EDE3-CBC -C RC2-CBC: 08.04s -c DES-EDE3-CBC -C RC4: 05.90s -c AES-256-CBC -C AES-256-CBC: 12.84s But on optimised build I'm getting: -c DES-EDE3-CBC -C RC2-CBC: 1.69s -c DES-EDE3-CBC -C RC4: 1.24s -c AES-256-CBC -C AES-256-CBC: 2.54s On a 12 year old bare Irwindale Xeon 3.4GHz[1] I'm getting: -c DES-EDE3-CBC -C RC2-CBC: 3.75s -c DES-EDE3-CBC -C RC4: 2.73s -c AES-256-CBC -C AES-256-CBC: 5.45s So I still think that 1 million iterations is fast enough. Can you recheck with optimised build? 1 - https://ark.intel.com/products/28018/64-bit-Intel-Xeon-Processor-3_40-GHz-2M-Cache-800-MHz-FSB
Flags: needinfo?(franziskuskiefer)
You're of course absolutely right Hubert... For opt builds 1,000,000 is absolutely fine. But for debug build tests it will take too long. We should set the iteration count for tests (better have the possibility to change it generally and use it for tests).
Flags: needinfo?(franziskuskiefer)
Bob suggested that we could set the number based on #ifdef DEBUG Try build running: https://treeherder.mozilla.org/#/jobs?repo=nss-try&revision=163bcb638f60d33cad0c60e6a10ae848892196b8
"#ifdef DEBUG" depends on MOZ_DEBUG_SYMBOLS or on BUILD_OPT make environment variable? if on the former, then it's not ok - distributions commonly build optimised packages with debug symbols
Assignee: nobody → kaie
(In reply to Hubert Kario from comment #22) > "#ifdef DEBUG" depends on MOZ_DEBUG_SYMBOLS or on BUILD_OPT make environment > variable? > if on the former, then it's not ok - distributions commonly build optimised > packages with debug symbols The former is a firefox application build symbol that NSS knows about. It's the latter. From https://hg.mozilla.org/projects/nss/file/tip/coreconf/UNIX.mk : ifdef BUILD_OPT OPTIMIZER += -O DEFINES += -UDEBUG -DNDEBUG else OPTIMIZER += -g USERNAME := $(shell whoami) USERNAME := $(subst -,_,$(USERNAME)) DEFINES += -DDEBUG -UNDEBUG -DDEBUG_$(USERNAME) endif
Attached patch 1278071-v2.patch (obsolete) (deleted) — Splinter Review
Try build looks good, no timeouts.
Attachment #8919846 - Attachment is obsolete: true
Attachment #8926857 - Flags: review?(franziskuskiefer)
Comment on attachment 8926857 [details] [diff] [review] 1278071-v2.patch Review of attachment 8926857 [details] [diff] [review]: ----------------------------------------------------------------- r+
Comment on attachment 8926857 [details] [diff] [review] 1278071-v2.patch Review of attachment 8926857 [details] [diff] [review]: ----------------------------------------------------------------- I'm ok with the change as is but we should increase the maxRunTime for some machines. It's currently at 1h for most platforms and most opt builds and Linux ASAN get pretty close with this patch. They might timeout if something runs a little slower. You can increase maxRunTime in extend.js.
Attachment #8926857 - Flags: review?(franziskuskiefer)
(In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #26) > > I'm ok with the change as is but we should increase the maxRunTime for some > machines. It's currently at 1h for most platforms and most opt builds and > Linux ASAN get pretty close with this patch. They might timeout if something > runs a little slower. Another alternative is to decrease the count in debug to a smaller number. This can also speed up development/test cycles. How about using just ten-thousand in debug mode?
> How about using just ten-thousand in debug mode? Let's do that.
Attached patch 1278071-v3.patch (deleted) — Splinter Review
Attachment #8926857 - Attachment is obsolete: true
Attachment #8930552 - Flags: review?(franziskuskiefer)
Attachment #8930552 - Flags: review?(franziskuskiefer) → review+
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Target Milestone: 3.34 → 3.35
https://bugzilla.mozilla.org/show_bug.cgi?id=1436873#c11 informs us that Windows' maximum PKCS12 iteration count is 600k, which turns out to be an active compatibility issue. I propose we pull this down to 600k from 1M, and uplift that into 60 Beta. 600k is a lot lower than 1M, but it's still much better than the previous iteration count.
(In reply to J.C. Jones [:jcj] from comment #31) > https://bugzilla.mozilla.org/show_bug.cgi?id=1436873#c11 informs us that > Windows' maximum PKCS12 iteration count is 600k, which turns out to be an > active compatibility issue. > > I propose we pull this down to 600k from 1M, and uplift that into 60 Beta. > 600k is a lot lower than 1M, but it's still much better than the previous > iteration count. we've increased it by 3 orders of magnitude, less than halving it for compatibility with Windows is entirely reasonable
Status: RESOLVED → UNCONFIRMED
Ever confirmed: false
Resolution: FIXED → ---
Attachment #8961837 - Flags: review?(kaie)
Comment on attachment 8961837 [details] [diff] [review] 1278071-decrease_iterations.patch r=kaie
Attachment #8961837 - Flags: review?(kaie) → review+
We should fix this for the enterprise world, prior to releasing Firefox 60 ESR. We'd need to release a NSS 3.36.1 release with this fix.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Blocks: 1448404
OK, landed in the current branch. And failed to mark r= because I expect mozreview-level of automation and am careless. Ugh. Must be better here in NSS. https://hg.mozilla.org/projects/nss/rev/dedf5290c679 Kaie, can you handle planning this for a 3.36.1 for 60 ESR?
Status: NEW → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Target Milestone: 3.35 → 3.37
(In reply to J.C. Jones [:jcj] from comment #36) > Kaie, can you handle planning this for a 3.36.1 for 60 ESR? Yes. Landed into branch: https://hg.mozilla.org/projects/nss/rev/ba3f1cc8a8e6 I'll request uplift into ff60-beta in bug 1448404.
Target Milestone: 3.37 → 3.36.1
Blocks: 1452670
We should use separate tracking bugs for regressions. The reason is that we publish bug queries, which identify bugs based on the target milestone setting. Moving the target milestone will change the result of queries contained in older release notes. I'm restoring the target of this bug to 3.35, and have filed bug 1452670 which records the target 3.36.1.
Target Milestone: 3.36.1 → 3.35
No longer blocks: 1448404
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: