Detect, report and repair Thunderbird key storage corruption
Categories
(MailNews Core :: Security, defect, P1)
Tracking
(thunderbird_esr102 fixed, thunderbird106 fixed)
People
(Reporter: KaiE, Assigned: KaiE)
References
Details
Attachments
(5 files, 4 obsolete files)
(deleted),
text/x-phabricator-request
|
wsmwk
:
approval-comm-beta+
|
Details |
(deleted),
image/png
|
Details | |
(deleted),
image/png
|
Details | |
(deleted),
text/x-phabricator-request
|
wsmwk
:
approval-comm-beta+
wsmwk
:
approval-comm-esr102+
|
Details |
(deleted),
patch
|
wsmwk
:
approval-comm-esr102+
|
Details | Diff | Splinter Review |
The Thunderbird profile directory contains several files that depend on each other.
A very important file is: key4.db
If you lose key4.db, you can no longer access logins.json (saved passwords)
If you lose key4.db, you can no longer decrypt file encrypted-openpgp-passphrase.txt
And the consequence of that is, you can no longer access the OpenPGP secret keys stored in secring.gpg
If you lose key4.db, you have lost the secret keys for the personal S/MIME certificates.
The Thunderbird profile importer feature shipped with Thunderbird 102 may overwrite file key4.db, which has the consequences explained above.
We have several reports from users who are no longer able to use the OpenPGP feature at all, and their problems are caused by this inconsistency. The current Thunderbird code isn't prepared for this kind of inconsistency, and when it occurrs, the code simply runs into an exception and stops to work.
As a first step, I suggest to disable the broken import in 1790605, to avoid that more users will be affected.
However, we also must help our users who are already affected.
In my opinion, we cannot expect users to clean this situation up themselves.
We need to improve Thunderbird to perform a consistency check, that discovers this kind of inconsistency, and assists the user to repair the broken configuration.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 1•2 years ago
|
||
Let's focus on the inconsistency between key4.db and encrypted-openpgp-passphrase.txt and secring.gpg, because that is a known possible consequence of the code from bug 1720042.
We should perform a once-per-session consistency check, and offer assistance if the check fails.
I suggest to perform the check at the time we attempt to read file encrypted-openpgp-passphrase.txt - as implemented in mail/extensions/openpgp/content/modules/masterpass.jsm
Currently, that code will attempt to decrypt encrypted-openpgp-passphrase.txt, and simply throw an exception on failure (the exception is invisible to the user, and results in an attempt action to not work).
We should instead use the following new logic:
- find out if the application primary password is set
- if a primary password is set, find out if the user has unlocked using the PP. If not, we aren't experiencing corruption (return an exception, we'll check again if the user retries after having unlocked with the PP)
- if a primary password is set, and the user has already unlocked using the PP, and we're unable to decrypt, we have corruption
- if no primary password is set, and we're unable to decrypt, we have corruption
If we have corruption, check if the corruption is bad, or not problematic.
Check how many secret keys are stored in file secring.pgp.
If there are zero secret keys, we could automatically repair the corruption. It should be fine to delete the contents of encrypted-openpgp-passphrase.txt and receate it (using a fresh automatic passphrase, that will be used for future secret OpenPGP keys).
If there are nonzero secret keys, the corruption means dataloss, we're no longer able to access the secret keys in secring.gpg.
We have to inform the user. We should tell the user that the secret OpenPGP keys are lost.
We could offer two choices:
- automatically delete the unaccessible secret OpenPGP keys to restore a working configuration with no OpenPGP keys
- tell the user to quit Thunderbird, and to restore a backup of the profile directory (we should show the full path of the corrupted profile on screen to assist the user in identifying which folder needs to be restored from a backup)
Assignee | ||
Comment 2•2 years ago
|
||
I found another scenario, and I think this scenario is a good argument for implementing automatic reparing:
- create a fresh profile
- don't setup an email account, just quit
- start again, same profile
- import from another profile
- quit and start again
- confirm that you can send and receive email (imported account, imported passwords)
- try to configure openpgp, and you get a failure
In this scenario, the user wouldn't notice the profile corruption immediately.
The user might start to invest into configuring and using this profile.
At a later time, the user might decide to try OpenPGP, and will notice with frustration that it doesn't work, with no good indication why.
Assignee | ||
Comment 3•2 years ago
|
||
Assignee | ||
Comment 4•2 years ago
|
||
Assignee | ||
Comment 5•2 years ago
|
||
Even if we decide to not automatically repair (for example, because we're worried about potentially make a situation worse by deleting files), in my opinion, at the very least we should automatically detect the corruption and inform the user. The attached patch implements that, see also the screenshot.
Assignee | ||
Comment 6•2 years ago
|
||
I realize simply reporting isn't ideal... We'd inform users who aren't using OpenPGP. And it will be more work to report only when users attempt to use OpenPGP with a broken profile.
I think automatic repairing really would be better.
Updated•2 years ago
|
Assignee | ||
Comment 7•2 years ago
|
||
Comment on attachment 9294620 [details]
fatal-corruption-report.png
I suggest, only show this warning if there really is a bad problem, if we find existing secret keys that we can no longer access.
Assignee | ||
Comment 8•2 years ago
|
||
I suggest to show this simpler warning, if we detect the corruption, but the corruption isn't a problem. If there are no secret keys stored, it should be safe to delete the passphrase file.
I suggest to still inform the user, and require the user to confirm repairing (as done in this screenshot and updated patch).
Updated•2 years ago
|
Comment 9•2 years ago
|
||
I would strongly prefer we avoid any messaging if possible.
Updated•2 years ago
|
Assignee | ||
Comment 10•2 years ago
|
||
Patch updated, automatic repair, no prompts.
Comment 11•2 years ago
|
||
Based on Kai's patch.
Updated•2 years ago
|
Comment 12•2 years ago
|
||
(In reply to Andrei Hajdukewycz [:sancus] from comment #9)
I would strongly prefer we avoid any messaging if possible.
Why ?
Comment 13•2 years ago
|
||
Just to reiterate what I wrote on phab:
Looks OK to me.... but that's coming in blind without knowing anything about OpenPGP and just reading over the bug comments and patch.
So I don't have a great understanding of the bigger picture here.
Kai: I can come back to this tomorrow and take a deeper look (i.e. try and exercise all the failure cases) if you'd prefer!
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 14•2 years ago
|
||
Comment on attachment 9294618 [details]
Bug 1790610 - Detect and automatically repair OpenPGP key storage corruption. r=BenC
[Approval Request Comment]
Regression caused by (bug #): profile import
User impact if declined: corrupted profile that breaks users attempting to use openpgp
Testing completed (on c-c, etc.):
Risk to taking this patch (and alternatives if risky): low theoretical risk for additional dataloss, because we're moving important files, but the code should only do it when really necessary
Comment 15•2 years ago
|
||
Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/08d9b3319b64
Detect and automatically repair OpenPGP key storage corruption. r=BenC
Updated•2 years ago
|
Comment 16•2 years ago
|
||
Comment on attachment 9294618 [details]
Bug 1790610 - Detect and automatically repair OpenPGP key storage corruption. r=BenC
[Triage Comment]
Approved for beta
Assignee | ||
Comment 17•2 years ago
|
||
Please don't ship this yet. We need to check if this makes things worse.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 18•2 years ago
|
||
This code is broken on comm-central. We have a regression that destroys more. Please back out.
Comment 19•2 years ago
|
||
Assignee | ||
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Comment 20•2 years ago
|
||
I'm trying to find a way to test the regression bug scenario, which is difficult.
The regression happened, when we already had files from a previous Thunderbird session, and our code unnecessarily replaced it.
To test that our code wouldn't replace existing information, we'd need to run a test that reuses profile files from a different session.
This is difficult with our automated tests apparently. I cannot find any test that restarts Thunderbird, rather, I find tests with comments, stating that they avoid a restart, because that would be tricky.
Another idea is to copy prepared files into the profile directory, prior to executing the tests. This is also difficult. If our process starts with an empty profile, then NSS init will create a fresh key4.db very early during startup - which means we cannot combine it with prepared secring.gpg / encrypted-openpgp-passphrase.txt files, because they would mismatch.
I'm also not aware of a way to copy a prepared file key4.db into the profile folder, prior to starting up.
I found tests using that approach in mailnews/compose/test/unit/ - a function setupForPassword() does this copying. But are those tests used by Thunderbird, or are they old tests from SeaMonkey? Even if they are Thunderbird tests, they are unit tests. I'm not sure if the same action would work with a mochitest.
Any ideas are welcome.
Assignee | ||
Comment 21•2 years ago
|
||
I have a simpler idea, which could simply test the flow of the related code.
Currently the related code is executed only once. As soon as we have succeeded init, we have the password from encrypted-openpgp-passphrase.txt cached in memory. If it's cached, we don't attempt to read it again, which means we also won't run through that repair code again.
We could use a test that erases the cached password from memory, thereby forcing us to run through that code twice. We could ensure that after the second execution we're still using the same password. This would confirm that our second re-init doesn't replace the earlier data.
Updated•2 years ago
|
Assignee | ||
Comment 22•2 years ago
|
||
Assignee | ||
Comment 23•2 years ago
|
||
I've implemented the idea from comment 21 in the separate patch.
With the original, broken repair patch, this test fails.
The updated repair patch contains two changes, the logic fix, and the safety consistency check. I have tested that each of those two changes is sufficient to prevent the regression. Having both improves safety further against accidental, future code changes.
Assignee | ||
Comment 24•2 years ago
|
||
fix backported to 102, I'm running my local thunderbird with this patch
Assignee | ||
Comment 25•2 years ago
|
||
Ben, do you have time to review the updated patch?
Comment 26•2 years ago
|
||
The revised patch looks good to me, but I'd recommend getting a second opinion too.
Assignee | ||
Comment 27•2 years ago
|
||
During additional testing I found that the updated patch is still insufficient.
Most users will have both a secring.gpg and an automatic backup file secring.gpg.old, which we create every time we save an updated secring.gpg file.
In addition, we have automatic fallback loading old the secring.gpg.old file.
With the current patch if corruption is detected, we move file secring.gpg away. Then we automatically restart. Then our code attempts to fall back loading from the secring.gpg.old file, and it reports corruption. (This report is from older repair code, from a past corruption scenario, bug 1656287, which we haven't yet removed.)
Assignee | ||
Comment 28•2 years ago
|
||
I suggest that we do two things in addition, to prevent that misinterpretation:
- also rename file secring.gpg.old to secring.gpg.old.corrupt
- as a second precaution, create an empty file secring.gpg as part of repairing (to avoid falling back)
Assignee | ||
Comment 29•2 years ago
|
||
While the repair code helps to repair the original corruption in stable Thunderbird caused by bug 1720042,
it cannot repair the corruption that was caused by the earlier patch in this bug (only in Thunderbird Daily).
We currently don't have detection for that scenario.
The scenario is: We are able to decrypt the passphrase from encrypted-openpgp-passphrase.txt, but key stored in file secring.gpg cannot be decrypted using it.
I suggest that we add an additional patch, only for Daily, that reports this inconsistency.
(It could also happen if a user modifies file secring.gpg using external OpenPGP software, which the user shouldn't do.)
Assignee | ||
Comment 30•2 years ago
|
||
Depends on D158146
Assignee | ||
Comment 31•2 years ago
|
||
Assignee | ||
Comment 32•2 years ago
|
||
Assignee | ||
Comment 33•2 years ago
|
||
(In reply to Kai Engert (:KaiE:) from comment #29)
While the repair code helps to repair the original corruption in stable Thunderbird caused by bug 1720042,
it cannot repair the corruption that was caused by the earlier patch in this bug (only in Thunderbird Daily).We currently don't have detection for that scenario.
The scenario is: We are able to decrypt the passphrase from encrypted-openpgp-passphrase.txt, but key stored in file secring.gpg cannot be decrypted using it.I suggest that we add an additional patch, only for Daily, that reports this inconsistency.
(It could also happen if a user modifies file secring.gpg using external OpenPGP software, which the user shouldn't do.)
I'll move this patch to bug 1792450, because that bug introduced the need for this additional patch on c-c.
Comment 34•2 years ago
|
||
Comment on attachment 9296454 [details]
Bug 1790610 - Report if a secret key cannot be decrypted using our automatic passphrase. r=mkmelin
Revision D158250 was moved to bug 1792450. Setting attachment 9296454 [details] to obsolete.
Assignee | ||
Updated•2 years ago
|
Updated•2 years ago
|
Comment 36•2 years ago
|
||
Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/7cb73c32cc14
Detect and automatically repair OpenPGP key storage corruption. r=mkmelin,BenC
https://hg.mozilla.org/comm-central/rev/bbe2423152f3
Automated test to ensure OpenPGP key storage repair code doesn't contain a logic error. r=BenC
Assignee | ||
Comment 37•2 years ago
|
||
I've done additional testing using daily-broken-import from 09-23, and with today's daily, and the repairing worked.
Also, if good openpgp already exist, today's daily don't corrupt them.
I've also been running the backported patch in my local 102 build for the last days, and it's been working fine.
I think the patch is good and can be added to beta.
Assignee | ||
Comment 38•2 years ago
|
||
Comment on attachment 9294618 [details]
Bug 1790610 - Detect and automatically repair OpenPGP key storage corruption. r=BenC
[Approval Request Comment]
Regression caused by (bug #): 1720042 (need repair)
User impact if declined: corrupted profiles and users not knowing about the need to repair, experiencing broken functionality at the time they attempt to use OpenPGP without indication what's wrong
Testing completed (on c-c, etc.): yes
Risk to taking this patch (and alternatives if risky): medium
Assignee | ||
Updated•2 years ago
|
Comment 39•2 years ago
|
||
Comment on attachment 9296244 [details]
Bug 1790610 - Automated test to ensure OpenPGP key storage repair code doesn't contain a logic error. r=BenC
[Triage Comment]
Approved for beta
Comment 40•2 years ago
|
||
Comment on attachment 9294618 [details]
Bug 1790610 - Detect and automatically repair OpenPGP key storage corruption. r=BenC
[Triage Comment]
Approved for beta
Comment 41•2 years ago
|
||
bugherder uplift |
Comment 42•2 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #12)
(In reply to Andrei Hajdukewycz [:sancus] from comment #9)
I would strongly prefer we avoid any messaging if possible.
Why ?
Kai and I had discussed this right before I made the comment, so it was more summarizing the result of our meeting, but my view was basically: If we screwed it up, and it's worth fixing, then it should be done as automatically as possible. Complicated popup messages will confuse many users, and doing messages also means adding strings and localizing them as well. It's better to avoid all that if at all possible.
Assignee | ||
Comment 44•2 years ago
|
||
Magnus, FYI, I think this bug shouldn't be responsible for scenarios where stored passwords cannot be read.
Unusable stored passwords could be seen it key4.db doesn't contain the symmetric key used for decrypting logins.json.
For the scenario handled in this bug (as caused by the code that was backed out in bug 1790605), if file logins.json already existed, file key4.db was not copied.
In other words, this bug here only affected users who did not yet have saved passwords.
(It affected openpgp and s/mime key storage, it did not affected stored passwords.)
Assignee | ||
Comment 46•2 years ago
|
||
I would like to see this uplifted to esr102, for the reason explained in comment 38.
I have tested the fix myself with my local esr102 build (which I'm running as my primary Thunderbird) , and I didn't see any problems.
Should we consider my own testing, plus no more bad reports from daily/beta sufficient for uplifting to esr102?
Or would you like to see additional testing by someone else?
I think it's most important to double check that there aren't any other lingering regressions like bug 1792450.
It would be good to get testing assistance from users who are already using OpenPGP with esr102, who own one or more OpenPGP secret keys.
Those users should backup their profile, then start the esr102-test build that I would provide, and confirm they are not experiencing any regressions, and no key dataloss.
Assignee | ||
Comment 47•2 years ago
|
||
I can ask for volunteers in the openpgp matrix channel.
Assignee | ||
Comment 48•2 years ago
|
||
If you're running Thunderbird 102 with your profile, and you have at least one OpenPGP secret key, and you'd like to volunteer to test the fix for this bug:
- quit thunderbird
- find your profile folder, and create a full backup of the folder
- download one of the builds from the below links, for your respective platform
- unpack and start that Thunderbird build
Then please test that you can still use the OpenPGP keys managed by Thunderbird (e.g. decrypt a message and send an encrypted an signed message.)
Windows 64bit:
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/TVg9JX1uRbmhd_wpjsqpQA/runs/0/artifacts/public/build/setup.exe
or
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/TVg9JX1uRbmhd_wpjsqpQA/runs/0/artifacts/public/build/target.zip
Windows 32bit:
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/K7PDJpUzTSCzu-yvYGWuGg/runs/0/artifacts/public/build/setup.exe
or
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/K7PDJpUzTSCzu-yvYGWuGg/runs/0/artifacts/public/build/target.zip
macOS: build not yet ready
Comment 49•2 years ago
|
||
Downloaded Linux 64bit, imported profile from TB 91. Allowed several certificate exceptions (as I run a private server with self-signed certificate) and passwords as prompted. The file encrypted-openpgp-passphrase.txt existed, with time of original run of downloaded daily. File secring.gpg did not exist.
Account settings showed OpenPGP key ID, but said key could not be found. Imported ~/.gnupg/secring.gpg and all keys from ~/.gnupg/pubring.gpg.
Tested sending signed and encypted email, and receiving reply.
All OK.
Ubuntu 20.04 LTS
File encrypted-openpgp-passphrase.txt still had original timestamp.
Assignee | ||
Comment 50•2 years ago
|
||
Comment on attachment 9296457 [details] [diff] [review]
backport-1790610-esr102-v3.patch
see comment 38.
Requesting approval, to get this on Wayne's radar.
Comment 51•2 years ago
|
||
Comment on attachment 9296457 [details] [diff] [review]
backport-1790610-esr102-v3.patch
[Triage Comment]
Approved for esr102
Comment 52•2 years ago
|
||
bugherder uplift |
Thunderbird 102.4.0:
https://hg.mozilla.org/releases/comm-esr102/rev/ff1606851536
Assignee | ||
Comment 53•2 years ago
|
||
Comment on attachment 9296244 [details]
Bug 1790610 - Automated test to ensure OpenPGP key storage repair code doesn't contain a logic error. r=BenC
please uplift the test, too
Comment 54•2 years ago
|
||
Comment on attachment 9296244 [details]
Bug 1790610 - Automated test to ensure OpenPGP key storage repair code doesn't contain a logic error. r=BenC
[Triage Comment]
Approved for esr102
Comment 55•2 years ago
|
||
bugherder uplift |
Thunderbird 102.4.0:
https://hg.mozilla.org/releases/comm-esr102/rev/7c3c6daef45b
Comment 56•2 years ago
|
||
bugherder uplift |
Thunderbird 102.4.0:
https://hg.mozilla.org/releases/comm-esr102/rev/2184a2e410bd
Assignee | ||
Comment 57•2 years ago
|
||
(In reply to Kai Engert (:KaiE:) from comment #48)
macOS: build not yet ready
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Description
•