Closed Bug 1687852 Opened 4 years ago Closed 4 years ago

Resolve how we handle duplicate logins when importing from CSV file.

Categories

(Toolkit :: Password Manager, task, P2)

task

Tracking

()

VERIFIED FIXED
87 Branch
Tracking Status
firefox87 --- verified

People

(Reporter: sfoster, Assigned: petcuandrei)

References

(Blocks 1 open bug)

Details

(Whiteboard: [passwords:import])

Attachments

(1 file)

The current logic in LoginCSVImport.jsm's importFromCSV method throws an exception when it encounters a duplicate login. In many cases this is a recoverable error and we should be complete the import and note any skipped/consolidated duplicates in the report.

STR:
Import a CSV with duplicated entries, e.g. create a file with contents:

name,url,username,password
somesite,https://example.com/,user@example.com,asdasd123123
somesite,https://example.com/,user@example.com,asdasd123123

then, in the browser console:

Cu.import("resource://gre/modules/LoginCSVImport.jsm");
await LoginCSVImport.importFromCSV("/path/to/your/logins.csv");

Expected result:

  • The 2 duplicated logins resolve to a single imported login

Actual:

  • An exception is thrown and no logins are imported.

There are other input variations where logins should be treated as dupes, which we are normally able to resolve using LoginHelper.dedupeLogins, however it accepts nsLoginInfo objects, not the plain javascript objects we have at this preprocessing stage in the import.

Blocks: 1303176
Severity: -- → S3
Priority: -- → P2
Whiteboard: [passwords:import]
Assignee: nobody → petcuandrei
Status: NEW → ASSIGNED

Lets capture some decisions and open questions here as phabricator is not a good place for that stuff.

We want to figure out what logic we'll apply when attempting to import logins from a user-provided .csv file. We want to end up with a summary of how many logins were added and updated, and how many were ignored as being duplicates, or ignored for being invalid in some way.

First, the whole-file import errors in which we entirely abort and import no logins, showing the Import Error dialog

  • File couldn't be read for various reasons: I don't think there are any open questions here.
  • Missing column headers: If we don't have columns for at least origin, username and password we abort.
  • Ambiguous/duplicated column headers: There is more than one column which maps to one of the fields we support. We can't easily disambiguate this case and could end up importing every login with the wrong data, or updating existing logins with bad data. So we abort.

Per-entry import errors. We complete the attempt to import logins from the file and show a summary + detailed report of the outcome

  • When there are matching guids within the .csv recordset, this is a strong signal that this is bad data and something unexpected may happen if we proceed to update the previous login with this 2nd login's data. This is a very specific failure case, I guess this is most likely to happen if the same data was concatenated to a .csv file more than once. I think we should skip importing this line and mark it as an error in the summary and detailed report.
  • Note: We explicitly support importing data with guids that match saved logins. This is a way for the user to rollback or restore a backup of logins. Import should proceed.
  • I can't think of other kinds of data conflict errors we should catch before we attempt to import.

Duplicated logins

  • There are 2 source of duplication: pre-import, within the set of entries from this .csv file, and a duplication between a csv entry and an existing saved login.
    • For the purposes of the summary and report, I don't think it matters where the duplication occurs, though it might be useful to have debug logging in place to help troubleshoot future bugs.
  • What constitutes a duplicate?
    • Any entry with exact matching fields, where modifying the first login with the values of the second would result in no change is a duplicate. This is also true where we don't have the information necessary to resolve conflicting values - e.g. 2 different passwords on an otherwise matching login, with no timestamps to know which supercedes the other. In these case we leave the first login as-is nd mark the line in the report for the 2nd login as unchanged so the user can manually update if they want to.

I know there was some discussion of this already and I hope this doesn't create more confusion, but these are the assumptions I'm using in the review. Lets chat if you disagree or anything is still unclear.

Everything you said makes sense, Sam. I implemented this work over this weekend. It's ready for another review.

Pushed by sfoster@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4da871dee153 Resolve how we handle duplicate logins when importing from CSV file r=sfoster,tgiles
Backout by dluca@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/ea0966587b4a Backed out changeset 4da871dee153 for Xpcshell failures in browser/components/migration/tests/unit/test_Chrome_passwords.js. CLOSED TREE
Flags: needinfo?(petcuandrei)

here is a try with fuzzy including other tests like chrome/ie https://treeherder.mozilla.org/jobs?repo=try&revision=fa05404c276b370810b713fdff434a2ab79af195

Pushed by archaeopteryx@coole-files.de: https://hg.mozilla.org/integration/autoland/rev/b4d18605a424 Resolve how we handle duplicate logins when importing from CSV file r=sfoster,tgiles
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 87 Branch

I have verified this issue using the latest Firefox Nightly 87.0a1 (Build ID: 20210210215051) on Windows 10 x64, Ubuntu 20.04 and macOS 11.1.

  • In order to verify this bug and check if there is any regression, I have tested all the scenarios mentioned in comment 2:
  1. The “Import Error” modal is displayed after trying importing logins from a non-CSV file.
  2. The “Import Error” modal is displayed after importing logins from a CSV file if there are missing username/password columns header.
  3. After importing a CSV file that contains duplicate logins, one is correctly imported and the other one is recognized as an error in the “Import Complete” modal and is not imported.
  4. After importing a CSV file that contains the same logins as in the “about:logins” page, these logins are recognized as duplicates in the “Import Complete” modal and are not imported.
  5. However, for the “Ambiguous/duplicated column headers”, it seems that there is a problem. I have tested the following scenarios:
    • I have created a CSV file with 3 username and 3 password columns, where 2 of them contain the same password and the other one a different password. After importing logins from this file, the “Import Error” modal was not displayed and the logins were imported with the last username and password created in the CSV file.
    • I have created a CSV file with multiple username and password column headers in which I have created logins with multiple usernames and passwords and logins that have a single username and a single password. After importing this CSV file, it seems that all logins with only one password/username are recognized as errors and the other ones are imported as normal.

@Sam, @Andrei should I log a new bug for the scenario described in point 5? Based on comment 2, the “Import Error” modal should be displayed if the CSV contains duplicate or extra columns header.

Flags: needinfo?(sfoster)
Flags: needinfo?(petcuandrei)

Yes. Please report a new bug. Thank you.

Flags: needinfo?(petcuandrei)
Flags: needinfo?(sfoster)

I have logged the issue: Bug 1692481, and I'm marking this issue as Verified Fixed. Thank you!

Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: