Closed
Bug 1235269
Opened 9 years ago
Closed 8 years ago
Changes made during a Sync may be ignored causing items to be missing or deleted items to be resurrected
Categories
(Firefox :: Sync, defect, P1)
Firefox
Sync
Tracking
()
RESOLVED
DUPLICATE
of bug 1258127
People
(Reporter: rnewman, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: dataloss, Whiteboard: [data-integrity])
Attachments
(1 file)
(deleted),
image/png
|
Details |
In August I created two bookmarks, numbers 49 and 50 in my 'Recipes' folder.
These bookmarks were created on desktop, seven seconds apart, clearing open tabs:
Mon Aug 10 12:52:02 PDT 2015
Mon Aug 10 12:52:09 PDT 2015
Firefox for iOS's consistency checker just told me those weren't present in my iOS server bookmarks buffer.
And sure enough, they're not on the server.
Desktop places.sqlite:
---
$ sqlite3 places.sqlite "select * from moz_bookmarks where guid in ('FtN9dOhARjHD', 'r1jDGaKZ_003');"
id type fk parent position title keyword_id folder_type dateAdded lastModified guid
---------- ---------- ---------- ---------- ---------- --------------------------------------------- ---------- ----------- ---------------- ---------------- ------------
6545 1 365326 4760 49 Red Braised Pork Belly – China Sichuan Food 1439236322479000 1439236325532000 FtN9dOhARjHD
6546 1 365324 4760 50 Hong Shao Rou Recipe—(Red Braised Pork Bell 1439236329874000 1439236332351000 r1jDGaKZ_003
---
There's no mention of these two in $profiledir/weave/ -- not in failed, not in logs, nothing.
They're not on the server if I try to fetch them:
Sync.RecordManager TRACE Importing record: https://sync-221-us-west-2.sync.services.mozilla.com/1.5/23528203/storage/bookmarks/FtN9dOhARjHD?full=1
Sync.Resource DEBUG mesg: GET fail 404 https://sync-221-us-west-2.sync.services.mozilla.com/1.5/23528203/storage/bookmarks/FtN9dOhARjHD?full=1
Sync.Resource DEBUG GET fail 404 https://sync-221-us-west-2.sync.services.mozilla.com/1.5/23528203/storage/bookmarks/FtN9dOhARjHD?full=1
but their parent record on the server claims to have them as children!
{"id":"zU14bWY9AY2q","type":"folder","parentName":"Food","title":"Recipes","description":null,"children":["gPtraaq3VWS5","7LnanjWSE2dR","-qt32FFduyTJ","Vcu0b7Yhyr4n","1UIsVacGOpN_","M8-C6xNEPi01","lEEm3OoVkJck","beGBJU_GID35","1UOFRd58a7EM","-mhx_iQ8mvSu","cUpFZYps3PgB","3loYBiI1WMQA","o-Oyai_TDzkI","gGRhexfu1H6Y","sLOW0Rd12Kfz","RLn7qrosrL29","nKcHznZnX6-x","Dt4Jcn2F1cJn","WebaD9RvROZJ","KSzu2knU5LMT","LQnsH-pBfuAW","eeaW4iEjXaBJ","O934Gm6wjHIz","92m9yf5C_Aet","TIBhr9etGCZm","d5d0GJyKc7gA","I2Ls3ozDhLt4","37pp7btj9nuz","iNkz8Op0q-g9","bp4y7IEMguuy","YcZz37xZzZvy","3vxmJS25Aq0Q","UjF0VSTfajsk","9QPx1xnJGBQi","NZlA1kRy4RKk","BDT-_LLi4Ssy","lqutaAqQcOfx","RQgvat9frau-","OD8_0Hl74Bgr","unsk_8WEhPiw","E94LZbmeVlBh","HJminkGHuc1M","f7NG7DDqqOjX","aFMOyHIfvVIn","IgUlW4QKuYK6","_PbdCkXyuEZa","KFHHY3ht0rba","ZrSoa9iPr81O","97MIzlWVx6cO","FtN9dOhARjHD","r1jDGaKZ_003","4g43-Q8BE4VV","kWQOZGTngK8Y"],"parentid":"5lQz_glXhEtn"}
These logs imply one of two things:
* The server accepted a POST containing the changed parent folder, but dropped the two new records on the floor.
* The client never uploaded the two new records, either because they were skipped, because we thought they were uploaded successfully, or because they were never entered into the tracker.
I'll bet on the latter. Another record, 97MIzlWVx6cO, was created 4 seconds earlier:
Mon Aug 10 12:51:58 PDT 2015
so my hypothesis is this:
* The user saves a bookmark.
* We begin a sync. This sets ignoreAll so we don't trigger ourselves.
* The user bookmarks another link. We ignore it. They bookmark another, and we ignore that, too.
* All three bookmarks are in the same folder. We eventually come to upload that, and -- because we go to the DB to compute the record -- all three bookmarks appear in the uploaded parent record!
This bug worries me, because my sample size has reached *1* and I've already found bookmark corruption that would cause iOS to back off for fear of doing more damage.
All you need to do to hit this bug is bookmark two pages faster than Sync can complete -- four seconds seems to be enough.
One more check mark next to "blind trust in observer notifications is bad". This explains another slice of bookmark quality issues over the past few years.
Fixes or mitigations for this bug might be:
* Narrow the ignoreAll window.
* Don't use observer notifications for change tracking. See <http://160.twinql.com/syncing-and-storage-on-three-platforms/>.
* If we're uploading a folder, make sure -- through any means; fetching IDs from the server, checking timestamps against when our sync began, annotating bookmarks that we've uploaded them ever -- that we've uploaded all of its children. Eventual consistency is better than none.
* …
Reporter | ||
Comment 1•9 years ago
|
||
Note that fixes that are scoped to an individual folder within a sync aren't sufficient. The same problem exists at longer timescales:
* Hit the bookmark star to add an Unsorted Bookmark. A sync begins.
* Drag a URL to the toolbar. A bookmark is created, but neither the new bookmark nor the changed toolbar is marked as changed.
* Wait for the sync to finish.
* Weeks later, drag another bookmark to the toolbar. The uploaded toolbar record now includes two children, only one of which is new and only one of which will be uploaded to the server.
When another client downloads these incomplete sets, it'll do one of three things:
* Quietly screw up until it makes a local change, then screw up other clients. (Desktop)
* Actively fix up the damage, then push the changes. (Android)
-- This will screw up the original desktop, which will move its local-never-synced bookmarks to Unsorted Bookmarks as it realizes the magnitude of its error.
* Refuse to sync (either the whole tree or the smallest inconsistent subtree) until the server gets its shit in order. (iOS)
-- I might end up having the iOS client send reset commands to try to force other clients to fix things, but that doesn't make me happy.
We've been seeing these symptoms as a quiet hum of complaints for years, so it's nice to have another explanation beyond Bug 814801.
Reporter | ||
Comment 2•9 years ago
|
||
Here's some chrome JS that'll see if your account is missing records.
//////////
function removeAll(xs, ys) {
ys.forEach((y) => xs.delete(y));
return xs;
}
let bme = Weave.Service.engineManager.get("bookmarks");
let bms = bme._store;
let local = Object.keys(bms.getAllIDs());
let remote = JSON.parse(Weave.Service.resource(Weave.Service.storageURL + "bookmarks").get());
let missing = removeAll(new Set(local), remote);
console.log(missing);
// Set [ "FtN9dOhARjHD", "r1jDGaKZ_003" ]
//////////
A reasonable fix might be to run this and add the results to the tracker:
//////////
missing.forEach((id) => bme._tracker.addChangedID(id, Date.now()));
//////////
Comment 3•9 years ago
|
||
(In reply to Richard Newman [:rnewman] from comment #2)
> Here's some chrome JS that'll see if your account is missing records.
This reports 58 records for my real profile :(
Comment 4•9 years ago
|
||
(In reply to Mark Hammond [:markh] from comment #3)
> (In reply to Richard Newman [:rnewman] from comment #2)
> > Here's some chrome JS that'll see if your account is missing records.
>
> This reports 58 records for my real profile :(
I have 605 bookmarks and none reported missing. Victory!
Updated•9 years ago
|
Flags: firefox-backlog+
Priority: -- → P2
Comment 5•9 years ago
|
||
FTR, the same issue obviously exists with deletions, resulting in the server still having records that don't exist locally (because we were syncing when they were deleted). When the next client for that account does a full sync (eg, a new device) those "deleted" bookmarks spring back into life. My test profiles have thousands of such records.
Comment 6•9 years ago
|
||
I'm going to steal this as a meta bug for the general problem discussed above.
Blocks: 1214117
Summary: Desktop fails to upload some new bookmarks, leaving the server inconsistent → Changes made during a Sync may be ignored causing items to be missing or deleted items to be resurrected
Reporter | ||
Comment 7•9 years ago
|
||
I've also seen the opposite, just now: a record that's present on the server, not deleted, yet not present in its parent's child list. The consistency checker catches this, too.
Comment 8•9 years ago
|
||
To be clear, isn't this the same issue as comment 5? ie, a bookmark manually deleted during a sync causes it to be removed locally but left on the server, and thus resurrected next time a client does a full sync?
Reporter | ||
Comment 9•9 years ago
|
||
(In reply to Mark Hammond [:markh] from comment #8)
> To be clear, isn't this the same issue as comment 5? ie, a bookmark manually
> deleted during a sync causes it to be removed locally but left on the
> server, and thus resurrected next time a client does a full sync?
I think in your scenario the server is still consistent: the deletion is missed, and neither the deleted record nor the changed parent is uploaded, right?
In this situation a record is present on the server but its parent on the server doesn't mention it. That means we either:
* Uploaded the new record, but didn't upload its changed parent.
* Raced with another client uploading that parent, losing that write.
* Deleted the record locally, uploaded the parent but not the deleted record, and then redownloaded the record and didn't update the server parent.
In theory, the last scenario should mark the parent as changed when we find a parent folder for the new record, so the server parent should end up consistent. Dunno if it works that way in practice, but that's the theory :)
Updated•9 years ago
|
Priority: P2 → P1
Updated•9 years ago
|
Updated•9 years ago
|
Reporter | ||
Comment 10•9 years ago
|
||
Mark: would it be worth splitting this bug into "don't fail to track changes" and "recover from having missed changes"?
The latter might be a once-per-week hook before/after a sync or something like that -- fairly tractable -- and I'd love for clients to be able to recover from this automatically.
The hook might be as simple as fetching info/collection_counts, fetching and processing the ID list as in Comment 2 if the counts don't match.
Flags: needinfo?(markh)
Reporter | ||
Comment 11•9 years ago
|
||
We now have preliminary telemetry data from iOS users. This is a very small sample size: about 63 users.
Above 0 bookmarks, your chance of your server data being correct is 68%.
Above 222 bookmarks, your chance is 50-50.
Above 1000 bookmarks, your chance is just 37%.
20 users have bookmark data that isn't correct. Only one user recovered on a subsequent attempt.
Of these 20 users, there were:
9 users with parentid disagreements -- where a record's parentid did not match up to the children array in a folder.
12 users with missing records -- where a record is mentioned as a child of a folder but not present on the server.
16 users with missing structure -- where a folder isn't present but a record claims that's where it lives.
8 users with overlapping structure -- where more than one folder claims to own a child.
Probably many of these are caused by the same root issue: failing to track a change, and an ongoing sync seeing partial state.
In short: our frequent bookmark users have a very high chance of failure, and we need to do significant work to make sure that desktop clients behave more correctly.
OS: Mac OS X → All
Reporter | ||
Comment 12•9 years ago
|
||
Comment 13•9 years ago
|
||
How would I have gotten into a state of my remote having 342 records and my local having 207 records?
Comment 14•9 years ago
|
||
Orphaned records could be one. Richard do you have a JS script which does the analysis in comment 11?
Comment 15•9 years ago
|
||
(In reply to Chris Karlof [:ckarlof] from comment #13)
> How would I have gotten into a state of my remote having 342 records and my
> local having 207 records?
Deleted records?
Comment 16•9 years ago
|
||
> Deleted records?
Yeah. I should write a script to figure this out. :)
Reporter | ||
Comment 17•9 years ago
|
||
(In reply to Nick Alexander :nalexander from comment #15)
> (In reply to Chris Karlof [:ckarlof] from comment #13)
> > How would I have gotten into a state of my remote having 342 records and my
> > local having 207 records?
>
> Deleted records?
Most likely exactly that: every bookmark you delete locally remains as a deleted record on the server, forever and ever, until you get assigned to a new server or change your password.
Something like this should print "135" for you:
---
Components.utils.import("resource://services-sync/main.js");
Components.utils.import("resource://services-sync/record.js");
let collection = "bookmarks";
let bme = Weave.Service.engineManager.get(collection);
let recordType = bme._recordObj;
let coll = new Collection(Weave.Service.storageURL + collection, recordType, Weave.Service);
var deleted = [];
let collKey = Weave.Service.collectionKeys.keyForCollection(collection);
coll.full = true;
coll.limit = 500;
coll.recordHandler = function(item) {
item.collection = collection;
item.decrypt(collKey);
if (item.cleartext.deleted) {
deleted.push(item.id);
}
};
coll.get();
console.log("Deleted: " + deleted.length);
---
Reporter | ||
Comment 18•9 years ago
|
||
(In reply to Chris Karlof [:ckarlof] from comment #14)
> Richard do you have a JS script which does
> the analysis in comment 11?
Sadly not; this relies on the structure extracted by the iOS bookmark mirrorer. The validation is expressed in SQL here:
https://github.com/mozilla/firefox-ios/blob/master/Storage/SQL/SQLiteBookmarksSyncing.swift#L677
It would be easy enough to write a bit of desktop JS to download everything and do similar analysis in memory.
Reporter | ||
Updated•9 years ago
|
Updated•9 years ago
|
Updated•8 years ago
|
Whiteboard: [sync-data-integrity]
Comment 19•8 years ago
|
||
this is essentially a dupe of bug 1258127 , just setting it as blocks for now.
Whiteboard: [sync-data-integrity] → [data-integrity]
Reporter | ||
Comment 20•8 years ago
|
||
Probably should check the other engines for similar problems, too.
Comment 21•8 years ago
|
||
(In reply to Richard Newman [:rnewman] from comment #20)
> Probably should check the other engines for similar problems, too.
Yep, we will probably only get to bookmarks this quarter.
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(markh)
Resolution: --- → DUPLICATE
Assignee | ||
Updated•6 years ago
|
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in
before you can comment on or make changes to this bug.
Description
•