Consider storing the bookmarks mirror as a JSON file instead of a SQLite DB
Categories
(Firefox :: Sync, task, P3)
Tracking
()
People
(Reporter: lina, Unassigned)
References
Details
We currently keep remote bookmarks in a separate SQLite DB, which we attach to the main Places database when we merge.
In https://phabricator.services.mozilla.com/D39890, Mak and Nan (CCed) suggested getting rid of SQLite for this, and storing them in a JSON file. When we merge in Rust, we'd slurp the entire file into memory, and update Places directly from that state instead of using a temp table:
Did we ever evaluate keeping the mirror in a large memory object, and just write it down periodically, maybe even as json? I mean, Chrome keeps bookmarks in a json file, they just load and manipulate that object in memory, and write it to disk periodically. Would something similar work for the mirror? If manipulation is done from the async thread, it would still block sqlite, ensuring data coherence, but it would be faster (no I/O). Writing could happen separately and not block the db.
And:
For the future I would like to extend the discussion of alternatives, also considering memory a viable alternative. That would obviously require some rewrite of the existing code, because it would not be anymore matter of copying across tables, but rather just binding memory values and executing writes. Considered where the tecnology is going (memory is becoming extremely cheap, even low value laptops and mobiles come with 8GB) and that this merge process is not continuous, I think it would be ok to use more memory for multiple timeframes. In the end we are worried about continuous memory usage and growth, not burst usage.
And:
As mak's comments above, would it be possible to use an in-memory structure here instead of the persisted store? If the whole merge could be done in a short period of time with a high success rate (so we don't need to persist it), a transient memory consumption sounds like a decent trade-off here.
We don't really take advantage of SQL for any mirror queries. Fetching the remote bookmarks tree scans items
and structure
, and the one query we run in Store::prepare
can be rewritten with Rust hash maps. Ditto for reupload telemetry, when we add it.
Comment 1•5 years ago
|
||
There's a mention that chrome uses raw JSON for all bookmarks - are we considering that too? If not, why not?
It seems like whenever we start using raw json on disk, we end up wishing we'd used sqlite, and now when using sqlite, we end up wishing we'd used raw json :(
For my info, what's the key benefit here? What should the relative priority be? Should it be prioritized at roughly the same priority as "use raw json for all bookmarks"?
Reporter | ||
Comment 2•5 years ago
|
||
It seems like whenever we start using raw json on disk, we end up wishing we'd used sqlite, and now when using sqlite, we end up wishing we'd used raw json :(
😅
Should it be prioritized at roughly the same priority as "use raw json for all bookmarks"?
Unless we're planning to move to JSON for all bookmarks soon, this bug is a comfortable P3. If we see slow merges caused by fetch_remote_tree
and update_local_items_in_places
that block the main Places connection, we can also revisit—and there are other things we can try before moving the mirror into JSON, like using a read-only connection to fetch the tree, and running a transaction on the Places connection's thread when we're ready to write.
Comment 3•4 years ago
|
||
mark sayz wontfix
Description
•