Closed
Bug 548382
Opened 15 years ago
Closed 15 years ago
audit Places API usage in bookmarks/history sync engines
Categories
(Firefox :: Sync, defect)
Tracking
()
RESOLVED
FIXED
1.3b1
People
(Reporter: mconnor, Assigned: Mardak)
References
(Blocks 1 open bug)
Details
This is not immediate-priority, but the sooner the better.
http://mxr.mozilla.org/labs-central/source/weave/source/modules/engines/history.js
http://mxr.mozilla.org/labs-central/source/weave/source/modules/engines/bookmarks.js
http://mxr.mozilla.org/labs-central/source/weave/source/modules/type_records/bookmark.js
Comment 2•15 years ago
|
||
http://mxr.mozilla.org/labs-central/source/weave/source/modules/engines/bookmarks.js
let kSpecialIds = {};
[["menu", "bookmarksMenuFolder"],
["places", "placesRoot"],
["tags", "tagsFolder"],
["toolbar", "toolbarFolder"],
["unfiled", "unfiledBookmarksFolder"],
so, unless you need it for some special think, i'm unsure what you need placesRoot for...
You must pay attention to avoid walking it, since it also contains roots we don't want to save and restore, like the Library left pane folder
also in observers you should probably skip changes to any root that is not wanted (the left pane folder for example)... this is not easy, though, till we implement getting all ancestors of them :(
function GUIDForId(placeId) {
for (let [guid, id] in Iterator(kSpecialIds))
if (placeId == id)
return guid;
return Svc.Bookmark.getItemGUID(placeId);
}
never use placeid name unless it's a place id, bookmarks have an itemId, an what is returned by addVisit is a visitId.
place id is never exposed by the API.
_sync: Utils.batchSync("Bookmark", SyncEngine),
_syncStartup: function _syncStart() {
SyncEngine.prototype._syncStartup.call(this);
// Lazily create a mapping of folder titles and separator positions to GUID
this.__defineGetter__("_lazyMap", function() {
i'm not completely sure about the creation of this lazyMap object, looks like it does a bunch of database reads to create unique keys to bookmarks
and is only used to search for duplicates, that are even allowed in bookmarks.
From this looks like i cannot have 2 bookmarks to the same uri and titles in a folder, the API allows that.
if it's using uri and title because id could change, well separators are using position that can change the same way.
Acting on this to make it cheaper would be good, we are evaluating the possibility to have a result able to return a single node from bookmarks, it would query just once and get
all needed information (title, uri, position, parent, ...), if this can't be rewritten to do less querying it should depend on that.
i filed bug 555433 about this.
applyIncoming: function BStore_applyIncoming(record) {
// Preprocess the record before doing the normal apply
switch (record.type) {
case "query": {
// Convert the query uri if necessary
if (record.bmkUri == null || record.folderName == null)
break;
// Tag something so that the tag exists
let tag = record.folderName;
let dummyURI = Utils.makeURI("about:weave#BStore_preprocess");
this._ts.tagURI(dummyURI, [tag]);
// Look for the id of the tag (that might have just been added)
let tags = this._getNode(this._bms.tagsFolder);
when it needs to access root ids, the code should always use PlacesUtils shortcuts, this way can avoid crossing xpcom.
PlacesUtils.tagsFolderId should be available
That getNode method is a getFolderNode, btw, again we could make results API generic so that result can query a single node regardless the type.
if (!(tags instanceof Ci.nsINavHistoryQueryResultNode))
break;
tags.containerOpen = true;
for (let i = 0; i < tags.childCount; i++) {
let child = tags.getChild(i);
// Found the tag, so fix up the query to use the right id
if (child.title == tag) {
this._log.debug("query folder: " + tag + " = " + child.itemId);
record.bmkUri = record.bmkUri.replace(/([:&]folder=)\d+/, "$1" +
child.itemId);
break;
}
}
ARGH! Close containers! this is the root container of the result, not closing it means having a useless result auto updating in background for minutes, till it's collected.
/**
* Move an item and all of its followers to a new position until reaching an
* item that shouldn't be moved
*/
_moveItemChain: function BStore__moveItemChain(itemId, insertPos, stopId) {
let parentId = Svc.Bookmark.getFolderIdForItem(itemId);
if i read this correctly callers of this method already knew the parentId, would be nice to pass it in optionally and query for it only if it's not passed in
Thus we can save 1 access to the db most of the times.
remove: function BStore_remove(record) {
switch (type) {
case this._bms.TYPE_BOOKMARK:
this._log.debug(" -> removing bookmark " + record.id);
this._ts.untagURI(this._bms.getBookmarkURI(itemId), null);
this._bms.removeItem(itemId);
actually removeItem will also untagURI, su unless there is an internal reason to do so, looks like a useless call
break;
case this._bms.TYPE_FOLDER:
this._log.debug(" -> removing folder " + record.id);
Svc.Bookmark.removeItem(itemId);
break;
case this._bms.TYPE_SEPARATOR:
this._log.debug(" -> removing separator " + record.id);
this._bms.removeItem(itemId);
break;
default:
this._log.error("remove: Unknown item type: " + type);
break;
}
To me, looks like this method should just call removeItem for all the record types.
// Create a record starting from the weave id (places guid)
createRecord: function createRecord(guid) {
let placeId = idForGUID(guid);
ditto: this not a place id.
get _frecencyStm() {
this._log.trace("Creating SQL statement: _frecencyStm");
let stm = Svc.History.DBConnection.createStatement(
"SELECT frecency " +
"FROM moz_places_view " +
"WHERE url = :url");
views don't have indices, so this is most likely slow, never select from a view unless you need all records.
use UNION ALL and LIMIT 1 instead
_getParentGUIDForId: function BStore__getParentGUIDForId(itemId) {
this._log.debug("Found orphan bookmark, reparenting to unfiled");
parentid = this._bms.unfiledBookmarksFolder;
ditto, use PlacesUtils shortcuts
_getChildren: function BStore_getChildren(guid, items) {
let node = guid; // the recursion case
if (typeof(node) == "string") // callers will give us the guid as the first arg
node = this._getNode(idForGUID(guid));
if (node.type == node.RESULT_TYPE_FOLDER &&
!this._ls.isLivemark(node.itemId)) {
node.QueryInterface(Ci.nsINavHistoryQueryResultNode);
not sure why you QI to a queryresultnode, rather than to a generic containerresultnode
it is a folder moreover
node.containerOpen = true;
// Remember all the children GUIDs and recursively get more
for (var i = 0; i < node.childCount; i++) {
let child = node.getChild(i);
items[GUIDForId(child.itemId)] = true;
this._getChildren(child, items);
}
doh! close container please.
BookmarksTracker.prototype = {
/**
* Add a bookmark (places) id to be uploaded and bump up the sync score
*
* @param itemId
hey this is correct, unfortunately the "(places)" above is non sense :)
/**
* Add the successor id for the item that follows the given item
*/
_addSuccessor: function BMT__addSuccessor(itemId) {
let parentId = Svc.Bookmark.getFolderIdForItem(itemId);
let itemPos = Svc.Bookmark.getItemIndex(itemId);
the observer already had this information, why asking for it again instead of passing it along?
pass in all the info we have, make params optional and gather them if they are undefined
_ignore: function BMT__ignore(itemId, folder) {
// Ignore unconditionally if the engine tells us to
if (this.ignoreAll)
return true;
// Ensure that the mobile bookmarks query is correct in the UI
this._ensureMobileQuery();
// Make sure to remove items that have the exclude annotation
if (Svc.Annos.itemHasAnnotation(itemId, "places/excludeFromBackup")) {
put anno names in const above all the code, for easy check and changes
onItemChanged: function BMT_onItemChanged(itemId, property, isAnno, value) {
if (this._ignore(itemId))
return;
// ignore annotations except for the ones that we sync
let annos = ["bookmarkProperties/description",
"bookmarkProperties/loadInSidebar", "bookmarks/staticTitle",
"livemark/feedURI", "livemark/siteURI", "microsummary/generatorURI"];
if (isAnno && annos.indexOf(property) == -1)
return;
// Ignore favicon changes to avoid unnecessary churn
if (property == "favicon")
return;
would not be better to make these simple checks before this._ignore that to me looks like much more expensive?
onItemMoved: function BMT_onItemMoved(itemId, oldParent, oldIndex, newParent, newIndex) {
if (this._ignore(itemId))
return;
this._log.trace("onItemMoved: " + itemId);
this._addId(itemId);
this._addSuccessor(itemId);
// Get the thing that's now at the old place
let oldSucc = Svc.Bookmark.getIdForItemAt(oldParent, oldIndex);
if (oldSucc != -1)
this._addId(oldSucc);
// Remove any position annotations now that the user moved the item
Svc.Annos.removeItemAnnotation(itemId, PARENT_ANNO);
Svc.Annos.removeItemAnnotation(itemId, PREDECESSOR_ANNO);
better doing this before? btw, how does this handle if an item moves in the same folder? should it remove position annotations for all items involved?
Comment 3•15 years ago
|
||
http://mxr.mozilla.org/labs-central/source/weave/source/modules/engines/history.js
get _visitStm() {
this._log.trace("Creating SQL statement: _visitStm");
let stm = this._db.createStatement(
"SELECT visit_type type, visit_date date " +
"FROM moz_historyvisits_view " +
"WHERE place_id = (" +
"SELECT id " +
"FROM moz_places_view " +
"WHERE url = :url) " +
"ORDER BY date DESC LIMIT 10");
views don't have indices, thus the internal query is slow
use UNION ALL and LIMIT 1 instead
Also, this is selecting only type and date, while i don't care much about session (that atm is unused in UI) i care much about from_visit, since it's used to filter redirects.
from what i see this system does not track visit ids, that can be fine, but from_visit relies on those ids...
if you really want to skip from_visit, you should ignore all visits that will redirect to something other.
you could maybe replace your internal visit id reference to an "index" on uri and visit_date. we should ensure that visit_date is unique for each visit (it's impossible to visit 2 uris at the same microsecond, even if this somehow depends on OS timers resolution).
get _urlStm() {
this._log.trace("Creating SQL statement: _urlStm");
let stm = this._db.createStatement(
"SELECT url, title, frecency " +
"FROM moz_places_view " +
"WHERE id = (" +
"SELECT place_id " +
"FROM moz_annos " +
"WHERE content = :guid AND anno_attribute_id = (" +
"SELECT id " +
"FROM moz_anno_attributes " +
"WHERE name = '" + GUID_ANNO + "'))");
ditto, this will do a linear search per no indices
getAllIDs: function HistStore_getAllIDs() {
let root = this._hsvc.executeQuery(query, options).root;
root.QueryInterface(Ci.nsINavHistoryQueryResultNode);
root.containerOpen = true;
let items = {};
for (let i = 0; i < root.childCount; i++) {
let item = root.getChild(i);
let guid = GUIDForUri(item.uri, true);
items[guid] = item.uri;
}
Close containers!
so this saves visits for the last 1000 visited uris, right?
would probably be better to change this to async storage and increase the limit a bit, my history can hold 130 000 unique pages. depending on server capabilities obviously...
update: function HistStore_update(record) {
// Add visits if there's no local visit with the same date
for each (let {date, type} in record.visits)
if (curvisits.every(function(cur) cur.date != date))
Svc.History.addVisit(uri, date, null, type, type == 5 || type == 6, 0);
hm, yep this loses the referrer information
OT: fwiw, i think i've never seen any Mozilla coding style allowing loops without braces...
this._hsvc.setPageTitle(uri, record.title);
},
I hope all of these changes use runInBatchMode, since it's a bunch of changes... the same is valid for bookmarks, when doing multiple changes (more than a couple) they should be done in a batch
wipe: function HistStore_wipe() {
this._hsvc.removeAllPages();
}
note that removeAllPages atm is async, pages are removed immediately, but any other stuff will disappear later. there is a "places-expiration-finished" observer topic when done, if needed.
onBeforeDeleteURI: function onBeforeDeleteURI(uri) {
if (this.ignoreAll)
return;
this._log.trace("onBeforeDeleteURI: " + uri.spec);
if (this.addChangedID(GUIDForUri(uri, true)))
this._upScore();
},
onDeleteURI: ...
I have sadly to note that during batches we don't always fire "delete" notifications, there is bug 530782 about that :(
Comment 4•15 years ago
|
||
there is nothing to comment about the record file, it's not really Places related.
Assignee | ||
Comment 5•15 years ago
|
||
(In reply to comment #2)
> http://mxr.mozilla.org/labs-central/source/weave/source/modules/engines/bookmarks.js
Thanks for the detailed comments!
> i'm unsure what you need placesRoot for...
Pretty sure we only really use it to make sure we don't try processing it if we happen to get a notification for this id.
> never use placeid name unless it's a place id
Yeah, I believe most instances were named placeId just so that it's easier to differentiate between the bookmark itemId and weave guid.
> i'm not completely sure about the creation of this lazyMap object, looks like
> it does a bunch of database reads to create unique keys to bookmarks
> and is only used to search for duplicates, that are even allowed in bookmarks.
Yes, the lazyMap looks for duplicates on new items, but this doesn't mean the engine doesn't allow duplicates. If one machine already has duplicates, each item will have its own GUID. But yes according to the findDupe logic, duplicate bookmarks in the same folder will be pruned. This is mostly to get rid of the "new profile" bookmarks.
> when it needs to access root ids, the code should always use PlacesUtils
> shortcuts, this way can avoid crossing xpcom.
Weave also has its own local cache of ids, so some instances are just old. You might have noticed some things use _bookmark and others use Svc.Bookmark. ;)
> ARGH! Close containers! this is the root container of the result, not closing
> it means having a useless result auto updating in background for minutes, till
> it's collected.
Good to know!
> if i read this correctly callers of this method already knew the parentId,
> would be nice to pass it in optionally and query for it only if it's not
Seems like a reasonable optimization.
> actually removeItem will also untagURI, su unless there is an internal reason
> to do so, looks like a useless call
Okay. Seems like it was added in bug 486481#c23
> To me, looks like this method should just call removeItem for all the record
> types.
Could do that now that we don't have removeFolder. Probably will still need to check if it's a valid type before doing anything though.
> views don't have indices, so this is most likely slow, never select from a view
> unless you need all records.
> use UNION ALL and LIMIT 1 instead
The individual tables have the right indices, so that's why it'll be faster? I suppose ideally we shouldn't query directly anyway.
> not sure why you QI to a queryresultnode, rather than to a generic
> containerresultnode
Dunno.. This is some old code here from 2008.
> would not be better to make these simple checks before this._ignore that to me
> looks like much more expensive?
Eh.. eh.. maybe. ignoreAll is a fast boolean check, but if not it'll do the expensive annotation lookup stuff. I suppose in the common case ignoreAll isn't true, so doing the string comparisons for properties should be faster, yeah.
> how does this handle if an item moves in the
> same folder? should it remove position annotations for all items involved?
The removal annotation is if the user explicitly moves it somewhere. The annotations are normally removed after the bookmark is correctly placed (when the predecessor/parent are synced).
Comment 6•15 years ago
|
||
Thanks a lot for the super detailed audit. Way more intense than the others, but I'm not complaining!
Since the audit is done, I'm closing this. We'll still need to file bugs to followup with all the points.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment 7•15 years ago
|
||
(In reply to comment #5)
> > views don't have indices, so this is most likely slow, never select from a view
> > unless you need all records.
> > use UNION ALL and LIMIT 1 instead
> The individual tables have the right indices, so that's why it'll be faster? I
> suppose ideally we shouldn't query directly anyway.
yes, individual tables have indices, but views don't, sqlite just does not yet support them.
Reporter | ||
Comment 8•15 years ago
|
||
Please don't close this until the bugs are filed.
Assignee: mak77 → edilee
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Updated•15 years ago
|
Flags: blocking-weave1.3+
Assignee | ||
Updated•15 years ago
|
Status: REOPENED → RESOLVED
Closed: 15 years ago → 15 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•15 years ago
|
Target Milestone: 1.3 → 1.3b1
Updated•6 years ago
|
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in
before you can comment on or make changes to this bug.
Description
•