Closed Bug 463513 Opened 16 years ago Closed 16 years ago

Tagging service could hold a fixed cache instead of an open node

Categories

(Toolkit :: Places, defect, P2)

defect

Tracking

()

VERIFIED FIXED
mozilla1.9.2a1

People

(Reporter: mak, Assigned: dietrich)

References

Details

(Keywords: fixed1.9.1)

Attachments

(1 file, 4 obsolete files)

For better separation and correct cleanup Tagging service could hold a local fixed hash of tags in the form id -> tag, and observe bookmarks service to update the list.
During a bookmarks batch we should constantly double check with bookmarks service for tag existance since a tag could have been removed, and rebuild the hash at the end of the batch.

see Bug 462545 for a leak caused by having an open node
see bug 432706 for batch double check reasoning
Flags: wanted1.9.1?
Assignee: nobody → dietrich
Blocks: 432706
canceling wanted? since it's blocking a 3.1 P1 blocker.
Flags: wanted1.9.1?
Attached patch v1 (obsolete) (deleted) β€” β€” Splinter Review
passes all the xpcshell and mochi tests.
Attachment #349873 - Flags: review?(mak77)
Comment on attachment 349873 [details] [diff] [review]
v1

>diff --git a/toolkit/components/places/src/nsTaggingService.js b/toolkit/components/places/src/nsTaggingService.js
>--- a/toolkit/components/places/src/nsTaggingService.js
>+++ b/toolkit/components/places/src/nsTaggingService.js
>@@ -55,6 +55,9 @@
>  * The Places Tagging Service
>  */
> function TaggingService() {
>+  this._bms.addObserver(this, false);
>+  var observerSvc = Cc[OBSS_CONTRACTID].getService(Ci.nsIObserverService);
>+  observerSvc.addObserver(this, "xpcom-shutdown", false);

This is used also in Observe, so it would be better to assign the service to a local var to use it later (or move to a getter, but since it's used on service creation i would instantiate both _bms and _obs here and would not use any getter for them).

>@@ -112,13 +116,29 @@
>     if (typeof(aTagNameOrId) == "string")
>       nameLower = aTagNameOrId.toLowerCase();
> 
>-    var root = this._tagsResult.root;
>-    var cc = root.childCount;
>-    for (var i=0; i < cc; i++) {
>-      var child = root.getChild(i);
>-      if ((nameLower && child.title.toLowerCase() == nameLower) ||
>-          child.itemId === aTagNameOrId)
>-        return child;
>+    var tagId = this._tagFolders.indexOf(aTagNameOrId);

At this point i already know if i got a tagId (so i don't need to do anything more) or a tagName, so i can use indexOf to get the id, i would move this into the initial if

>+    if (tagId == -1) {
>+      var options = this._history.getNewQueryOptions();
>+      var query = this._history.getNewQuery();
>+      query.setFolders([this._bms.tagsFolder], 1);
>+      var result = this._history.executeQuery(query, options);
>+      var root = result.root;
>+
>+      var root = this._tagsResult.root;

double root?

>+      var cc = root.childCount;
>+      for (var i=0; i < cc; i++) {
>+        var child = root.getChild(i);
>+        if ((nameLower && child.title.toLowerCase() == nameLower) ||
>+            child.itemId === aTagNameOrId)
>+          return child;
>+      }
>+    }
>+    else {
>+      var options = this._history.getNewQueryOptions();
>+      var query = this._history.getNewQuery();
>+      query.setFolders([tagId], 1);
>+      var result = this._history.executeQuery(query, options);
>+      return result.root;
>     }

However i think here we should simply check if the tag is defined in our local cache, if it is not and we are not batching the tag does not exists, if we are batching we double check with the root. Hwv my idea was to get the tags on service start (or first _tags request) with a RESULTS_AS_TAGS_QUERY with expandQueries false. Then open the tagNodes only when we need to get contained uris or know if it's empty.
Could you check if _getTagNode and _TagResult could be thrown away completely?
i'd like a new iteration on this.

> 
>     return null;
>@@ -160,30 +180,48 @@
>     return false;
>   },
> 
>+  /**
>+   * Returns the folder id for a tag, or -1 if not found.
>+   * @param [in] aTag
>+   *        string tag to search for
>+   * @returns integer id for the bookmark folder for the tag
>+   */
>+  _getIdForTag: function TS_getIdForTag(aTag) {
>+    return this._tagFolders.reduce(function(prev, current, index) {
>+      return (current && aTag.toLowerCase() == current.toLowerCase()) ? index : prev;
>+    }, -1);
>+  },

mh, why not doing this._tagFolders.indexOf(aTagName)?

>   // nsITaggingService
>   tagURI: function TS_tagURI(aURI, aTags) {
>     if (!aURI || !aTags)
>       throw Cr.NS_ERROR_INVALID_ARG;
> 
>     for (var i=0; i < aTags.length; i++) {
>-      var tagNode = this._getTagNode(aTags[i]);
>-      if (!tagNode) {
>-        if (typeof(aTags[i]) == "number")
>+      var tagId = null;
>+      if (typeof(aTags[i]) == "number") {
>+        // is it a tag folder id?
>+        if (this._tagFolders[aTags[i]])
>+          tagId = aTags[i];
>+        else
>           throw Cr.NS_ERROR_INVALID_ARG;
>-
>-        var tagId = this._createTag(aTags[i]);
>-        this._bms.insertBookmark(tagId, aURI, this._bms.DEFAULT_INDEX, null);
>       }
>       else {
>-        var tagId = tagNode.itemId;
>-        if (!this._isURITaggedInternal(aURI, tagNode.itemId, {}))
>-          this._bms.insertBookmark(tagId, aURI, this._bms.DEFAULT_INDEX, null);
>+        tagId = this._getIdForTag(aTags[i]);
>+        if (tagId == -1)
>+          tagId = this._createTag(aTags[i]);
>+      }
> 
>-        // _getTagNode ignores case sensitivity
>-        // rename the tag container so the places view would match the
>-        // user-typed values
>-        if (typeof(aTags[i]) == "string" && tagNode.title != aTags[i])
>-          this._bms.setItemTitle(tagNode.itemId, aTags[i]);
>+      if (!this._isURITaggedInternal(aURI, tagId, {}))

i don't see the point of a bool isURITaggedInternal with an optional out param, would not be cleaner a _getItemIdForTaggedURI(aURI, aTag) and check for -1 to see if item exists?

>+        this._bms.insertBookmark(tagId, aURI, this._bms.DEFAULT_INDEX, null);
>+
>+      // _getTagNode ignores case sensitivity
>+      // rename the tag container so the places view would match the
>+      // user-typed values
>+      var currentTagTitle = this._bms.getItemTitle(tagId);
>+      if (typeof(aTags[i]) == "string" && currentTagTitle != aTags[i]) {
>+        this._bms.setItemTitle(tagId, aTags[i]);
>+        this._tagFolders[tagId] = aTags[i];

can't we always be case insensitive? it would be fairly better, i don't see a point in distinguish "beluga" tag from "belUga" tag. would solve headache also for merging in future.

>+  __tagFolders: null, 
>+  get _tagFolders() {
>+    if (!this.__tagFolders) {
>+      this.__tagFolders = [];
>+      var options = this._history.getNewQueryOptions();
>+      var query = this._history.getNewQuery();
>+      query.setFolders([this._bms.tagsFolder], 1);
>+      var result = this._history.executeQuery(query, options);
>+      var root = result.root;
>+      root.containerOpen = true;
>+      var cc = root.childCount;
>+      for (var i=0; i < cc; i++) {
>+        var child = root.getChild(j);
>+        this.__tagFolders[child.itemId] = child.title;
>+      }
>+      root.containerOpen = false;
>+    }
>+    return this.__tagFolders;
>+  },

i guess if using RESULTS_AS_TAG_QUERY with expandQueries=0 would be faster, we only need tag ids and names. i think it would be faster, but maybe only visible with lot of tags and lof of tagged items.

>+  // boolean to indicate if we're in a batch
>+  _inBatch: false,
>+
>+  // nsINavBookmarkObserver
>+  onBeginUpdateBatch: function() {
>+    this._inBatch = true;
>+  },
>+  onEndUpdateBatch: function() {
>+    this._inBatch = false;
>+  },
>+  onItemAdded: function(aItemId, aFolderId, aIndex) {
>+    if (aFolderId == this._bms.tagsFolder)
>+      this._tagFolders[aItemId] = this._bms.getItemTitle(aItemId);

lowercase this? (see above)

>+  },
>+  onItemRemoved: function(aItemId, aFolderId, aIndex){
>+    if (aFolderId == this._bms.tagsFolder)
>+      delete this._tagFolders[aItemId];
>+  },
>+  onItemChanged: function(aItemId, aProperty, aIsAnnotationProperty, aValue){},

What if a tag folder title changes? should not we update our cache?

>+  onItemVisited: function(aItemId, aVisitID, time){},
>+  onItemMoved: function(aItemId, aOldParent, aOldIndex, aNewParent, aNewIndex){}
> };
> 
> // Implements nsIAutoCompleteResult
>diff --git a/toolkit/components/places/tests/unit/test_tagging.js b/toolkit/components/places/tests/unit/test_tagging.js
>--- a/toolkit/components/places/tests/unit/test_tagging.js
>+++ b/toolkit/components/places/tests/unit/test_tagging.js
>@@ -90,7 +90,7 @@
>   tag1node.containerOpen = true;
>   do_check_eq(tag1node.childCount, 2);
> 
>-  // Tagging the same url twice (or even trice!) with the same tag should be a
>+  // Tagging the same url twice (or even thrice!) with the same tag should be a
>   // no-op
>   tagssvc.tagURI(uri1, ["tag 1"]);
>   do_check_eq(tag1node.childCount, 2);

do we need a test for folder title change?
Attachment #349873 - Flags: review?(mak77) → review-
Attached patch more (obsolete) (deleted) β€” β€” Splinter Review
Attachment #349873 - Attachment is obsolete: true
Attached patch v2 (obsolete) (deleted) β€” β€” Splinter Review
simplified things, but passes all tests now, and added test for previously untested mixed id/tag params.

_getTagNode() is much simplified, but is not using the cached query result even when not batching, as i couldn't manage to open a tag node. i'm probably missing something silly. marco, is there something special i need to do in order to open RESULT_AS_TAG_CONTENTS nodes? if i get a child of the tag root node from a RESULT_AS_TAG_QUERY query, and attempt to open it normally (QI to container, and open), there are no children when there should be.
Attachment #350126 - Attachment is obsolete: true
Attachment #351507 - Flags: review?(mak77)
(In reply to comment #5)
> marco, is there something special i need to do in
> order to open RESULT_AS_TAG_CONTENTS nodes? if i get a child of the tag root
> node from a RESULT_AS_TAG_QUERY query, and attempt to open it normally (QI to
> container, and open), there are no children when there should be.

Have you tried QI to query rather than to generic container?
Attachment #351507 - Flags: review?(mak77) → review-
Comment on attachment 351507 [details] [diff] [review]
v2

while there please define Cu and use it for import

>diff --git a/toolkit/components/places/src/nsTaggingService.js b/toolkit/components/places/src/nsTaggingService.js
>--- a/toolkit/components/places/src/nsTaggingService.js
>+++ b/toolkit/components/places/src/nsTaggingService.js
>@@ -55,15 +55,14 @@
>  * The Places Tagging Service
>  */
> function TaggingService() {
>+  this._bms = Cc[BMS_CONTRACTID].getService(Ci.nsINavBookmarksService);
>+  this._bms.addObserver(this, false);
>+
>+  this._obss = Cc[OBSS_CONTRACTID].getService(Ci.nsIObserverService);
>+  this._obss.addObserver(this, "xpcom-shutdown", false);
> }
> 
> TaggingService.prototype = {
>-  get _bms() {
>-    if (!this.__bms)
>-      this.__bms = Cc[BMS_CONTRACTID].getService(Ci.nsINavBookmarksService);
>-    return this.__bms;
>-  },
>-
>   get _history() {
>     if (!this.__history)
>       this.__history = Cc[NH_CONTRACTID].getService(Ci.nsINavHistoryService);
>@@ -79,15 +78,10 @@
>   get _tagsResult() {
>     if (!this.__tagsResult) {
>       var options = this._history.getNewQueryOptions();
>+      options.resultType = Ci.nsINavHistoryQueryOptions.RESULTS_AS_TAG_QUERY;
>       var query = this._history.getNewQuery();
>-      query.setFolders([this._bms.tagsFolder], 1);
>       this.__tagsResult = this._history.executeQuery(query, options);
>       this.__tagsResult.root.containerOpen = true;

is not this what we are trying to avoid, having an open node around that could leak at exit?
it's already opened below in tagFolders ans since it is the only place where we are
using it, i would directly get rid of this internal method and move the query there.


>-   * If there's no tag with the given name, null is returned;
>+   * If there's no tag with the given name or id, null is returned;
>    */
>   _getTagNode: function TS__getTagIndex(aTagNameOrId) {

While there, please change the different function name

>-    var nameLower = null;
>-    if (typeof(aTagNameOrId) == "string")
>-      nameLower = aTagNameOrId.toLowerCase();
>+    var tagId = null;
>+    if (Number(aTagNameOrId) == aTagNameOrId)

is this faster then typeOf? What happens if i have a numeric tag like "1337"?
I think we should go back to the typeof comparison or use === for type comparison
Also please add a test to catch this case (numeric tag), or file a followup for it.

>+      tagId = aTagNameOrId;
>+    else
>+      tagId = this._getItemIdForTag(aTagNameOrId);
> 
>-    var root = this._tagsResult.root;
>-    var cc = root.childCount;
>-    for (var i=0; i < cc; i++) {
>-      var child = root.getChild(i);
>-      if ((nameLower && child.title.toLowerCase() == nameLower) ||
>-          child.itemId === aTagNameOrId)
>-        return child;
>-    }
>+    if (tagId == -1)
>+      return null;
> 
>-    return null;
>+    var options = this._history.getNewQueryOptions();
>+    var query = this._history.getNewQuery();
>+    query.setFolders([tagId], 1);
>+    var result = this._history.executeQuery(query, options);
>+    result.root.containerOpen = true;

same as above, i think we should try to avoid having open containers around for tests leaking.

>+    return result ? result.root : null;
>   },
> 
>   /**
>@@ -141,23 +137,36 @@
>    *
>    * @param [in] aURI
>    *        url to check for
>-   * @param [in] aTagId
>-   *        id of the folder representing the tag to check
>-   * @param [out] aItemId
>-   *        the id of the item found under the tag container
>-   * @returns true if the given uri is tagged with the given tag, false
>+   * @param [in] aTag
>+   *        the tag to check for
>+   * @returns the item id if the URI is tagged with the given tag, -1
>    *          otherwise.
>    */
>-  _isURITaggedInternal: function TS__uriTagged(aURI, aTagId, aItemId) {
>+  _getItemIdForTaggedURI: function TS__getItemIdForTaggedURI(aURI, aTag) {

i guess if aTagName would be better since we only support that

>+  /**
>+   * Returns the folder id for a tag, or -1 if not found.
>+   * @param [in] aTag
>+   *        string tag to search for
>+   * @returns integer id for the bookmark folder for the tag
>+   */
>+  _getItemIdForTag: function TS_getItemIdForTag(aTag) {
>+    for (var i in this._tagFolders) {
>+      if (aTag.toLowerCase() == this._tagFolders[i].toLowerCase())
>+        return i;
>+    }
>+    return -1;
>   },

even if it would be stupid passing an id to get an id i would use aTagName (as above)

>+
>+      // Rename the tag container so the Places view would match the
>+      // most-recent user-typed values.
>+      var currentTagTitle = this._bms.getItemTitle(tagId);
>+      if (currentTagTitle != tag) {
>+        this._bms.setItemTitle(tagId, tag);
>+        this._tagFolders[tagId] = tag;

This is still a bit obscure to me, what's the use case?

>       }
>     }
>   },
>@@ -195,7 +213,10 @@
>    *        the item-id of the tag element under the tags root
>    */
>   _removeTagIfEmpty: function TS__removeTagIfEmpty(aTagId) {
>-    var node = this._getTagNode(aTagId).QueryInterface(Ci.nsINavHistoryContainerResultNode);
>+    var node = this._getTagNode(aTagId);
>+    if (!node)
>+      return;
>+    node.QueryInterface(Ci.nsINavHistoryContainerResultNode);
>     var wasOpen = node.containerOpen;

if we don't pre-open the node this check should be useless

> 
>+  __tagFolders: null, 
>+  get _tagFolders() {
>+    if (!this.__tagFolders) {
>+      this.__tagFolders = [];
>+      var root = this._tagsResult.root;
>+      root.containerOpen = true;
>+      var cc = root.childCount;
>+      for (var i=0; i < cc; i++) {
>+        var child = root.getChild(j);
>+        this.__tagFolders[child.itemId] = child.title;

what about lowercasing at this point instead that redoing it later when comparing?
Bad for AllTags? Do we want to support FaNcY tags?

>+  onItemAdded: function(aItemId, aFolderId, aIndex) {
>+    if (aFolderId == this._bms.tagsFolder &&
>+        this._bms.getItemType(aItemId) == this._bms.TYPE_FOLDER)
>+      this._tagFolders[aItemId] = this._bms.getItemTitle(aItemId);
>+  },
>+  onItemChanged: function(aItemId, aProperty, aIsAnnotationProperty, aValue){
>+    if (this._tagFolders[aItemId])
>+      this._tagFolders[aItemId] = this._bms.getItemTitle(aItemId);
>+  },

If we do the lowercase in the cache, remeber to do it here too, otherwise nevermind.

>+  onItemVisited: function(aItemId, aVisitID, time){},
>+  onItemMoved: function(aItemId, aOldParent, aOldIndex, aNewParent, aNewIndex){}

I know that's something that should NEVER happen but i guess if, for sanity, we should serve
the case where a tag folder is moved out from tag root, or a bookmark is moved out of a tag folder.
> is this faster then typeOf? What happens if i have a numeric tag like "1337"?
> I think we should go back to the typeof comparison or use === for type
> comparison
> Also please add a test to catch this case (numeric tag), or file a followup for
> it.

for some reason, the keys were being converted to strings from getIdForTag. instead now forcing to int there.

> >+      // Rename the tag container so the Places view would match the
> >+      // most-recent user-typed values.
> >+      var currentTagTitle = this._bms.getItemTitle(tagId);
> >+      if (currentTagTitle != tag) {
> >+        this._bms.setItemTitle(tagId, tag);
> >+        this._tagFolders[tagId] = tag;
> 
> This is still a bit obscure to me, what's the use case?

to respect user behavior, we want to support the casing that the user has most recently used.

> >+  __tagFolders: null, 
> >+  get _tagFolders() {
> >+    if (!this.__tagFolders) {
> >+      this.__tagFolders = [];
> >+      var root = this._tagsResult.root;
> >+      root.containerOpen = true;
> >+      var cc = root.childCount;
> >+      for (var i=0; i < cc; i++) {
> >+        var child = root.getChild(j);
> >+        this.__tagFolders[child.itemId] = child.title;
> 
> what about lowercasing at this point instead that redoing it later when
> comparing?
> Bad for AllTags? Do we want to support FaNcY tags?

yes. we populate nsITaggingService.allTags from this, so should store the user values, not normalized ones.
Attached patch v3 (obsolete) (deleted) β€” β€” Splinter Review
Attachment #351507 - Attachment is obsolete: true
Attachment #352048 - Flags: review?(mak77)
Attachment #352048 - Flags: review?(mak77) → review+
Comment on attachment 352048 [details] [diff] [review]
v3

>+  _getTagResult: function TS__getTagResult(aTagNameOrId) {
>     if (!aTagNameOrId)
>       throw Cr.NS_ERROR_INVALID_ARG;
> 
>-    var nameLower = null;
>-    if (typeof(aTagNameOrId) == "string")
>-      nameLower = aTagNameOrId.toLowerCase();
>+    var tagId = null;
>+    if (typeof(aTagNameOrId) == "string") {
>+      dump(aTagNameOrId + " is a string!\n");

this was probably debug code (to be removed?)

>@@ -238,8 +258,9 @@
>       throw Cr.NS_ERROR_INVALID_ARG;
> 
>     var uris = [];
>-    var tagNode = this._getTagNode(aTag);
>-    if (tagNode) {
>+    var tagResult= this._getTagResult(aTag);

nit: space after tagResult

>+    if (tagResult) {
>+      var tagNode = tagResult.root;
>       tagNode.QueryInterface(Ci.nsINavHistoryContainerResultNode);
>       tagNode.containerOpen = true;
>       var cc = tagNode.childCount;
>@@ -263,15 +284,10 @@
> 
>     var tags = [];
>     var bookmarkIds = this._bms.getBookmarkIdsForURI(aURI, {});
>-    var root = this._tagsResult.root;
>-    var cc = root.childCount;
>     for (var i=0; i < bookmarkIds.length; i++) {
>-      var parent = this._bms.getFolderIdForItem(bookmarkIds[i]);
>-      for (var j=0; j < cc; j++) {
>-        var child = root.getChild(j);
>-        if (child.itemId == parent)
>-          tags.push(child.title);
>-      }
>+      var folderId = this._bms.getFolderIdForItem(bookmarkIds[i]);
>+      if (this._tagFolders[folderId])
>+        tags.push(this._tagFolders[folderId]);
>     }
> 
>     // sort the tag list
>@@ -280,48 +296,73 @@
>     return tags;
>   },
> 
>+  __tagFolders: null, 
>+  get _tagFolders() {
>+    if (!this.__tagFolders) {
>+      this.__tagFolders = [];
>+      var options = this._history.getNewQueryOptions();
>+      options.resultType = Ci.nsINavHistoryQueryOptions.RESULTS_AS_TAG_QUERY;

could you try if setting expandQueries=0 makes this faster?

please remember to add a test for numeric tags like "123", or file a followup for
it.

apart these minor things, r=me
blocks a blocker, marking blocking+.
Flags: blocking1.9.1+
Priority: -- → P2
Target Milestone: --- → mozilla1.9.1b3
Attached patch v3.1 (for check-in) (deleted) β€” β€” Splinter Review
comments fixed, test added, for check-in.
Attachment #352048 - Attachment is obsolete: true
> 
> could you try if setting expandQueries=0 makes this faster?
> 

profiled locally, no noticeable difference.

> please remember to add a test for numeric tags like "123", or file a followup
> for
> it.

added
Whiteboard: [has patch][can land]
notice we have reached the leaks threshold for tests on linux boxes, most of our leaks are due to this (practically every test that uses tagging service is leaking), so this should be pushed asap.
pushed to trunk:
http://hg.mozilla.org/mozilla-central/rev/89a175c70c25
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Depends on: 468836
Whiteboard: [has patch][can land]
V.Fixed, per bug 462545 and bug 464816.
Status: RESOLVED → VERIFIED
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/b45dae37ca01
Keywords: fixed1.9.1
Target Milestone: mozilla1.9.1b3 → mozilla1.9.2a1
Flags: in-testsuite+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: