Closed Bug 298471 Opened 19 years ago Closed 13 years ago

Searches should understand synonyms and antonyms of search terms

Categories

(Bugzilla :: Query/Bug List, enhancement, P4)

2.19.3
enhancement

Tracking

()

RESOLVED DUPLICATE of bug 198820

People

(Reporter: LpSolit, Unassigned)

References

(Blocks 2 open bugs)

Details

Many duplicates are due to a bad choice of words used when querying the summary of existing bugs. We should have a synonyms table which would convert these 'bad' words to something more meaningful, such as 'request tracker' would be converted to 'flag', 'owner' to 'assignee', 'hardware' to 'platform', etc... Then, this converted query would be used and sent to the DB. In order to make this search efficient, each new bug commited to Bugzilla should follow a 'conversion' process similar (identical?) to the one proposed above and the corresponding 'meaningful' words should be stored in the (new?) keywords field. Words which have no entry in the synonyms table, or which are designed as the primary word (in the example above: flag, assignee, platform) would be stored as is (words such as 'the', 'a', 'or', 'and', etc... should be omitted). Once a new bug has been submitted, its summary would no longer affect the keywords field, even if it's being changed. Only users with canconfirm and/or editbugs privs would be allowed to add/change/remove keywords, depending on the exact topic of the bug. As mentioned above, the converted query would not look at the summary field anymore, but at this keywords field, which should give better results as both the query and stored keywords have gone through the same conversion process. The result should be displayed by relevance (bugs matching all keywords of the query being displayed first, then the ones which miss one keyword, etc...). Another advantage is that it makes the relation between similar or related bugs very easy, see bugs depending on this one (we could say that two bugs are similar or related if 2-3 or more keywords match). This would also make triage much easier (also to find dupes). And finally, we could even imagine to watch some given keywords (bug 34787).
If you are going to do this, it probably pays to put the words to be ignored in the same table as the synonyms and also cause the new field to still be treated like a phrase rather than an unordered list of words. You may want to look for some articles on how the big search engines work.
Blocks: 154571
Depends on: 597833
No longer depends on: 597833
There are search engines that can already do this (look at Google). We don't need a special table. This is like P3.5, not really P4.
Priority: -- → P4
Summary: add a synonyms table and populate the (new) keywords field → Searches should understand synonyms and antonyms of search terms
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.