Closed Bug 579568 Opened 14 years ago Closed 14 years ago

Search.pm: Improve the implementation and performance of substring and "words" searches

Tracking

()

Status:

RESOLVED FIXED

Milestone:

Bugzilla 4.2

People

(Reporter: mkanat, Assigned: mkanat)

References

(Blocks 2 open bugs)

Details

Votes:

Bug Flags:

mkanat

approval

Attachments

(1 file, 1 obsolete file)

Work In Progress 14 years ago Max Kanat-Alexander (deleted), patch		Details \| Diff \| Splinter Review
v1 14 years ago Max Kanat-Alexander (deleted), patch	mkanat : review+	Details \| Diff \| Splinter Review

Max Kanat-Alexander

Assignee

Description

•

14 years ago

There are a lot of things that the words/substring searches do that are non-optimal, and their implementation is a little hard to read. I have some improvements that I know I can make to them to improve the performance of the "words" searches, and the architecture of both. This will probably mitigate bug 157952, even if it doesn't solve it totally.

Max Kanat-Alexander

Assignee

Comment 1

•

14 years ago

Attached patch Work In Progress (obsolete) (deleted) — Details — Splinter Review

This works, but it breaks the flagtypes.name/anywords search in MySQL.

Assignee: query-and-buglist → mkanat

Status: NEW → ASSIGNED

Max Kanat-Alexander

Assignee

Comment 2

•

14 years ago

Attached patch v1 (deleted) — Details — Splinter Review

This improves the performance and implementation of the substring and words searches. I changed the definition of a "word" to include numbers, so "blah" will not match "blah1" now, if used as an anywords/allwords/nowords search. Also, for anything using build_subselect, I just return the subselect SQL instead of doing another query, now. This should be a bit faster, particularly in situations where doing fewer queries significantly impacts performance (like a MySQL server connected over a slightly laggy link to the webserver). On some databases, the anywords/allwords/nowords searches will now work properly with Unicode--it depends on the Unicode support of the regular expression engine in the database in use, now. I also made Search.pm generate slightly more readable SQL.

Attachment #458033 - Attachment is obsolete: true

Attachment #458134 - Flags: review+

Max Kanat-Alexander

Assignee

Updated

•

14 years ago

Flags: approval+

Max Kanat-Alexander

Assignee

Comment 3

•

14 years ago

By the way, this causes some tests to pass on Pg that previously did not pass, but it *only* fixes them on Pg. Instead of implementing a whole new framework to deal with that slight testing difference on Pg, I'm just going to let the Pg xt tinderbox turn orange for now.

Max Kanat-Alexander

Assignee

Comment 4

•

14 years ago

The WORD_START and WORD_END constants may need to be different on Oracle and MS-SQL, depending on how their regex engines work.

Max Kanat-Alexander

Assignee

Comment 5

•

14 years ago

Note that MS-SQL and Oracle may still need fixes. Committing to: bzr+ssh://bzr.mozilla.org/bugzilla/trunk/ modified Bugzilla/DB.pm modified Bugzilla/Search.pm modified xt/lib/Bugzilla/Test/Search/Constants.pm Committed revision 7381.

Status: ASSIGNED → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Frédéric Buclin

Comment 6

•

14 years ago

Do you have some metrics which show the perf improvement? This is to see if we are talking about 2x faster, 50% faster or 5% faster; just to get an idea.

Max Kanat-Alexander

Assignee

Comment 7

•

14 years ago

(In reply to comment #6) > Do you have some metrics which show the perf improvement? This is to see if we > are talking about 2x faster, 50% faster or 5% faster; just to get an idea. I'd like to, but I don't. I don't have a test bed that can tell me that effectively enough. It's still too slow to reasonably use on bmo without any other criteria. However, it's reasonably fast now if you specify some additional criteria, like limiting it to a product.

Frédéric Buclin

Updated

•

12 years ago

Blocks: 814361

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Search.pm: Improve the implementation and performance of substring and "words" searches

Categories

(Bugzilla :: Query/Bug List, enhancement)

Tracking

()

People

(Reporter: mkanat, Assigned: mkanat)

References

(Blocks 2 open bugs)

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file, 1 obsolete file)

Description

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Updated

Attachment

General

Description

File Name

Content Type