Closed
Bug 114696
Opened 23 years ago
Closed 22 years ago
Permission checking in queries not optimal
Categories
(Bugzilla :: Query/Bug List, defect, P1)
Tracking
()
RESOLVED
FIXED
Bugzilla 2.18
People
(Reporter: dkl, Assigned: dkl)
References
Details
(Keywords: topperf)
Attachments
(2 files, 12 obsolete files)
(deleted),
patch
|
Details | Diff | Splinter Review | |
(deleted),
text/plain
|
myk
:
review+
|
Details |
I have found in practice that buglist.cgi will return results from a large query much
faster when they SelectVisible() wrapper is not used for permission checking. I
feel the performance hit is due to the large amount of outer joins that are being
used. To show the difference you can simple comment out the SelectVisible line in
buglist.cgi and re-run your query. The PostgreSQL branch was basically unusable
without this removal with a large amount of bugs (50,000+). So in the PostgreSQL
branch and now the Groups branch, I have altered to CanSeeBug to both take a
single bug and a list of bugs returning the list of bugs that can be seen by the
current user. This only adds one or two extra SQL queries (one for PostgreSQL,
two for Groups) to each buglist.cgi hit. For details you can check out the code from
either the Bugzilla_PgSql_Branch or the Bugzilla_Groups_Branch. I would be
interested in seeing how this performs with 100,000+ bugs such as
bugzilla.mozilla.org. Also if anyone is better at reading EXPLAIN output it would be
helpful to compare the two.
Comment 1•23 years ago
|
||
dkl: can you give the sql gnereated by a query for all bugs asa has commented
on, both before and after this change?
If theres a large speedup on bmo, and the patch is simple, then maybe we could
get this in for 2.16? (unless the group stuff is landing before then, of course)
Post the explain output, too, I guess.
Assignee | ||
Comment 2•23 years ago
|
||
I imagine this specific request would be difficult since I do not have a copy of the
current mozilla database to run against my new code. I can simulate something by
picking someone here at redhat and running a query against our database on bugs
they have commented on which I assume will serve the same purpose.
Updated•23 years ago
|
Priority: -- → P2
Target Milestone: --- → Bugzilla 2.16
Comment 3•23 years ago
|
||
Not a release blocker, no patch. -> 2.18
I'm assuming we'll be able to fix this a better way when we redo the groups
stuff anyway (which is targetted for 2.18 already). Setting dependency.
Depends on: 68022
Target Milestone: Bugzilla 2.16 → Bugzilla 2.18
Comment 4•22 years ago
|
||
Joel: comments on this bug now that you've rewritten groups?
Gerv
Comment 5•22 years ago
|
||
In light of the SQL discussions that have been going up and back now, I am going
to suggest that this bug be closed unless dkl jumps in and says otherwise. The
postgresql bug 98304 is home to the changes to the query structure that make
these queries efficient across multiple dbs without requiring many outer joins.
Comment 6•22 years ago
|
||
Changed name from:
Using SelectVisible() for permission checking not optimal.
To:
Permission checking in queries not optimal across DBs
From IRC discussion....
This bug is mutating to cover the changes to the Search.pm query to make it
closer to cross-db compatible or, at least, to unblock bug 172272
Assignee: endico → dkl
Blocks: bz-postgres
OS: Linux → All
Priority: P2 → P1
Hardware: Other → All
Summary: Using SelectVisible() for permission checking not optimal. → Permission checking in queries not optimal across DBs
Version: 2.15 → 2.17
Assignee | ||
Comment 7•22 years ago
|
||
Assignee | ||
Comment 8•22 years ago
|
||
Oops, fixed perl warning and also left out other non-group related HAVING
stuff. Better patch.
Assignee | ||
Updated•22 years ago
|
Attachment #102755 -
Attachment is obsolete: true
Comment 9•22 years ago
|
||
Comment on attachment 102758 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v2)
You need to check for isbless != 0. You also don't need the DISTINCT. For the
empty grouplist case, use NOT NULL instead of NOT IN (....). I think.
You could also only select buggroups, and hope that the db can make that more
optimal, esp for teh common case where the user isn't in any bug group.
Bring on Bugzilla::User! :)
I suspect that this is faster, anyway - is it?
Attachment #102758 -
Flags: review-
Assignee | ||
Comment 10•22 years ago
|
||
New patch with bbaetz's changes
Attachment #102758 -
Attachment is obsolete: true
Comment 11•22 years ago
|
||
Actually, when selecting the group list, you need to use isbless = 0, not
isbless != 0 or else only admins will be able to see bugs in queries.
After you do the ...
@grouplist = (0) if !@grouplist;
does the ...
+ if (@grouplist) {
+ $query .= " AND bug_group_map.group_id NOT IN (" . join(',',
@grouplist) . ") ";
+ } else {
+ $query .= " AND bug_group_map.group_id NOT NULL ";
+ }
+
still work?
Status: NEW → ASSIGNED
Comment 12•22 years ago
|
||
No, that won;t work, what you do is leave @grouplist as empty, ie lose the (0) line.
Actually, won't you want IS NULL, rather than IS NOT NULL, so that you select
unprotected bugs?
Comment 13•22 years ago
|
||
Both of bbaetz's comment 12 notes are correct.
Slightly off-topic, what is an unprotected bug with OR-groups?
Assignee | ||
Comment 14•22 years ago
|
||
@grouplist = (0) was due to lack of sleep. Fixed.
Attachment #102765 -
Attachment is obsolete: true
Assignee | ||
Comment 15•22 years ago
|
||
Reminder to myself per discussion with Joel on IRC
<dkl> joel_desk: the highlighting is only needed really when doing multibug editing
<dkl> so that you know that you are adding a private comment to a public bug or not
<dkl> we could make it where only multi bug suffers the slowdown somehow since
it is not used as often
<joel_desk> If that assumption is correct, the best option would be leave the
query alone, but have buglist do a query-per-bug only in the multi-bug-edit mode.
--- Kowalski|sleep is now known as Kowalski
<dkl> joel_desk: that is how I currently do it in the Oracle version except only
when in multi edit
<joel_desk> In which case, we may as well call an IsBugPublic function directly
from the template.
<joel_desk> Let's move that to the template in case someone has strong feelings
on the subject.
<dkl> joel_desk: or in buglist and just step one at a time through the resulting
bug array
<dkl> okay
<joel_desk> It becomes easy for a site to manage their own tradeoff.
<dkl> yeah that would be better
<dkl> I will generate a new patch
<dkl> i guess is doesnt really matter if we call it IsBugPublic() or IsBugPrivate()
<dkl> probably private better since the template can then check for a true value
<dkl> instead of !IsBugPublic()
Comment 16•22 years ago
|
||
Further notes on v4:
You REALLY need a DISTINCT when populating grouplist()
The queries generated when searching for a user on the CC list, having
commented, reporting, or assigned are way slower than they need to be. This
results from....
1) The reported and assigned EACH do their own get-the-user, then join with
profiles, then do the string compare. Really, the entire property should get
the user maching the property and then avoid repeatedly joining the profiles in
for each table.
In other words, rather than....
SELECT stuff FROM bugs LEFT JOIN cc cc_ ON cc_.bugid = bugs.id LEFT JOIN
profiles map_cc ON cc_.who = profiles.id ...... WHERE map_cc.login =
'exactname@exactdomain.com'
We should first look up the userid specified in the form and then use matching
the numeric userid as part of the ON clause when the cc_ table is joined.
This is absolutely worthwhile for exact matches and cases where there are a
small number of userids matching the form. It may even pay when there is a huge
number of matches to the form's userid, but there could be a limit to the number
of elements in an IN() clause. (is there?)
So, for matches on "is" dkl@redhat.com or "contains dkl", it should certainly
work this way. For cases like "doesn't match" dkl, is probably still has to do
the old join.
Even there, it may be possible to use...
SELECT stuff FROM bugs LEFT JOIN profiles AS profiles_leftbox ON
profiles_leftbox.login_name NOT LIKE "dkl" LEFT JOIN longdescs AS commented_ ON
commented_.who = profiles_leftbox.id AND commented_.bugid = bugs.id LEFT JOIN cc
AS cc_ ON cc_.who = profiles_leftbox.id AND cc_.bug_id = bugs.id ...morestuff
WHERE cc_.who IS NOT NULL OR commented_.who IS NOT NULL .. morestuff
Comment 17•22 years ago
|
||
Note that OR group checking is probably better, since we then wouldn't need the
COUNT stuff.
Comment 18•22 years ago
|
||
Actually, the issue in comment 16 is completely independent of AND vs OR groups
and is now being tracked under bug 175425.
The AND-vs-OR logic neutralizes out with the patch being worked here which does
eliminate the counting.
Assignee | ||
Updated•22 years ago
|
Attachment #102807 -
Attachment is obsolete: true
Attachment #102807 -
Flags: review-
Assignee | ||
Comment 19•22 years ago
|
||
Fixes:
1. Added DISTINCT back to SQL query that populates @grouplist per Joel's
request.
2. Added IsBugPrivate() to globals to be called from list template. Only
invoked
when multi-edit is chosen do to being slight slower.
3. Fixed report.cgi to work with new query in Bugzilla::Search.
4. Removed 3 dummy fields in buglist.cgi.
Comment 20•22 years ago
|
||
One nit before this goes in....
1) We should supress the call to IsPrivate() if the use is not logged in (or
else make IsPrivate() return false immediately when called by a non-logged in user)
Assignee | ||
Comment 21•22 years ago
|
||
WRT comment 20, this happens with the latest patch I attached.
<tr class="bz_[% bug.severity %] bz_[% bug.priority %] [%+ "bz_secure" IF
(dotweak && IsBugPrivate(bug.id)) %]">
dotweak in buglist.cgi requires login by default.
Updated•22 years ago
|
Attachment #103641 -
Flags: review-
Comment 22•22 years ago
|
||
Comment on attachment 103641 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v5)
I'm seeing some non-public bugs that are showing up when the user is logged
out.
This comes from the ...
+ if (@grouplist) {
+ $query .= " AND bug_group_map.group_id NOT IN (" . join(',',
@grouplist) . ") ";
+ } else {
+ $query .= " AND bug_group_map.group_id IS NULL ";
+ }
Which should be
+ if (@grouplist) {
+ $query .= " AND bug_group_map.group_id NOT IN (" . join(',',
@grouplist) . ") ";
+ }
To cause the join to grab ANY record that exists. The only time we appear to
have a NULL in that field is if the JOIN can find nothing.
Also, we forgot to permit the QA contact to see bugs. Better make sure that
the qa_contact is not zero when we do that or all bugs without QA contacts will
become visible to logged-out users.
Assignee | ||
Comment 23•22 years ago
|
||
I have made the suggested change and still get bugs listed in the search results
that shouldnt be there for a logged out user.
$query = "SELECT DISTINCT " . join(', ', @fields) .
" FROM $suppstring" .
" LEFT JOIN bug_group_map " .
" ON bug_group_map.bug_id = bugs.bug_id ";
if (@grouplist) {
$query .= " AND bug_group_map.group_id NOT IN (" . join(',', @grouplist)
. ") ";
}
$query .= " LEFT JOIN cc ON cc.bug_id = bugs.bug_id " .
" WHERE " . join(' AND ', (@wherepart, @andlist)) .
" AND ((bug_group_map.group_id IS NULL) " .
" OR (bugs.reporter_accessible = 1 AND bugs.reporter =
$::userid) " .
" OR (bugs.cclist_accessible = 1 AND cc.who = $::userid) " .
" OR (bugs.assigned_to = $::userid) OR (bugs.qa_contact =
$::userid)) " .
" GROUP BY bugs.bug_id";
Comment 24•22 years ago
|
||
I think you need to make the following conditional on the user being logged in....
" OR (bugs.reporter_accessible = 1 AND bugs.reporter =
$::userid) " .
" OR (bugs.cclist_accessible = 1 AND cc.who = $::userid) " .
" OR (bugs.assigned_to = $::userid) OR (bugs.qa_contact =
$::userid)) " .
Comment 25•22 years ago
|
||
To follow up.... I confirmed this....
quetly_check_login leaves $::userid = 0 if the user is not logged in.
qa_contact would be 0 if it is never set.
That should be the culprit.
Assignee | ||
Comment 26•22 years ago
|
||
okay, made suggested changes. New cut.
Attachment #103641 -
Attachment is obsolete: true
Assignee | ||
Comment 27•22 years ago
|
||
Fix the SQL that gets the @grouplist to be more DB independent. Had problem
with foo JOIN bar ON (some condition) with a certain other db. Changed it to be
SELECT something FROM foo, bar WHERE foo.id = bar.id AND some condition.
Which does the same thing and is more compatible.
Attachment #104061 -
Attachment is obsolete: true
Updated•22 years ago
|
Attachment #104149 -
Flags: review+
Comment 28•22 years ago
|
||
Comment on attachment 104149 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v7)
r=joel
Assignee | ||
Comment 29•22 years ago
|
||
More efficient checking for private bugs in buglist.cgi. Calls to
IsBugPrivate() no longer needed.
Attachment #104149 -
Attachment is obsolete: true
Assignee | ||
Comment 30•22 years ago
|
||
Oops. Check for @bugidlist before performing check so as not generate a SQL
error if @bugidlist empty.
Attachment #104242 -
Attachment is obsolete: true
Comment 31•22 years ago
|
||
Comment on attachment 104244 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v9)
Excellent - you did manage to get it back into the hash.
r=joel
Attachment #104244 -
Flags: review+
Comment 32•22 years ago
|
||
Comment on attachment 104244 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v9)
>+# Check for bug privacy and set $bug->{id}->{isprivate} = 1 if private
>+# to 1 or more groups
>+my %privatebugs;
The comment's wrong: that's not what you are setting.
>+if (@bugidlist) {
>+ SendSQL("SELECT DISTINCT bugs.bug_id FROM bugs, bug_group_map " .
>+ "WHERE bugs.bug_id = bug_group_map.bug_id " .
>+ "AND bugs.bug_id IN (" . join(',',@bugidlist) . ")");
>+ while (MoreSQLData()) {
>+ my ($id) = FetchSQLData();
>+ $privatebugs{$id} = 1;
>+ }
>+ foreach my $bug (@bugs) {
>+ if ($privatebugs{$bug->{id}}) {
>+ $bug->{isingroups} = 1;
>+ }
>+ }
>+}
This would be all so much easier if @bugs was actually a hash, %bugs. No need
for a separate list of IDs; no need for two loops. Is that too big a change?
The above section would look like this:
+if (%bugs) {
+ SendSQL("SELECT DISTINCT bugs.bug_id FROM bugs, bug_group_map " .
+ "WHERE bugs.bug_id = bug_group_map.bug_id " .
+ "AND bugs.bug_id IN (" . join(',', keys %bugs) . ")");
+ while (MoreSQLData()) {
+ my ($id) = FetchSQLData();
+ $bugs->{$id}{isingroups} = 1;
+ }
In fact, couldn't this be combined with the actual getting of the bug data
further up? I am reviewing this without access to the whole file, so I can't
see.
> ################################################################################
> # Template Variable Definition
>Index: report.cgi
>===================================================================
>RCS file: /cvsroot/mozilla/webtools/bugzilla/report.cgi,v
>retrieving revision 1.4
>diff -u -u -r1.4 report.cgi
Please don't mess with report.cgi, as I'm in the process of rewriting it (bug
173005). Are these changes necessary at the same time as everything else, or
can they be done in a later patch? (Also, I can't quite see what you are doing
- whay are you reading the id back from the database? If $id has always to be
selected, then you shouldn't force the caller to add it to the field list - you
should add it inside Search.pm.)
>Index: Bugzilla/Search.pm
>===================================================================
>+
>+ # Grab a list of group ids that user belongs to
>+ my @grouplist;
>+ my $query = "SELECT DISTINCT group_id FROM user_group_map, groups " .
>+ " WHERE user_group_map.group_id = groups.id " .
>+ " AND groups.isbuggroup = 1 AND user_group_map.isbless = 0 " .
>+ " AND user_group_map.user_id = $::userid";
>+ &::SendSQL($query);
>+ while (&::MoreSQLData()) {
>+ my ($id) = &::FetchSQLData();
>+ push (@grouplist, $id);
>+ }
The user's groups are, or should be, cached in $vars->{'user'}. No need to get
them again.
>+
>+ $query = "SELECT DISTINCT " . join(', ', @fields) .
>+ " FROM $suppstring" .
>+ " LEFT JOIN bug_group_map " .
>+ " ON bug_group_map.bug_id = bugs.bug_id ";
>+
>+ if (@grouplist) {
>+ $query .= " AND bug_group_map.group_id NOT IN (" . join(',', @grouplist) . ") ";
>+ }
>+
>+ $query .= " LEFT JOIN cc ON cc.bug_id = bugs.bug_id " .
>+ " WHERE " . join(' AND ', (@wherepart, @andlist)) .
>+ " AND ((bug_group_map.group_id IS NULL) ";
This will fail if @wherepart and @andlist are both empty, which can happen, I
believe.
This is great - I'm sure this will speed things up immensely.
Gerv
Attachment #104244 -
Flags: review-
Assignee | ||
Comment 33•22 years ago
|
||
Comments about patch to follow.
Attachment #104244 -
Attachment is obsolete: true
Assignee | ||
Comment 34•22 years ago
|
||
WRT to Gerv's comments:
>The comment's wrong: that's not what you are setting.
Fixed.
>+ foreach my $bug (@bugs) {
>+ if ($privatebugs{$bug->{id}}) {
>+ $bug->{isingroups} = 1;
>+ }
>+ }
>+}
>This would be all so much easier if @bugs was actually a hash, %bugs. No need
>for a separate list of IDs; no need for two loops. Is that too big a change?
This would require changes to the table.html.tmpl template to properly iterate
through each record so it might be fine just to leave it as is.
> ################################################################################
> # Template Variable Definition
>Index: report.cgi
>===================================================================
>RCS file: /cvsroot/mozilla/webtools/bugzilla/report.cgi,v
>retrieving revision 1.4
>diff -u -u -r1.4 report.cgi
Please don't mess with report.cgi, as I'm in the process of rewriting it (bug
173005). Are these changes necessary at the same time as everything else, or
can they be done in a later patch? (Also, I can't quite see what you are doing
- whay are you reading the id back from the database? If $id has always to be
selected, then you shouldn't force the caller to add it to the field list - you
should add it inside Search.pm.)
>+ # Grab a list of group ids that user belongs to
>+ my @grouplist;
>+ my $query = "SELECT DISTINCT group_id FROM user_group_map, groups " .
>+ " WHERE user_group_map.group_id = groups.id " .
>+ " AND groups.isbuggroup = 1 AND user_group_map.isbless = 0 " .
>+ " AND user_group_map.user_id = $::userid";
>+ &::SendSQL($query);
>+ while (&::MoreSQLData()) {
>+ my ($id) = &::FetchSQLData();
>+ push (@grouplist, $id);
>+ }
>The user's groups are, or should be, cached in $vars->{'user'}. No need to get
>them again.
This was actually a good idea. I had to alter GetUserInfo() in CGI.pl to pull in
group ids as well as names. Then just checked for the existence of
$::vars->{user}{groupids} in Search.pm and used those instead. Cut down one one
more unnecessary SQL query.
>+ $query .= " LEFT JOIN cc ON cc.bug_id = bugs.bug_id " .
>+ " WHERE " . join(' AND ', (@wherepart, @andlist)) .
>+ " AND ((bug_group_map.group_id IS NULL) ";
>This will fail if @wherepart and @andlist are both empty, which can happen, I
>believe.
This was already like this before so was not introduced with this patch. Should
this be filed as a separate bug?
I have removed my changes from report.cgi since Gerv stated it was being
rewritten anyway. But note that it is broken when using this patch. Either
changes need to be made to report.cgi to add bugs.bug_id as the leading field
when calling Bugzilla::Search or Bugzilla::Search needs to add it if not there
since it is using a DISTINCT in the beginning of the main query. Comments?
Comment 35•22 years ago
|
||
If we want to avoid changing report.cgi, we should append the id to the list of
fields requested by the caller instead of prepending it. However, that is
pretty hokey. It would be much better to just fix report.cgi
Assignee | ||
Comment 36•22 years ago
|
||
Fixed problem with $::vars->{user}{groupids} check. Also changed to look for
$::vars->{user}{userid} instead of $::userid since that may go away in the
future.
Attachment #104515 -
Attachment is obsolete: true
Comment 37•22 years ago
|
||
> I have removed my changes from report.cgi since Gerv stated it was being
> rewritten anyway. But note that it is broken when using this patch. Either
> changes need to be made to report.cgi to add bugs.bug_id as the leading field
> when calling Bugzilla::Search or Bugzilla::Search needs to add it if not there
> since it is using a DISTINCT in the beginning of the main query. Comments?
If the workings of Bugzilla::Search require bugs.bug_id to be the first field
(why? DISTINCT doesn't have a special relationship with the first field named)
then it should make sure that this is true itself.
There should be no onus on a user of the search API to request particular
columns; if Bugzilla::Search needs particular columns, it should request them,
and make the fact that it has done so transparent to the caller (so that if the
caller asks for fields x, y and z, the results which return have fields x, y and
z as the first three fields.)
Gerv
Assignee | ||
Comment 38•22 years ago
|
||
Fixed error problem with @{$::vars->{user}{groupids}} if not defined.
Assignee | ||
Updated•22 years ago
|
Attachment #104666 -
Attachment is obsolete: true
Comment 39•22 years ago
|
||
Assuming that any columns not retrieved are discarded anyway, we should APPEND
the bug_id to the list of fields requested if the caller doesn't ask for it.
This will unbreak reports without hacking it up.
Comment 40•22 years ago
|
||
Comment on attachment 104687 [details] [diff] [review]
Patch to Bugzilla/Search.pm for better cross DB portability (v11)
>+ $query .= " LEFT JOIN cc ON cc.bug_id = bugs.bug_id " .
>+ " WHERE " . join(' AND ', (@wherepart, @andlist)) .
>+ " AND ((bug_group_map.group_id IS NULL) ";
You don't want to be unconditionally creating a row for every person on the cc
list...
You lost the ' AND cc.who = $::userid' from the join condition, so that you get
exactly one row per bug. Plus, that LEFT JOIN is only needed if you're logged
in.
Also, make the cc AS ccMap, just to avoid possible name conflicts with joins
from the search code, and also as a form of documentation.
You also may need to change the corresponding WHERE thing for the ccs to add a
AND ccMap.who IS NOT NULL, in order to work arround
http://lists.mysql.com/cgi-ez/ezmlm-cgi?9:mss:11417 You could also use IS NOT
NULL in place of checking for equality with $::userid, rather tha in addition
to, I think; if you do both, add a comment pointing to that url, and mention
that the bug was fixed in 3.23.44.
Oh, for a SUBSELECT ;)
Attachment #104687 -
Flags: review-
Comment 41•22 years ago
|
||
You can also drop teh DISTINCT; its implied by the GROUP by
Comment 42•22 years ago
|
||
although myk noticed times increasing when the distinct was dropped. No clue why...
Comment 43•22 years ago
|
||
Anyway, even with teh JOIN change, the patch with this bug was still about 0.5
seconds slower when searching for all Bugzilla+Webtools bugs 10 times. That was
out of 1 min 20 second, though...
It may be better if the user isn't logged in, mind you.
Comment 44•22 years ago
|
||
OK, new fun stuff
GROUP BY is 10-15% slower than DISTINCT on mysql (Its also slower on pg, but a
query which takes 0.8 seconds before a vaccum takes 12 after, probably because I
have a bogus cost setting somewhere which I don't feel like chasing down)
I'm guessing, (based partly on pg's more detailed EXPLAIN results) that the
reason is that GROUP has to sort every single possible column into a temp data
structure (since they could be used for HAVING/aggregate), whilst DISTINCT can
just sort the returned columns. Anyway, let me hack this patch to handle that...
Comment 45•22 years ago
|
||
OK, I'm going to attach a patch. Since so much of buglist.cgi's time is spent
loading TT+friends, I hacked up a quick script to generate the sql, then call
SendSQL on that 1000 times. The query I used the query term 'product=foo'
w/o being logged in:
prepatch, this took 6.1 seconds
postpatch, this to 3.8
when logged in:
prepatch - 6.8 seconds
postpatch - 4.7
I only have 40 bugs in total, so those numbers aren't really meaningful.
However, on a randomly generated db with 50000 bugs (spread roughly evenly over
10 products), 20000 users, 10 groups, 100000 entries in the cc table, 100
entires in user_group_map and 50 entries in bug_group_map, running 100 times for
product = '1 existing product':
w/o being logged in:
prepatch - 48 seconds
post-patch - 18 seconds
Being logged in:
prepatch - 48 seconds
post-patch - 28 seconds
Searching on two products:
not logged in:
pre - 67 seconds
post - 33 seconds
logged in:
prepatch - 67 seconds
post patch - 55 seconds
The vast speed difference when not being logged in is mainly due to avoiding the
unneeded table joins. Re-adding the GROUP BY adds 1-2 seconds to the post patch
time; IOW its about 5%. This is close to the variance, though, so take that
number with a grain of salt.
Numbers are average of 2-3 runs, variance about +/- 0.3 seconds for the first
set, and +/- 2-3 sec for the large db, and I discarded the first run for each query.
Unless the optimiser barfs, this improvement is relative to the number of rows
found in the query; its not going to be noticable on one of the monster
comment/historical stuff. Still, it won't hurt :)
In short; I think we really want this for bmo's update; patch coming after I fix
the reporting stuff.
While we're at it, I'll point out that the current Bugzilla::Search stuff leaves
out allowing the qa contact to see the bug..... Those numbers were run with the
new stuff disabling it too, but its fixed in the patch.
myk, given the above numbers I'm not sure why you weren't seeing any difference
when we ran the queries the other day. If you don't want it immediately because
of perceived risk, we need to fix teh qa stuff regardless.
Keywords: topperf
Comment 46•22 years ago
|
||
Those numbers were all for selecting the bug_id only. If I additionally select
assigned_to (which involves an extra join), then I get:
not logged in:
pre - 79
post - 63
logged in:
pre - 70
post - 76
Removing distinct 'fixes' this. The numbers are really bad if you want reporter
and assigned_to, - they're worse by a *factor* of 3.
Consider the query to find all bugs not in a group:
SELECT bugs.bug_id, map_assigned_to.login_name FROM profiles AS map_assigned_to,
bugs LEFT JOIN bug_group_map ON bug_group_map.bug_id = bugs.bug_id WHERE
bugs.assigned_to = map_assigned_to.userid AND ((bug_group_map.group_id IS NULL));
without distinct, 2.8 seonds. With, 18.5 seconds. No, thats not a typo. Using
GROUP BY in this case is 5.8 seconds, so I'm going to switch to that. The
numbers in my previous coment roughly stay the same, and the last case where we
are worse off becomes 60 s.
The real fix for this is probably to split up the 'give me columns' stuff from
the search fu, but that would not only be a pain with the current architechture,
its probably something which would be better done using subselects. (In any
event, doing so is probably a blocker for custfields)
As (another) aside, if I change the product code to do the id lookup in
Bugzilla::Search rather than a join, stuff doesn't seem to change much; the
optimiser is obvioulsy managing to pull that out, at least
Comment 47•22 years ago
|
||
As previously discussed. The sumary for this bug probably needs to be changed;
its not about cross db support, and none of the other patches have done that
either.
myk, can I get you to run some comparison tests on bmo-based data, please?
Attachment #104687 -
Attachment is obsolete: true
Comment 48•22 years ago
|
||
A few initial comments....
In some of the earlier work dkl and I did, we found that the IN() clause will
fail if there are more than a few thousand items in it. I dont remember if this
effected Pg or MySql or both. Buglist.cgi needs a way of more incrementally
running the query to build the privacy hash. A good way could be to just let it
go through the buglist 1000 entries at a time.
For logged out users, we know in advance that none of the bugs will be
restricted, so we should skip that entire query.
isingroups will become a misnomer when we get to OR groups. How about reversing
the sense of it (only one template seems to use it) and making it "ispublic"?
This does look like it is on the right track, tough I am not certain what the
impact on Pg compatibility will be.
No longer blocks: 147275
Comment 49•22 years ago
|
||
bbaetz- got that test script handy?
Comment 50•22 years ago
|
||
[myk@buttmonkey html]$ time for i in `seq 1 1000`; do mysql -uroot bugs <
114696-anonymous-before.sql > /dev/null; done
real 0m51.317s
user 0m5.721s
sys 0m2.666s
[myk@buttmonkey html]$ time for i in `seq 1 1000`; do mysql -uroot bugs <
114696-anonymous-after.sql > /dev/null; done
real 0m50.216s
user 0m5.691s
sys 0m2.652s
[myk@buttmonkey html]$ time for i in `seq 1 1000`; do mysql -uroot bugs <
114696-myk-before.sql > /dev/null; done
real 0m51.709s
user 0m5.699s
sys 0m2.627s
[myk@buttmonkey html]$ time for i in `seq 1 1000`; do mysql -uroot bugs <
114696-myk-after.sql > /dev/null; done
real 0m50.987s
user 0m5.742s
sys 0m2.678s
Comment 51•22 years ago
|
||
Doh. myk - I'll attach it now. What sort of query were you running? As I said,
the improvement is per row retreived, so make sure that you return a few hundred
rows to notice th emost difference.
dkl - we only use IN for groups. Noone is going to have several thousand groups,
and if anyone ever does we could move to a subselect with NOT EXISTS, which
would be faster in that case anyway.
Comment 52•22 years ago
|
||
Change the user id and groups stuff as appropriate. Put the script whereever
(change the $::db_name setting, too), and run from the root bugzilla dir.
Try it after also removing the assigned_to search, too.
Comment 53•22 years ago
|
||
Ok, here's what I get with the test script:
version on b.m.o: ~28.16
tip before 114696: ~17.55
tip after 114696: ~13.73
tip after 114696 and removing duplicate condition: ~13.69
I don't fully understand these numbers, since the version on b.m.o should be
faster than the others, but there it is.
Comment 54•22 years ago
|
||
Comment on attachment 105644 [details]
test script
Ok, I'm satisfied. Testing shows significant performance improvements, and as
far as I can tell it doesn't break anything. r=myk
Attachment #105644 -
Flags: review+
Comment 55•22 years ago
|
||
a= justdave
Comment 56•22 years ago
|
||
Checked in. The cross-db support has a fair way to go, please file a separate
bug for those
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Summary: Permission checking in queries not optimal across DBs → Permission checking in queries not optimal
Comment 57•19 years ago
|
||
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR
1.0.3705)
Build Identifier: bugzilla/2.18.1
I use the bugzilla 2.18.1 and have a productA
The Group Access Controls groupA: Default/Default
groupB: Default/Default
groupC: Default/Default
groupD: Default/Default
The groupA->user(user1,user2)
groupB->user(user3,user4)
groupC->user(user5,user6)
groupD->user(user7,user8)
Reproducible: Always
Steps to Reproduce:
1. when user1 new a bug 1,select the groupA,groupB,groupC .
Actual Results:
1.the bug 1 only user1 can see.
Expected Results:
1.The users of groupA,groupB and groupC can see the bug 1,but the users of
groupD can not see the bug.
when user1 new a bug 2 ,select the groupA. the result only the groupA can see
the bug.It is right.
who can tall me ,How can i do?
Updated•12 years ago
|
QA Contact: matty_is_a_geek → default-qa
You need to log in
before you can comment on or make changes to this bug.
Description
•