Closed Bug 285695 Opened 20 years ago Closed 19 years ago

[PostgreSQL] Username checks for login, etc. need to be case insensitive

Categories

(Bugzilla :: Database, defect, P1)

2.19.2
defect

Tracking

()

RESOLVED FIXED
Bugzilla 2.20

People

(Reporter: Tomas.Kopal, Assigned: mkanat)

References

(Blocks 1 open bug, )

Details

Attachments

(1 file, 7 obsolete files)

String matching and comparison is case insensitive on MySQL, but is not on Postgres. We need to modify the queries to do all the comparisons case-insensitive explicitly. This patch deals with login_name and realname only, other string columns will need to be checked as well.
Attached patch V1 (obsolete) (deleted) — Splinter Review
Attachment #177091 - Flags: review?(mkanat)
Comment on attachment 177091 [details] [diff] [review] V1 Unfortunately, this will kill our index performance. We need to find some way to deal with that.
Attachment #177091 - Flags: review?(mkanat) → review-
This is actually going to be somewhat of a major issue. Either we need to find some way to reliably make the Pg indexes on these lowercase, or we need to find some solution to this problem that is reliably performant on all DBs.
Priority: -- → P1
Target Milestone: --- → Bugzilla 2.20
problably a silly idea, but why not force the data to lowercase before storing it in the db? and lowercase existing data in checksetup.
OK, so my search has led me to this: WHERE LOWER(login_name) = 'something' will do a TABLE SCAN on MySQL. On bmo at least, that would KILL performance. There is NO WAY to get around this. Believe me, I've asked pretty much one of the world's foremost MySQL experts, and it just can't be done. You can't create an index on LOWER(login_name). The only workarounds are: (1) Create a new sql_icompare function. (2) Create another column, and store the lowercase version in that column, and index that and do searches on it. (3) Store all the data as lowercase. 3 can't be done (for realname), and it's also a sort of regression. 2 is a big hassle. So that leaves us with #1. I've already talked to Tomas about how we'd do the icompare function. (In reply to comment #4) > problably a silly idea, but why not force the data to lowercase before storing > it in the db? and lowercase existing data in checksetup. Because we have to do this for realname, too.
Status: NEW → ASSIGNED
Things are never as simple as they appear. sql_icompare is a nice idea, but as I started to implement it, I realised it does not handle cases where we are looking the names up using "username LIKE pattern" or even "username IN (list)". For LIKE, we can create another function, sql_ilike. But that does not help with IN. The best solution I can see ATM is to provide a function sql_istring, which would "prepare" the string for case insensitive comparison (calling LOWER($param) on ANSI DBs, just returning $param on MySQL). Then we can use this function in all cases...
Attached patch V2 (obsolete) (deleted) — Splinter Review
New attempt :-). Still fixing just login name and full name.
Attachment #177091 - Attachment is obsolete: true
Attachment #177494 - Flags: review?(mkanat)
Arrgh. If anybody has any better ideas about how we could handle this, please offer your suggestions. Otherwise I'll review the patch within the next few days.
Comment on attachment 177494 [details] [diff] [review] V2 OK, it generally looks OK, except there's been a bit of bitrot because of emailprefs. Also, what are we doing about searching for usernames in Search.pm?
Attachment #177494 - Flags: review?(mkanat) → review-
Attached patch V2.1 (obsolete) (deleted) — Splinter Review
Un-bitrotted, double-checked, polished, improved :-). Search.pm is mostly searching by user_id, the functions converting from email to id (e.g. ListIDsForEmail) should be fixed. Let me know, if you find something I overlooked.
Attachment #177494 - Attachment is obsolete: true
Attachment #181480 - Flags: review?(mkanat)
(In reply to comment #5) > (2) Create another column, and store the lowercase version in that column, and > index that and do searches on it. > 2 is a big hassle. Err, why? The space increase isn't terrible (just those two columns are duplicated in the profiles table); upgrade is easy enough - you add the lclogin_name and lcrealname and run an UPDATE to set them.
(In reply to comment #11) > (In reply to comment #5) > > > (2) Create another column, and store the lowercase version in that column, and > > index that and do searches on it. > > > 2 is a big hassle. > > Err, why? > > The space increase isn't terrible (just those two columns are duplicated in the > profiles table); upgrade is easy enough - you add the lclogin_name and > lcrealname and run an UPDATE to set them. > Well, you are right it's not such a hassle to implement it. But it can be a hassle down the road. Having basically the same data on multiple places asks for them to get out of sync and cause big trouble. It's an option, but I would like to avoid it if at all possible. And it looks like it is possible...
Blocks: 292119
Random-Internet-review-that-doesn't-really-count-because-I-haven't-reviewed-in-a-long-time of attachment 181480 [details] [diff] [review] (V2.1): >+ will not be usually used unles it was created as LOWER(column). NIT: The word "unless" is misspelled. >+ $term = $dbh->sql_position(lc($q), "LOWER($ff)") . " > 0"; >[...] > $dbh->sql_position(lc(::SqlQuote($email)), "LOWER(login_name)") . >[...] >+ "LOWER($field)") . " > 0"); >[...] >+ $dbh->sql_position($sqlstr, 'LOWER(login_name)') . " > 0" . >+ " OR " . >+ $dbh->sql_position($sqlstr, 'LOWER(realname)') . " > 0"; >[...] >+ $query .= $dbh->sql_position('?', >+ 'LOWER(profiles.login_name)') . ' > 0'; Why is "LOWER()" still being used in these places instead of $dbh->sql_istring()? Or better yet, why not change $dbh->sql_position() to automatically call sql_istring() on its second argument (if this is the way it's always used)?
Also, is the creation of table indexes (indices) using LOWER() in PostgreSQL covered in this patch (per the "Note" in this POD)? >+=item C<sql_istring> >+ >+ Description: Returns SQL syntax "preparing" a string or text column for >+ case-insensitive comparison. >+ Params: $string = string to convert (scalar) >+ Returns: formatted SQL making the string case insensitive >+ Note: The default implementation simpy calls LOWER on the parameter. >+ If this is used to search on a text column with index, the index >+ will not be usually used unles it was created as LOWER(column).
(In reply to comment #13) > Why is "LOWER()" still being used in these places instead of > $dbh->sql_istring()? Unfortunately, I have no idea, but I think it might be because POSITION is differently case-sensitive or insensitive depending on which version of MySQL you use. Also unfortunately, the use of sql_istring is somewhat complicated, probably the most complicated that we've had to introduce. > Or better yet, why not change $dbh->sql_position() to > automatically call sql_istring() on its second argument (if this is the way > it's always used)? From the fact that we have both "casesubstring" and "substring" in Search.pm, I'm assuming that sometimes we want POSITION to be case-sensitive. I'm starting to think about this, and I think that *maybe* for usernames, at least, the solution is to store them in lowercase, and lc() the arguments as they come in. I think the times that we need to do searches on realname are much rarer, so it would be easier to special-case them, or to just call LOWER (because we need less speed).
Comment on attachment 181480 [details] [diff] [review] V2.1 I'm going to cut down this patch to address just this bug.
Attachment #181480 - Flags: review?(mkanat)
Assignee: Tomas.Kopal → mkanat
Status: ASSIGNED → NEW
Component: Bugzilla-General → Database
Summary: Username matching needs to be case insensitive → [PostgreSQL] Username matching needs to be case insensitive
Version: unspecified → 2.19.2
Status: NEW → ASSIGNED
Summary: [PostgreSQL] Username matching needs to be case insensitive → [PostgreSQL] Username checks for login, etc. need to be case insensitive
Attached patch v2.2 (obsolete) (deleted) — Splinter Review
I basically just cut ou all the parts of this patch that aren't actually relevant to this bug. This has my r+. However, since I technically "made" this patch, it will also need another review.
Attachment #181480 - Attachment is obsolete: true
Attachment #187782 - Flags: review+
Comment on attachment 187782 [details] [diff] [review] v2.2 OK, welcome to the wonderful world of sql_istring! Basically, this just does "LOWER" on PostgreSQL and nothing at all on MySQL (because MySQL already does case-insensitive comparisons). It doesn't yet deal with Search.pm, because that's pretty complex and I thought it wasn't critical functionality. However, you should be able to log in with a username in any sort of mixed case, and do a few other things, too. There's a testing installation up at: http://landfill.bugzilla.org/bz285695/
Attachment #187782 - Flags: review?(LpSolit)
Comment on attachment 187782 [details] [diff] [review] v2.2 > $sth = $dbh->prepare("SELECT login_name FROM profiles " . >- "WHERE login_name = ?"); >+ "WHERE " . $dbh->sql_istring('login_name') . >+ " = " . $dbh->sql_istring('?')); Except in one place where the comparison operator is "LIKE", we have everywhere else in the code something like String1 = String2. So instead of writing $dbh->sql_istring(String1) . " = " . $dbh->sql_istring(String2) all the time, which is really annoying, I would prefer something like $dbh->sql_istring(String1, String2). In the very rare cases where the comparison operator is something different, you can choose between: 1) $dbh->sql_istring(String1) . " LIKE " . $dbh->sql_istring(String2) where the second parameter would be undefined, meaning that it should not fall back to LOWER(String1) = LOWER(String2) but to LOWER(StringX) only; or 2) $dbh->sql_istring(String1, String2, "LIKE") where the third parameter would be the comparison operator.
Attachment #187782 - Flags: review?(LpSolit) → review-
Attached patch v3 (sql_istrcmp) (obsolete) (deleted) — Splinter Review
OK! Now we have sql_istrcmp, which does a case-insensitive comparison of two strings. I agree that this is really much better. You'll notice that in one place I actually modified a Bugzilla::Auth::Verify::DB function to just call a Bugzilla::User function that was doing essentially the same thing. I tested all this code.
Attachment #187782 - Attachment is obsolete: true
Attachment #188039 - Flags: review?(LpSolit)
Comment on attachment 188039 [details] [diff] [review] v3 (sql_istrcmp) >Index: checksetup.pl Question: There are several places in checksetup.pl where you did not convert "login_name = ?" to the new syntax. It's because only "internal" login comparisons are done, right? I mean, no input from the user. >+=item C<sql_istring> >+ Params: $string = string to convert (scalar) Nit: s/=/-/ >+ Note: The default implementation simpy calls LOWER on the parameter. Nit: s/simpy/simply/ >+ If this is used to search on a text column with index, the index >+ will not be usually used unles it was created as LOWER(column). Nit: s/unles/unless/ >Index: Bugzilla/Auth/Verify/DB.pm >+ return (login_to_id($username) || undef); It's better to write "return login_to_id($username);" and fix Bugzilla::Auth::Verify::DB::authenticate(), line 59, accordingly. The reason I deny review (except the point in authenticate()) is because you forgot to consider files in contrib/ as well.
Attachment #188039 - Flags: review?(LpSolit) → review-
(In reply to comment #21) > Question: There are several places in checksetup.pl where you did not convert > "login_name = ?" to the new syntax. It's because only "internal" login > comparisons are done, right? I mean, no input from the user. Right. Either that, or because it's on MySQL-only code. > >+ return (login_to_id($username) || undef); > > It's better to write "return login_to_id($username);" and fix > Bugzilla::Auth::Verify::DB::authenticate(), line 59, accordingly. OK, agreed. I was concerned that some subclass might depend on the behavior, but I realize now that that's not an issue. > The reason I deny review (except the point in authenticate()) is because you > forgot to consider files in contrib/ as well. Ahh, will fix.
Attached patch v4 (obsolete) (deleted) — Splinter Review
OK, I've fixed all your comments. I made sure that all the contrib/ stuff still compiles.
Attachment #188039 - Attachment is obsolete: true
Attachment #188073 - Flags: review?(LpSolit)
Attached patch v5 (obsolete) (deleted) — Splinter Review
Removed get_id_from_username entirely, as pointed out by LpSolit in IRC.
Attachment #188073 - Attachment is obsolete: true
Attachment #188151 - Flags: review?(LpSolit)
Attachment #188073 - Flags: review?(LpSolit)
>Index: contrib/syncLDAP.pl >[...] >@@ -237,7 +238,9 @@ > print "Performing DB update:\nPhase 1: disabling not-existing users... " unless $quiet; > if($nodisable == 0) { > while( my ($key, $value) = each(%disable_users) ) { >- SendSQL("UPDATE profiles SET disabledtext = 'auto-disabled by ldap sync' WHERE login_name='$key'" ); >+ SendSQL("UPDATE profiles SET disabledtext = 'auto-disabled by ldap " . >+ "sync' WHERE " . $dbh->sql_istrcmp('login_name', >+ $dbh->quote($key))); > } > print "done!\n" unless $quiet; > } Isn't it kind of weird to split the string, 'auto-disabled by ldap sync', into two pieces?!
> Isn't it kind of weird to split the string, 'auto-disabled by ldap sync', into > two pieces?! It's to make the lines conform to 80 characters where they didn't before. This makes it fit in the least number of lines. Also, it's contrib.
Comment on attachment 188151 [details] [diff] [review] v5 >Index: contrib/BugzillaEmail.pm >@@ -45,13 +47,15 @@ >+ my $stmt = "SELECT login_name FROM profiles WHERE " . >+ $dbh->istrcmp('login_name', $dbh->quote($address)); > SendSQL($stmt); > my $found_address = FetchOneColumn(); > return $found_address; > } elsif ($email_transform eq $EMAIL_TRANSFORM_BASE_DOMAIN) { > my ($username) = ($address =~ /(.+)@/); >- my $stmt = "SELECT login_name FROM profiles WHERE profiles.login_name RLIKE \'$username\';"; >+ my $stmt = "SELECT login_name FROM profiles WHERE " . $dbh->istrcmp( >+ 'login_name', $dbh->quote($username), $dbh->sql_regexp()); > SendSQL($stmt); For these two queries: - use $dbh->sql_istrcmp() instead of $dbh->istrcmp() :-D - replace SendSQL() by $dbh->selectrow_array() which avoid the use of $dbh->quote() and makes the code a little bit more readable. >@@ -68,7 +72,8 @@ >- my $stmt = "SELECT login_name FROM profiles WHERE profiles.login_name RLIKE \'$username\';"; >+ my $stmt = "SELECT login_name FROM profiles WHERE " .$dbh->istrcmp( >+ 'login_name', $dbh->quote($username), $dbh->sql_regexp()); > SendSQL($stmt); > my $found_address = FetchOneColumn(); > return $found_address; Same 2 remarks here. >Index: contrib/bug_email.pl >@@ -1149,7 +1151,8 @@ >+ SendSQL("SELECT userid FROM profiles WHERE " . >+ $dbh->istrcmp('login_name', $dbh->quote($reporter))); > my $userid = FetchOneColumn(); Same here. You definitely don't like this sql_-like stuff, do you? :-D >Index: contrib/syncLDAP.pl >@@ -237,7 +238,9 @@ >+ SendSQL("UPDATE profiles SET disabledtext = 'auto-disabled by ldap " . >+ "sync' WHERE " . $dbh->sql_istrcmp('login_name', >+ $dbh->quote($key))); $dbh->do() so that you avoid $dbh->quote() (mixing old a new stuff looks awful). >@@ -249,9 +252,12 @@ >+ SendSQL("UPDATE profiles SET login_name = '" . >+ @$value{'new_login_name'} . "' WHERE " . >+ $dbh->istrcmp('login_name', $dbh->quote($key))); > } else { >- SendSQL("UPDATE profiles SET realname = '" . @$value{'realname'} . "' WHERE login_name='$key'" ); >+ SendSQL("UPDATE profiles SET realname = '" . @$value{'realname'} . >+ "' WHERE " . $dbh->istrcmp('login_name', $dbh->quote($key))); > } $dbh->do() and $dbh->do() and... *SQL*_istrcmp(). Moreover, you forgot contrib/bugzilla_email_append.pl, line 104.
Attachment #188151 - Flags: review?(LpSolit) → review-
(In reply to comment #27) > For these two queries: > - use $dbh->sql_istrcmp() instead of $dbh->istrcmp() :-D OK. > - replace SendSQL() by $dbh->selectrow_array() which avoid the use of > $dbh->quote() and makes the code a little bit more readable. No. This is contrib/ code, and we're frozen. I'm not going to re-write the structure of contrib/ code, particularly not in a freeze. > Moreover, you forgot contrib/bugzilla_email_append.pl, line 104. OK, I'll get that.
Attached patch v6 (deleted) — Splinter Review
OK, I fixed the names of the functions in the contrib/ code. I'm not otherwise re-writing contrib code to use standard DBI functions; we can do that in another bug.
Attachment #188151 - Attachment is obsolete: true
Attachment #188384 - Flags: review?(LpSolit)
Comment on attachment 188384 [details] [diff] [review] v6 I did some basic tests on landfill-pg, bz285695 and on my local installation using MySQL, and this patch effectively fixes the problem for Pg. No regression found when using MySQL. r=LpSolit
Attachment #188384 - Flags: review?(LpSolit) → review+
Flags: approval?
Flags: approval? → approval+
Checking in checksetup.pl; /cvsroot/mozilla/webtools/bugzilla/checksetup.pl,v <-- checksetup.pl new revision: 1.410; previous revision: 1.409 done Checking in request.cgi; /cvsroot/mozilla/webtools/bugzilla/request.cgi,v <-- request.cgi new revision: 1.23; previous revision: 1.22 done Checking in token.cgi; /cvsroot/mozilla/webtools/bugzilla/token.cgi,v <-- token.cgi new revision: 1.31; previous revision: 1.30 done Checking in Bugzilla/DB.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/DB.pm,v <-- DB.pm new revision: 1.56; previous revision: 1.55 done Checking in Bugzilla/Token.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/Token.pm,v <-- Token.pm new revision: 1.31; previous revision: 1.30 done Checking in Bugzilla/User.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/User.pm,v <-- User.pm new revision: 1.59; previous revision: 1.58 done Checking in Bugzilla/Auth/Login/WWW/Env.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/Auth/Login/WWW/Env.pm,v <-- Env.pmnew revision: 1.4; previous revision: 1.3 done Checking in Bugzilla/Auth/Verify/DB.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/Auth/Verify/DB.pm,v <-- DB.pm new revision: 1.5; previous revision: 1.4 done Checking in Bugzilla/Auth/Verify/LDAP.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/Auth/Verify/LDAP.pm,v <-- LDAP.pm new revision: 1.6; previous revision: 1.5 done Checking in Bugzilla/DB/Mysql.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/DB/Mysql.pm,v <-- Mysql.pm new revision: 1.24; previous revision: 1.23 done Checking in contrib/BugzillaEmail.pm; /cvsroot/mozilla/webtools/bugzilla/contrib/BugzillaEmail.pm,v <-- BugzillaEmail.pm new revision: 1.3; previous revision: 1.2 done Checking in contrib/bug_email.pl; /cvsroot/mozilla/webtools/bugzilla/contrib/bug_email.pl,v <-- bug_email.pl new revision: 1.28; previous revision: 1.27 done Checking in contrib/bugzilla_email_append.pl; /cvsroot/mozilla/webtools/bugzilla/contrib/bugzilla_email_append.pl,v <-- bugzilla_email_append.pl new revision: 1.9; previous revision: 1.8 done Checking in contrib/syncLDAP.pl; /cvsroot/mozilla/webtools/bugzilla/contrib/syncLDAP.pl,v <-- syncLDAP.pl new revision: 1.3; previous revision: 1.2 done
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Blocks: 300219
Blocks: 285705
Blocks: 410115
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: