Closed Bug 1194221 Opened 9 years ago Closed 9 years ago

Change the reference data varchar column collations to utf8_bin instead of utf8_ci

Categories

(Tree Management :: Treeherder, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mdoglio, Assigned: mdoglio)

References

Details

Attachments

(1 file)

We don't need case sensitive filtering/sorting and using utf8_ci gives you better performance in query execution.
Blocks: 1193836
Also, the django orm doesn't support non-default collation type on varchar/text fields.
Given that the target tables are quite small we probably don't need to do anything fancy other than an alter table. :sheeri what do you think about it? The tables I'm talking about are those ones defined here https://github.com/mozilla/treeherder/blob/cc52bf7f35077ec529e35e0236948fc2a1f89c29/treeherder/model/sql/template_schema/treeherder_reference_1.sql.tmpl
Assignee: nobody → mdoglio
Flags: needinfo?(scabral)
Ed Morley specifically used utf8_bin in https://bugzilla.mozilla.org/show_bug.cgi?id=1185030#c2 If the tables are small, does the performance difference matter? I ask b/c I assume in the bug Ed had good reason to change to a case-sensitive collation. But yeah, we can change utf8_bin in the tables other than the datasource table (and anything else that uses tokens).
Flags: needinfo?(scabral)
(In reply to Sheeri Cabral [:sheeri] from comment #3) > Ed Morley specifically used utf8_bin in > https://bugzilla.mozilla.org/show_bug.cgi?id=1185030#c2 That was not changing the collation, but setting a unique key, using the same collation as was there previously - so don't let it affect the decision here :-)
ok I'll go ahead and change the sql template files we have in our repository. I guess I can produce a sql script to apply the change on staging and I'll coordinate with sheeri to do the same on production if that goes well.
OK, sounds like a plan.
Attached file treeherder_alter_collation.sql (deleted) —
I tried to run this on staging. I put some notes in the file indicating the failures I got.
I'm putting this work on hold until we have a way to clean the reference data tables.
Depends on: 1195782
Since the final goal is to have a sql schema consistent with what the django orm would produce, I'm going to do it the other way around: utf8_bin everywhere! It should be sufficient to: 1- change the default db collation to utf8_bin 2- change the collation of those few utf8_unicode_ci tables to utf8_bin
Is this something we will undo in the future?
(In reply to Mauro Doglio [:mdoglio] from comment #9) > Since the final goal is to have a sql schema consistent with what the django > orm would produce, I'm going to do it the other way around: utf8_bin > everywhere! It should be sufficient to: > 1- change the default db collation to utf8_bin > 2- change the collation of those few utf8_unicode_ci tables to utf8_bin Updating the summary to reflect that.
Summary: Change all the reference data varchar columns to use utf8_ci instead of utf8_bin → Change the reference data varchar column collations to utf8_bin instead of utf8_ci
Depends on: 1201087
So, the one note I see that specifically talk about an error is: /*altering the reference_data_signatures table raise errors for unique key constraints*/ alter table `reference_data_signatures` modify `name` varchar(255) character set utf8 collate utf8_general_ci; ... When moving from utf8_bin to utf8_general_ci, ci = case insensitive, so if one value is "foo" and another "FOO" that can cause this type of error. Comment 11 shows that you want to change the case insensitive stuff to binary (which is case sensitive) so this shouldn't be a problem. Let me know if there's anything else I can help with. default charset is set to utf8 but there's no collation set, which defaults to utf8_general_ci. Do you want me to change the default collation on treeherder_stage and treeherder? Default collation is what new tables created will get.
:sheeri yes, that would be great.
OK, changing server default in bug 1201087. If you want me to change db defaults, let me know.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Blocks: 1303767
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: