Open Bug 1032080 Opened 10 years ago Updated 6 years ago

Use sereal for database schema

Categories

(bugzilla.mozilla.org :: General, task)

Production
task
Not set
normal

Tracking

()

People

(Reporter: glob, Assigned: dylanAtHome)

Details

(Keywords: perf)

Attachments

(1 file)

we should change bz_schema from using perl/safe to json. currently bz_schema is a 325k chunk of perl which is evaluated with perl's Safe module on every page load. it would be much faster to use the JSON::XS module to serialise and deserialise this the schema abstract. the json version of the same variable is only 53k, and is 10 times faster to deserialise with JSON::XS. benchmarking (1000 iterations): safe: 10 wallclock secs ( 9.62 usr + 0.42 sys = 10.04 CPU) @ 99.60/s safe_terse: 9 wallclock secs ( 8.65 usr + 0.00 sys = 8.65 CPU) @ 115.61/s json_xs: 1 wallclock secs ( 0.85 usr + 0.00 sys = 0.85 CPU) @ 1176.47/s json_pp: 57 wallclock secs (55.03 usr + 0.00 sys = 55.03 CPU) @ 18.17/s eval: 5 wallclock secs ( 5.69 usr + 0.00 sys = 5.69 CPU) @ 175.75/s eval_terse: 5 wallclock secs ( 5.18 usr + 0.00 sys = 5.18 CPU) @ 193.05/s "safe" is the current code "json_xs" is what i think we should be using "eval" is a straight "eval" of bz_schema (this isn't used by bugzilla) "eval_terse" and "safe_terse" are identical to their non-terse counterparts, however they use a minimised data::dumper string (61k, generated by setting Data::Dumper::Indent = 0) "json_pp" is the pure-perl json code (if json::xs isn't installed) i also propose moving JSON::XS from optional to mandatory. JSON::XS is an established module and is widely available. if requiring JSON::XS isn't agreeable, then we could use json only if JSON::XS is available.
I agree that JSON::XS is much better and faster than Data::Dumper + Safe (and much easier to write code and to maintain it). Note that the Sereal module is even faster and produces a smaller string than JSON::XS: http://search.cpan.org/~yves/Sereal/lib/Sereal.pm My testing shows the following timing to stringify + perlify ABSTRACT_SCHEMA: Results for Data::Dumper: Length of stored stringified data: 38748 Encoding + decoding time (ms): 11.6109848022461 Results for JSON::XS: Length of stored stringified data: 34129 Encoding + decoding time (ms): 1.09410285949707 Results for Sereal with Snappy compression: Length of stored stringified data: 7050 Encoding + decoding time (ms): 0.854015350341797 Results for Sereal without compression: Length of stored stringified data: 19863 Encoding + decoding time (ms): 0.761032104492188 In all cases, I called is_deeply() from Test::More to make sure that the decoded string generates the same hashref as the original ABSTRACT_SCHEMA.
Severity: normal → enhancement
Assignee: glob → database
Assignee: database → dylan
Keywords: perf
Summary: change bz_schema from using perl/safe to json → change bz_schema from using perl/safe to Sereal
Attached patch 1032080_1.patch (deleted) — Splinter Review
Work in progress. This causes a strange error currently: Updating column status_whiteboard in table bugs ... Old: mediumtext NOT NULL New: mediumtext DEFAULT '' NOT NULL DBD::mysql::db do failed: BLOB, TEXT, GEOMETRY or JSON column 'status_whiteboard' can't have a default value [for Statement "ALTER TABLE bugs ALTER COLUMN status_whiteboard SET DEFAULT ''"] at Bugzilla/DB.pm line 724.
What's the purpose of Bugzilla/Sereal.pm? Why not put the relevant code in Bugzilla/DB/Schema.pm directly?
Sharing the objects is useful for performance. Memcached will use this, and we can also implement dclone in terms of sereal here. (Will file bug for that later)
Assignee: dylan → dylan
Component: Database → General
Product: Bugzilla → bugzilla.mozilla.org
QA Contact: default-qa
Summary: change bz_schema from using perl/safe to Sereal → Use sereal for database schema
Version: unspecified → Production
Type: enhancement → task
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: