Octillery
is a Go package for sharding databases.
It can use with every OR Mapping library ( xorm
, gorp
, gorm
, dbr
...) implementing database/sql
interface, or raw SQL.
Currently supports MySQL
(for product) and SQLite3
(for testing) .
We need database sharding library in Go. Of course, we know some libraries like ( https://github.com/evalphobia/wizard , https://github.com/go-pg/sharding ). But OR Mapping library they support are restricted and we want to write sharding configuration declaratively, also expect to pluggable for sharding algorithm or database adapter, and expect to configurable sharding key or whether use sequencer or not.
- Supports every OR Mapping library implementing
database/sql
interface (xorm
,gorp
,gorm
,dbr
, ... ) - Supports using
database/sql
( raw SQL ) directly - Pluggable sharding algorithm ( preinstalled algorithms are
modulo
andhashmap
) - Pluggable database adapter ( preinstalled adapters are
mysql
andsqlite3
) - Declarative describing for sharding configuration in
YAML
- Configurable sharding algorithm, database adapter, sharding key, whether use sequencer or not.
- Supports capture read/write queries just before passing to database driver
- Supports database migration by CLI ( powered by
schemalex
) - Supports import seeds from CSV
go get go.knocknote.io/octillery/cmd/octillery
go get go.knocknote.io/octillery
We explain by using posts
table.
posts
table schema is
CREATE TABLE `posts` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`user_id` bigint unsigned NOT NULL,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `uq_posts_01` (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
And we want to shard this table to four databases for load distribution.
In this case, we can try to two approach according to requirements.
If you want id
to be unique in all databases, you should use this approach.
Architecture of this approach would be the following.
Application create SQL ( like insert into posts (id, user_id, ...) values (null, 1, ...)
), in this point, id value is null because still not decide. In accordance with the above graph, insert this query to one of the databases.
- Application requests id value to sequencer
- Sequencer generates next unique id in all shards
- Sequencer returns id value to application ( ex.
id = 1
) - Replace
id
value fromnull
to1
(ex.insert into posts (id, user_id, ...) values (1, 1, ...)
) - Decide target based of
id
value by sharding algorithm ( defaultmodulo
) and insert record to selected database.
By using sequencer approach, you can get unique id
value in all databases.
Therefore, if you insert multiple records, database records should looks like the following.
posts_shard_1 | posts_shard_2 | posts_shard_3 | posts_shard_4 | ||||
---|---|---|---|---|---|---|---|
id | user_id | id | user_id | id | user_id | id | user_id |
1 | 1 | 2 | 2 | 3 | 3 | 4 | 4 |
5 | 5 | 6 | 6 | 7 | 7 | 8 | 8 |
If you don't care about uniqueness of id
, you can use sharding key approach.
Architecture of this appraoch would be the following.
- Decide target based of
user_id
value by sharding algorithm ( defaultmodulo
) and insert record to selected database.
By using sharding key approach, same id value will appear in multiple databases. Therefore, if you insert multiple records, database record should looks like the following.
posts_shard_1 | posts_shard_2 | posts_shard_3 | posts_shard_4 | ||||
---|---|---|---|---|---|---|---|
id | user_id | id | user_id | id | user_id | id | user_id |
1 | 1 | 1 | 2 | 1 | 3 | 1 | 4 |
2 | 5 | 2 | 6 | 2 | 7 | 2 | 8 |
We explained how to sharding database at section 1. From this we define requirements of database sharding library.
- Know about database sharding configuration
- Capture query just before passing to database driver
- Parse query and find sharding key
- If use sequencer, requests
id
value to sequencer and replace value ofid
column by it - Select sharding target based of sharding key by sharding algorithm
Octillery
CLI tool supports transpose
command.
It replace import statement of database/sql
to go.knocknote.io/octillery/database/sql
.
go.knocknote.io/octillery/database/sql
package has compatible interface of database/sql
.
Therefore, OR Mapping library call Octillery
's interface. and it can capture all queries.
Octillery
use github.com/blastrain/vitess-sqlparser as SQL parser. It implements powered by vitess
and tidb
.
Octillery
supports mysql
and sqlite3
adapter by default.
If you want to use new database adapter, need to the following two steps.
- Write
DBAdapter
interface. ( see https://godoc.org/go.knocknote.io/octillery/connection/adapter ) - Put new adapter file to
go.knocknote.io/octillery/plugin
directory
Octillery
supports modulo
and hashmap
algorithm by default.
If you want to use new algorithm, need to the following two steps.
- Write
ShardingAlgorithm
interface. ( see https://godoc.org/go.knocknote.io/octillery/algorithm ) - Put new algorithm file to
go.knocknote.io/octillery/algorithm
directory
$ go get go.knocknote.io/octillery/cmd/octillery
$ go get go.knocknote.io/octillery
$ octillery transpose
※ --dry-run
option confirms without overwriting
$ octillery install --mysql
databases.yml
default: &default
adapter: mysql
encoding: utf8mb4
username: root
master:
- localhost:3306
tables:
posts:
shard: true
shard_key: user_id
shards:
- post_shard_1:
<<: *default
database: posts_shard_1
- post_shard_2:
<<: *default
database: posts_shard_2
$ octillery migrate --config databases.yml /path/to/schema
※ --dry-run
option confirms migration plan
package main
import (
"go.knocknote.io/octillery"
"go.knocknote.io/octillery/database/sql"
)
func main() {
if err := octillery.LoadConfig("databases.yml"); err != nil {
panic(err)
}
db, _ := sql.Open("mysql", "")
db.QueryRow("...")
}
See GoDoc
$ make deps
If update dependencies, the following
- Modify
glide.yaml
- Run
make update-deps
- Commit
glide.yaml
andglide.lock
$ make test
sqlparser
: https://github.com/blastrain/vitess-sqlparserschemalex
: https://github.com/schemalex/schemalex
Masaaki Goshima (@goccy)
MIT