Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TPC-C test results #332

Merged
merged 4 commits into from
Nov 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 14 additions & 77 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -1,86 +1,23 @@
# SPQR benchmarks

## Test #1
TPC-C (Transaction Processing Performance Council - C) benchmark is a standardized performance test used to measure the performance of database systems under conditions of high load and a large number of transactions. It simulates the operation of an online store with a large number of simultaneous users, each of whom performs various operations with goods, such as viewing, adding to the cart, buying, etc.

The main goal of the test was to show that adding one more shard is not increasing the query latency. On the one hand, it is a very obvious statement but on the other hand, we need to make sure of this.
There are a lot of implementations of TPC-C test, in our experiments we use [Percona TPC-C Variant](https://github.com/Percona-Lab/sysbench-tpcc).

### Setup

In this test I benchmarked an SPQR installation with only one `spqr-router`:
- I use `sysbench` and it's out of box OLTP test
- `sysbench` and `spqr-router` are running on the same host
- Each shard is PostgreSQL 14 with 8 vCPU, 100% vCPU rate, 32 GB RAM, 100 GB local SSD disk
- Host with the router has 16 vCPU, 100% vCPU rate, 32 GB RAM, 100 GB local SSD disk
- I ran this test with [2,4,8,16,32,64,128] shards

I used `config.py` script to generate the router config and `init.py` to generate the `init.sql` (SQL-like code that creates key ranges).

The router config was like this:

```
log_level: error
host: localhost
router_port: '6432'
admin_console_port: '7432'
grpc_api_port: '7000'
world_shard_fallback: false
show_notice_messages: false
init_sql: init.sql
router_mode: PROXY
frontend_rules:
- db: denchick
usr: denchick
auth_rule:
auth_method: ok
password: ''
pool_mode: SESSION
pool_discard: false
pool_rollback: false
pool_prepared_statement: true
pool_default: false
frontend_tls:
sslmode: disable
backend_rules:
- db: denchick
usr: denchick
auth_rule:
auth_method: md5
password: password
pool_default: false
shards:
shard01:
...

```

For creating shards I used [Managed Service for PostgreSQL](https://cloud.yandex.com/en/services/managed-postgresql) and its [terraform provider](https://registry.terraform.io/providers/yandex-cloud/yandex/), see `main.tf`.

### Results

The query latency is indeed not increased. Test outputs are stored in `results/` folder.

## Test #2

The main goal of the test was to compare the query latency with and without using `spqr-router`.

### Setup

- I created a Managed PostgreSQL 14 Cluster with 8 vCPU, 100% vCPU rate, and 16 GB RAM
- I use `sysbench` and it's out of box OLTP tests (see `results/` folder)
- Test data is 100 tables with 10 000 000 rows in each table
- `sysbench` and `spqr-router` are running on the same host
- Host with the router has the same resources as Postgres

I made two runs:

1. connecting directly to the cluster
2. connecting via the router.
We ran PostgreSQL on s3.medium (8 vCPU, 100% vCPU rate, 32 GB RAM) instances and 300 GB of memory with default Managed PostgreSQL Cluster settings. In each test we were increasing shard count only.

### Results

Raw Postgres could process **402.42 transactions per second**, and the router made **373.76 tps**. The difference is about 10%. Test outputs are stored in `results/` folder.

## Test #3
| Warehouses | Shards | CPU | TPS | TpmC | TpmC per CPU |
| ---------- | --------- | --- | ---- | ----- | ------------ |
| 1000 | no router | 8 | 433 | 26010 | 3251.25 |
| 1000 | 2 | 16 | 664 | 39840 | 2490 |
| 1000 | 4 | 32 | 875 | 52500 | 1640.625 |
| 1000 | 8 | 64 | 1303 | 78180 | 1221.5625 |
| 1000 | 16 | 128 | 1543 | 92580 | 723.28125 |

TODO TPC-C test
As you can see, one router scales workload decently up to 8 shards.
![TPC-C test results](tpcc.png)
However, at some point adding more routers is necessary. But still there's a big room for improvement of router performance.

You can compare this results with [Vitess and Aurora](https://www.amazon.science/publications/amazon-aurora-on-avoiding-distributed-consensus-for-i-os-commits-and-membership-changes), Perfomance Results.
103 changes: 0 additions & 103 deletions benchmarks/results/test1/shards02.md

This file was deleted.

103 changes: 0 additions & 103 deletions benchmarks/results/test1/shards04.md

This file was deleted.

Loading