分布式实时分析数据库citus数据插入性能优化
前言
- 多租户
每个租户的数据按租户ID分片,互不干扰,避免跨库操作。 - 实时数据分析
通过分片将数据打散到各个worker上,查询时由master生成分布式执行计划驱动所有worker并行工作。支持过滤,投影,聚合,join等各类常见算子的下推。
环境
软硬件配置
- CentOS release 6.5 x64物理机(16C/128G/300GB SSD)
- CPU: 2*8core 16核32线程, Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
- PostgreSQL 9.6.2
- citus 6.1.0
- sysbench-1.0.3
机器列表
- 192.168.0.177
- 192.168.0.181~192.168.0.188
postgresql.conf配置
listen_addresses = '*' port = 5432 max_connections = 1100 shared_buffers = 32GB effective_cache_size = 96GB work_mem = 16MB maintenance_work_mem = 2GB min_wal_size = 4GB max_wal_size = 32GB checkpoint_completion_target = 0.9 wal_buffers = 16MB default_statistics_target = 100 shared_preload_libraries = 'citus' checkpoint_timeout = 60min wal_level = replica wal_compression = on wal_level = replica wal_log_hints = on synchronous_commit = off
测试场景
INSERT INTO sbtest1 (id, k, c, pad) VALUES (525449452, 5005, '28491622445-08162085385-16839726209-31171823540-28539137588-93842246002-13643098812-68836434394-95216556185-07917709646', '49165640733-86514010343-02300194630-37380434155-24438915047')
i = sysbench.rand.unique()
i = sysbench.rand.unique() - 2147483648
单机测试
建表
CREATE TABLE sbtest1 ( id integer NOT NULL, k integer NOT NULL DEFAULT 0, c character(120) NOT NULL DEFAULT ''::bpchar, pad character(60) NOT NULL DEFAULT ''::bpchar, PRIMARY KEY (id) ); CREATE INDEX k_1 ON sbtest1(k);
插入数据
src/sysbench --test=src/lua/oltp_insert.lua \ --db-driver=pgsql \ --pgsql-host=127.0.0.1 \ --pgsql-port=5432 \ --pgsql-user=postgres \ --pgsql-db=dbone \ --auto_inc=0 \ --time=10 \ --threads=128 \ --report-interval=1 \ run
测试结果
-bash-4.1$ src/sysbench --test=src/lua/oltp_insert.lua --db-driver=pgsql --pgsql-host=127.0.0.1 --pgsql-port=5432 --pgsql-user=postgres --pgsql-db=dbone --auto_inc=0 --time=20 --threads=128 --report-interval=5 run
WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.
sysbench 1.0.3 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 128
Report intermediate results every 5 second(s)
Initializing random number generator from current time
Initializing worker threads...
Threads started!
[ 5s ] thds: 128 tps: 138381.74 qps: 138381.74 (r/w/o: 0.00/138381.74/0.00) lat (ms,95%): 2.07 err/s: 0.00 reconn/s: 0.00
[ 10s ] thds: 128 tps: 134268.30 qps: 134268.30 (r/w/o: 0.00/134268.30/0.00) lat (ms,95%): 2.07 err/s: 0.00 reconn/s: 0.00
[ 15s ] thds: 128 tps: 132830.91 qps: 132831.11 (r/w/o: 0.00/132831.11/0.00) lat (ms,95%): 2.07 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 128 tps: 132073.81 qps: 132073.61 (r/w/o: 0.00/132073.61/0.00) lat (ms,95%): 2.03 err/s: 0.00 reconn/s: 0.00
SQL statistics:
queries performed:
read: 0
write: 2688192
other: 0
total: 2688192
transactions: 2688192 (134030.18 per sec.)
queries: 2688192 (134030.18 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 20.0547s
total number of events: 2688192
Latency (ms):
min: 0.10
avg: 0.95
max: 88.80
95th percentile: 2.07
sum: 2554006.85
Threads fairness:
events (avg/stddev): 21001.5000/178.10
execution time (avg/stddev): 19.9532/0.01 资源消耗
-bash-4.1$ iostat sdc -xk 5
...
avg-cpu: %user %nice %system %iowait %steal %idle
69.12 0.00 20.56 0.15 0.00 10.17
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sdc 0.00 25302.60 18.20 705.20 72.80 104019.20 287.79 5.96 8.21 0.81 58.48 citus集群测试
建表
CREATE TABLE sbtest1
(
id integer NOT NULL,
k integer NOT NULL DEFAULT 0,
c character(120) NOT NULL DEFAULT ''::bpchar,
pad character(60) NOT NULL DEFAULT ''::bpchar,
PRIMARY KEY (id)
);
CREATE INDEX k_1 ON sbtest1(k);
set citus.shard_count = 128;
set citus.shard_replication_factor = 1;
select create_distributed_table('sbtest1','id'); 插入数据
/bak/soft/sysbench-1.0.3/src/sysbench --test=/bak/soft/sysbench-1.0.3/src/lua/oltp_insert.lua \ --db-driver=pgsql \ --pgsql-host=127.0.0.1 \ --pgsql-port=5432 \ --pgsql-user=postgres \ --pgsql-db=dbcitus \ --auto_inc=0 \ --time=10 \ --threads=64 \ --report-interval=1 \ run
执行结果
-bash-4.1$ /bak/soft/sysbench-1.0.3/src/sysbench --test=/bak/soft/sysbench-1.0.3/src/lua/oltp_insert.lua --db-driver=pgsql --pgsql-host=127.0.0.1 --pgsql-port=5432 --pgsql-user=postgres --pgsql-db=dbcitus --auto_inc=0 --time=20 --threads=64 --report-interval=5 run
WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.
sysbench 1.0.3 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 64
Report intermediate results every 5 second(s)
Initializing random number generator from current time
Initializing worker threads...
Threads started!
[ 5s ] thds: 64 tps: 44628.01 qps: 44628.01 (r/w/o: 0.00/44628.01/0.00) lat (ms,95%): 2.48 err/s: 0.00 reconn/s: 0.00
[ 10s ] thds: 64 tps: 44780.80 qps: 44780.80 (r/w/o: 0.00/44780.80/0.00) lat (ms,95%): 2.48 err/s: 0.00 reconn/s: 0.00
[ 15s ] thds: 64 tps: 44701.32 qps: 44701.72 (r/w/o: 0.00/44701.72/0.00) lat (ms,95%): 2.48 err/s: 0.00 reconn/s: 0.00
[ 20s ] thds: 64 tps: 44801.41 qps: 44801.01 (r/w/o: 0.00/44801.01/0.00) lat (ms,95%): 2.48 err/s: 0.00 reconn/s: 0.00
SQL statistics:
queries performed:
read: 0
write: 894715
other: 0
total: 894715
transactions: 894715 (44637.47 per sec.)
queries: 894715 (44637.47 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 20.0421s
total number of events: 894715
Latency (ms):
min: 0.42
avg: 1.43
max: 203.28
95th percentile: 2.48
sum: 1277233.99
Threads fairness:
events (avg/stddev): 13979.9219/71.15
execution time (avg/stddev): 19.9568/0.01 资源消耗
master
[root@node1 ~]# iostat sdc -xk 5
Linux 2.6.32-431.el6.x86_64 (node1) 2017年03月13日 _x86_64_ (32 CPU)
...
avg-cpu: %user %nice %system %iowait %steal %idle
50.61 0.00 17.80 0.00 0.00 31.59
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 其中一个worker
[root@node5 ~]# iostat sdc -xk 5
Linux 2.6.32-431.el6.x86_64 (node5) 2017年03月13日 _x86_64_ (32 CPU)
...
avg-cpu: %user %nice %system %iowait %steal %idle
2.24 0.00 0.63 0.00 0.00 97.13
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sdc 0.00 774.00 0.00 265.80 0.00 4159.20 31.30 0.25 0.96 0.01 0.38 优化:masterless部署
拷贝元数据
CREATE TABLE sbtest1 ( id integer NOT NULL, k integer NOT NULL DEFAULT 0, c character(120) NOT NULL DEFAULT ''::bpchar, pad character(60) NOT NULL DEFAULT ''::bpchar, PRIMARY KEY (id) ); CREATE INDEX k_1 ON sbtest1(k); copy pg_dist_node from PROGRAM 'psql "host=192.168.0.177 port=5432 dbname=dbcitus user=postgres" -Atc "copy pg_dist_node to STDOUT"'; copy pg_dist_partition from PROGRAM 'psql "host=192.168.0.177 port=5432 dbname=dbcitus user=postgres" -Atc "copy pg_dist_partition to STDOUT"'; copy pg_dist_shard from PROGRAM 'psql "host=192.168.0.177 port=5432 dbname=dbcitus user=postgres" -Atc "copy pg_dist_shard to STDOUT"'; copy pg_dist_shard_placement from PROGRAM 'psql "host=192.168.0.177 port=5432 dbname=dbcitus user=postgres" -Atc "copy pg_dist_shard_placement to STDOUT"'; copy pg_dist_colocation from PROGRAM 'psql "host=192.168.0.177 port=5432 dbname=dbcitus user=postgres" -Atc "copy pg_dist_colocation to STDOUT"';
修改oltp_insert.lua
i = sysbench.rand.unique() - 2147483648
i = sysbench.rand.unique() - 2147483648 + 1
i = sysbench.rand.unique() - 2147483648 + 2
i = sysbench.rand.unique() - 2147483648 + 7
准备测试脚本
#!/bin/bash cd /bak/soft/sysbench-1.0.3 /bak/soft/sysbench-1.0.3/src/sysbench /bak/soft/sysbench-1.0.3/src/lua/oltp_insert.lua \ --db-driver=pgsql \ --pgsql-host=127.0.0.1 \ --pgsql-port=5432 \ --pgsql-user=postgres \ --pgsql-db=dbcitus \ --auto_inc=0 \ --time=60 \ --threads=64 \ --report-interval=5 \ run >/tmp/run_oltp_insert.log 2>&1
测试
[root@node1 ~]# for i in `seq 1 8` ; do ssh 192.168.0.18$i /tmp/run_oltp_insert.sh >/dev/null 2>&1 & done [10] 27332 [11] 27333 [12] 27334 [13] 27335 [14] 27336 [15] 27337 [16] 27338 [17] 27339
测试结果
-bash-4.1$ cat /tmp/run_oltp_insert.log
sysbench 1.0.3 (using bundled LuaJIT 2.1.0-beta2)
Running the test with following options:
Number of threads: 64
Report intermediate results every 5 second(s)
Initializing random number generator from current time
Initializing worker threads...
Threads started!
[ 5s ] thds: 64 tps: 25662.78 qps: 25662.78 (r/w/o: 0.00/25662.78/0.00) lat (ms,95%): 6.67 err/s: 2.60 reconn/s: 0.00
[ 10s ] thds: 64 tps: 26225.38 qps: 26225.38 (r/w/o: 0.00/26225.38/0.00) lat (ms,95%): 6.67 err/s: 7.00 reconn/s: 0.00
[ 15s ] thds: 64 tps: 25996.42 qps: 25996.42 (r/w/o: 0.00/25996.42/0.00) lat (ms,95%): 6.79 err/s: 11.40 reconn/s: 0.00
[ 20s ] thds: 64 tps: 25670.36 qps: 25670.36 (r/w/o: 0.00/25670.36/0.00) lat (ms,95%): 6.79 err/s: 18.60 reconn/s: 0.00
[ 25s ] thds: 64 tps: 25620.89 qps: 25620.89 (r/w/o: 0.00/25620.89/0.00) lat (ms,95%): 6.79 err/s: 22.60 reconn/s: 0.00
[ 30s ] thds: 64 tps: 25357.39 qps: 25357.39 (r/w/o: 0.00/25357.39/0.00) lat (ms,95%): 6.91 err/s: 33.40 reconn/s: 0.00
[ 35s ] thds: 64 tps: 25247.67 qps: 25247.67 (r/w/o: 0.00/25247.67/0.00) lat (ms,95%): 6.91 err/s: 34.60 reconn/s: 0.00
[ 40s ] thds: 64 tps: 25069.27 qps: 25069.27 (r/w/o: 0.00/25069.27/0.00) lat (ms,95%): 6.91 err/s: 41.00 reconn/s: 0.00
[ 45s ] thds: 64 tps: 24796.27 qps: 24796.27 (r/w/o: 0.00/24796.27/0.00) lat (ms,95%): 7.04 err/s: 49.40 reconn/s: 0.00
[ 50s ] thds: 64 tps: 24801.00 qps: 24801.00 (r/w/o: 0.00/24801.00/0.00) lat (ms,95%): 7.04 err/s: 47.40 reconn/s: 0.00
[ 55s ] thds: 64 tps: 24752.83 qps: 24752.83 (r/w/o: 0.00/24752.83/0.00) lat (ms,95%): 7.04 err/s: 57.20 reconn/s: 0.00
[ 60s ] thds: 64 tps: 24533.35 qps: 24533.35 (r/w/o: 0.00/24533.35/0.00) lat (ms,95%): 7.17 err/s: 63.60 reconn/s: 0.00
SQL statistics:
queries performed:
read: 0
write: 1518786
other: 0
total: 1518786
transactions: 1518786 (25277.24 per sec.)
queries: 1518786 (25277.24 per sec.)
ignored errors: 1944 (32.35 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 60.0829s
total number of events: 1518786
Latency (ms):
min: 0.47
avg: 2.53
max: 1015.04
95th percentile: 6.91
sum: 3835098.18
Threads fairness:
events (avg/stddev): 23731.0312/213.31
execution time (avg/stddev): 59.9234/0.02 系统负载
-bash-4.1$ iostat sdc -xk 5
Linux 2.6.32-431.el6.x86_64 (node5) 2017年03月13日 _x86_64_ (32 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
47.09 0.00 18.35 0.47 0.00 34.10
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sdc 0.00 4195.60 0.00 19787.60 0.00 95932.80 9.70 0.98 0.05 0.02 42.54 汇总结果
[root@node1 ~]# for i in `seq 1 8` ; do ssh 192.168.0.18$i grep queries: /tmp/run_oltp_insert.log ; done
queries: 1518786 (25277.24 per sec.)
queries: 1587323 (26412.68 per sec.)
queries: 1700562 (28305.06 per sec.)
queries: 1631516 (27151.82 per sec.)
queries: 1615778 (26885.48 per sec.)
queries: 1649236 (27449.03 per sec.)
queries: 1621940 (26993.20 per sec.)
queries: 1554917 (25890.71 per sec.) 数据查询
dbcitus=# select count(1) from sbtest1; count ---------- 12880058 (1 行记录) 时间:73.197 ms
总结
- citus的执行计划生成影响了数据插入的速度,通过Masterless部署可提升到20w/s以上。
- 进一步提升插入性能可以从citus源码入手,根据分片列值做快速SQL分发,避免在master上解析SQL,之前在另一个场景上做过原型,性能可提升10倍以上。
- 的做法是绕过master直接插入数据到worker上的分片表,还可以利用copy或批更新。
参考
- Real-time Inserts:0-50k/s
- Real-time Updates:0-50k/s
- Bulk Copy:100-200k/s
- Masterless Citus:50k/s-500k/s
相关文章