1. 前言
中启乘数科技是一家专业的PostgreSQL和Greenplum数据库服务提供商,专注于数据库极致的性能,当前PostgreSQL 14beta1已经发布,其中一个激动人心的特性就是当数据库有海量连接时,其它连接的性能基本不下降。下面我们通过实际的测试来说明。
2. 测试环境和方法
本着不详细介绍测试方法和测试环境的测试都是耍流氓的精神,我们把测试环境和方法详细介绍如下:
我们准备在一台物理机器上准备两套环境,一套是PostgreSQL 13.3,另一套是PostgreSQL 14beta1。
物理机器的硬件为2路服务器,CPU为:Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz,内存为256GB。
我们使用initdb
命令初始化数据库,然后修改最大连接数:
max_connections = 11000
其它参数都不修改,默认shared_buffer为128MB。
然后用pgbench造2千万行数据:
pgbench -i -s 200
PostgreSQL 14与PostgreSQL 13在造数据上花的时间差不多,稍微快一点:
PostgreSQL 14:
[pg14@pg01 ~]$ time pgbench -i -s 200
dropping old tables...
NOTICE: table "pgbench_accounts" does not exist, skipping
...
20000000 of 20000000 tuples (100%) done (elapsed 15.45 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 23.67 s (drop tables 0.00 s, create tables 0.06 s, client-side generate 15.54 s, vacuum 2.70 s, primary keys 5.37 s).
real 0m23.674s
user 0m5.027s
sys 0m0.138s
PostgreSQL 13:
[pg13@pg01 ~]$ time pgbench -i -s 200
dropping old tables...
NOTICE: table "pgbench_accounts" does not exist, skipping
...
...
...
20000000 of 20000000 tuples (100%) done (elapsed 17.23 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 26.41 s (drop tables 0.00 s, create tables 0.06 s, client-side generate 17.32 s, vacuum 2.69 s, primary keys 6.34 s).
real 0m26.409s
user 0m5.022s
sys 0m0.117s
测试方法的思路是我们给数据库建9000个执行pg_sleep(2000)的空闲连接,然后在用pgbench做只读压力测试。
建9000个pg_sleep(2000)连接的方法为:
我们建一个sleep.sql的文件,内容如下:
select pg_sleep(2000);
然后执行下面的命令
pgbench -c 9000 -f sleep.sql
而我们压力测试的命令为:
pgbench -S -c 64 -j 96 -M prepared -T30 -P 2
3. 实际测试
3.1 开启9000个空闲连接的测试
测试PostgreSQL 13.3
先开9000个执行pg_sleep()的连接:
[pg13@pg01 ~]$ pgbench -c 9000 -f sleep.sql
starting vacuum...end.
然后压力测试:
[pg13@pg01 ~]$ pgbench -S -c 128 -j 16 -M prepared -T30 -P 2
starting vacuum...end.
progress: 2.0 s, 259526.5 tps, lat 0.446 ms stddev 0.185
progress: 4.0 s, 268625.8 tps, lat 0.437 ms stddev 0.131
...
...
progress: 30.0 s, 272168.1 tps, lat 0.462 ms stddev 0.135
transaction type: <builtin: select only>
scaling factor: 200
query mode: prepared
number of clients: 128
number of threads: 16
duration: 30 s
number of transactions actually processed: 8120358
latency average = 0.454 ms
latency stddev = 0.151 ms
tps = 270446.442022 (including connections establishing)
tps = 271185.174447 (excluding connections establishing)
测试PostgreSQL 14beta1
先开9000个执行pg_sleep()的连接:
[pg14@pg01 ~]$ pgbench -c 9000 -f sleep.sql
starting vacuum...end.
然后压力测试:
[pg14@pg01 ~]$ pgbench -S -c 128 -j 16 -M prepared -T30 -P 2
starting vacuum...end.
progress: 2.0 s, 324460.7 tps, lat 0.335 ms stddev 0.475
progress: 4.0 s, 384877.4 tps, lat 0.316 ms stddev 0.454
progress: 6.0 s, 385626.7 tps, lat 0.316 ms stddev 0.467
...
...
progress: 28.0 s, 557425.8 tps, lat 0.213 ms stddev 0.152
progress: 30.0 s, 557706.0 tps, lat 0.212 ms stddev 0.145
pgbench (PostgreSQL) 14.0
transaction type: <builtin: select only>
scaling factor: 200
query mode: prepared
number of clients: 128
number of threads: 16
duration: 30 s
number of transactions actually processed: 15378932
latency average = 0.231 ms
latency stddev = 0.256 ms
initial connection time = 216.714 ms
tps = 515524.706846 (without initial connection time)
可以看到在PostgreSQL 13中只能到达27万,而在PostgreSQL 14中就可以达到51万了,有近翻倍的性能提升。
极端测试: 20000万个空闲连接
PostgreSQL 13.3
[pg13@pg01 ~]$ pgbench -S -c 128 -j 16 -M prepared -T30 -P 2
starting vacuum...end.
progress: 30.0 s, 146343.3 tps, lat 0.447 ms stddev 0.176
transaction type: <builtin: select only>
scaling factor: 200
query mode: prepared
number of clients: 128
number of threads: 16
duration: 30 s
number of transactions actually processed: 4392927
latency average = 0.447 ms
latency stddev = 0.176 ms
tps = 146325.374803 (including connections establishing)
tps = 155519.134507 (excluding connections establishing)
PostgreSQL 14beta1
[pg14@pg01 ~]$ pgbench -S -c 128 -j 16 -M prepared -T30 -P 2
starting vacuum...end.
progress: 2.0 s, 390343.4 tps, lat 0.234 ms stddev 0.196
progress: 4.0 s, 542178.6 tps, lat 0.219 ms stddev 0.152
progress: 6.0 s, 550231.9 tps, lat 0.216 ms stddev 0.145
progress: 8.0 s, 550509.8 tps, lat 0.215 ms stddev 0.153
progress: 10.0 s, 554279.3 tps, lat 0.214 ms stddev 0.144
progress: 12.0 s, 552410.2 tps, lat 0.215 ms stddev 0.148
progress: 14.0 s, 554254.9 tps, lat 0.214 ms stddev 0.147
progress: 16.0 s, 554077.1 tps, lat 0.214 ms stddev 0.149
progress: 18.0 s, 552176.4 tps, lat 0.215 ms stddev 0.148
progress: 20.0 s, 554539.0 tps, lat 0.214 ms stddev 0.144
progress: 22.0 s, 550789.9 tps, lat 0.215 ms stddev 0.154
progress: 24.0 s, 554287.8 tps, lat 0.214 ms stddev 0.144
progress: 26.0 s, 552759.7 tps, lat 0.215 ms stddev 0.149
progress: 28.0 s, 554362.7 tps, lat 0.214 ms stddev 0.148
pgbench (PostgreSQL) 14.0
transaction type: <builtin: select only>
scaling factor: 200
query mode: prepared
number of clients: 128
number of threads: 16
duration: 30 s
number of transactions actually processed: 16240811
latency average = 0.216 ms
latency stddev = 0.160 ms
initial connection time = 460.811 ms
tps = 548627.931145 (without initial connection time)
可以看到PostgreSQL 14还是可以达到50多万的TPS,而PostgreSQL 13.3就只能是14万了,差距巨大。
3.3 进一步的测试
我们不建空闲连接,直接不断的改变连接数,看看解结果是怎样的:
pgbench -S -c <连接数> -j 16 -M prepared -T30 -P 2
压测时只改变上面的-c
的连接数。测试的过程这里就不写出来,直接给出结果:
连接数 | PostgreSQL 13.3 | PostgreSQL 14beta1 |
---|---|---|
64 | 496841 | 513140 |
128 | 528556 | 552778 |
256 | 475030 | 497435 |
1000 | 347863 | 402369 |
5000 | 205857 | 310090 |
10000 | 124371 | 248701 |
4. 结论
PostgtreSQL 14可以支持大量的空闲连接,这些空闲连接对执行的SQL的性能影响很小,但在PostgreSQL 13,会有很大的影响。所以从PostgreSQL 14之后,数据库可以支持上万个空闲连接,以前为了支持很多连接需要上连接池,现在可以不用了。
5. 原理简介
PostgreSQL 14主要是优化了GetSnapshotData()
的性能,原先是随连接数增加,这个函数的性能是线性下降,从PostgreSQL 14之后就不是了。这个优化总共提交了7个PATCH,如果想进一步了解的同学可以见:
- snapshot scalability: Don’t compute global horizons while building snapshots.
- snapshot scalability: Move PGXACT->xmin back to PGPROC.
- snapshot scalability: Move PGXACT->vacuumFlags to ProcGlobal->vacuumFlags.
- snapshot scalability: Move subxact info to ProcGlobal, remove PGXACT.
- snapshot scalability: Introduce dense array of in-progress xids.
- snapshot scalability: cache snapshots using a xact completion counter.
- Fix race condition in snapshot caching when 2PC is used.