基于Patroni的PostgreSQL高可用环境部署
1. 前言
PostgreSQL是一款功能,性能,可靠性都可以和高端的国外商业数据库相媲美的开源数据库。而且PostgreSQL的许可和生态完全开放,不被任何一个单一的公司或国家所操控,保证了使用者没有后顾之忧。国内越来越多的企业开始用PostgreSQL代替原来昂贵的国外商业数据库。
在部署PostgreSQL到生产环境中时,选择适合的高可用方案是一项必不可少的工作。本文介绍基于Patroni的PostgreSQL高可用的部署方法,供大家参考。
PostgreSQL的开源HA工具有很多种,下面几种算是比较常用的
PAF(PostgreSQL Automatic Failomianver)
repmgr
Patroni
它们的比较可以参考: https://scalegrid.io/blog/managing-high-availability-in-postgresql-part-1/
其中Patroni不仅简单易用而且功能非常强大。
支持自动failover和按需switchover
支持一个和多个备节点
支持级联复制
支持同步复制,异步复制
支持同步复制下备库故障时自动降级为异步复制(功效类似于MySQL的半同步,但是更加智能)
支持控制指定节点是否参与选主,是否参与负载均衡以及是否可以成为同步备机
支持通过
pg_rewind自动修复旧主支持多种方式初始化集群和重建备机,包括
pg_basebackup和支持wal_e,pgBackRest,barman等备份工具的自定义脚本支持自定义外部callback脚本
支持REST API
支持通过watchdog防止脑裂
支持k8s,docker等容器化环境部署
支持多种常见DCS(Distributed Configuration Store)存储元数据,包括etcd,ZooKeeper,Consul,Kubernetes
因此,除非只有2台机器没有多余机器部署DCS的情况,Patroni是一款非常值得推荐的PostgreSQL高可用工具。下面将详细介绍基于Patroni搭建PostgreSQL高可用环境的步骤。
2. 实验环境
主要软件
CentOS 7.8
PostgreSQL 12
Patroni 1.6.5
etcd 3.3.25
机器和VIP资源
PostgreSQL
node1:192.168.234.201
node2:192.168.234.202
node3:192.168.234.203
etcd
node4:192.168.234.204
VIP
读写VIP:192.168.234.210
只读VIP:192.168.234.211
环境准备
所有节点设置时钟同步
yum install -y ntpdatentpdate time.windows.com && hwclock -w
如果使用防火墙需要开放postgres,etcd和patroni的端口。
postgres:5432
patroni:8008
etcd:2379/2380
更简单的做法是将防火墙关闭
setenforce sed -i.bak "s/SELINUX=enforcing/SELINUX=permissive/g" /etc/selinux/configsystemctl disable firewalld.servicesystemctl stop firewalld.serviceiptables -F
3.etcd部署
因为本文的主题不是etcd的高可用,所以只在node4上部署单节点的etcd用于实验。生产环境至少需要部署3个节点,可以使用独立的机器也可以和数据库部署在一起。etcd的部署步骤如下
安装需要的包
yum install -y gcc python-devel epel-release
安装etcd
yum install -y etcd
编辑etcd配置文件/etc/etcd/etcd.conf, 参考配置如下
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"ETCD_LISTEN_PEER_URLS="http://192.168.234.204:2380"ETCD_LISTEN_CLIENT_URLS="http://localhost:2379,http://192.168.234.204:2379"ETCD_NAME="etcd0"ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.234.204:2380"ETCD_ADVERTISE_CLIENT_URLS="http://192.168.234.204:2379"ETCD_INITIAL_CLUSTER="etcd0=http://192.168.234.204:2380"ETCD_INITIAL_CLUSTER_TOKEN="cluster1"ETCD_INITIAL_CLUSTER_STATE="new"
启动etcd
systemctl start etcd
设置etcd自启动
systemctl enable etcd
4. PostgreSQL + Patroni HA部署
在需要运行PostgreSQL的实例上安装相关软件
安装PostgreSQL 12
yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
yum install -y postgresql12-server postgresql12-contrib
安装Patroni
yum install -y gcc epel-releaseyum install -y python-pip python-psycopg2 python-devel
pip install --upgrade pippip install --upgrade setuptoolspip install patroni[etcd]
创建PostgreSQL数据目录
mkdir -p /pgsql/datachown postgres:postgres -R /pgsqlchmod -R 700 /pgsql/data
创建Partoni service配置文件/etc/systemd/system/patroni.service
[Unit]Description=Runners to orchestrate a high-availability PostgreSQLAfter=syslog.target network.target
[Service]Type=simpleUser=postgresGroup=postgres#StandardOutput=syslogExecStart=/usr/bin/patroni /etc/patroni.ymlExecReload=/bin/kill -s HUP $MAINPIDKillMode=processTimeoutSec=30Restart=no
[Install]WantedBy=multi-user.target
创建Patroni配置文件/etc/patroni.yml,以下是node1的配置示例
scope: pgsqlnamespace: /service/name: pg1
restapi:listen: 0.0.0.0:8008connect_address: 192.168.234.201:8008
etcd:host: 192.168.234.204:2379
bootstrap:dcs:ttl: 30loop_wait: 10retry_timeout: 10maximum_lag_on_failover: 1048576master_start_timeout: 300synchronous_mode: falsepostgresql:use_pg_rewind: trueuse_slots: trueparameters:listen_addresses: "0.0.0.0"port: 5432wal_level: logicalhot_standby: "on"wal_keep_segments: 100max_wal_senders: 10max_replication_slots: 10wal_log_hints: "on"
initdb:- encoding: UTF8- locale: C- lc-ctype: zh_CN.UTF-8- data-checksums
pg_hba:- host replication repl 0.0.0.0/0 md5- host all all .0.0.0/ md5
postgresql:listen: 0.0.0.0:5432connect_address: 192.168.234.201:5432data_dir: /pgsql/databin_dir: /usr/pgsql-12/bin
authentication:replication:username: replpassword: "123456"superuser:username: postgrespassword: "123456"
basebackup:max-rate: 100Mcheckpoint: fast
tags:nofailover: falsenoloadbalance: falseclonefrom: falsenosync: false
完整的参数含有可参考Patroni手册中的 YAML Configuration Settings,其中PostgreSQL参数可根据需要自行补充。
其他PG节点的patroni.yml需要相应修改下面3个参数
name
node1~node4分别设置pg1~pg4restapi.connect_address
根据各自节点IP设置
postgresql.connect_address
根据各自节点IP设置
启动Patroni
先在node1上启动Patroni。
systemctl start patroni
初次启动Patroni时,Patroni会初始创建PostgreSQL实例和用户。
[root@node1 ~]# systemctl status patroni● patroni.service - Runners to orchestrate a high-availability PostgreSQL Loaded: loaded (/etc/systemd/system/patroni.service; disabled; vendor preset: disabled) Active: active (running) since Sat 2020-09-05 14:41:03 CST; 38min ago Main PID: 1673 (patroni) CGroup: /system.slice/patroni.service ├─1673 /usr/bin/python2 /usr/bin/patroni /etc/patroni.yml ├─1717 /usr/pgsql-12/bin/postgres -D /pgsql/data --config-file=/pgsql/data/postgresql.conf --listen_addresses=0.0.0.0 --max_worker_processe... ├─1719 postgres: pgsql: logger ├─1724 postgres: pgsql: checkpointer ├─1725 postgres: pgsql: background writer ├─1726 postgres: pgsql: walwriter ├─1727 postgres: pgsql: autovacuum launcher ├─1728 postgres: pgsql: stats collector ├─1729 postgres: pgsql: logical replication launcher └─1732 postgres: pgsql: postgres postgres 127.0.0.1(37154) idle
再在node2上启动Patroni。node2将作为replica加入集群,自动从leader拷贝数据并建立复制。
[root@node2 ~]# systemctl status patroni● patroni.service - Runners to orchestrate a high-availability PostgreSQL Loaded: loaded (/etc/systemd/system/patroni.service; disabled; vendor preset: disabled) Active: active (running) since Sat 2020-09-05 16:09:06 CST; 3min 41s ago Main PID: 1882 (patroni) CGroup: /system.slice/patroni.service ├─1882 /usr/bin/python2 /usr/bin/patroni /etc/patroni.yml ├─1898 /usr/pgsql-12/bin/postgres -D /pgsql/data --config-file=/pgsql/data/postgresql.conf --listen_addresses=0.0.0.0 --max_worker_processe... ├─1900 postgres: pgsql: logger ├─1901 postgres: pgsql: startup recovering 000000010000000000000003 ├─1902 postgres: pgsql: checkpointer ├─1903 postgres: pgsql: background writer ├─1904 postgres: pgsql: stats collector ├─1912 postgres: pgsql: postgres postgres 127.0.0.1(35924) idle └─1916 postgres: pgsql: walreceiver streaming /3000060
查看集群状态
[root@node2 ~]# patronictl -c /etc/patroni.yml list+ Cluster: pgsql (6868912301204081018) -------+----+-----------+| Member | Host | Role | State | TL | Lag in MB |+--------+-----------------+--------+---------+----+-----------+| pg1 | 192.168.234.201 | Leader | running | 1 | || pg2 | 192.168.234.202 | | running | 1 | . |+--------+-----------------+--------+---------+----+-----------+
为了方便日常操作,设置全局环境变量PATRONICTL_CONFIG_FILE
echo 'export PATRONICTL_CONFIG_FILE=/etc/patroni.yml' >/etc/profile.d/patroni.sh
添加以下环境变量到~postgres/.bash_profile
export PGDATA=/pgsql/dataexport PATH=/usr/pgsql-12/bin:$PATH
设置postgres拥有免密的sudoer权限
echo 'postgres ALL=(ALL) NOPASSWD: ALL'> /etc/sudoers.d/postgres
5. 自动切换和脑裂防护
5.1 Patroni如何防止脑裂
非Leader节点的PG处于生产模式时,重启PG并切换到恢复模式作为备库运行 Leader节点的patroni无法连接etcd时,不能确保自己仍然是Leader,将本机的PG降级为备库 正常停止patroni时,patroni会顺便把本机的PG进程也停掉
/etc/systemd/system/patroni.service[Unit]Description=Runners to orchestrate a high-availability PostgreSQLAfter=syslog.target network.target
[Service]Type=simpleUser=postgresGroup=postgres#StandardOutput=syslogExecStartPre=-/usr/bin/sudo /sbin/modprobe softdogExecStartPre=-/usr/bin/sudo /bin/chown postgres /dev/watchdogExecStart=/usr/bin/patroni /etc/patroni.ymlExecReload=/bin/kill -s HUP $MAINPIDKillMode=processTimeoutSec=30Restart=no
[Install]WantedBy=multi-user.targetsystemctl enable patroni/etc/patroni.yml,添加以下内容watchdog:mode: automatic # Allowed values: off, automatic, requireddevice: /dev/watchdogsafety_margin: 5safety_margin指如果Patroni没有及时更新watchdog,watchdog会在Leader key过期前多久触发重启。在本例的配置下(ttl=30,loop_wait=10,safety_margin=5)下,patroni进程每隔10秒(loop_wait)都会更新Leader key和watchdog。如果Leader节点异常导致patroni进程无法及时更新watchdog,会在Leader key过期的前5秒触发重启。重启如果在5秒之内完成,Leader节点有机会再次获得Leader锁,否则Leader key过期后,由备库通过选举选出新的Leader。/etc/patroni.yml中设置同步模式synchronous_mode:truepatronictl edit-config -s 'synchronous_mode=true'synchronous_standby_names控制同步异步复制的切换。并且Patroni会把同步的状态记录到etcd中,确保同步状态在Patroni集群中的一致性。[root@node4 ~]# etcdctl get /service/cn/sync{"leader":"pg1","sync_standby":"pg2"}[root@node4 ~]# etcdctl get /service/cn/sync{"leader":"pg1","sync_standby":null}synchronous_mode:truesynchronous_mode_strict:truetags:nofailover: truenoloadbalance: trueclonefrom: falsenosync: true5.3 etcd不可访问的影响
retry_timeout参数,比如1万天,同时通过同步复制模式防止脑裂。retry_timeout:864000000synchronous_mode:trueretry_timeout用于控制操作DCS和PostgreSQL的重试超时。Patroni对需要重试的操作,除了时间上的限制还有重试次数的限制。对于PostgreSQL操作,目前似乎只有调用GET /patroni的REST API时会重试,而且多只重试1次,所以把retry_timeout调大不会带来其他副作用。6. 日常操作
patronictl命令控制Patroni和PostgreSQL,比如修改PotgreSQL参数。[postgres@node2 ~]$ patronictl --helpUsage: patronictl [OPTIONS] COMMAND [ARGS]...
Options: -c, --config-file TEXT Configuration file -d, --dcs TEXT Use this DCS -k, --insecure Allow connections to SSL sites without certs--help Show this message and exit.
Commands: configure Create configuration file dsn Generate a dsn for the provided member, defaults to a dsn of... edit-config Edit cluster configurationfailover Failover to a replicaflush Discard scheduled events (restarts only currently) history Show the history of failovers/switchoverslist List the Patroni members for a given Patroni pause Disable auto failoverquery Query a Patroni PostgreSQL member reinit Reinitialize cluster member reload Reload cluster member configuration remove Remove cluster from DCS restart Restart cluster memberresume Resume auto failover scaffold Create a structure for the cluster in DCSshow-config Show cluster configurationswitchover Switchover to a replicaversion Output version of patronictl command or a running Patroni...6.1 修改PostgreSQL参数
ALTER SYSTEM SET ... SQL命令,比如临时打开某个节点的debug日志。对于需要统一配置的参数应该通过patronictl edit-config设置,确保全局一致,比如修改大连接数。patronictl edit-config -p 'max_connections=300'Pending restart标志。[postgres@node2 ~]$ patronictl list+ Cluster: pgsql (6868912301204081018) -------+----+-----------+-----------------+| Member | Host | Role | State | TL | Lag in MB | Pending restart |+--------+-----------------+--------+---------+----+-----------+-----------------+| pg1 | 192.168.234.201 | Leader | running | 25 | | * || pg2 | 192.168.234.202 | | running | 25 | . | * |+--------+-----------------+--------+---------+----+-----------+-----------------+patronictl restart pgsql6.2 查看Patroni节点状态
patronictl list查看每个节点的状态。但是如果想要查看更详细的节点状态信息,需要调用REST API。比如在Leader锁过期时存活节点却无法成为Leader,查看详细的节点状态信息有助于调查原因。curl -s http://127.0.0.1:8008/patroni | jq[root@node2 ~]# curl -s http://127.0.0.1:8008/patroni | jq{"database_system_identifier": "6870146304839171063","postmaster_start_time": "2020-09-13 09:56:06.359 CST","timeline": 23,"cluster_unlocked": true,"watchdog_failed": true,"patroni": {"scope": "cn","version": "1.6.5" },"state": "running","role": "replica","xlog": {"received_location": 201326752,"replayed_timestamp": null,"paused": false,"replayed_location": 201326752 },"server_version": 120004}"watchdog_failed": true,代表使用了watchdog但是却无法访问watchdog设备,该节点无法被提升为Leader。多主机URL vip -
haproxy 7.1 多主机URL
-
jdbc:postgresql://192.168.234.201:5432,192.168.234.202:5432,192.168.234.203:5432/postgres?targetServerType=primary 连接主节点(实际是可写的节点)。当出现"双主"甚至"多主"时驱动连接个它发现的可用的主节点 -
jdbc:postgresql://192.168.234.201:5432,192.168.234.202:5432,192.168.234.203:5432/postgres?targetServerType=preferSecondary&loadBalanceHosts=true 优先连接备节点,无可用备节点时连接主节点,有多个可用备节点时随机连接其中一个。 -
jdbc:postgresql://192.168.234.201:5432,192.168.234.202:5432,192.168.234.203:5432/postgres?targetServerType=any&loadBalanceHosts=true 随机连接任意一个可用的节点
-
postgres://192.168.234.201:5432,192.168.234.202:5432,192.168.234.203:5432/postgres?target_session_attrs=read-write 连接主节点(实际是可写的节点) -
postgres://192.168.234.201:5432,192.168.234.202:5432,192.168.234.203:5432/postgres?target_session_attrs=any 连接任一可用节点
import psycopg2
conn=psycopg2.connect("postgres://192.168.234.201:5432,192.168.234.202:5432/postgres?target_session_attrs=read-write&password=123456")7.2 VIP(通过Patroni回调脚本实现VIP漂移)
/pgsql/loadvip.sh#!/bin/bash
VIP=192.168.234.210GATEWAY=192.168.234.2DEV=ens33
action=$1role=$2cluster=$3
log(){echo "loadvip: $*"|logger}
load_vip(){ip a|grep -w ${DEV}|grep -w ${VIP} >/dev/nullif [ $? -eq ] ;thenlog "vip exists, skip load vip"else sudo ip addr add ${VIP}/32 dev ${DEV} >/dev/null rc=$?if [ $rc -ne ] ;thenlog "fail to add vip ${VIP} at dev ${DEV} rc=$rc"exit 1fi
log "added vip ${VIP} at dev ${DEV}"
arping -U -I ${DEV} -s ${VIP} ${GATEWAY} -c 5 >/dev/null rc=$?if [ $rc -ne ] ;thenlog "fail to call arping to gateway ${GATEWAY} rc=$rc"exit 1fi
log "called arping to gateway ${GATEWAY}"fi}
unload_vip(){ip a|grep -w ${DEV}|grep -w ${VIP} >/dev/nullif [ $? -eq ] ;then sudo ip addr del ${VIP}/32 dev ${DEV} >/dev/null rc=$?if [ $rc -ne ] ;thenlog "fail to delete vip ${VIP} at dev ${DEV} rc=$rc"exit 1fi
log "deleted vip ${VIP} at dev ${DEV}"elselog "vip not exists, skip delete vip"fi}
log "loadvip start args:'$*'"
case $action in on_start|on_restart|on_role_change)case $role in master) load_vip ;; replica) unload_vip ;; *)log "wrong role '$role'"exit 1 ;;esac ;; *)log "wrong action '$action'"exit 1 ;;esac/etc/patroni.yml,配置回调函数postgresql:...callbacks:on_start: /bin/bash /pgsql/loadvip.shon_restart: /bin/bash /pgsql/loadvip.shon_role_change: /bin/bash /pgsql/loadvip.shpatronictl reload pgsqlSep 5 21:32:24 localvm postgres: loadvip: loadvip start args:'on_role_change master pgsql'Sep 5 21:32:24 localvm systemd: Started Session c7 of user root.Sep 5 21:32:24 localvm postgres: loadvip: added vip 192.168.234.210 at dev ens33Sep 5 21:32:25 localvm patroni: 2020-09-05 21:32:25,415 INFO: Lock owner: pg1; I am pg1Sep 5 21:32:25 localvm patroni: 2020-09-05 21:32:25,431 INFO: no action. i am the leader with the lockSep 5 21:32:28 localvm postgres: loadvip: called arping to gateway 192.168.234.2-
GET /或GET /leader运行中且是leader节点 -
GET /replica运行中且是replica角色,且没有设置tag noloadbalance -
GET /read-only和 GET /replica类似,但是包含leader节点
yum install -y keepalived/etc/keepalived/keepalived.confglobal_defs {router_id LVS_DEVEL}vrrp_script check_leader {script "/usr/bin/curl -s http://127.0.0.1:8008/leader -v 2>&1|grep '200 OK' >/dev/null"interval 2weight 10}vrrp_script check_replica {script "/usr/bin/curl -s http://127.0.0.1:8008/replica -v 2>&1|grep '200 OK' >/dev/null"interval 2weight 5}vrrp_script check_can_read {script "/usr/bin/curl -s http://127.0.0.1:8008/read-only -v 2>&1|grep '200 OK' >/dev/null"interval 2weight 10}vrrp_instance VI_1 {state BACKUPinterface ens33virtual_router_id 211priority 100advert_int 1track_script {check_can_readcheck_replica}virtual_ipaddress {192.168.234.211}}systemctl start keepalivedtrack_script中的脚本换成check_leader即可。但是在网络抖动或其它临时故障时keepalived管理的VIP容易飘,因此个人更推荐使用Patroni回调脚本动态绑定读写VIP。如果有多个备库,也可以在keepalived中配置LVS对所有备库进行负载均衡,过程就不展开了。yum install -y haproxy/etc/haproxy/haproxy.cfgglobal maxconn 100 log 127.0.0.1 local2
defaults log global mode tcp retries 2 timeout client 30m timeout connect 4s timeout server 30m timeout check 5s
listen statsmode http bind *:7000 stats enable stats uri /
listen pgsql bind *:5000option httpchkhttp-check expect status 200default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessionsserver postgresql_192.168.234.201_5432 192.168.234.201:5432 maxconn 100 check port 8008server postgresql_192.168.234.202_5432 192.168.234.202:5432 maxconn 100 check port 8008server postgresql_192.168.234.203_5432 192.168.234.203:5432 maxconn 100 check port 8008
listen pgsql_read bind *:6000option httpchk GET /replicahttp-check expect status 200default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessionsserver postgresql_192.168.234.201_5432 192.168.234.201:5432 maxconn 100 check port 8008server postgresql_192.168.234.202_5432 192.168.234.202:5432 maxconn 100 check port 8008server postgresql_192.168.234.203_5432 192.168.234.203:5432 maxconn 100 check port 8008GET /replica 需要改成GET /read-only,否则备库故障时就无法提供只读访问了,但是这样配置主库也会参与读,不能完全分离主库的读负载。systemctl start haproxy/etc/keepalived/keepalived.confglobal_defs {router_id LVS_DEVEL}vrrp_script check_haproxy {script "pgrep -x haproxy"interval 2weight 10}vrrp_instance VI_1 {state BACKUPinterface ens33virtual_router_id 210priority 100advert_int 1track_script {check_haproxy}virtual_ipaddress {192.168.234.210}}systemctl start keepalived[postgres@node4 ~]$ psql "host=192.168.234.210 port=5000 password=123456" -c 'select inet_server_addr(),pg_is_in_recovery()' inet_server_addr | pg_is_in_recovery------------------+------------------- 192.168.234.201 | f(1 row)[postgres@node4 ~]$ psql "host=192.168.234.210 port=6000 password=123456" -c 'select inet_server_addr(),pg_is_in_recovery()' inet_server_addr | pg_is_in_recovery------------------+------------------- 192.168.234.202 | t(1 row)
[postgres@node4 ~]$ psql "host=192.168.234.210 port=6000 password=123456" -c 'select inet_server_addr(),pg_is_in_recovery()' inet_server_addr | pg_is_in_recovery------------------+------------------- 192.168.234.203 | t(1 row)tags:replicatefrom: pg2replicatefrom只对节点处于Replica角色时有效,并不影响该节点参与Leader选举并成为Leader。当replicatefrom指定的复制源节点故障时,Patroni会自动修改PG切换到从Leader节点复制。/etc/patroni.yml中加入以下配置bootstrap:dcs:standby_cluster:host: 192.168.234.210port: 5432primary_slot_name: slot1create_replica_methods:- basebackuphost和port是上游复制源的主机和端口号,如果上游数据库是配置了读写VIP的PG集群,可以将读写VIP作为host避免主集群主备切换时影响备集群。primary_slot_name是可选的,如果配置了复制槽,需要同时在主集群上配置持久slot,确保在新主上始终保持slot。slots:slot1:type: physicalpatronictl edit-config命令动态添加standby_cluster设置把主集群变成备集群;以及删除standby_cluster设置把备集群变成主集群。standby_cluster:host: 192.168.234.210port: 5432primary_slot_name: slot1create_replica_methods:- basebackup9. 参考
https://patroni.readthedocs.io/en/latest/ http://blogs.sungeek.net/unixwiz/2018/09/02/centos-7-postgresql-10-patroni/ https://scalegrid.io/blog/managing-high-availability-in-postgresql-part-1/ https://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters https://www.percona.com/blog/2019/10/23/seamless-application-failover-using-libpq-features-in-postgresql/
相关文章