Postgres 9.x 升级至 10.x

Postgres 9.x 升级至 10.x

December 1, 2019

目前主要有两种升级方式,它们分别使用于不同的场景。

升级方式 优点 缺点
pglogical逻辑复制升级 停机时间短 需提前准备从库,并搭建replication,在replication期间,需考虑DDL语句带来的影响。
pg_upgrade升级 不需要提前准备replication 升级完成后需检查index是否正常,并执行vacuum analyze,对于数据量较大的库,这可能耗时较长

逻辑复制升级

前提条件

所有表都必须有主键

可以用下面的语句来检查

SELECT table_name
  FROM information_schema.tables
 WHERE (table_catalog, table_schema, table_name) NOT IN
          (SELECT table_catalog, table_schema, table_name
             FROM information_schema.table_constraints
            WHERE constraint_type = 'PRIMARY KEY')
   AND table_schema IN ('myschema')

super user无密码访问provider和subscriber

修改配置文件

# prodvider 上配置允许 super user 从 subscriber 上访问 replication。
# 假如 subscriber IP 是 10.1.10.15,则 provider 的 pg_hba.conf 中应包含以下配置
host    all             postgres       10.1.10.15/32               trust
host    replication     postgres       10.1.10.15/32                 trust
# subscriber 上允许 super user 访问本地的 postgresql,
# 假如 subscriber IP 是 10.1.10.6,pg_hba.conf 中应包含以下配置
host    all             postgres             10.1.10.6/32            trust

subscriber必须拥有和provider一样的表结构

在 provider 上导出结构数据

sudo -iu postgres pg_dumpall --schema-only -f dump.sql

在 subscriber 上导入结构数据

sudo -iu postgres psql -f dump.sql

搭建逻辑复制

主库上的操作

  1. provider安装pglogical
apt install postgresql-9.5-pglogical/xenial-pgdg
  1. provider上,连到对应的数据库,创建pglogical扩展 mydb=# CREATE EXTENSION pglogical;
  2. 创建逻辑复制节点
-- host 填的是provider的IP
mydb=# SELECT pglogical.create_node(node_name := 'provider', dsn := 'host=10.1.10.6 port=5432 dbname=mydb');
  1. 添加所有表到需要逻辑复制的集合中
mydb=# SELECT pglogical.replication_set_add_all_tables('default', ARRAY['public', 'mydb', 'monitoring', 'authenticator']);

如果replication_set_add_all_tables 出现 deadlock, 可尝试逐个表添加

postgres@provider:~$ for i in `psql mydb -c '\d' | grep table | cut -d '|' -f 2`; do psql mydb -c "select pglogical.replication_set_add_table('default', '$i')"; done

从库上的操作

安装的 pg 10.x 作为从库,和逻辑复制的subscriber

  1. 从库上安装postgresql-10
  2. subscriber安装pglogical
apt install postgresql-10-pglogical/xenial-pgdg
  1. 修改subscriber配置, 重启PostgreSQL
shared_preload_libraries = 'pglogical'
  1. subscriber上,连到对应的数据库,创建pglogical扩展
mydb=# CREATE EXTENSION pglogical;
  1. 创建逻辑复制节点
-- host 填的是subscriber的IP
mydb=# SELECT pglogical.create_node(node_name := 'subscriber', dsn := 'host=10.1.10.15 port=5432 dbname=mydb');
  1. 创建逻辑复制订阅端
-- host 填的是provider的IP
mydb=# SELECT pglogical.create_subscription( subscription_name := 'subscriber', provider_dsn := 'host=10.1.10.6 port=5432 dbname=mydb');

升级

一次性同步所有sequence数据,provider 上执行

mydb=# SELECT pglogical.replication_set_add_all_sequences('default', ARRAY['public', 'mydb', 'monitoring', 'authenticator'], true);

升级完成后需drop replication

subscribe端的操作

subscriber端删除订阅

mydb=# select * from pglogical.drop_subscription('subscriber');

subscriber端删除逻辑复制节点

mydb=# select * from pglogical.drop_node('subscriber');

provider端的操作

provider端删除逻辑复制节点

mydb=# select * from pglogical.drop_node('provider');

pg_upgrade升级

事前准备

export LC_CTYPE=en_US.UTF-8
export LC_ALL=en_US.UTF-8

apt update

# 安装pg 10
apt install postgresql-10

# 安装索引检查插件
apt install postgresql-10-amcheck

停止服务

service [email protected] stop
service postgresql@10-main stop

检查当前的postgres集群

user@ubuntu:~# pg_lsclusters

Ver Cluster Port Status Owner    Data directory               Log file
9.5 main    5432 down   postgres /var/lib/postgresql/9.5/main /var/log/postgresql/postgresql-%d.csv
10  main    5433 down   postgres /var/lib/postgresql/10/main  /var/log/postgresql/postgresql-10-main.log

根据上面输出停掉pg 10的集群

user@ubuntu:~# pg_dropcluster --stop 10 main

升级9.x的集群

user@ubuntu:~# pg_upgradecluster --method upgrade --link 9.5 main

Finding the real data directory for the source cluster      ok
Finding the real data directory for the target cluster      ok
Performing Consistency Checks
-----------------------------
Checking cluster versions                                   ok
Checking database user is the install user                  ok
Checking database connection settings                       ok
Checking for prepared transactions                          ok
Checking for reg* data types in user tables                 ok
Checking for contrib/isn with bigint-passing mismatch       ok
Checking for invalid "unknown" user columns                 ok
Checking for roles starting with "pg_"                      ok
Creating dump of global objects                             ok
Creating dump of database schemas
                                                            ok
Checking for presence of required libraries                 ok
Checking database user is the install user                  ok
Checking for prepared transactions                          ok
 
If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.
 
Performing Upgrade
------------------
Analyzing all rows in the new cluster                       ok
Freezing all rows in the new cluster                        ok
Deleting files from new pg_xact                             ok
Copying old pg_clog to new server                           ok
Setting next transaction ID and epoch for new cluster       ok
Deleting files from new pg_multixact/offsets                ok
Copying old pg_multixact/offsets to new server              ok
Deleting files from new pg_multixact/members                ok
Copying old pg_multixact/members to new server              ok
Setting next multixact ID and offset for new cluster        ok
Resetting WAL archives                                      ok
Setting frozenxid and minmxid counters in new cluster       ok
Restoring global objects in the new cluster                 ok
Restoring database schemas in the new cluster
                                                            ok
Adding ".old" suffix to old global/pg_control               ok
 
If you want to start the old cluster, you will need to remove
the ".old" suffix from /var/lib/postgresql/9.5/main/global/pg_control.old.
Because "link" mode was used, the old cluster cannot be safely
started once the new cluster has been started.
 
Linking user relation files
                                                            ok
Setting next OID for new cluster                            ok
Sync data directory to disk                                 ok
Creating script to analyze new cluster                      ok
Creating script to delete old cluster                       ok
Checking for hash indexes                                   ok
 
Upgrade Complete
----------------
Optimizer statistics are not transferred by pg_upgrade so,
once you start the new server, consider running:
    ./analyze_new_cluster.sh
 
Running this script will delete the old cluster's data files:
    ./delete_old_cluster.sh
pg_upgrade output scripts are in /var/log/postgresql/pg_upgradecluster-9.5-10-main.yo8D
Re-enabling connections to the old cluster...
Copying old configuration files...
Copying old start.conf...
Copying old pg_ctl.conf...
Disabling automatic startup of old cluster...
Configuring old cluster to use a different port (5433)...
Success. Please check that the upgraded cluster works. If it does,
you can remove the old cluster with
    pg_dropcluster 9.5 main
 
Ver Cluster Port Status Owner    Data directory               Log file
9.5 main    5433 down   postgres /var/lib/postgresql/9.5/main /var/log/postgresql/postgresql-%d.csv
Ver Cluster Port Status Owner    Data directory              Log file
10  main    5432 down   postgres /var/lib/postgresql/10/main /var/log/postgresql/postgresql-%d.csv

复制pg的配置到pg 10上

cp postgresql.conf /etc/postgresql/10/main/postgresql.conf

重启pg 10

service postgresql@10-main start

移除pg 9 集群

pg_dropcluster --stop 9.5 main

查看当前的集群

pg_lsclusters

Ver Cluster Port Status Owner    Data directory              Log file
10  main    5432 online postgres /var/lib/postgresql/10/main /var/log/postgresql/postgresql-%d.csv

执行vacuum analyze

sudo -iu postgres psql -c 'vacuum analyze'
最后更新于