Thursday, September 12, 2024

PostgreSQL - Streaming Replication (Hot Standby)

1. General information

Master Node
IP: 192.168.56.111
Hostname: postgres-master
Version: postgres (PostgreSQL) 16.4
$PGDATA=/var/lib/pgsql/16/data

postgres=# \db
                    List of tablespaces
    Name    |  Owner   |              Location              
------------+----------+------------------------------------
 dvd_data   | tucdt    | /var/lib/pgsql/dvdrental_tbs/data
 dvd_index  | tucdt    | /var/lib/pgsql/dvdrental_tbs/index
 pg_default | postgres | 
 pg_global  | postgres | 
(4 rows)

Slave Node
IP: 192.168.56.222
Hostname: postgres-slave
Version: postgres (PostgreSQL) 16.4
$PGDATA=/var/lib/pgsql/16/data    // empty

2. Preparation on Master/Slave
  • Disable Selinux : SELINUX=disabled
  • Stop Firewall
  • Config /etc/hosts
    • 192.168.56.111  postgres-master.taolaoxibup.com postgres-master
    • 192.168.56.222  postgres-slave.taolaoxibup.com postgres-slave
  • Setup SSH KEY-PAIR
    • ssh-keygen -t rsa
    • ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@192.168.56.[111/222]
3. Master - Configuration
3.1 Create user for replication
postgres=# CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'P@ssw0rd';

3.2 Create replication slot
  • Replication slots provide an automated way to ensure that the primary does not remove WAL segments until they have been received by all standbys, and that the primary does not remove rows which could cause a recovery conflict even when the standby is disconnected.
  • In lieu of using replication slots, it is possible to prevent the removal of old WAL segments using wal_keep_size, or by storing the segments in an archive using archive_command or archive_library
postgres=# select * from pg_replication_slots;
postgres=# select * from pg_create_physical_replication_slot('cherry');
postgres=# select * from pg_replication_slots;
-[ RECORD 1 ]-------+---------
slot_name           | cherry
plugin              | 
slot_type           | physical
datoid              | 
database            | 
temporary           | f
active              | f
active_pid          | 
xmin                | 
catalog_xmin        | 
restart_lsn         | 
confirmed_flush_lsn | 
wal_status          | 
safe_wal_size       | 
two_phase           | f
conflicting         | 

3.3 Configure parameters
alter system set wal_level TO 'hot_standby';
alter system set archive_mode to 'on';
alter system set archive_command = 'test ! -f /var/lib/pgsql/backup/%f && cp %p /var/lib/pgsql/backup/%f';
alter system set max_wal_senders to '10';
alter system set wal_keep_size = '1024 MB';

Restart cluster database to apply changes
[root@postgres-master ~]# systemctl restart postgresql-16.service 
[root@postgres-master ~]# systemctl status postgresql-16.service 

4. Slave - Replication from Master
  • $PGDATA must be empty
  • Do not create any tablespace locations. These locations will be created during replication processing
[postgres@postgres-slave ~]$ pg_basebackup -h postgres-master -U replicator -D $PGDATA -P -Xs -R -v
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/5000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_7922"
39497/39497 kB (100%), 3/3 tablespaces                                         
pg_basebackup: write-ahead log end point: 0/5000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed

💩 Verify the $PGDATA
[postgres@postgres-slave ~]$ cd $PGDATA
[postgres@postgres-slave data]$ ls -la
total 264
drwx------. 20 postgres postgres   4096 Sep 13 09:30 .
drwx------.  3 postgres postgres     18 Sep 13 09:27 ..
-rw-------   1 postgres postgres    225 Sep 13 09:30 backup_label
-rw-------   1 postgres postgres 198549 Sep 13 09:30 backup_manifest
drwx------   6 postgres postgres     46 Sep 13 09:30 base
-rw-------   1 postgres postgres     30 Sep 13 09:30 current_logfiles
drwx------   2 postgres postgres   4096 Sep 13 09:30 global
drwx------   2 postgres postgres    136 Sep 13 09:30 log
drwx------   2 postgres postgres     18 Sep 13 09:30 pg_commit_ts
drwx------   2 postgres postgres      6 Sep 13 09:30 pg_dynshmem
-rw-------   1 postgres postgres   5458 Sep 13 09:30 pg_hba.conf
-rw-------   1 postgres postgres   2640 Sep 13 09:30 pg_ident.conf
drwx------   4 postgres postgres     68 Sep 13 09:30 pg_logical
drwx------   4 postgres postgres     36 Sep 13 09:30 pg_multixact
drwx------   2 postgres postgres      6 Sep 13 09:30 pg_notify
drwx------   2 postgres postgres      6 Sep 13 09:30 pg_replslot
drwx------   2 postgres postgres      6 Sep 13 09:30 pg_serial
drwx------   2 postgres postgres      6 Sep 13 09:30 pg_snapshots
drwx------   2 postgres postgres      6 Sep 13 09:30 pg_stat
drwx------   2 postgres postgres      6 Sep 13 09:30 pg_stat_tmp
drwx------   2 postgres postgres      6 Sep 13 09:30 pg_subtrans
drwx------   2 postgres postgres     32 Sep 13 09:30 pg_tblspc
drwx------   2 postgres postgres      6 Sep 13 09:30 pg_twophase
-rw-------   1 postgres postgres      3 Sep 13 09:30 PG_VERSION
drwx------   3 postgres postgres     60 Sep 13 09:30 pg_wal
drwx------   2 postgres postgres     18 Sep 13 09:30 pg_xact
-rw-------   1 postgres postgres    727 Sep 13 09:30 postgresql.auto.conf
-rw-------   1 postgres postgres  29684 Sep 13 09:30 postgresql.conf
-rw-------   1 postgres postgres      0 Sep 13 09:30 standby.signal

👀standby.signal file:
  • The presence of this file triggers standby mode. PostgreSQL knows it should operate as a standby server and connect to the primary for replication.
  • The server will remain in standby mode as long as this file exists in the data directory.
[postgres@postgres-slave data]$ cat postgresql.auto.conf 
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
max_worker_processes = '10'
max_replication_slots = '10'
shared_preload_libraries = 'pglogical'
track_commit_timestamp = 'on'
wal_level = 'hot_standby'
archive_mode = 'on'
archive_command = 'test ! -f /var/lib/pgsql/backup/%f && cp %p /var/lib/pgsql/backup/%f'
max_wal_senders = '10'
wal_keep_size = '1024 MB'
primary_conninfo = 'user=replicator passfile=''/var/lib/pgsql/.pgpass'' channel_binding=prefer host=''postgres-master'' port=5432 sslmode=prefer sslcompression=0 sslcertmode=allow sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres gssdelegation=0 target_session_attrs=any load_balance_hosts=disable'

💩 Config postgresql.conf
primary_conninfo = 'host=postgres-master port=5432 user=replicator password=******'
primary_slot_name = 'cherry'
archive_cleanup_command = '/usr/pgsql-16/bin/pg_archivecleanup /var/lib/pgsql/backup %r'
restore_command = 'rsync -a  postgres@192.168.56.111:/var/lib/pgsql/backup/%f %p'    // using IP instead of hostname
recovery_target_timeline = 'latest'

💩 Start cluster database to begin replication
[root@postgres-slave ~]# systemctl stop postgresql-16.service 
[root@postgres-slave ~]# systemctl start postgresql-16.service 
[root@postgres-slave ~]# systemctl status postgresql-16.service 

5. Verify replication status
💀 On Master Node
postgres=# select * from pg_replication_slots;
-[ RECORD 1 ]-------+----------
slot_name           | cherry
plugin              | 
slot_type           | physical
datoid              | 
database            | 
temporary           | f
active              | t
active_pid          | 10897
xmin                | 
catalog_xmin        | 
restart_lsn         | 0/70001C0
confirmed_flush_lsn | 
wal_status          | reserved
safe_wal_size       | 
two_phase           | f
conflicting         | 

postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 10897
usesysid         | 17430
usename          | replicator
application_name | walreceiver
client_addr      | 192.168.56.222
client_hostname  | 
client_port      | 33438
backend_start    | 2024-09-13 10:40:03.323266+07
backend_xmin     | 
state            | streaming
sent_lsn         | 0/70001C0
write_lsn        | 0/70001C0
flush_lsn        | 0/70001C0
replay_lsn       | 0/70001C0
write_lag        | 
flush_lag        | 
replay_lag       | 
sync_priority    | 0
sync_state       | async
reply_time       | 2024-09-13 10:44:42.739911+07

postgres=# select pg_current_wal_lsn();
postgres=# select pg_walfile_name(pg_current_wal_lsn());

Lag in Bytes
postgres=# select client_addr, pg_wal_lsn_diff(pg_stat_replication.sent_lsn, pg_stat_replication.replay_lsn) AS byte_lag from pg_stat_replication;
-[ RECORD 1 ]---------------
client_addr | 192.168.56.222
byte_lag    | 0

Lag in Seconds
SELECT CASE 
WHEN pg_last_wal_receive_lsn() = pg_last_wal_replay_lsn() THEN 0 
    ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) 
    END AS log_delay;

💀 On Slave Node
postgres=# select pg_is_wal_replay_paused();
 pg_is_wal_replay_paused 
-------------------------
 f                                                // f = recovery is running , t = stopped
(1 row)

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery 
-------------------
 t
(1 row)

postgres=# select pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn(), pg_last_xact_replay_timestamp();
-[ RECORD 1 ]-----------------+----------
pg_last_wal_receive_lsn       | 0/70001C0
pg_last_wal_replay_lsn        | 0/70001C0
pg_last_xact_replay_timestamp | 

postgres=# select * from pg_stat_wal_receiver;
-[ RECORD 1 ]---------+--------
pid                   | 10407
status                | streaming
receive_start_lsn     | 0/6000000
receive_start_tli     | 1
written_lsn           | 0/70001C0
flushed_lsn           | 0/70001C0
received_tli          | 1
last_msg_send_time    | 2024-09-13 10:44:03.598586+07
last_msg_receipt_time | 2024-09-13 10:44:02.71503+07
latest_end_lsn        | 0/70001C0
latest_end_time       | 2024-09-13 10:40:03.374545+07
slot_name             | cherry
sender_host           | postgres-master
sender_port           | 5432
conninfo              | user=replicator passfile=/var/lib/pgsql/.pgpass channel_binding=prefer dbname=replication host=postgres-master port=5432 fallback_application_name=walreceiver sslmode=prefer sslcompression=0 sslcertmode=allow sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres gssdelegation=0 target_session_attrs=any load_balance_hosts=disable

postgres=# select pg_last_wal_receive_lsn();
postgres=# select pg_last_wal_replay_lsn();
postgres=# select pg_last_xact_replay_timestamp();

Note: We can not execute DML, DDL on Slave Node
dvdrental=# create table temp01 as select * from actor ;
ERROR:  cannot execute CREATE TABLE AS in a read-only transaction

👽 Make data change and check:
[root@postgres-master ~]# ps -eaf | grep sender
postgres   10897    9704  0 10:40 ?        00:00:00 postgres: walsender replicator 192.168.56.222(33438) streaming 0/7020300

[postgres@postgres-slave ~]$ ps -eaf | egrep "receiver|startup"
root         872       1  0 08:31 ?        00:00:00 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
postgres   10406   10401  0 10:40 ?        00:00:00 postgres: startup recovering 000000010000000000000007
postgres   10407   10401  0 10:40 ?        00:00:00 postgres: walreceiver streaming 0/7020300

6. Change to SYNC mode
👉 On Master Node
postgres=# show synchronous_standby_names ;
postgres=# alter system set synchronous_standby_names to 'standby';
[root@postgres-master ~]# systemctl restart postgresql-16.service 

Note: Convert to ASYNC mode with command
postgres=# alter system reset synchronous_standby_names ;
[root@postgres-master ~]# systemctl restart postgresql-16.service 

👉 On Slave Node
postgres=# show cluster_name ;
postgres=# alter system set cluster_name = 'standby';
[root@postgres-slave ~]# systemctl restart postgresql-16.service 

👉 Verify replication status on Master Node
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 23177
usesysid         | 17430
usename          | replicator
application_name | standby
client_addr      | 192.168.56.222
client_hostname  | 
client_port      | 37916
backend_start    | 2024-09-13 15:02:18.940161+07
backend_xmin     | 
state            | streaming
sent_lsn         | 0/80000D8
write_lsn        | 0/80000D8
flush_lsn        | 0/80000D8
replay_lsn       | 0/80000D8
write_lag        | 
flush_lag        | 
replay_lag       | 
sync_priority    | 1
sync_state       | sync
reply_time       | 2024-09-13 15:02:58.081962+07

7. Failover
  • The Master node goes down and we can not fix the problem.  
  • The Slave node will be promoted to be Master node
😈On Master Node
[root@postgres-master ~]# systemctl stop postgresql-16.service 

💩On Slave Node
[postgres@postgres-slave data]$ pg_ctl promote -W    //  -W, --no-wait          do not wait until operation completes
server promoting

Note: The standby.signal will be removed after promoting the Slave to Master

8. Switchover
Note: Master node must be in ASYNC mode

Step 1: On Master - Stop service
postgres=# select pg_switch_wal();

[root@postgres-master ~]# systemctl stop postgresql-16.service 

Step 2: On Master - Backup postgresql.conf file
[postgres@postgres-master ~]$ cd $PGDATA
[postgres@postgres-master data]$ cp postgresql.conf postgresql.conf.bak

Step 3: On Slave - Check data synchronization
postgres=# select pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn(), pg_last_xact_replay_timestamp();
 pg_last_wal_receive_lsn | pg_last_wal_replay_lsn | pg_last_xact_replay_timestamp 
-------------------------+------------------------+-------------------------------
 0/F0000A0               | 0/F0000A0              | 2024-09-16 14:42:46.980097+07
(1 row)

Step 4: On Slave - Promote Slave to Master
[postgres@postgres-slave ~]$ pg_ctl promote 
waiting for server to promote.... done
server promoted
OR
postgres=# select pg_promote();

Step 5: On New Master - Create replication slot 
Because of replication slot is not replicated to Slave. So we must create a replication slot into New Master node

postgres=# select * from pg_create_physical_replication_slot('cherry');
postgres=# select * from pg_replication_slots;

Step 6: On OLD Master - Convert OLD Master to Slave
💩 Config postgresql.conf
primary_conninfo = 'host=postgres-slave port=5432 user=replicator password=******'
primary_slot_name = 'cherry'
archive_cleanup_command = '/usr/pgsql-16/bin/pg_archivecleanup /var/lib/pgsql/backup %r'
recovery_target_timeline = 'latest'

💓 Create standby.signal file
[postgres@postgres-master ~]$ cd $PGDATA
[postgres@postgres-master data]$ touch standby.signal
[postgres@postgres-master data]$ chmod 600 standby.signal 

💓 Restart service
[root@postgres-master ~]# systemctl stop postgresql-16.service 
[root@postgres-master ~]# systemctl start postgresql-16.service 


Ref: