Monday, September 09, 2024

PostgreSQL - pglogical 2 extention

1. Overview 
  • The pglogical 2 extension provides logical streaming replication for PostgreSQL, using a publish/subscribe model.
  • Using pglogical the provider and subscriber must be running PostgreSQL 9.4 or newer.
  • Use cases supported are:
    • Upgrades between major versions (given the above restrictions)
    • Full database replication
    • Selective replication of sets of tables using replication sets
    • Selective replication of table rows at either publisher or subscriber side (row_filter)
    • Selective replication of table columns at publisher side
    • Data gather/merge from multiple upstream servers
  • Architectural details:
    • pglogical works on a per-database level, not whole server level like physical streaming replication
    • One Provider may feed multiple Subscribers without incurring additional disk write overhead
    • One Subscriber can merge changes from several origins and detect conflict between changes with automatic and configurable conflict resolution (some, but not all aspects required for multi-master).
    • Cascading replication is implemented in the form of changeset forwarding.
  • Limitations and restrictions
    • To replicate multiple databases you must set up individual provider/subscriber relationships for each. There is no way to configure replication for all databases in a PostgreSQL install at once
    • UPDATEs and DELETEs cannot be replicated for tables that lack a PRIMARY KEY or other valid replica identity such as a UNIQUE constraint. Replication has no way to find the tuple that should be updated/deleted since there is no unique identifier.
    • Automatic DDL replication is not supported. Managing DDL so that the provider and subscriber database(s) remain compatible is the responsibility of the user. 
    • The state of sequences added to replication sets is replicated periodically and not in real-time.
    • PGLogical can replicate across PostgreSQL major versions. Despite that, long term cross-version replication is not considered a design target, though it may often work. Issues where changes are valid on the provider but not on the subscriber are more likely to arise when replicating across versions. 
    • It is safer to replicate from an old version to a newer version since PostgreSQL maintains solid backward compatibility but only limited forward compatibility. Initial schema synchronization is only supported when replicating between same version of PostgreSQL or from lower version to higher version
    • ...
2. General information

Master Node
IP: 192.168.56.111
Hostname: postgres-master.taolaoxibup.com
OS: Red Hat Enterprise Linux release 9.4 (Plow)

postgres=# select datname from pg_database ;
  datname  
-----------
 postgres
 dvdrental        // we will replica this database
 template1
 template0
(4 rows)

Slave Node
IP: 192.168.56.222
Hostname: postgres-slave.taolaoxibup.com
OS: Red Hat Enterprise Linux release 9.4 (Plow)

3. Installation on Master/Slave nodes
# dnf install -y postgresql16-server postgresql16-contrib
# dnf install pglogical_16

4. Configuration on Master/Slave nodes
4.1 Set parameters for replication
alter system set wal_level = 'logical';
alter system set max_worker_processes = 10;
alter system set max_replication_slots = 10;
alter system set max_wal_senders = 10;
alter system set shared_preload_libraries = 'pglogical';
alter system set track_commit_timestamp = 'on';

4.2 Allow replication connections from Master/Slave
$ vi pg_hba.conf
# IPv4 local connections:
host    all             all             192.168.56.0/24            trust
host    replication     all             192.168.56.0/24         trust

$ vi postgresql.conf
listen_addresses = '*'
port = 5432 

4.3 Restart service
# systemctl restart postgresql-16.service 

Verify parameters
postgres=# select name,setting,unit from pg_settings where name in ('wal_level', 'max_worker_processes', 'max_replication_slots', 'max_wal_senders', 'shared_preload_libraries', 'track_commit_timestamp');

5. Create pglogical extention on Master/Slave nodes

postgres=# \c dvdrental             // Login into specific database before creating extention
You are now connected to database "dvdrental" as user "postgres".

List all extentions
dvdrental=# select * from pg_available_extensions;

List all installed extentions
dvdrental=# select * from pg_available_extensions where installed_version is not null;
dvdrental=# \dx

Create pglogical extention
dvdrental=# create extension pglogical;
CREATE EXTENSION

dvdrental=# \dx
                   List of installed extensions
   Name    | Version |   Schema   |          Description           
-----------+---------+------------+--------------------------------
 pglogical | 2.4.4   | pglogical  | PostgreSQL Logical Replication
 plpgsql   | 1.0     | pg_catalog | PL/pgSQL procedural language
(2 rows)

6. Create empty dvdrental database on Slave

On Master: export metadata of dvdrental database and copy to Slave
[postgres@postgres-master ~]$ pg_dump -d dvdrental -s -f dvdrental.sql
[postgres@postgres-master ~]$ scp dvdrental.sql postgres@192.168.56.222:/var/lib/pgsql

On Slave: import metadata of dvdrental database
postgres=# create database dvdrental;
[postgres@postgres-slave ~]$ psql -d dvdrental -f dvdrental.sql 

7. On Master - Create the provider node & replication set

dvdrental=# select pglogical.create_node(node_name:='provider', dsn:='host=postgres-master.taolaoxibup.com port=5432 dbname=dvdrental');
 create_node 
-------------
  3171898924
(1 row)

dvdrental=# select pglogical.create_replication_set('dvdrental_repset');
 create_replication_set 
------------------------
             1358967304
(1 row)

Add all tables of public schema into dvdrental_repset replication set
dvdrental=# select pglogical.replication_set_add_all_tables('dvdrental_repset', ARRAY['public']);
 replication_set_add_all_tables 
--------------------------------
 t
(1 row)

Add all sequences of public schema into dvdrental_repset replication set
dvdrental=# select pglogical.replication_set_add_all_sequences('dvdrental_repset', ARRAY['public']);
 replication_set_add_all_sequences 
-----------------------------------
 t
(1 row)

dvdrental=# select * from pglogical.tables;
 relid | nspname |    relname    |     set_name     
-------+---------+---------------+------------------
 16654 | public  | language      | dvdrental_repset
 16620 | public  | address       | dvdrental_repset
 16648 | public  | inventory     | dvdrental_repset
 16595 | public  | film          | dvdrental_repset
 16606 | public  | film_actor    | dvdrental_repset
 16670 | public  | rental        | dvdrental_repset
 16632 | public  | country       | dvdrental_repset
 16610 | public  | film_category | dvdrental_repset
 16583 | public  | actor         | dvdrental_repset
 16589 | public  | category      | dvdrental_repset
 16665 | public  | payment       | dvdrental_repset
 16573 | public  | customer      | dvdrental_repset
 16690 | public  | store         | dvdrental_repset
 16681 | public  | staff         | dvdrental_repset
 16626 | public  | city          | dvdrental_repset
(15 rows)

  • The event trigger facility can be used for describing rules which define replication sets for newly created tables.
  • The above example will put all new tables created in schema public into replication set dvdrental_repset and all other new tables which are not created by extensions will go to default replication set.
  • This function will add new table into replication set automatically only on Provider. We must create table manually on Subscriber before executing replication.
CREATE OR REPLACE FUNCTION pglogical_assign_repset()
RETURNS event_trigger AS $$
DECLARE obj record;
BEGIN
    FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
    LOOP
        IF obj.object_type = 'table' THEN
            IF obj.schema_name = 'public' THEN
                PERFORM pglogical.replication_set_add_table('dvdrental_repset', obj.objid);
            ELSIF NOT obj.in_extension THEN
                PERFORM pglogical.replication_set_add_table('default', obj.objid);
            END IF;
        END IF;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

CREATE EVENT TRIGGER pglogical_assign_repset_trg
    ON ddl_command_end
    WHEN TAG IN ('CREATE TABLE', 'CREATE TABLE AS')
    EXECUTE PROCEDURE pglogical_assign_repset();

dvdrental=# \df
dvdrental=# \dy

8. On Slave - Create the subcriber node & subscription

8.1 Create .pgpass for automating connection
[postgres@postgres-slave ~]$ vi .pgpass
postgres-master:dvdrental:postgres:{password}
postgres-master:replication:postgres:{password}
postgres-slave:dvdrental:postgres:{password}
postgres-slave:replication:postgres:{password}

[postgres@postgres-slave ~]$ chmod 0600 .pgpass
[root@postgres-slave ~]# systemctl restart postgresql-16.service 

8.2 Create the subcriber node

dvdrental=# select pglogical.create_node(node_name:='subscriber',dsn:='host=postgres-slave.taolaoxibup.com port=5432 dbname=dvdrental user=postgres');
 create_node 
-------------
  2941155235
(1 row)

8.2 Create the subcription start synchronization and replication process in the background

dvdrental=# select pglogical.create_subscription(subscription_name:='dvdrental_subs', replication_sets:=array['dvdrental_repset'],provider_dsn:='host=postgres-master.taolaoxibup.com port=5432 dbname=dvdrental user=postgres');
 create_subscription 
---------------------
          3036885041
(1 row)

dvdrental=# \d+

9. Verify replication status

On Master
dvdrental=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 7382
usesysid         | 10
usename          | postgres
application_name | dvdrental_subs
client_addr      | 192.168.56.222
client_hostname  | 
client_port      | 49164
backend_start    | 2024-09-10 11:26:54.34647+07
backend_xmin     | 
state            | streaming
sent_lsn         | 0/2137228
write_lsn        | 0/2137228
flush_lsn        | 0/2137228
replay_lsn       | 0/2137228
write_lag        | 
flush_lag        | 
replay_lag       | 
sync_priority    | 0
sync_state       | async
reply_time       | 2024-09-10 11:52:29.395522+07

On Slave

dvdrental=# select * from pglogical.show_subscription_status();
-[ RECORD 1 ]-----+------------------------------------------------------------------------------
subscription_name | dvdrental_subs
status            | replicating
provider_node     | provider
provider_dsn      | host=postgres-master.taolaoxibup.com port=5432 dbname=dvdrental user=postgres
slot_name         | pgl_dvdrental_provider_dvdrental_subs
replication_sets  | {dvdrental_repset}
forward_origins   | {all}

dvdrental=# select pglogical.wait_for_subscription_sync_complete('dvdrental_subs');
-[ RECORD 1 ]-----------------------+-
wait_for_subscription_sync_complete | 

Ref: