Cherry's blog

Tuesday, October 01, 2024

PostgreSQL - Logging

Ref:

Thursday, September 19, 2024

PostgreSQL - Setting Parameters for Production

There are several key parameters that should be adjusted for optimal performance, security, and stability. The default settings are usually designed for small systems and need tuning based on your hardware, workload, and specific use cases. Here's a breakdown of important parameters to adjust:

1. Memory Settings

shared_buffers

Memory the database server uses for shared memory buffers
25% of physical RAM if physical RAM > 1GB
Larger settings for shared_buffers usually require a corresponding increase in max_wal_size and setting huge_pages

wal_buffers

The amount of shared memory used for WAL data that has not yet been written to disk
Default value: 3% of shared_buffers, but not less than 64kB nor more than the size of one WAL segment (16MB)
The contents of the WAL buffers are written out to disk at every transaction commit, so extremely large values are unlikely to provide a significant benefit. However, setting this value to at least a few megabytes can improve write performance on a busy server where many clients are committing at once. The auto-tuning selected by the default setting of -1 should give reasonable results in most cases

work_mem

Used for sorting tuples by ORDER BY and DISTINCT operations, and for joining tables by merge-join and hash-join operations.
Be careful with high values, as this is allocated per query, potentially consuming a large amount of RAM during heavy workloads
Consider with formular "Total RAM * 0.25 / max_connections"

temp_buffers

Used for temporary buffers within each database session and for storing temporary tables.
This setting can be changed within individual sessions, but only before the first use of temporary tables within the session; subsequent attempts to change the value will have no effect on that session

Thursday, September 12, 2024

PostgreSQL - Streaming Replication (Hot Standby)

1. General information

Master Node

IP: 192.168.56.111

Hostname: postgres-master

Version: postgres (PostgreSQL) 16.4

$PGDATA=/var/lib/pgsql/16/data

postgres=# \db

List of tablespaces

Name | Owner | Location

------------+----------+------------------------------------

dvd_data | tucdt | /var/lib/pgsql/dvdrental_tbs/data

dvd_index | tucdt | /var/lib/pgsql/dvdrental_tbs/index

pg_default | postgres |

pg_global | postgres |

(4 rows)

Slave Node

IP: 192.168.56.222

Hostname: postgres-slave

Version: postgres (PostgreSQL) 16.4

$PGDATA=/var/lib/pgsql/16/data // empty

2. Preparation on Master/Slave

Disable Selinux : SELINUX=disabled
Stop Firewall
Config /etc/hosts

192.168.56.111 postgres-master.taolaoxibup.com postgres-master
192.168.56.222 postgres-slave.taolaoxibup.com postgres-slave

Setup SSH KEY-PAIR

ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@192.168.56.[111/222]

3. Master - Configuration

3.1 Create user for replication

postgres=# CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'P@ssw0rd';

3.2 Create replication slot

Replication slots provide an automated way to ensure that the primary does not remove WAL segments until they have been received by all standbys, and that the primary does not remove rows which could cause a recovery conflict even when the standby is disconnected.
In lieu of using replication slots, it is possible to prevent the removal of old WAL segments using wal_keep_size, or by storing the segments in an archive using archive_command or archive_library

Wednesday, September 11, 2024

PostgreSQL - Backup and Restore

1. pg_dump

pg_dump is a utility for backing up a PostgreSQL database. It makes consistent backups even if the database is being used concurrently. pg_dump does not block other users accessing the database (readers or writers)
pg_dump only dumps a single database.

[postgres@postgres-master ~]$ pg_dump --help

pg_dump dumps a database as a text file or to other formats.

Usage:

pg_dump [OPTION]... [DBNAME]

General options:

-f, --file=FILENAME output file or directory name

-F, --format=c|d|t|p output file format (custom, directory, tar, plain text (default))

p - plain: Output a plain-text SQL script file (the default). Using psql to restore this script.
c - custom: Output a custom-format archive suitable for input into pg_restore. Together with the directory output format, this is the most flexible output format in that it allows manual selection and reordering of archived items during restore. This format is also compressed by default.
d - directory: Output a directory-format archive suitable for input into pg_restore. This will create a directory with one file for each table and large object being dumped, plus a so-called Table of Contents file describing the dumped objects in a machine-readable format that pg_restore can read. A directory format archive can be manipulated with standard Unix tools; for example, files in an uncompressed archive can be compressed with the gzip, lz4, or zstd tools. This format is compressed by default using gzip and also supports parallel dumps.
t - tar : Output a tar-format archive suitable for input into pg_restore. The tar format is compatible with the directory format: extracting a tar-format archive produces a valid directory-format archive. However, the tar format does not support compression. Also, when using tar format the relative order of table data items cannot be changed during restore

-j, --jobs=NUM use this many parallel jobs to dump // only supported by the directory format -Fd

Options controlling the output content:

-a, --data-only dump only the data, not the schema // only metadata of database

-c, --clean clean (drop) database objects before recreating

-n, --schema=PATTERN dump the specified schema(s) only

-N, --exclude-schema=PATTERN do NOT dump the specified schema(s)

-s, --schema-only dump only the schema, no data

-t, --table=PATTERN dump only the specified table(s)

-T, --exclude-table=PATTERN do NOT dump the specified table(s)

Connection options:

-d, --dbname=DBNAME database to dump

-h, --host=HOSTNAME database server host or socket directory

-p, --port=PORT database server port number

-U, --username=NAME connect as specified database user

-w, --no-password never prompt for password

-W, --password force password prompt (should happen automatically)

Example:

[postgres@postgres-master ~]$ pg_dump dvdrental -j2 -Fd -f /source/dumpfile

[postgres@postgres-master ~]$ pg_dump dvdrental -f /source/dvdrental.sql

[postgres@postgres-master ~]$ pg_dump dvdrental -t cherry -t actor > cherry_actor.sql

Monday, September 09, 2024

PostgreSQL - pglogical 2 extention

1. Overview

The pglogical 2 extension provides logical streaming replication for PostgreSQL, using a publish/subscribe model.
Using pglogical the provider and subscriber must be running PostgreSQL 9.4 or newer.
Use cases supported are:

Upgrades between major versions (given the above restrictions)
Full database replication
Selective replication of sets of tables using replication sets
Selective replication of table rows at either publisher or subscriber side (row_filter)
Selective replication of table columns at publisher side
Data gather/merge from multiple upstream servers

Architectural details:

pglogical works on a per-database level, not whole server level like physical streaming replication
One Provider may feed multiple Subscribers without incurring additional disk write overhead
One Subscriber can merge changes from several origins and detect conflict between changes with automatic and configurable conflict resolution (some, but not all aspects required for multi-master).
Cascading replication is implemented in the form of changeset forwarding.

Limitations and restrictions

To replicate multiple databases you must set up individual provider/subscriber relationships for each. There is no way to configure replication for all databases in a PostgreSQL install at once
UPDATEs and DELETEs cannot be replicated for tables that lack a PRIMARY KEY or other valid replica identity such as a UNIQUE constraint. Replication has no way to find the tuple that should be updated/deleted since there is no unique identifier.
Automatic DDL replication is not supported. Managing DDL so that the provider and subscriber database(s) remain compatible is the responsibility of the user.
The state of sequences added to replication sets is replicated periodically and not in real-time.
PGLogical can replicate across PostgreSQL major versions. Despite that, long term cross-version replication is not considered a design target, though it may often work. Issues where changes are valid on the provider but not on the subscriber are more likely to arise when replicating across versions.
It is safer to replicate from an old version to a newer version since PostgreSQL maintains solid backward compatibility but only limited forward compatibility. Initial schema synchronization is only supported when replicating between same version of PostgreSQL or from lower version to higher version
...

2. General information

Master Node

IP: 192.168.56.111

Hostname: postgres-master.taolaoxibup.com

OS: Red Hat Enterprise Linux release 9.4 (Plow)

postgres=# select datname from pg_database ;

datname

-----------

postgres

dvdrental // we will replica this database

template1

template0

(4 rows)

Slave Node

IP: 192.168.56.222

Hostname: postgres-slave.taolaoxibup.com

OS: Red Hat Enterprise Linux release 9.4 (Plow)

3. Installation on Master/Slave nodes

# dnf install -y postgresql16-server postgresql16-contrib

# dnf install pglogical_16

Pages

Tuesday, October 01, 2024

PostgreSQL - Logging

Thursday, September 19, 2024

PostgreSQL - Setting Parameters for Production

Thursday, September 12, 2024

PostgreSQL - Streaming Replication (Hot Standby)

Wednesday, September 11, 2024

PostgreSQL - Backup and Restore

Monday, September 09, 2024

PostgreSQL - pglogical 2 extention