blog/drafts/postgres_cdc.md
Simon Petit d49b8be4d9
All checks were successful
continuous-integration/drone/push Build is passing
enabling scrolling on overflow code
2025-11-10 15:06:15 +01:00

1.9 KiB

Postgres CDC

What is CDC ?

CDC stands for Change Data Capture. It is a mechanism that enables the replication of a database. That is we listen to changes on the tables of the database so that we can replicate them into another database.

This is used in data engineering pipelines to extract data from sources and to replicate them into the datawarehouse, data lake or lakehouse for example. This way it is possible to do analysis over these data without impacting the transactionnal database, which is used by another software as its primary storage.

The other advantage of replicating the database is that it can be stored in another way. For example it is possible to store the resulting mirrored database as parquet files, or any columnar storage format, to speed up analytics queries.

Replication in Postgres

The database needs some configuration to enable a replication sufficient for a CDC data pipeline.

First, in the postgres.conf file the three following lines shall be added :

  • wal_level=logical
  • max_replication_slots=10
  • max_wal_senders=10

Here follows a quick explanation of what each of these parameters mean :

wal_level

WAL stands for Write Ahead Logs. These are the logs written by postgres to record all operations on the database. By default the level is replica, which is .... [TODO] but for CDC we need the highest level logical. This level records every transaction happening is the database, at the point that we can literally reconstruct the database from the logs; which is exactly what CDC is trying to achieve.

max_replication_slots

Here comes another concept : the replication slots. These are ... .[TODO] Naturally, all WAL are not kepts forever, hence we need to configure replication slots so that unread WAL are not destroyed before our CDC pipeline has had the chance to read them.

max_wal_senders

[TODO]

Publications