Skip to content

guilhermehuther/cdc

Repository files navigation

CDC (Change Data Capture) with Debezium, PostgreSQL, and DLT

  • This project demonstrates a log-based Change Data Capture (CDC) workflow in a homogeneous PostgreSQL environment, using Debezium, Docker, and dlt to consume change events.

  • The goal of this experimentation is to configure CDC via Debezium, simulate database changes, and observe how those changes are captured and consumed by dlt in Python.

  • The test_mock_data.py script is used to simulate inserts, updates, and deletes on the source PostgreSQL database defined in docker-compose.yaml.

To enable PostgreSQL logical replication (transaction logs), see this configuration in this line on the docker-compose.yaml.

Getting Started

Start Docker Services

Make sure Docker is installed and then run:

docker compose up -d

This will start the necessary containers: PostgreSQL, Kafka + Zookeeper (used by Debezium), and Debezium connectors.

Install Python Dependencies

You’ll need psycopg to run the mock data script:

pip install psycopg

Simulate Database Changes

Run the script to insert, update, and delete rows in the source PostgreSQL database:

python3 ./test_mock_data.py

This will generate mock events that Debezium will capture and stream to DLT.

References

About

Change Data Capture with Python, Kafka, Debezium, Docker, MySQL, Postgres.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published