🪝 Teaser
Did you ever find yourself in a situation where :
- Team 🇦 pushes data in a given database (let's say MySQL,...) with its very own custom software
-
Team 🇧 needs to get these data changes (
INSERT
,UPDATE
,DELETE
) as events so they can put them in let's say... an another database (MariaDB, PostgreSQL,...) - Base software cannot be changed : you have to "deal with it"
Eg, team B's motivation maybe to achieve datascience, RealTime Analytics, store in a datalake,...
👉 This blog post is dedicated to this case... and surprisingly : open source solutions do exist to achieve this magic!
🤔 About the "why"
Debezium
Project's "why" is pretty straightforward :
"Turn your databases into change event streams"
... even for "legacy" like systems:
👂 How it does NOT work (why it's awesome)
The key thing here to remind is that Debezium does NOT act as a proxy in front of the database, and that's the most elegant part.
The key point is that Debezium is literally listening to database changes, whatever you call them :
-
MySql
binlog
- MariaDB
-
PostgreSQL
WAL
s for -
Oracle
archivelog
- ...
, then send these events in a common standard format into Kafka messages... waiting to be used later by one or many consumers.
🪄 How it works
The magic resides in the following workflow :
-
Capture data changes at the database level (
WAL
for postgres, archivelogs, whatever you call them...) - Send/Stream events to Kafka
- Consume Kafka events so they they can be pushed to any third party data service 3'. JDBC : for example "consume events from multiple source topics, and then rite those events to a relational database by using a JDBC driver."
🍿 Demo from scratch
Below the live demo I was able to do, from scratch, but by following default instructions for a MySQL instance :
Top comments (2)
Very nice in-depth post :