CockroachDB and Kafka Connectors
In this article I want to go over the steps on how to integrate cockroachDB with Kafka Connectors to import data into Kafka topics. CockroachDB has an inbuilt feature CDC/Changefeeds to push data to any sink that you can talk natively without having to integrate with a third party connector.CDC is a robust and mature feature of cockroachDB and it will let you stream data in a distributed architecture more efficiently.For instance, you can stream data changes to Kafka by just enabling rangefeeds at the row level and get the batched changes streamed to the kafka broker without another hop. For more information on how to do this please read more on our official docs. CDC
Disclaimer
Although the official docs say that CRDB is not supported, I was able to connect and stream changes. So, please test throughly when using this feature.Limitations
Kafka Connectors
If you don't already know what a kafka connector is, I took this official explanation out of their docs.
Kafka connectors are ready-to-use components, which can help us to import data from external systems into Kafka topics and export data from Kafka topics into external systems.
So, as we all know, Apache Kafka is a framework for event streaming. Events are contained in streams that can be produced, consumed, or processed. So, connectors acts as a bridge and ensure compatibility of event-streaming and non-event streaming pieces of technology. Data can be easily streamed into Kafka from a variety of sources and out of Kafka to a variety of targets thanks to Connectors.
So, today lets look at how to integrate CRDB with Kafka's PostgreSQL source connector.
PostgreSQL source connector
The Kafka Connect PostgreSQL Source connector for Confluent Cloud can obtain a snapshot of the existing data in a compatible PostgreSQL database and then monitor and record all subsequent row-level changes to that data. The connector supports Avro, JSON Schema, Protobuf, or JSON (schemaless) output data formats. All of the events for each table are recorded in a separate Apache Kafka® topic. The events can then be easily consumed by applications and services.
Note that deleted records are not captured and the changes are being pulled from the DB.
Please refer to their official docs for more information.
Integrating PostgreSQL source Connector with CRDB dedicated.
Pre-requisites:
- Working CockroachDB connection with DB console access
- Working confluent cloud console access >Note: you can always sign up for trial clusters for both of the above if you'd like to test any feature.
As a First step, get all the connection information from your cockroach DB console.
Goto your cluster, click on connect
and get the connection info similar to the screenshot here below and save it. You'll also need the SSL cert to upload into the confluent connector authentication. So, on the same pop up page you should see an option to download the ca cert information from the command line
section. This is your SSL cert.
Now, Go to confluent cloud and create a cluster by clicking on the + Add a cluster
option.
You should have a cluster similar to above once you finish provisioning the cluster.
- Click on the connectors from the cluster overview page to create a connector.
On the connector marketplace, search for postgresql and you should see the option to integrate the cluster with it, Continue selecting the topic prefix and provide kafka Credentials, You can either create new credentials or provide existing kafka API key & Secret.
Now, you should see the Authentication pane which should ask you for the postgres DB information, in our case CRDB's connection info that we captured from the earlier step. See below information for reference. Note, for SSL cert, you can make a copy of the cert to any folder on your page and upload it here.
Pick the configuration that you'd like for the messages and the table information for this connector
Additionally, you can make advanced changes to the connector and also apply any transformations for the messages. Once you click on finish
your connector should be connected and start streaming the changes.
Please refer to the above image for the final overview of a created connector. For quick review of the messages that are being streamed from CRDB to Kakfa, you can go to the topic that was created and select messages
tab. you should be able to see all the current messages that are being streamed like in the below screenshot.
Thanks for going through the article.
Top comments (0)