A Simple Guide to Apache Kafka: Setting It Up and Testing in KRaft Mode
What is Apache Kafka?
Let’s start simple. Imagine Kafka as a mailing system for your data. Different parts of your system, or even different systems, can communicate with each other by sending and receiving data messages. If you have tons of data being created all the time, like user actions on a website, machine logs, or even weather sensor data, Kafka ensures that this data flows smoothly and quickly to wherever it needs to go.
In essence, Kafka makes sure data gets from one place to another in a way that’s fast, reliable, and scalable. So, whether you’re running a small application or a massive enterprise, Kafka’s got you covered.
Why Should You Use Kafka?
Here are a few good reasons why people use Kafka:
- Real-Time Data Processing: Data flows through Kafka almost instantly, making it perfect for live data streams.
- Reliable and Scalable: Kafka can handle millions of data messages every second, which makes it great for both small and large systems.
- Fault-Tolerant: If something goes wrong, Kafka won’t lose your data – it’s designed to recover and keep things moving.
- Decouples Systems: Kafka allows systems to send and receive data without needing to know too much about each other, which keeps everything flexible.
Key Kafka Concepts
Before we jump in, let’s break down some of the basic terms:
- Producer: This is the system or service that sends data to Kafka. Think of it like the sender in the mailing system.
- Consumer: This is the system that reads or takes the data from Kafka. It’s the recipient.
- Topic: A channel or folder where Kafka stores messages. Producers send data to topics, and consumers read data from topics.
- Broker: A Kafka server that holds and manages the messages within topics.
- Partition: Topics can be split into smaller pieces called partitions, which helps Kafka handle a large amount of data more efficiently.
Kafka in Zookeeper Mode vs. KRaft Mode
Originally, Kafka used Zookeeper to manage its data and operations. Zookeeper was like the manager that made sure everything stayed organized. However, managing Zookeeper and Kafka together could be a bit complex, especially as your system grows.
That’s where KRaft Mode comes in. KRaft Mode lets Kafka manage its own data without needing Zookeeper. This makes the entire setup simpler and more efficient because you’re only managing Kafka now, not both Kafka and Zookeeper.
How Do They Compare?
Feature | Zookeeper Mode | KRaft Mode |
---|---|---|
Metadata Management | Managed by Zookeeper | Managed directly by Kafka |
Consensus Protocol | Uses Zookeeper's method | Kafka uses its own Raft protocol |
Architecture | Needs a separate Zookeeper cluster | No Zookeeper needed |
Complexity | More components to manage | Simpler, one less system to worry about |
Setting Up Kafka in KRaft Mode Using Docker
Now that we have a basic understanding, let’s set up Kafka in KRaft mode using Docker. If you’re not familiar with Docker, don’t worry! It’s a tool that lets us run programs in containers, which are like isolated environments, without needing to install everything directly on our computer.
Prerequisites:
- Docker and Docker Compose installed on your machine.
Step 1: Create the Docker Compose File
-
Create a file called
docker-compose.yml
in your project directory. - Copy and paste the following into that file:
name: 'stream'
version: '3.8'
services:
kafka:
image: confluentinc/cp-kafka:latest
hostname: kafka
container_name: kafka
ports:
- "9092:9092"
- "9093:9093"
environment:
KAFKA_KRAFT_MODE: "true" # This enables KRaft mode in Kafka.
KAFKA_PROCESS_ROLES: controller,broker # Kafka acts as both broker and controller.
KAFKA_NODE_ID: 1 # A unique ID for this Kafka instance.
KAFKA_CONTROLLER_QUORUM_VOTERS: "1@localhost:9093" # Defines the controller voters.
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:9093
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
KAFKA_LOG_DIRS: /var/lib/kafka/data # Where Kafka stores its logs.
KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true" # Kafka will automatically create topics if needed.
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 # Since we’re running one broker, one replica is enough.
KAFKA_LOG_RETENTION_HOURS: 168 # Keep logs for 7 days.
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0 # No delay for consumer rebalancing.
CLUSTER_ID: "Mk3OEYBSD34fcwNTJENDM2Qk" # A unique ID for the Kafka cluster.
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ./data:/var/lib/kafka/data # Store Kafka logs on your local machine.
Step 2: Start Kafka
Once the file is ready, open your terminal (or command line), go to the directory where you saved the docker-compose.yml
file, and run this command:
docker-compose up
This command will pull the Kafka image, set everything up, and run Kafka in KRaft mode. Now you have Kafka running locally!
Testing Kafka: Let’s See If It Works!
Great! Now let’s test our Kafka setup to make sure everything is working as expected. We’ll create a topic, send some messages, and then read them back.
Step 1: Access the Kafka Container
To interact with Kafka, we need to get inside the Docker container where Kafka is running. Use this command to access the Kafka container:
docker exec -it kafka bash
Step 2: Create a Topic
Now, let’s create a topic in Kafka. A topic is where messages are stored. We’ll create one called test-topic
:
/usr/bin/kafka-topics --create --topic test-topic --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1
Step 3: Send Messages to Kafka (Producing)
Next, we’ll send some messages to the test-topic
. We do this by starting a producer that sends messages to Kafka:
/usr/bin/kafka-console-producer --broker-list localhost:9092 --topic test-topic
Type a few messages and press Enter after each one:
> Hello Kafka!
> This is a test message.
Step 4: Read Messages from Kafka (Consuming)
Now, let’s see if we can read the messages we just sent. To do this, we’ll start a consumer that reads messages from test-topic
:
/usr/bin/kafka-console-consumer --bootstrap-server localhost:9092 --topic test-topic --from-beginning
You should see the messages you sent earlier:
Hello Kafka!
This is a test message.
Step 5: Check All Topics
If you want to see all the topics that exist in your Kafka setup, you can list them with this command:
/usr/bin/kafka-topics --list --bootstrap-server localhost:9092
Step 6: Check Logs (Optional)
If you want to check what’s happening behind the scenes, you can always check the Kafka logs:
docker logs kafka
Wrapping It Up
Well done! 🎉 You’ve just set up Kafka in KRaft mode, created a topic, sent messages, and consumed those messages. With KRaft mode, Kafka becomes simpler to manage, since you no longer need Zookeeper.
Kafka is an amazing tool for handling real-time data, and now that you’ve got it running, you can start exploring its potential. Try experimenting with different topics, producing more data, or even connecting Kafka to your own applications to see how it handles real-world use cases.
Good luck, and happy streaming!
Top comments (0)