In this Post we're going to cover how to run Kafka for local development of our applications, this will be useful to run future examples and can be used as a basic reference, I will try to keep consistency with this approach.
There are many possible ways to run Kafka locally for development, among those the most common are probably:
- downloading Kafka and Zookeeper manually and running them locally,
- directly from the command line using Docker,
- using a docker-compose file,
- using the Confluent CLI commands,
- running them on the cloud.
From all the possibilities the one I find more intuitive and easy to maintain in development during projects is using docker-compose. I have used different approaches, including VM based solutions in the past but using docker-compose brings a better developer experience. So I will stick to that in my posts here.
If you don't have docker and docker-compose installed please check out my previous post where I point out directions to where to find the proper documentation to install it in your own environment.
Docker compose uses a simple YAML file and can also build from Dockerfiles for some more advanced setups if you need more control of the images you're building in your projects. Compose also enables you to set configuration parameters and share those changes within a repository with other team members and reuse it to run as a Swarm Cluster if you so desire, so we're going to cover mostly docker-compose here as it's my preference and I will provide some hints and links to documentation on how to do it using other approaches.
I think important to clarify that I am not advocating to run all your application in docker during development, it should be dependencies and mocks that you might need. During development, it's very important that you can run your unit tests and application directly from the IDE of your choice to have a better developer experience with faster feedback cycles.
Running Kafka from a docker-compose file
There are many available docker images of Kafka and Zookeeper, you can also use their binaries to pack your own docker images. Here I will share configurations for a couple of existent options from docker-hub.
The source code with the compose files used in this post is available, you can clone the repository using:
git clone git@github.com:stockgeeks/docker-compose.git
Open the cloned project in your favorite IDE. The source code for this post is under the folder one-to-run-them-all. Navigate to this folder in a command prompt to run the docker-compose commands presented next. If you have problems running the commands make sure to have docker and docker-compose installed as explained in the link shared above and check this compatibility matrix.
When running kafka in this examples notice the configurations on the project specifically the listener and advertised.listener and advertised.hostname as they're key to understand later how we can connect our application running from the IDE to the local kafka development broker we are running here in this post.
Kakfa wurstmeister image
For this initial example, we're going to use the latest Kafka docker image from wurstmeister which it's available in docker hub here.
We will not cover components like the schema registry in this post, the idea is to keep it simple and focus on running Kafka while learning some ways to interact with it using docker and docker-compose.
Let's first make it run it in the background, from a shell inside the one-to-run-them-all folder in the project:
docker-compose up -d
you can then use docker ps
to check the running containers. If you're comfortable with multiple command line shells you can watch the docker containers continuously in one of them while developing, I usually do that: watch docker ps
and hit the keyboard with CTRL + C
when you want to stop watching.
If you're using Mac or Windows it's also possible to install watch command and use it. You can alternatively if you have other docker containers running also watch only those in the compose file, from the same directory where the compose file is:
watch docker-compose ps
To check and follow the container logs in another terminal window, always in the same folder level as the docker-compose file:
docker-compose logs -f
Now that we have Kafka running locally let's issue some commands to test our setup, first enter the running Kafka docker container:
docker exec -it kafka /bin/bash
Now you're inside the running Kafka container and can access the Kafka support scripts available in the command line under /opt/kafka/bin
, let's start creating a topic called client with 1 partition and replication factor 1.
./kafka-topics.sh --bootstrap-server kafka:9092 --create --topic client --partitions 1 --replication-factor 1
let's then list the topics:
./kafka-topics.sh --bootstrap-server kafka:9092 --list
You should see client
as this is the topic we just created, but let's get some more details with describe
:
./kafka-topics.sh --bootstrap-server kafka:9092 --topic client --describe
Now you should get some more details like Partition, leader, replicas and in sync replicas which in this case are all the same as we've set all to 1 when creating the topic.
With a small variation of docker command you can execute the same kafka commands from the host directly, all you need to do is prefix the commands with
docker exec -it kafka
so from the host machine and NOT inside the kafka running container, you can run:docker exec -it kafka /opt/kafka/bin/kafka-topics.sh --bootstrap-server kafka:9092 --topic client --describe
and should have exactly the same output as before.
Ok, let's now publish a message to the kafka topic using the console-producer from kafka, but would be nice to be sure the message was published immediately so let's start a consumer to be listening from message in that topic and print it out when a new message is sent.
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic client --from-beginning
and now in another terminal window let's produce messages to the client topic:
./kafka-console-producer.sh --broker-list localhost:9092 --topic client
Your terminal will be in waiting status, see screenshot below, with a >
, type in the line and press enter, the message will be produced to Kafka and received by the client in the consumer terminal.
Now that we have kafka running and have produced and consumed a few messages,
let's check the docker-compose file:
version: '3.2'
services:
# https://github.com/wurstmeister/zookeeper-docker
zookeeper:
container_name: zookeeper
image: wurstmeister/zookeeper:latest
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ports:
- "2181:2181"
# https://hub.docker.com/r/confluentinc/cp-kafka/
kafka:
container_name: kafka
image: wurstmeister/kafka:2.12-2.2.1
environment:
## the >- used below infers a value which is a string and properly
## ignore the multiple lines resulting in one long string:
## https://yaml.org/spec/1.2/spec.html
KAFKA_ADVERTISED_LISTENERS: >-
LISTENER_DOCKER_INTERNAL://kafka:19092,
LISTENER_DOCKER_EXTERNAL://${DOCKER_HOST_IP:-kafka}:9092
KAFKA_LISTENERS: >-
LISTENER_DOCKER_INTERNAL://:19092,
LISTENER_DOCKER_EXTERNAL://:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: >-
LISTENER_DOCKER_INTERNAL:PLAINTEXT,
LISTENER_DOCKER_EXTERNAL:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: LISTENER_DOCKER_INTERNAL
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_LOG4J_LOGGERS: >-
kafka.controller=INFO,
kafka.producer.async.DefaultEventHandler=INFO,
state.change.logger=INFO
ports:
- 9092:9092
depends_on:
- zookeeper
volumes:
- /var/run/docker.sock:/var/run/docker.sock
To stop your Kafka and Zookeeper, from the same folder where the compose file is, using the command line: docker-compose down -v
which will also clean the mounted volume reseting the kafka topics, if you want to keep existing messages you can remove the -v from the command.
Confluent Kafka and Zookeeper images
Let's now run kafka confluent docker image.
In the project under the folder one-to-run-them-all there's a second docker-compose file called docker-compose-confluent.yml
to run it we can issue the same command as before specifying the file name when want to run using the option -f
, but before we do that please make sure the previous containers are stopped, if not, run from the command line: docker-compose down -v
which will stop the containers we started previously.
docker-compose -f docker-compose-confluent.yml up -d
Check if the containers are running and put a watch docker ps
as before to be sure the confluent containers with zookeeper and kafka are running. If yes, let's execute the same commands as before, the main difference is that Confluent Kafka has it's own installation standars so the kafka scripts are under /usr/bin
which basically gives access to them from everywhere in the shell inside the container, so go ahead and repeat the same commands issued for the previous example to enter the container shell docker exec -it docker /bin/bash
and the commands to list, publish, consume are almost the same:
Enter the kafka container:
docker exec -it kafka /bin/bash
Start the consumer
kafka-console-consumer --bootstrap-server localhost:9092 --topic client
Start the producer
kafka-console-producer --broker-list kafka:9092 --topic client
By default for develoment which this images are meant for, the topic creation happens automatically when a client tries to consume or produce to it, this should be disabled in a production environment.
Command line using docker images
This option only requires that you have Docker installed, not docker-compose, you'll run the docker images for Kafka and zookeeper from the command line and that's it. Make sure you have Docker installed. The tricky part is to pick the docker image, there are many available. The most populars in my perception(would need further research to confirm) are:
- wurstmeister
- Confluent Kafka Images
Please leave a comment if you prefer any other kafka image for development.
Command line using binaries
This approach is well documented with all required links to download the binaries in the official Apache Kafka Documentation Quick Start section.
Confluent cli tools
Confluent provides a very user friendly way of running local kafka for development together if all it's platform which includes many extra features including management and monitoring tools and much more.
If you're using Confluent stack this can be a great option. They have very detailed documentation on their pages about it:
Managed Kafka Offers In the Cloud
If you have plans to run your kafka Cluster in the Cloud fully automated, there are a few options, the most natural being Confluent Cloud as it's managed by a team with many Kafka core contributtors and it's creators, currently you can deploy Confluent managed Kafka clusters to Google Cloud or AWS.
Other possible options include:
IBM Event Streams for IBM Cloud is a full, automated Kafka managed service provided by IBM.
For Azure check their Quickstart article and more details in this post with extra links and detailed information.
For AWS there's the AWS official offer MSK but you will also find different options and companies that offer Kafka managed cluster services over AWS.
For Google Cloud it's recommended to use the Confluent offer as mentioned above.
Kubernetes or Swarm versions
I don't have any experience here, but there are some talks available specially for running Kafka in Kubernetes. Some links:
What's next
So in this post I've tried to give some introduction on how to spin up a local kafka infrastructure locally for develoment. This is the basis to the upcoming posts where we will see a bit more of developing Kafka Applications with Springboot.
Top comments (2)
I`ve started conteiner for wurstmeister Kafka and Zookeeper. I am begginer at docker and kafka.
Could you help me to configure container with two kafka?
And how I can connect to those two Kafka (in docker container) from my OS Windows?
Thanks in advance.
Hi, sorry for some reason I missed this message. Did you manage to sort this out? I do have some compose files with a cluster configured and would happily share otherwise.
Best,