Kafkacat is an awesome tool and today I want to show you how easy it is to use it and what are some of the cool things you can do with it.
All the features explained below are available in version 1.5.0.
Looking for a quick Kafkacat reference? Download the Kafkacat 1-page cheatsheet
Installing Kafkacat
Kafkacat is available from Homebrew (latest version) and some Linux repositories, but it is possible that Linux repos don’t contain the latest version. If that’s the case, you can always run the latest kafkacat from docker.
The basics
Kafkacat is a command-line tool for producing and consuming Kafka messages. In addition, you can view metadata about the cluster or topics.
Kafkacat has quite a few parameters and it might look scary learning them all, yet (most of) the parameters make sense and are easy to remember. Let’s start with the most important: modes. When making a call to Kafkacat, you’ll always use it in one of the four modes it has. All the modes use capital letter:
- -P = Produce data
- -C = Consume data
- -L = List metadata
- -Q = Query
The next most important option is the broker list (-b) and after that, it’s usually topic (-t).
So you can almost write your command like a story. The following command:
kafkacat -C -b localhost:9092 -t topic1 -o beginning
could be read as: I want to Consume from broker localhost:9092 and topic topic1 with offset set to the beginning.
Ok, now that I have hopefully convinced you that all those cryptical parameters make sense, let’s look at how to use Kafkacat to achieve some common tasks.
Producing data (-P)
What do we need so we could produce data? At a minimum, you need a broker and a topic you want to write to.
Produce values
kafkacat -P -b localhost:9092 -t topic1
Default message separator is Enter. Type your messages, and separate them with Enter.
Producing keys and values
If you want to produce messages with key, you need to specify the Key delimiter (-K). Let’s use a colon to separate the key and the message in the input:
kafkacat -P -b localhost:9092 -t topic1 -K :
key3:message3
key4:message4
Notice that parameter uses capital K.
Produce messages with headers
If you want to add headers to the messages, add them using -H parameter, in a key=value format:
kafkacat -P -b localhost:9092 \
-t topic1 \
-H appName=kafkacat -H appId=1
As you see, additional headers are added by repeating -H flag. Note that all the messages produced will have the two headers specified with -H flag.
Produce data from a file
If you want to produce data using a file, use the option -l (as in: fi*l*e)… I did say that most of the parameters are easy to remember :). Let’s say we have a file called data.txt containing key-value pairs, separated by a colon:
key1:message1
key2:message2
key3:message3
So the command would be:
kafkacat -P -b localhost:9092 -t topic1 -K: -l data.txt
Produce message with compression
Using a (-z) parameter you can specify message compression:
kafkacat -P -b localhost:9092 -t topic1 -z snappy
Supported values are: snappy, gzip and lz4.
Consuming data (-C)
Simple consumer
Consume all the messages from a topic
kafkacat -C -b localhost:9092 -t topic1
Note that, unlike kafka-console-consumer, kafkacat will consume the messages from the beginning of the topic by default. This approach makes more sense to me, but YMMV.
Consume X messages
You can control how many messages will be consumed using the count parameter (-c, lowercase).
kafkacat -C -b localhost:9092 -t topic1 -c 5
Consuming from an offset
If you want to read data from a particular offset, you can use the -o parameter. The offset parameter is very versatile. You can:
Consume messages from the beginning or end
kafkacat -C -b localhost:9092 -t topic1 -o beginning
Use constants beginning or end to tell kafkacat where to begin the consumption.
Consume from a given offset
kafkacat -C -b localhost:9092 -t topic1 -o 123
Use an absolute value for the offset and Kafkacat will start consuming from the given offset. If you don’t specify the partition to consume, Kafkacat will consume all the partitions from the given offset.
Consume last X messages in a partition(s)
kafkacat -C -b localhost:9092 -t topic1 -o -10
We do this by using a negative offset value.
Consume based on a timestamp
It is possible to start consuming after a given timestamp in milliseconds using the format -o s@start_timestamp. Technically this is consuming based on an offset, the difference is that kafkacat figures out the offset for you based on the provided timestamp(s).
kafkacat -C -b localhost:9092 -t topic1 -o s@start_timestamp
You can also stop consuming when a given timestamp is reached using:
kafkacat -C -b localhost:9092 -t topic1 -o e@end_timestamp
This is very useful when you are debugging an error that occurred, you have the timestamp of the error, but you want to check how the message looked. Then, combining the start and end offset, you can narrow down your search:
kafkacat -C -b localhost:9092 -t topic1 -o s@start_timestamp -o e@end_timestamp
Formatting the output
By default, Kafkacat will print out only the message payload (value of the Kafka record), but you can print anything you’re interested in. To define the custom output, specify (-f) flag, as in format, followed by a format string. Here’s an example that prints a string with key and value of the message:
kafkacat -C -b localhost:9092 -t topic1 \
-f 'Key is %k, and message payload is: %s \n'
%k and %s are format string tokens. The output might be something like this:
Key is key3, and message payload is: message3
Key is key4, and message payload is: message4
So what can you print out using format string?
- Topic (%t),
- partition (%p)
- offset (%o)
- timestamp (%T)
- message key (%k)
- message value (%s)
- message headers (%h)
- key length (%K)
- value length (%S)
As you’ve seen above, you can use newline (\n \r) or tab characters(\t) in the format string as well.
Serdes
If messages are not written as strings, you need to configure a proper serde for keys and values using -s parameter.
For example, if both key and value are 32-bit integers, you would read it using:
kafkacat -C -b localhost:9092 -t topic1 -s i
You can specify separately serde for the key and value using:
kafkacat -C -b localhost:9092 -t topic1 -s key=i -s value=s
You will find the list of all the serdes in a kafkacat help (kafkacat -h).
Avro serde
Avro messages are a bit special since they require a schema registry. But Kafkacat has you covered there as well. Use (-r) to specify the schema registry URL:
kafkacat -C -b localhost:9092 \
-t avro-topic \
-s key=s -s value=avro \
-r http://localhost:8081
In the example above, we’re are reading messages from a topic where keys are strings, but values are Avro.
List metadata (-L)
Listing metadata gives you info about topics: how many partitions it has, which broker is a leader for a partition as well as the list of in-sync replicas (isr).
Metadata for all topics
kafkacat -L -b localhost:9092
Simply calling -L with no other parameters will display the metadata for all the topics in the cluster.
Metadata for a given topic
If you want to see metadata for just one topic, specify it using (-t) parameter:
kafkacat -L -b localhost:9092 -t topic1
Query mode (-Q)
If you want to find an offset of a Kafka record based on a timestamp, Query mode can help with that. Just specify the topic, partition and a timestamp:
kafkacat -b localhost:9092 -Q -t topic1:1:1588534509794
Is this all?
I'm glad that you asked because it's not :) I have created a 1-page Kafkacat cheatsheet for you to download. Grab it here.
Top comments (7)
Thanks for the post. i am looking a solution to export/import, so that i am able to test my kafka stream application locally.
with this, i am able to export to a file from cloud data and import multiple times to my local kafka to finish my stream test.
HI i am trying to copy data from 1 topic to another topic ,but its not working
this is the command i am using.
sudo docker run --rm --network=host edenhill/kafkacat:1.5.0 \
kafkacat -C -b broker -t 302008 -e | \
kafkacat -b broker -t 301039load
its gave below error:
-bash: kafkacat: command not found
% Reached end of topic 302008 [0] at offset 18: exiting
How do I display all the topics in kafka?
Hi Rakesh,
You could use a -L option without topic parameter. That will list all the topics. Do note that it will also list topic partitions. For this job I would rather recommend kafka-topics command:
You can also do it using the JSON output and filter it with
jq
:How to read messages between two different offsets. For example from 1st offset to 15th Offset considering a topic has a single partition ony?
Simple
Really liked it.
Please make more on Kafka related posts.