Dejan Maric

Posted on May 5, 2020 • Originally published at codingharbour.com on May 4, 2020

Learn how to use Kafkacat – the most versatile Kafka CLI client

#apachekafka #kafkacat #tutorial

Kafkacat is an awesome tool and today I want to show you how easy it is to use it and what are some of the cool things you can do with it.

All the features explained below are available in version 1.5.0.

Looking for a quick Kafkacat reference? Download the Kafkacat 1-page cheatsheet

Installing Kafkacat

Kafkacat is available from Homebrew (latest version) and some Linux repositories, but it is possible that Linux repos don’t contain the latest version. If that’s the case, you can always run the latest kafkacat from docker.

The basics

Kafkacat is a command-line tool for producing and consuming Kafka messages. In addition, you can view metadata about the cluster or topics.

Kafkacat has quite a few parameters and it might look scary learning them all, yet (most of) the parameters make sense and are easy to remember. Let’s start with the most important: modes. When making a call to Kafkacat, you’ll always use it in one of the four modes it has. All the modes use capital letter:

-P = Produce data
-C = Consume data
-L = List metadata
-Q = Query

The next most important option is the broker list (-b) and after that, it’s usually topic (-t).

So you can almost write your command like a story. The following command:

kafkacat -C -b localhost:9092 -t topic1 -o beginning

could be read as: I want to Consume from broker localhost:9092 and topic topic1 with offset set to the beginning.

Ok, now that I have hopefully convinced you that all those cryptical parameters make sense, let’s look at how to use Kafkacat to achieve some common tasks.

Producing data (-P)

What do we need so we could produce data? At a minimum, you need a broker and a topic you want to write to.

Produce values

kafkacat -P -b localhost:9092 -t topic1

Default message separator is Enter. Type your messages, and separate them with Enter.

Producing keys and values

If you want to produce messages with key, you need to specify the Key delimiter (-K). Let’s use a colon to separate the key and the message in the input:

kafkacat -P -b localhost:9092 -t topic1 -K :
key3:message3
key4:message4

Notice that parameter uses capital K.

Produce messages with headers

If you want to add headers to the messages, add them using -H parameter, in a key=value format:

kafkacat -P -b localhost:9092 \
-t topic1 \
-H appName=kafkacat -H appId=1

As you see, additional headers are added by repeating -H flag. Note that all the messages produced will have the two headers specified with -H flag.

Produce data from a file

If you want to produce data using a file, use the option -l (as in: fi*l*e)… I did say that most of the parameters are easy to remember :). Let’s say we have a file called data.txt containing key-value pairs, separated by a colon:

key1:message1
key2:message2
key3:message3

So the command would be:

kafkacat -P -b localhost:9092 -t topic1 -K: -l data.txt

Produce message with compression

Using a (-z) parameter you can specify message compression:

kafkacat -P -b localhost:9092 -t topic1 -z snappy

Supported values are: snappy, gzip and lz4.

Consuming data (-C)

Simple consumer

Consume all the messages from a topic

kafkacat -C -b localhost:9092 -t topic1

Note that, unlike kafka-console-consumer, kafkacat will consume the messages from the beginning of the topic by default. This approach makes more sense to me, but YMMV.

Consume X messages

You can control how many messages will be consumed using the count parameter (-c, lowercase).

kafkacat -C -b localhost:9092 -t topic1 -c 5

Consuming from an offset

If you want to read data from a particular offset, you can use the -o parameter. The offset parameter is very versatile. You can:

Consume messages from the beginning or end

kafkacat -C -b localhost:9092 -t topic1 -o beginning

Use constants beginning or end to tell kafkacat where to begin the consumption.

Consume from a given offset

kafkacat -C -b localhost:9092 -t topic1 -o 123

Use an absolute value for the offset and Kafkacat will start consuming from the given offset. If you don’t specify the partition to consume, Kafkacat will consume all the partitions from the given offset.

Consume last X messages in a partition(s)

kafkacat -C -b localhost:9092 -t topic1 -o -10

We do this by using a negative offset value.

Consume based on a timestamp

It is possible to start consuming after a given timestamp in milliseconds using the format -o s@start_timestamp. Technically this is consuming based on an offset, the difference is that kafkacat figures out the offset for you based on the provided timestamp(s).

kafkacat -C -b localhost:9092 -t topic1 -o s@start_timestamp

You can also stop consuming when a given timestamp is reached using:

kafkacat -C -b localhost:9092 -t topic1 -o e@end_timestamp

This is very useful when you are debugging an error that occurred, you have the timestamp of the error, but you want to check how the message looked. Then, combining the start and end offset, you can narrow down your search:

kafkacat -C -b localhost:9092 -t topic1 -o s@start_timestamp -o e@end_timestamp

Formatting the output

By default, Kafkacat will print out only the message payload (value of the Kafka record), but you can print anything you’re interested in. To define the custom output, specify (-f) flag, as in format, followed by a format string. Here’s an example that prints a string with key and value of the message:

kafkacat -C -b localhost:9092 -t topic1 \
 -f 'Key is %k, and message payload is: %s \n'

%k and %s are format string tokens. The output might be something like this:

Key is key3, and message payload is: message3
Key is key4, and message payload is: message4

So what can you print out using format string?

Topic (%t),
partition (%p)
offset (%o)
timestamp (%T)
message key (%k)
message value (%s)
message headers (%h)
key length (%K)
value length (%S)

As you’ve seen above, you can use newline (\n \r) or tab characters(\t) in the format string as well.

Serdes

If messages are not written as strings, you need to configure a proper serde for keys and values using -s parameter.

For example, if both key and value are 32-bit integers, you would read it using:

kafkacat -C -b localhost:9092 -t topic1 -s i

You can specify separately serde for the key and value using:

kafkacat -C -b localhost:9092 -t topic1 -s key=i -s value=s

You will find the list of all the serdes in a kafkacat help (kafkacat -h).

Avro serde

Avro messages are a bit special since they require a schema registry. But Kafkacat has you covered there as well. Use (-r) to specify the schema registry URL:

kafkacat -C -b localhost:9092 \
-t avro-topic \
-s key=s -s value=avro \
-r http://localhost:8081

In the example above, we’re are reading messages from a topic where keys are strings, but values are Avro.

List metadata (-L)

Listing metadata gives you info about topics: how many partitions it has, which broker is a leader for a partition as well as the list of in-sync replicas (isr).

Metadata for all topics

kafkacat -L -b localhost:9092

Simply calling -L with no other parameters will display the metadata for all the topics in the cluster.

Metadata for a given topic

If you want to see metadata for just one topic, specify it using (-t) parameter:

kafkacat -L -b localhost:9092 -t topic1

Query mode (-Q)

If you want to find an offset of a Kafka record based on a timestamp, Query mode can help with that. Just specify the topic, partition and a timestamp:

kafkacat -b localhost:9092 -Q -t topic1:1:1588534509794

Is this all?

I'm glad that you asked because it's not :) I have created a 1-page Kafkacat cheatsheet for you to download. Grab it here.

Top comments (7)

Ryan Shao • Jan 29 '23

Thanks for the post. i am looking a solution to export/import, so that i am able to test my kafka stream application locally.

with this, i am able to export to a file from cloud data and import multiple times to my local kafka to finish my stream test.

bahekar • Feb 23 '22

HI i am trying to copy data from 1 topic to another topic ,but its not working
this is the command i am using.
sudo docker run --rm --network=host edenhill/kafkacat:1.5.0 \
kafkacat -C -b broker -t 302008 -e | \
kafkacat -b broker -t 301039load

its gave below error:

-bash: kafkacat: command not found
% Reached end of topic 302008 [0] at offset 18: exiting

Rakesh • Aug 19 '20

How do I display all the topics in kafka?

Dejan Maric • Aug 26 '20

Hi Rakesh,
You could use a -L option without topic parameter. That will list all the topics. Do note that it will also list topic partitions. For this job I would rather recommend kafka-topics command:

kafka-topics --bootstrap-server localhost:9092 --list

Robin Moffatt • Jan 4 '21

You can also do it using the JSON output and filter it with jq:

kafkacat -b broker:29092 -L -J | \
  jq '.topics[].topic'|
  sort

tamil07 • May 10 '21

How to read messages between two different offsets. For example from 1st offset to 15th Offset considering a topic has a single partition ony?

srikanth597 • Sep 1 '21

Simple
Really liked it.
Please make more on Kafka related posts.

DEV Community