DEV Community

Cover image for Introduction to Messaging Systems with Kafka
Yasmine Cherif
Yasmine Cherif

Posted on

Introduction to Messaging Systems with Kafka

Table of Contents

  1. Introduction: Messaging Systems with Kafka

  2. Core Concepts of Messaging Systems and Kafka

  3. Hands-On: Setting Up Kafka

  4. Building Your First Kafka Application

  5. Conclusion

I. Introduction: Messaging Systems with Kafka

1. What Are Messaging Systems?

Messaging systems are the backbone of modern distributed architectures, enabling communication between independent services or applications. They work by transmitting messages (small packets of information) from one system to another, ensuring these systems can work asynchronously and independently.

Imagine an e-commerce platform where a customer places an order. Here’s what happens:

  1. Order Service: Accepts and stores the order details.
  2. Payment Service: Processes the payment.
  3. Inventory Service: Updates stock levels.
  4. Notification Service: Sends an email or SMS confirmation.

Rather than directly linking these services, a messaging system ensures each service receives and processes the required information through messages in a decoupled manner.

2. Why Use Messaging Systems?

  • Asynchronous Communication: Producers (senders) and consumers (receivers) don’t need to interact directly or operate simultaneously.
  • Scalability: Systems can handle high loads by scaling producers, consumers, or both.
  • Resilience: Messages persist even if a consumer is temporarily unavailable.
  • Flexibility: Multiple services can act on the same data, enabling publish-subscribe patterns.

3. Real World Use Cases

  1. Realtime Analytics:
    • Streaming live data to dashboards for insights (e.g., stock prices, website metrics).
  2. Log Aggregation:
    • Collecting logs from various servers for centralized monitoring and analysis.
  3. Event Driven Applications:
    • Handling user actions (e.g., clicking a button) and triggering downstream processes.
  4. Microservices Communication:
    • Decoupling independent services in a microservices architecture.

4. Why Kafka?

Apache Kafka is a distributed messaging system originally built by LinkedIn and now maintained by the Apache Software Foundation. Kafka goes beyond simple message queues, offering features that make it a powerful choice for handling real-time data streams and building event-driven systems.

Key Benefits of Kafka

  1. High Throughput and Low Latency

    Kafka can handle millions of messages per second with minimal delay, making it ideal for applications requiring real-time processing, such as financial transactions, log aggregation, and IoT data streams.

    • Why It Matters: Systems like RabbitMQ and ActiveMQ are great for traditional queuing but struggle under extremely high loads due to their broker-centric architectures. Kafka’s distributed design overcomes this limitation.
  2. Scalability

    Kafka scales horizontally by adding brokers (servers). Data is distributed across partitions, which can be processed independently by consumers.

    • Comparison: RabbitMQ relies on adding more queues for scale, but this approach can become complex to manage in larger systems. Kafka’s native partitioning simplifies scaling.
  3. Fault Tolerance and Durability

    Kafka replicates data across brokers, ensuring no messages are lost even if a broker fails. This is achieved through configurable replication factors.

    • Compared to Others: Systems like RabbitMQ require specific configurations or external tools for similar fault tolerance, while Kafka includes it natively as part of its design.
  4. Event Streaming at Scale

    Kafka isn’t just a messaging queue; it’s an event-streaming platform. Kafka supports replaying messages by retaining data for a configurable time, enabling applications to process historical data.

    • Advantage Over Queues: Traditional messaging systems delete messages once consumed. Kafka stores them for the duration of the retention policy, allowing multiple consumers to process the same data independently.
  5. Flexible Communication Patterns

    Kafka supports both publish-subscribe and point-to-point messaging models, making it versatile for various use cases.

    • Comparison: While RabbitMQ excels in transactional, one-to-one communications, Kafka’s multi-consumer design is better for analytics, streaming, and large-scale applications.
  6. Integration Ecosystem

    Kafka integrates seamlessly with a wide range of tools and frameworks:

    • Kafka Connect: For syncing with databases, Elasticsearch, Hadoop, etc.
    • Kafka Streams: For processing data in real time.
    • KSQL: For querying and transforming data using SQL-like syntax.
    • Compared to Others: While RabbitMQ or ActiveMQ may need third-party plugins for complex integrations, Kafka provides these capabilities natively.

When to Choose Kafka?

  • Use Kafka if:
    • Your system needs to process large volumes of data in real time.
    • You require scalable, fault-tolerant messaging.
    • You’re building a data pipeline or event-streaming system (e.g., log aggregation, metrics collection, user activity tracking).
    • Multiple consumers need to process the same data independently (e.g., notifications and analytics).
  • Consider Other Solutions if:
    • You need lightweight, transactional messaging with lower setup complexity (RabbitMQ or ActiveMQ might be better for simple use cases).
    • Your system is event-driven but does not process massive amounts of data.

Kafka’s Versatility in Real-World Applications

  1. Log Aggregation: Collect logs from multiple servers into a central repository for real-time monitoring.
  2. Event Sourcing: Track all user interactions (clicks, purchases, etc.) for downstream analysis.
  3. Real-Time Analytics: Stream data to dashboards or analytics systems for instant insights.
  4. Decoupling Microservices: Enable services to communicate asynchronously without direct dependencies.

II. Core Concepts of Messaging Systems and Kafka

To effectively work with Kafka and messaging systems, it’s crucial to understand their core building blocks. This section introduces the foundational concepts that underpin messaging systems in general and Kafka in particular.

1. Core Concepts of Messaging Systems

  • Producers and Consumers
    • Producers:
      • Applications that send data (messages) to a messaging system.
      • In Kafka, producers send messages to topics.
    • Consumers:
      • Applications that read messages from the messaging system.
      • Kafka consumers subscribe to topics and process messages either individually or as part of a group.
  • Topics
    • A topic is a category or feed to which messages are sent by producers and from which consumers retrieve messages.
    • Kafka topics are partitioned for scalability and performance.
    • Example:
      • Topic: user-signups
        • Messages: {"user_id": 101, "action": "signup"}, {"user_id": 102, "action": "signup"}
  • Queues and Publish-Subscribe Models
    • Queue (Point-to-Point):
      • Messages are delivered to a single consumer.
      • Example: A task queue where only one worker processes each task. Point to point
    • Publish-Subscribe:
      • Messages are broadcast to multiple consumers.
      • Example: A topic that multiple services subscribe to, such as notifications and analytics. Publish-Subscribe
  • Message Delivery Semantics
    • Ensures how messages are delivered:
      • At-least-once: Messages might be delivered multiple times but never lost.
      • At-most-once: Messages might be lost but never duplicated.
      • Exactly-once: Messages are delivered precisely once (Kafka supports this with transactional guarantees).

2. Core Concepts in Kafka

Kafka builds on these general concepts, adding its own unique architecture and features.

Apache Kafka Architecture

  1. Brokers
    • Kafka is a distributed system consisting of brokers.
    • A broker is a server that stores and delivers messages to consumers.
    • In a Kafka cluster:
      • Brokers work together to distribute partitions of topics.
      • If one broker fails, others can continue serving data.
  2. Topics and Partitions

    • Kafka topics are divided into partitions.
    • Each partition is an ordered, immutable sequence of messages.
    • Partitioning enables:
      • Scalability: Multiple consumers can read from different partitions in parallel.
      • Fault Tolerance: Data is replicated across brokers.

    Example:

    • Topic: user-activity
    • Partitions:
      • Partition 0: Messages for users with ID ending in 0-3.
      • Partition 1: Messages for users with ID ending in 4-6.
      • Partition 2: Messages for users with ID ending in 7-9.
  3. Producers and Partitions

    • Producers decide which partition to send a message to, either:
      • Round-robin: Distribute messages evenly.
      • Custom logic: Use a key (e.g., user ID) to determine the partition.
  4. Consumer Groups

    • Kafka uses consumer groups to ensure load balancing:
      • Each consumer in a group reads from a unique partition of a topic.
      • Multiple groups can read the same topic independently.

    Example:

    • Topic: order-processing (3 partitions)
    • Consumer Group A (3 consumers): Each consumer reads from one partition.
    • Consumer Group B (2 consumers): Consumers share partitions.
  5. Offsets

    • Kafka tracks the position of a consumer in a partition using offsets.
    • Offsets ensure that consumers can:
      • Re-read messages: Start at an earlier offset.
      • Resume processing: Continue from the last read offset after a failure.
  6. Replication

    • Kafka ensures fault tolerance by replicating partitions across brokers.
    • Each partition has:
      • Leader: Handles all read and write requests.
      • Followers: Replicate data from the leader.
    • If a leader fails, a follower takes over.
    • Leader Election: Zookeeper (or KRaft in newer versions) coordinates the election of a new leader if the current leader becomes unavailable.
  7. Zookeeper (or KRaft) and Its Role in Kafka

    • Zookeeper:
      • Manages metadata and configuration for Kafka clusters.
      • Tracks broker health and coordinates leader elections for partitions.
      • Ensures availability by assigning partition leaders and monitoring replicas.
    • KRaft (Kafka Raft):
      • A new metadata management system replacing Zookeeper in newer Kafka versions.
      • Simplifies Kafka’s architecture by handling metadata internally within Kafka.
      • Improves scalability and reduces operational complexity.

3. Kafka in Action: Example of a Ride Hailing App like Uber or Lyft

In a ride-hailing app, seamless communication between various services is crucial to handle millions of ride requests and driver assignments every day. Kafka plays a vital role in enabling real-time, scalable, and fault-tolerant messaging for such systems.

Producers: Generating Ride Requests

  • The ride-request service acts as the producer in this system.
  • Whenever a user requests a ride via the app, a message is sent to a Kafka topic named ride-requests.
  • Each message includes important data:
    • User ID
    • Pickup and drop-off locations
    • Timestamp
    • Payment preferences

Example message in JSON format:

{
  "user_id": 12345,
  "pickup": "Location A",
  "dropoff": "Location B",
  "timestamp": "2024-11-28T12:34:56Z"
}
Enter fullscreen mode Exit fullscreen mode

Topic: Organizing Messages

The topic ride-requests is used to organize all incoming ride requests. To improve scalability and parallel processing, the topic is divided into 3 partitions:

  • Partition 0: Requests from Region A (e.g., North).
  • Partition 1: Requests from Region B (e.g., South).
  • Partition 2: Requests from Region C (e.g., Central).

Kafka ensures that:

  1. Each message is assigned to a specific partition based on its region key.
  2. Messages within a partition maintain strict order, which is essential for processing rides sequentially within a region. Consumers: Processing Ride Requests
  3. Driver-Matching Service:

    • Reads ride requests from the ride-requests topic.
    • Matches riders with the nearest available drivers.
    • Updates the status of the ride in the system.
    • Example output:

      {
        "ride_id": 67890,
        "driver_id": 54321,
        "status": "matched"
      }
      
  • Analytics Service:
    • Reads the same messages from the ride-requests topic.
    • Aggregates data for real-time dashboards, such as:
      • Number of rides requested per region.
      • Average wait times.
      • Driver availability trends. Consumer Groups: Ensuring Independent Processing

Kafka enables independent processing of the same messages by using consumer groups:

  1. The Driver-Matching Service is part of Consumer Group A.
    • Each consumer in this group processes messages from a specific partition.
    • For example:
      • Consumer 1 processes messages from Partition 0 (Region A).
      • Consumer 2 processes messages from Partition 1 (Region B).
  2. The Analytics Service is part of Consumer Group B.
    • Each consumer in this group reads messages independently from the ride-requests topic.

Key Kafka Feature: Each consumer group processes messages independently. This ensures the driver-matching logic doesn’t interfere with analytics processing.
Kafka Workflow in Action
Here’s a high-level flow of how Kafka facilitates the ride-hailing process:

  1. A User Requests a Ride:
    • The ride-request service sends a message to the ride-requests topic.
  2. Messages Are Partitioned:
    • Kafka assigns the message to a partition based on the user’s region.
  3. Consumers Process Messages:
    • The driver-matching service fetches messages and assigns drivers.
    • The analytics service reads the same messages to update dashboards.

Here’s a simple text-based diagram to illustrate the Kafka-based ride-hailing process:

+-------------------+              +-------------------------+
| Ride Request      |   Produce    | Kafka Topic:            |
| Service (Producer)| -----------> | ride-requests           |
+-------------------+              +-------------------------+
                                        /      |      \
                                       /       |       \
                                      v        v        v
                           +---------+  +---------+  +---------+
                           | Region A |  | Region B |  | Region C |
                           | Partition|  | Partition|  | Partition|
                           +---------+  +---------+  +---------+
                              |              |              |
                              |              |              |
           +------------------v--------------v--------------v-------------------+
           |                                                                      |
+-------------------------+                                      +---------------------+
| Driver Matching Service |   Consumer Group A                   | Analytics Service   |
| (Consumes and assigns   | <------------------------------------| (Processes data for |
| drivers)                |                                      | dashboards)         |
+-------------------------+                                      +---------------------+
Enter fullscreen mode Exit fullscreen mode

Advantages of Kafka in This Scenario

  1. Real Time Processing:
    • Kafka ensures ride requests are processed in near real-time, improving user experience.
  2. Scalability:
    • Kafka partitions and consumer groups allow the system to handle thousands of ride requests per second.
  3. Decoupling:
    • Driver-matching and analytics are decoupled; each service processes the same messages without interfering with the other.
  4. Reliability:
    • Message durability and replication prevent data loss, even in case of server failures.

III. Hands On: Setting Up Kafka

Now that we’ve covered the core concepts of messaging systems and Kafka, it’s time to get hands-on. This section will guide you through setting up Kafka on your local machine, verifying the installation, and performing basic operations like creating a topic, producing, and consuming messages.

1. What Is Required to Run Kafka?

Before starting, Kafka requires two components:

  1. Kafka Broker: The server that stores and handles messages.
  2. Zookeeper (or KRaft): Manages Kafka metadata and helps with coordination (e.g., leader election).

2. Installation Options

You can set up Kafka using Docker (recommended for simplicity) or install it natively.

Option 1: Using Docker (Recommended)

Docker makes it easy to run Kafka without worrying about complex configurations.

  1. Install Docker:

    • Download and install Docker from the official website.
    • Verify Docker is installed:
     docker --version
    
  2. Create a Docker-Compose File:

    • Save the following YAML content in a file named docker-compose.yml:
     version: '3.7'
     services:
       zookeeper:
         image: confluentinc/cp-zookeeper:latest
         environment:
           ZOOKEEPER_CLIENT_PORT: 2181
       kafka:
         image: confluentinc/cp-kafka:latest
         ports:
           - "9092:9092"
         environment:
           KAFKA_BROKER_ID: 1
           KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
           KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
           KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
    
  • Zookeeper Service: Runs on port 2181, coordinating Kafka brokers.
  • Kafka Service: Runs on port 9092, storing and processing messages.
  • Environment Variables: Configure Zookeeper connection, broker ID, and listeners.
  1. Start Kafka and Zookeeper:

    • Run the following command in the same directory as your docker-compose.yml file:
     docker-compose up -d
    

Expected Output:

   Creating network "default" with the default driver
   Creating zookeeper ... done
   Creating kafka     ... done
Enter fullscreen mode Exit fullscreen mode
  • This starts Kafka and Zookeeper in the background. You can check the running containers with:

     docker ps
    
  1. Test the Kafka Setup:
    • Use Kafka CLI tools to ensure it is running. See the Verifying the Installation section for instructions.

Option 2: Native Installation

If you prefer to install Kafka without Docker, follow these steps:

  1. Download Kafka:

    • Visit the Kafka Downloads page and download the latest binary release.
    • Extract the downloaded file:
     tar -xzf kafka_2.13-<version>.tgz
     cd kafka_2.13-<version>
    
  2. Start Zookeeper:

    • Use the built-in Zookeeper script to start Zookeeper:
     bin/zookeeper-server-start.sh config/zookeeper.properties
    

Expected Output:

   [INFO] binding to port 0.0.0.0/0.0.0.0:2181
   [INFO] Snapshot taken
   [INFO] Started admin server on port [8080]
Enter fullscreen mode Exit fullscreen mode

If you see "binding to port" and "Started admin server," Zookeeper is running successfully.

  1. Start Kafka Broker:
    • Open a new terminal and start Kafka:
bin/kafka-server-start.sh config/server.properties
Enter fullscreen mode Exit fullscreen mode

Expected Output:

Kafka server started
INFO Registered broker 0 at path /brokers/ids/0
Enter fullscreen mode Exit fullscreen mode
  1. Verify Kafka is Running:
    • Test your installation as described in 3. Verifying the Installation. ### 3. Verifying the Installation Step 1: List Existing Topics
bin/kafka-topics.sh --list --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode

What This Does:

  • Lists all topics currently managed by the Kafka broker.
  • Confirms Kafka is running properly. Expected Output:
(no output, as no topics exist yet)
Enter fullscreen mode Exit fullscreen mode

Step 2: Create a Topic

bin/kafka-topics.sh --create --topic test-topic --partitions 3 --replication-factor 1 --bootstrap-server localhost:9092

Enter fullscreen mode Exit fullscreen mode

What This Does:

  • Creates a new topic named test-topic.
  • Splits the topic into 3 partitions (for parallel processing).
  • Sets the replication factor to 1 (no redundancy). Expected Output:
Created topic test-topic.
Enter fullscreen mode Exit fullscreen mode

Step 3: Describe the Topic

bin/kafka-topics.sh --describe --topic test-topic --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode

What This Does:

  • Displays details about the topic, including partition count, replication factor, and leader assignments. Expected Output:
Topic: test-topic   PartitionCount: 3    ReplicationFactor: 1    Configs:
    Partition: 0    Leader: 0    Replicas: 0    Isr: 0
    Partition: 1    Leader: 0    Replicas: 0    Isr: 0
    Partition: 2    Leader: 0    Replicas: 0    Isr: 0
Enter fullscreen mode Exit fullscreen mode

4. Exploring Kafka Topics

Create Multiple Topics
Let’s create two topics, user-signups and order-events, each tailored for specific use cases.

  1. Create user-signups:
bin/kafka-topics.sh --create --topic user-signups --partitions 2 --replication-factor 1 --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode
  • user-signups captures new user registrations.
  • Two partitions enable parallel processing. Expected Output:
Created topic user-signups.
Enter fullscreen mode Exit fullscreen mode
  1. Create order-events:
bin/kafka-topics.sh --create --topic order-events --partitions 3 --replication-factor 1 --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode
  • order-events records e-commerce transactions.
  • Three partitions ensure scalability for high-volume data. Expected Output:
Created topic order-events.
Enter fullscreen mode Exit fullscreen mode

List All Topics
To confirm the topics were created:

bin/kafka-topics.sh --list --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode

Expected Output:

user-signups
order-events
Enter fullscreen mode Exit fullscreen mode

5. Producing Messages

Send Data to user-signups:

  1. Start a producer:
bin/kafka-console-producer.sh --topic user-signups --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode
  1. Enter user signup events:
{"user_id": 1, "name": "Alice", "email": "alice@example.com"}
{"user_id": 2, "name": "Bob", "email": "bob@example.com"}
Enter fullscreen mode Exit fullscreen mode
  • Producing user signups mimics a real-world registration system. Send Data to order-events:
  • Start another producer:
bin/kafka-console-producer.sh --topic order-events --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode
  1. Enter order events:
{"order_id": 101, "user_id": 1, "total": 99.99}
{"order_id": 102, "user_id": 2, "total": 49.50}
Enter fullscreen mode Exit fullscreen mode
- Simulates e-commerce transactions.
Enter fullscreen mode Exit fullscreen mode

6. Consuming Messages

Consume Data from user-signups:

  1. Start a consumer:
bin/kafka-console-consumer.sh --topic user-signups --from-beginning --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode

Expected Output:

{"user_id": 1, "name": "Alice", "email": "alice@example.com"}
{"user_id": 2, "name": "Bob", "email": "bob@example.com"}
Enter fullscreen mode Exit fullscreen mode

Consume Data from order-events:

  1. Start another consumer:
bin/kafka-console-consumer.sh --topic order-events --from-beginning --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode

Expected Output:

{"order_id": 101, "user_id": 1, "total": 99.99}
{"order_id": 102, "user_id": 2, "total": 49.50}
Enter fullscreen mode Exit fullscreen mode

7. Simulating Real World Scenarios

Offset Management

  1. Produce additional messages to order-events:
{"order_id": 103, "user_id": 1, "total": 75.00}
Enter fullscreen mode Exit fullscreen mode
  1. Start a consumer and consume from the latest message only:
bin/kafka-console-consumer.sh --topic order-events --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode

Expected Output:

{"order_id": 103, "user_id": 1, "total": 75.00}
Enter fullscreen mode Exit fullscreen mode
  1. Rewind to read all messages:
bin/kafka-console-consumer.sh --topic order-events --from-beginning --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode

Expected Output:

{"order_id": 101, "user_id": 1, "total": 99.99}
{"order_id": 102, "user_id": 2, "total": 49.50}
{"order_id": 103, "user_id": 1, "total": 75.00}
Enter fullscreen mode Exit fullscreen mode

Using Consumer Groups

  1. Start Consumer Group A for order-events:
bin/kafka-console-consumer.sh --topic order-events --group group-a --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode
  • Consumer groups allow multiple consumers to process messages in parallel.
  • Produce more messages to order-events:
{"order_id": 104, "user_id": 3, "total": 120.00}
Enter fullscreen mode Exit fullscreen mode
  1. Start Consumer Group B for order-events:
bin/kafka-console-consumer.sh --topic order-events --group group-b --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode

Expected Outputs:

  • Consumer Group A: Continues consuming from its last offset.
{"order_id": 104, "user_id": 3, "total": 120.00}
Enter fullscreen mode Exit fullscreen mode
  • Consumer Group B: Reads all messages from the beginning if it's new.
{"order_id": 101, "user_id": 1, "total": 99.99}
{"order_id": 102, "user_id": 2, "total": 49.50}
{"order_id": 103, "user_id": 1, "total": 75.00}
{"order_id": 104, "user_id": 3, "total": 120.00}
Enter fullscreen mode Exit fullscreen mode

Testing Replication

  1. Create a topic with replication:
bin/kafka-topics.sh --create --topic replicated-topic --partitions 2 --replication-factor 2 --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode
  1. Check the topic details:
bin/kafka-topics.sh --describe --topic replicated-topic --bootstrap-server localhost:9092
Enter fullscreen mode Exit fullscreen mode

Expected Output:

    Topic: replicated-topic   PartitionCount: 2    ReplicationFactor: 2    Configs:
        Partition: 0    Leader: 0    Replicas: 0,1    Isr: 0,1
        Partition: 1    Leader: 1    Replicas: 1,0    Isr: 1,0
Enter fullscreen mode Exit fullscreen mode

IV. Building Your First Kafka Application

Now that Kafka is set up and you’ve explored its features through the CLI, let’s build a simple producer and consumer application using a programming language. For this section, we’ll use Python with the kafka-python library, a popular choice for interacting with Kafka.

1. Setting Up the Environment

Step 1: Install Python
Ensure Python is installed on your system:

python --version
Enter fullscreen mode Exit fullscreen mode

Expected Output:

Python 3.x.x
Enter fullscreen mode Exit fullscreen mode

Step 2: Install kafka-python
Install the Kafka client library for Python:

pip install kafka-python
Enter fullscreen mode Exit fullscreen mode

What This Does:

  • Installs kafka-python, which provides tools for producing and consuming messages in Kafka. ### 2. Writing a Kafka Producer The producer sends messages to a Kafka topic. Step 1: Create a File for the Producer Create a new file named producer.py and paste the following code:
from kafka import KafkaProducer
import json

# Initialize Kafka Producer
producer = KafkaProducer(
    bootstrap_servers='localhost:9092',  # Kafka broker address
    value_serializer=lambda v: json.dumps(v).encode('utf-8')  # Serialize messages to JSON
)

# Topic Name
topic = 'user-signups'

# Send Messages
messages = [
    {"user_id": 1, "name": "Alice", "email": "alice@example.com"},
    {"user_id": 2, "name": "Bob", "email": "bob@example.com"}
]

for message in messages:
    producer.send(topic, value=message)
    print(f"Sent: {message}")

# Close Producer
producer.close()
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • KafkaProducer:
    • Connects to the Kafka broker (localhost:9092).
    • Sends JSON-encoded messages.
  • Topic:
    • Sends messages to the user-signups topic. Step 2: Run the Producer Run the script:
python producer.py
Enter fullscreen mode Exit fullscreen mode

Expected Output:

Sent: {'user_id': 1, 'name': 'Alice', 'email': 'alice@example.com'}
Sent: {'user_id': 2, 'name': 'Bob', 'email': 'bob@example.com'}
Enter fullscreen mode Exit fullscreen mode

Verify in Kafka:
Start a Kafka consumer to confirm the messages:

bin/kafka-console-consumer.sh --topic user-signups --from-beginning --bootstrap-server localhost:9092

Enter fullscreen mode Exit fullscreen mode

Expected Output:

user_id": 1, "name": "Alice", "email": "alice@example.com"}
{"user_id": 2, "name": "Bob", "email": "bob@example.com"}

Enter fullscreen mode Exit fullscreen mode

3. Writing a Kafka Consumer

The consumer retrieves messages from a Kafka topic.
Step 1: Create a File for the Consumer
Create a new file named consumer.py and paste the following code:

from kafka import KafkaConsumer
import json

# Initialize Kafka Consumer
consumer = KafkaConsumer(
    'user-signups',  # Topic to subscribe to
    bootstrap_servers='localhost:9092',
    auto_offset_reset='earliest',  # Start reading from the beginning
    group_id='user-signups-group',  # Consumer group
    value_deserializer=lambda x: json.loads(x.decode('utf-8'))  # Deserialize messages from JSON
)

# Consume Messages
print("Listening for messages...")
for message in consumer:
    print(f"Received: {message.value}")
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • KafkaConsumer:
    • Subscribes to the user-signups topic.
    • Starts reading from the earliest offset.
  • Group ID:
    • Ensures that multiple consumers can work together to process the same topic. Step 2: Run the Consumer Run the script:
python consumer.py
Enter fullscreen mode Exit fullscreen mode

Expected Output:

Listening for messages...
Received: {'user_id': 1, 'name': 'Alice', 'email': 'alice@example.com'}
Received: {'user_id': 2, 'name': 'Bob', 'email': 'bob@example.com'}
Enter fullscreen mode Exit fullscreen mode

4. Enhancing the Application

Add a Producer to Another Topic
Expand the producer to send data to a second topic (order-events):

# Additional Topic
order_topic = 'order-events'

orders = [
    {"order_id": 101, "user_id": 1, "total": 99.99},
    {"order_id": 102, "user_id": 2, "total": 49.50}
]

for order in orders:
    producer.send(order_topic, value=order)
    print(f"Sent to {order_topic}: {order}")
Enter fullscreen mode Exit fullscreen mode

Add a Second Consumer
Create a new consumer for the order-events topic:

# Initialize Kafka Consumer for another topic
order_consumer = KafkaConsumer(
    'order-events',
    bootstrap_servers='localhost:9092',
    auto_offset_reset='earliest',
    group_id='order-events-group',
    value_deserializer=lambda x: json.loads(x.decode('utf-8'))
)

# Consume Messages
print("Listening for order events...")
for order in order_consumer:
    print(f"Order Received: {order.value}")
Enter fullscreen mode Exit fullscreen mode

5. Testing the Application

Step 1: Start Both Consumers

  1. Run consumer.py to listen to user-signups.
  2. Run the new consumer for order-events. Step 2: Run the Producer Run producer.py to send messages to both topics. Expected Outputs:
  3. User Signups Consumer:

    Received: {'user_id': 1, 'name': 'Alice', 'email': 'alice@example.com'}
    Received: {'user_id': 2, 'name': 'Bob', 'email': 'bob@example.com'}
    
  • Order Events Consumer:

    Order Received: {'order_id': 101, 'user_id': 1, 'total': 99.99}
    Order Received: {'order_id': 102, 'user_id': 2, 'total': 49.50}
    

V. Conclusion

Kafka has revolutionized how modern systems handle messaging and real-time data processing. By decoupling producers and consumers, it allows for scalable, fault-tolerant, and flexible architectures that are crucial for today’s distributed systems. Through this course, you’ve gained a foundational understanding of Kafka’s core components, such as topics, partitions, brokers, and consumer groups, as well as hands-on experience with producing and consuming messages.

As a highly versatile platform, Kafka can serve as the backbone for a wide variety of applications, including real-time analytics, event-driven systems, and log aggregation pipelines. While this course focused on the basics, Kafka’s ecosystem extends far beyond messaging. Tools like Kafka Streams, KSQL, and Kafka Connect empower developers to build powerful stream processing applications and integrate seamlessly with other systems.

Whether you’re building microservices, processing IoT data, or scaling enterprise systems, Kafka provides the tools and reliability needed for success. With this introduction as your foundation, you’re now ready to explore Kafka’s advanced features and unlock its full potential for solving complex, real-world challenges.

Top comments (0)