Introduction to Amazon Managed Streaming for Apache Kafka (Amazon MSK)
Amazon MSK makes it easy to ingest and process streaming data in real-time with fully managed Apache Kafka.
APACHE KAFKA AT A HIGH LEVEL
What can fully manage Apache Kafka on AWS do:
- Allow you to create, update, and delete clusters
- MSK creates & manages Kafka brokers nodes & Zookeeper nodes for you
- Deploy the MSK cluster in your VPC, multi-AZ (up to 3 for HA) Automatic recovery from common Apache Kafka failures Data is stored on EBS volumes
- You can build producers and consumers of data
- Can create custom configurations for your clusters
- Default message size of 1MB
- Possibilities of sending large messages (ex: 10MB) into Kafka after custom configuration
MSK – Configurations
- Choose the number of AZs (3 – recommended, or 2)
- Choose the VPC & Subnets
- The broker instance type (ex: kafka.m5.large)
- The number of brokers per AZ (can add brokers later)
- Size of your EBS volumes (1GB – 16TB)
MSK – Security
Encryption:
- Optional in-flight using TLS between the brokers
- Optional in-flight with TLS between the clients and brokers
- At rest for your EBS volumes using KMS
Network Security:
- Authorize specific security groups for your Apache Kafka clients
Authentication & Authorization:
- Define who can read/write to which topics
- Mutual TLS (AuthN) + Kafka ACLs (AuthZ)
- SASL/SCRAM (AuthN) + Kafka ACLs (AuthZ)
- IAMAccessControl(AuthN+AuthZ)
MSK – Monitoring
CloudWatch Metrics:
- Basic monitoring (cluster and broker metrics)
- Enhanced monitoring (++enhanced broker metrics)
- Topic-level monitoring (++enhanced topic-level metrics)
Prometheus (Open-Source Monitoring):
- Opens a port on the broker to export cluster, broker, and topic-level metrics
- Setup the JMX Exporter (metrics) or Node Exporter (CPU and disk metrics)
Broker Log Delivery:
- Delivery to CloudWatch Logs
- Delivery to Amazon S3
- Delivery to Kinesis Data Streams
MSK Serverless
- Run Apache Kafka on MSK without managing the capacity
- MSK automatically provisions resources and scales compute & storage
- You just define your topics and your partitions and you're good to go!
- Security: IAM Access Control for all clusters
Top comments (0)