An easy way to deploy Apache Druid on EC2 in order to load data from any source .
Introduction
Currently, real-time analysis plays a large role and is a symbol of competitiveness in the technology sector due to the fact the amount of data grows exponentially and the same way the great variety of tools, for this reason, I want to show you how we can use one of them call Apache Druid and how you can deploy it on EC2 instances as easy as a fast way.
Apache Druid
Druid is a high-performance real-time analytics database. Druid’s main value add is to reduce time to insight and action.
Druid is designed for workflows where fast queries and ingest really matter. Druid excels at powering UIs, running operational (ad-hoc) queries, or handling high concurrency. Consider Druid as an open-source alternative to data warehouses for a variety of use cases. The design documentation explains the key concepts.
Step by step for deploying:
- Go to the AWS EC2 console
- Create a new EC2 instance
- Install Apache Druid
- Run & Open Druid on your browser
Here we go!
Before launching an EC2 instance you keeping in mind this Quickstart documentation where we must consider a virtual server with 16 GiB of RAM for this reason we going to choose a t2.xlarge with 4 vCPUs & 16 RAM (GiB).
Create a new EC2 instance
We are ready to create an EC2 instance, as follows :
- OS 👉 Ubuntu 22.04
- Instance Type 👉 t2.xlarge
- Create a Security Group with the Inbound rules indicated in the image
- Launch instance
Install Apache Druid
Now, we are going to connect to your instance recently created from SSH and configure it with this little step-by-step:
1) sudo apt update -y
2) sudo apt install openjdk-8-jdk -y
3) wget https://dlcdn.apache.org/druid/29.0.1/apache-druid-29.0.1-bin.tar.gz (Last updated version)
4) tar -xzf apache-druid-29.0.1-bin.tar.gz
5) cd apache-druid-29.0.1
6) export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
7) export DRUID_HOME=/home/ubuntu/apache-druid-29.0.1
8) PATH=$JAVA_HOME/bin:$DRUID_HOME/bin:$PATH
Run Apache Druid
Finally, we can run Apache Druid from the EC2 instance with the command
./bin/start-micro-quickstart
Apache Druid in action 🚀
Now, you can open your browser in order to see the web console in the URL 👉 AWS Public IPv4 address:8888
Summary
As you can see deploying Apache Druid on an EC2 instance is so easy, on the other hand, is one of the best ways to analyze data in real-time from Kafka topics by applying simple SQL queries for free because is open source.
Thank you for reading this far. If you find this article useful, like and share this article. Someone could find it useful too and why not invite me for a coffee.
Top comments (0)