DEV Community

Cover image for How to deploy Apache Druid on AWS EC2 Instance
Alexander Bolaño for AWS Community Builders

Posted on • Edited on

How to deploy Apache Druid on AWS EC2 Instance

An easy way to deploy Apache Druid on EC2 in order to load data from any source .

Introduction

Currently, real-time analysis plays a large role and is a symbol of competitiveness in the technology sector due to the fact the amount of data grows exponentially and the same way the great variety of tools, for this reason, I want to show you how we can use one of them call Apache Druid and how you can deploy it on EC2 instances as easy as a fast way.


Apache Druid

Druid is a high-performance real-time analytics database. Druid’s main value add is to reduce time to insight and action.

Druid is designed for workflows where fast queries and ingest really matter. Druid excels at powering UIs, running operational (ad-hoc) queries, or handling high concurrency. Consider Druid as an open-source alternative to data warehouses for a variety of use cases. The design documentation explains the key concepts.


Step by step for deploying:

  1. Go to the AWS EC2 console
  2. Create a new EC2 instance
  3. Install Apache Druid
  4. Run & Open Druid on your browser

Here we go!

Before launching an EC2 instance you keeping in mind this Quickstart documentation where we must consider a virtual server with 16 GiB of RAM for this reason we going to choose a t2.xlarge with 4 vCPUs & 16 RAM (GiB).

Create a new EC2 instance

We are ready to create an EC2 instance, as follows :

  • OS 👉 Ubuntu 22.04
  • Instance Type 👉 t2.xlarge
  • Create a Security Group with the Inbound rules indicated in the image
  • Launch instance

Image description

Choose OS & Instance Type

Inbound Rules

Install Apache Druid

Now, we are going to connect to your instance recently created from SSH and configure it with this little step-by-step:

1) sudo apt update -y
2) sudo apt install openjdk-8-jdk -y
3) wget https://dlcdn.apache.org/druid/29.0.1/apache-druid-29.0.1-bin.tar.gz (Last updated version)
4) tar -xzf apache-druid-29.0.1-bin.tar.gz
5) cd apache-druid-29.0.1
6) export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
7) export DRUID_HOME=/home/ubuntu/apache-druid-29.0.1
8) PATH=$JAVA_HOME/bin:$DRUID_HOME/bin:$PATH

Enter fullscreen mode Exit fullscreen mode

Run Apache Druid

Finally, we can run Apache Druid from the EC2 instance with the command

./bin/start-micro-quickstart

Run Druid on EC2 Instances

Apache Druid in action 🚀

Now, you can open your browser in order to see the web console in the URL 👉 AWS Public IPv4 address:8888

Image description


Summary

As you can see deploying Apache Druid on an EC2 instance is so easy, on the other hand, is one of the best ways to analyze data in real-time from Kafka topics by applying simple SQL queries for free because is open source.

Thank you for reading this far. If you find this article useful, like and share this article. Someone could find it useful too and why not invite me for a coffee.

Sponsor 💵

Top comments (0)