Introduction
Among data analysis tools, Apache Superset, provided as open-source software, is considered one of the best choices for deploying reports at a large scale efficiently and completely free of charge. In this article, I will guide you through installing, configuring Superset, and connecting data sources.
This application was initiated by Maxime Beauchemin (the creator of Apache Airflow) as a hackathon project when he was working at Airbnb, and it joined the Apache Incubator program in 2017.
Essentially, Superset's features are quite similar to other data analysis software, including:
- Creating and managing dashboards
- Supporting multiple database types: SQLite, PostgreSQL, MySQL, etc.
- Supporting direct querying
Installation and Configuration
Here, I will guide you through installing Superset using the following Docker command:
docker run -d -p {outside port}:{inside port} --name {container name} apache/superset
Example:
docker run -d -p 8080:8088 --name superset apache/superset
After the Superset Docker container is running, we access that container to run the command for initializing an account as follows:
docker exec -it superset superset fab create-admin --username {username} --firstname {firstname} --lastname {lastname} --email {email} --password {password}
Example:
docker exec -it superset superset fab create-admin --username admin --firstname Superset --lastname Admin --email admin@superset.com --password admin
Next, you run the following command to load some pre-existing examples:
docker exec -it superset superset load_examples
To start Superset:
docker exec -it superset superset init
After that, you can access http://localhost:8080 to start using Superset. The result will have some example data that we loaded previously.
Connecting Data Sources
To analyze data, you first need to create a connection to the database source (such as Postgres, MySQL, etc.). The connection process is simple and similar to how typical data connection tools work. Here, I will guide you on how to connect to PostgreSQL. If you are not familiar with Postgres, you can refer to this article to install and use PostgreSQL basics.
First, access the page to create a new database connection.
Next, enter the SQLALCHEMY URI with the following structure:
postgresql://{username}:{password}@{host}:{port}/{database}
After successfully connecting, you can use the features that Apache Superset supports, such as creating Dashboards, creating charts (with support for many chart types and diverse customization capabilities), querying data, saving queries, and viewing query history.
Conclusion
Apache Superset provides relatively comprehensive tools to support data analysis and visualization. It can embed query results into other applications, connect to various data sources, and, importantly, it is open-source and completely free.
Although it may not be comparable to powerful paid tools like Tableau or Power BI in some aspects, overall, Superset is a very worthwhile tool because it meets most data analysis and reporting needs.
What do you think? Leave a comment below!
If you found this content helpful, please visit the original article on my blog to support the author and explore more interesting content.
Some series you might find interesting:
Top comments (0)