In this comprehensive tutorial, we will dive deep into Amazon OpenSearch Service, a managed service that makes it easy to deploy, operate, and scale OpenSearch clusters for search and analytics in the AWS Cloud. Whether you're new to Elasticsearch or want to harness its power in the AWS cloud, this guide will provide a blend of theoretical knowledge and hands-on exercises to get you started.
Table of Contents
- Introduction to Amazon OpenSearch Service
- Main Features of Amazon OpenSearch
- Creating an Amazon OperSearch Service Domain
- Visualizing and Analyzing Data with OpenSearch Dashboards
- Conclusion
1. Introduction to Amazon OpenSearch Service
Amazon OpenSearch Service is a fully managed service that simplifies the deployment, operation, and scaling of OpenSearch clusters. Is a community-driven, open-source search and analytics suite derived from open-source Elasticsearch 7.10.2 and Kibana 7.10.2, which allows you to search, analyze, and visualize data in real-time, making it ideal for log and event data analysis, full-text search, and more.
2. Main Features of Amazon OpenSearch
OpenSearch Service includes the following features:
Scale
Numerous configurations of CPU, memory, and storage capacity known as instance types, including cost-effective Graviton instances
Up to 3 PB of attached storage
Cost-effective UltraWarm and cold storage for read-only data
Security
AWS Identity and Access Management (IAM) access control
Easy integration with Amazon VPC and VPC security groups
Encryption of data at rest and node-to-node encryption
Amazon Cognito, HTTP basic, or SAML authentication for OpenSearch Dashboards
Index-level, document-level, and field-level security
Audit logs
Dashboards multi-tenancy
Stability
Numerous geographical locations for your resources, known as Regions and Availability Zones
Node allocation across two or three Availability Zones in the same AWS Region, known as Multi-AZ
Dedicated master nodes to offload cluster management tasks
Automated snapshots to back up and restore OpenSearch Service domains
Flexibility
SQL support for integration with business intelligence (BI) applications
Custom packages to improve search results
Integration with popular services
Data visualization using OpenSearch Dashboards
Integration with Amazon CloudWatch for monitoring OpenSearch Service domain metrics and setting alarms
Integration with AWS CloudTrail for auditing configuration API calls to OpenSearch Service domains
Integration with Amazon S3, Amazon Kinesis, and Amazon DynamoDB for loading streaming data into OpenSearch Service
-
Alerts from Amazon SNS when your data exceeds certain thresholds
3. Creating an Amazon OpenSearch Service domain
An OpenSearch Service domain, which is the same as an OpenSearch cluster, can be created using the OpenSearch Service console or by using the AWS CLI with the create-domain command. Domains are clusters with the settings, instance types, instance counts, and storage resources you specify.
Practical Exercise: Creating an Amazon OpenSearch Domain
- Log in to the Amazon OpenSearch Console, and click "Create Domain".
- Configure your domain with settings like version, instance types, and storage.
In this step, you will need to enter the name of your domain, following the suggested naming convention, and you will also need to set a username and master password for this Domain. As the data that we will store in the domain is not sensitive at all and this is a testing exercise, setting Network Public access, will be enough.
For the domain to be available, you will need to wait a range of time from 10 to 15 minutes, so the next step will be to upload some sample data in your OpenSearch domain.
- Uploading data to your OpenSearch Domain.
Data is loaded into OpenSearch domains as JSON documents, and you can do it through the command line using cURL or through OpenSearch UI. I feel like a Data Engineer very used to work with CLIs, so I ran into some issues sending requests to my recently created OpenSearch endpoint, this time I preferred to use the UI.
In order to load data using the UI, you will need to log into your Domain by clicking in Amazon OpenSearch Service Domains UI OpenSearch Dashboards URL, which is simply your Domain endpoint + /_dashboards
, and then you will be prompted to introduce the master_username
and master_password
you previously set when creating the Domain.
Once there click in Dev tools
.
In Dev tools
console you will need to perform a PUT
request to the API in order to upload some data to your OpenSearch Domain:
PUT movies/_doc/1
{
"director": "Burton, Tim",
"genre": ["Comedy","Sci-Fi"],
"year": 1996,
"actor": ["Jack Nicholson","Pierce Brosnan","Sarah Jessica Parker"],
"title": "Mars Attacks!"
}
Then Click to send request
and your first index will be created with the name movies
- Querying data from OpenSearch UI.
Once documents are successfully loaded as indexes in your OpenSearch Domain, you can be able to check that the data is loaded by querying it. You can find more sample data to load into your domain in the repo that I created as a companion to this tutorial.
So far I have loaded more data to the endpoint by uploading the following document through the API:
POST movies/_doc/3
{
"director": "Baird, Stuart",
"genre": ["Action", "Crime", "Thriller"],
"year": 1998,
"actor": ["Downey Jr., Robert", "Jones, Tommy Lee", "Snipes, Wesley", "Pantoliano, Joe", "Jacob, Ir\u00e8ne", "Nelligan, Kate", "Roebuck, Daniel", "Malahide, Patrick", "Richardson, LaTanya", "Wood, Tom", "Kosik, Thomas", "Stellate, Nick", "Minkoff, Robert", "Brown, Spitfire", "Foster, Reese", "Spielbauer, Bruce", "Mukherji, Kevin", "Cray, Ed", "Fordham, David", "Jett, Charlie"],
"title": "U.S. Marshals"
}
You can find how to load indexes to OpenSearch in the official documentation.
Now you need to query your data by passing a single query to the API that contains a string of the recently uploaded data, it is similar to the like statement in SQL GET movies/_search?q=U.S.&pretty=true
. YOu will get the following output:
4. Visualizing and Analyzing Data with OpenSearch Dashboards
OpenSearch Dashboards is an open-source data visualization tool designed to work with OpenSearch Service Domains. OpenSearch Dashboards gives you data visualization tools to improve and automate business intelligence and support data-driven decision-making and strategic planning.
You can access OpenSearch Dashboards from OpenSearch Domains UI in AWS Console, or simply accessing to
{your_domain_endpoint}/_dashboards/app/home#/
Once there you can add some sample data:
There you have a couple of sample data sources to choose from:
And here is the Dashboard:
This data that you are visualizing through the Dashboard can also be queried in the Dev Tools
Console, guess you need to search in the sample_data_flights all flights related to Warsaw
, so we will need to query:
GET opensearch_dashboards_sample_data_flights/_search?q=Warsaw&pretty=true
to get all flights with OriginCityName
= Warsaw:
5. Conclusion
In this Amazon OpenSearch Service article, we've covered everything from fundamental concepts to practical exercises for creating, uploading and querying data in OpenSearch Service Domain, and and also visualizing data through OpenSearch Dashboards. Amazon OpenSearch Service is a powerful tool for various use cases, from log analysis to real-time data exploration.
With this knowledge, you can confidently leverage Amazon OpenSearch Service to search, analyze, and visualize your data, unlocking valuable insights for your organization. Happy searching!
Top comments (0)