DEV Community

Cover image for Explore Sydney crime data with Amazon CodeWhisperer - Get started with Generative AI on AWS - Part 3
Wendy Wong for AWS Heroes

Posted on • Edited on

Explore Sydney crime data with Amazon CodeWhisperer - Get started with Generative AI on AWS - Part 3

Exploratory data analysis is essential for data science or analytics

Let's take the heavy lifting out of data-preprocessing to speed up the 'data understanding' and 'data preparation' steps so that you can fix any data anomalies and make your data clean.

Now that you know a little bit more about generative AI from the last two instalments, let's get hands-on with Amazon CodeWhisperer in VS Code with Python to explore a crime dataset.

Learning Objectives

In this lesson you will use a real-word dataset to:

  • Import Python Libraries
  • Load a dataset
  • Explore the dataset
  • Use descriptive statistics to summarize the dataset
  • Check for missing data
  • Get comfortable using Amazon CodeWhisperer
  • Learn about the new generative features that integrate with Jupyter notebooks announced by AWS on 10 May 2023

Dataset

This open-source Quarterly crime recorded dataset was provided by NSW Bureau of Crime Statistics and Research via data.gov.nsw.au.

The meta data includes 6 columns:

  • Offences - Text
  • Year 2018 - Integer
  • Year 2019 - Integer
  • Year 2020 - Integer
  • Year 2021 - Integer
  • Year 2022 - Integer

How to use Amazon CodeWhisperer as your AI coding companion?

Here are some tips:

  • In VS Code open a new Python file
  • Starting typing a few words e.g. Import...
  • Write a comment # Import python libraries...
  • Accept the code suggestion by clicking the tab key and click enter on your keyboard

Amazon CodeWhisperer will start generating code from the words that you enter or even learn and predict what you want to achieve by reading your comments.

Tutorial 1: Exploratory Data Analysis in Python with Amazon CodeWhisperer

Amazon CodeWhisperer will help you as a data scientist or data analyst prepare a logical flow for your analysis and help you debug and offer code suggestions as it learns what you might be thinking from natural language processing contained in the large language model architecture.

Step 1: Ensure you are connected to your Amazon Builder ID in VS Code.

Step 2: Start typing a few words or a comment in Python

Step 3: Import popular Python libraries and load the dataset

import

Step 4: Inspect the first 5 rows of the dataframe. As you type in comments, Amazon Whispererer will try to predict what you would like to achieve next and will suggest if you wanted to review the first 5 rows of your data.

head

Step 5: You might want to understand the shape of your dataframe such as how many columns and rows are in your dataset.
In the comments, start typing the word 'shape' and Amazon CodeWhisperer will try to complete and predict what you are trying to achieve.

shape

Step 6: Summarize the dataset and provide descriptive statistics. Just start typing the word 'describe' and Amazon CodeWhisperer remembers the previous line of code and will suggest if you were wanting to summarize the data.

summazirie

Step 7: You might want to view overall information about the data and start typing the word 'info'.

info

Step 8: You might want to check for any missing values. You may start typing the word 'missing' in the comments and Amazon CodeWhisperer will try to generate the code you need.

Imissing

On the next line of code, Amazon CodeWhisperer will try to guess your next step and ask if you wanted to check for any duplicate values. You can click tab and enter to accept the code suggestion or you may reject the recommendation.

duplicateon

Step 9: You may also want to understand and visualize the data with histograms for different variables. If you start typing the word 'hist' in comments Amazon CodeWhisperer will quickly predict and provide a code suggestion in what you might be thinking. It's very smart! Learning from all your code comments and what you are typing from the large language models.

histogram

Step 10: As you tab to the next line of code, Amazon CodeWhisperer is already trying to guess that you want to write the code for building a box plot.

box

Step 11: On a new line, you may type in the Python comment 'draw a bar chart'. Amazon CodeWhisperer will quickly suggest to you the correct code that you need.

bar

This quick exploratory analysis outline took 10 minutes to write the workflow and thought process.

Tutorial 2: Exploratory data analysis of NSW Crime for 2018 to 2022 in Jupyter Notebook using Amazon CodeWhisperer

(Note: Please read the new AWS Machine Learning Blog below for new announcements in generative AI with Jupter at the end of this blog announced on 10 May 2023).

Step 1: Open an instance of Jupyter Notebook in Python 3 and ensure you have saved the dataset in your directory.

Step 2: You may quickly apply the Amazon CodeWhisperer code suggestions into the Jupyter Notebook so that you can see the input code and the output.

jupytern

Step 3: Run the Jupter Notebook for each line code so you may inspect the output.

ru  head

Step 4: Check the descriptive statistics of your dataframe

check stats

Step 5: Check the dimensions of your dataframe with shape. There are 62 records and 6 columns or features.

features

Step 6: Check for any missing values in the dataframe. There are no missing values in the dataframe.

missing

Step 7: Check details of the dataframe including the data types and also if there are any empty values or nulls values. All the data types are objects within the dataframe.

datatype

Step 8: Check for any duplicate values. There were no duplicate records in the dataframe.

dup

Conclusion

You have learnt how to navigate to use Amazon CodeWhisperer as your ML-powered coding companion in 15 minutes by accepting or rejecting the predicted words and comments that is learning in real-time. You have also completed exploratory analysis in a Jupyter notebook in 10 minutes to help to achieve different stages of data exploration for either data analytics or data science.

Resources

News Flash - Hot off the press on 10 May 2023! 🌍

I am pleased to share with you, hot off the press. A news flash from AWS. Please read the latest announcement from AWS announcing new Jupyter contributions by AWS to democratize generative AI and scale ML workloads.

New features for generative AI include:

  • Introducing two generative AI extensions for Jupyter
  • Jupyter AI, an open-source project to bring generative AI to Jupyter notebooks
  • Amazon CodeWhisperer Jupyter extension to build, train and deploy ML models
  • Notebooks scheduling
  • SageMaker open-source distribution

Please read the latest announcements authored by Brian Granger on the AWS Machine Learning Blog announced today 10 May 2023.

new

This Month - AWS She Builds Global Mentorship application closing 31 May

AWS She Builds Application

Apply to participate in the free AWS She Builds Mentorship Program - APJ, EMEA or US. Learn cloud computing ☁️

If you aspire to be a women in tech or interested in learning more about cloud computing, you have until 31 May 2023 to submit your application. Early bird gets the worm! You may apply at this link

seh builds

Next Month - AWS re:Inforce 2023 on June 13-14

Are you ready for AWS re:Inforce this year? Join the AWS Community in-person to experience two days of learning, networking with your peers as you further your learning in security, innovation and the latest announcements from the keynotes.

I hope you will be able to join AWS CISO CJ Moses for his keynote, the AWS Security community next month and register at this link. There are leadership sessions, partner sessions and expert sessions at the 300 and 400 levels which are focused-learning and interactive sessions.

next month

You may watch the keynote from AWS re:Inforce 2022 and also follow the hashtag #AWSreInforce on Twitter

Until the next lesson, happy learning! 😀

Reinventing your Customer's Business with Generative AI on AWS

Adam Selipsky on LinkedIn: Reinventing Your Customers’ Business with Generative AI on AWS | Amazon… | 14 comments

As #AWS continues to drive innovation in generative AI, we’re working with our partners to jointly help customers unlock the potential of these exciting new… | 14 comments on LinkedIn

favicon linkedin.com

Hello World |AI Coding Companion

Dr. Werner Vogels, Amazon VP & CTO, sits down with CodeWhisperer GM Doug Seven and Sr. Principal Engineer Sandeep Pokkunuri to discuss large language models.

Top comments (0)