Ervin Szilagyi for AWS Community Builders

Posted on Sep 29 • Originally published at ervinszilagyi.dev

About AWS AI Practitioner (Beta) Exam

#aws #certification #genai #career

In June this year, AWS announced two new certification exams: the AI Practitioner exam and the Machine Learning Engineer Associate exams. At the same time, AWS has the Machine Learning Specialty (MLS-C01) certification. You can still take this exam, and at the point of writing this article, there is no news of being retired. However, we can assume that the Machine Learning Engineer Associate will take its place.

On the flip side, the AI Practitioner certification is an entirely new offering, marking the second foundational level certification alongside the AWS Cloud Practitioner exam. Foundational level certifications are considered to be entry level certifications. They can be attempted by people who do not necessarily have an in-depth technical knowledge of cloud concepts.

From pure curiosity, I decided to attempt this certification. In my day-to-day job, I had the chance to work with AWS technologies and with LLM models lately, so I thought it would be an interesting challenge.

Exam Overview

The exam is intended not only for IT professionals, but also for people working as business analysts, sales and marketing professionals, and IT managers. It relies mainly on theoretical knowledge, it does not require as much hands-on experience as an associate or professional-level exam. This does not mean that it does not have its challenges.

The exam consists of 85 questions which should be answered in 120 minutes (though I had 130 minutes through Pearson's online testing). You can opt for accommodation of an extra 30 minutes, in case English is not your mother tongue. I did not opt for that, but I recommend considering it if you are in a similar situation as myself.

Aside from the multiple choice, multiple selection types of questions, AWS recently introduced new question types: ordering, matching, and case studies. At this point, I believe, AWS is A/B testing these new types of questions, since I did not encounter any of them during my session.

What You Need to Know for the Exam?

Considering that this is my 8th AWS certification, I can affirm that the exam was more challenging than expected. Regardless of your level of AWS cloud and AI/machine learning knowledge, I strongly suggest taking the time to do some meaningful preparation. To aid with that, I will present my learning plan and what I think you should know to take the exam successfully. However, I suggest considering a fully-fledged course.

Warning

Bear in mind that this is not a comprehensive guide. If you are seriously considering attempting this certification, you may want to enroll in training provided by AWS or by a third-party trainer. See the next section for my recommendations for courses and learning material.

My training guide does not specifically respect the order of the domains enumerated in the official exam guide, which I strongly suggest that you read. That being noted, these are the topics I learned about during my preparation:

1. Cloud Computing Basics

For the exam, you need to have a clear understanding of some basic concepts related to cloud computing, such as:

Cloud vs on-prem: what are the advantages, what are the drawbacks of both
Public vs private cloud
Pricing of cloud services: one important keyword that you will encounter is on-demand. Everything in the cloud is pay-per-use, you only pay for what you use. Whenever you see a question talking about pricing, as a rule of thumb, you can default to the answer that mentions the on-demand keyword. Of course, there might be exceptions, so use your best judgment.

2. Basic Machine Learning Concepts

For the exam, you will need to have surface-level knowledge of machine learning concepts and algorithms. In-depth understanding of these topics is not required, but you should be able to recognize them and understand their applications. The key concepts are as follows:

AI/Machine learning: What is AI and what is machine learning? What use cases are solved by AI systems?
AI application components: Data layer, ML algorithms, model layer, application layer.
Neural Networks: What are neural networks? Also, you should know about their components: neurons, layers, activation functions, and loss functions.
You should understand what backpropagation is: the process of updating the weights in the network to minimize the loss function
Deep Learning: Know what deep learning is and what it is used for. Remember the keyword convolution, as it refers to a type of deep learning network (convolutional networks), primarily used for computer vision and image recognition
GenAI: What is generative AI and what is it used for?
Transformer Models/Diffusion Models/Multi-Modal Models: These are types of GenAI models. There is a high likelihood that you will encounter questions regarding the properties of these models, so it's recommended to have a basic understanding of them. You likely won't need to go into detail about their inner workings.
Supervised vs. Unsupervised learning: Understand the difference between these two approaches and know when to use one over the other.
Reinforcement Learning (RL): How does it work, and what is it used for? Be familiar with some key use cases.
Machine learning model training and evaluation:
- Model fit: The exam covers some more technical topics, such as underfitting and overfitting. These concepts are essential when training a model. A model is considered to underfit if it does not perform well on the training data. Conversely, it is overfitting if it performs well on the training data but poorly on evaluation or real-world data. If neither of these applies to your model, you can assume you have a balanced model.
- Bias and variance: Both of these refer to errors introduced by your model. A model is biased when it consistently makes the same error in its predictions, often due to erroneous assumptions during training. Variance, on the other hand, reflects a model's sensitivity to small fluctuations in the input data. For the exam, you should be able to identify whether a model is highly biased or has high variance. Additionally, you should understand how bias and variance relate to model fit. For example, a model with high variance will likely overfit, while a model with high bias will probably underfit.
- Model evaluation metrics: it is expected that you are familiar with concepts of how to evaluate a model. It is not required to know the math behind those concepts, but it expects you to know when you should use one metric versus another. These metrics are the following:
  - Metrics used for classification models:
    - Confusion Matrix: used to evaluate the performance of classification models. It is usually structured as a square matrix, where rows represent the actual classes, and columns represent the predicted classes.
    - Accuracy: measures the fraction of correct prediction.
    - Recall (or Sensitivity): measures the true positive rates of the predictions.
    - Precision: measures the correct positive rate.
    - Specificity: measures the true negative rate.
    - F1 score: combines precision and recall into a final score. Use it when both precision and recall are considered important for evaluating your model.
  - Metrics used for regression models: similar to the metrics from the classification models, you don't need to know the formulas and mathematics behind these metrics. What you should know for the exam is to recognize them and know if they can be used with a presented model or not:
    - MAE (Mean Absolute Error): measures the average magnitude of errors between the predicted values and the actual values.
    - MAPE (Mean Absolute Percentage Error): used to assess the accuracy of a predictive model by calculating the average percentage error between the predicted values and the actual values. MAPE expresses the error as a percentage.
    - RMSE (Root Mean Squared Error): measures the average magnitude of the error between the predicted values and the actual values, with a higher emphasis on larger errors.
    - R Squared: explains variance in our model. R2 close to 1 means predictions are good.
  - Metrics used to evaluate LLMs - the exam might ask you about evaluating the performance of a large language model. In this case, you would want to know about these:
    - Perplexity loss: measures how well the model can predict the next word in a sequence of text
    - Recall-Oriented Understudy for Gisting Evaluation (ROUGE): a set of metrics used in the field of natural language processing to evaluate the quality of machine-generated text.

3. Generative AI and AWS Services for GenAI

The exam expects you to be familiar with GenAI models. While it does not require an understanding of their inner workings, you should have experience using them. For example, you might be asked whether, for a certain problem, you would prefer to use a GenAI-based model or a more "legacy" ML model.

Probably the most important AWS service for this exam is AWS Bedrock. You can expect around 15 to 20 questions involving Bedrock in one way or another. Bedrock is essentially a one-stop shop for GenAI models. It provides access to a variety of Foundation Models, such as Amazon Titan, several Anthropic models (Claude), and models from AI21 Labs, Cohere, Stability AI, Meta (LLaMA), Mistral, and others. In addition to allowing you to build on these models, Bedrock offers several other features. The following are relevant for the exam:

Knowledge Bases: A solution for provisioning and using vector databases. You may want to use vector databases if you are building a system based on RAG (Retrieval-Augmented Generation)
Agents: BBedrock's method for function calling. You can integrate your model with an "action," allowing the model to perform tasks in addition to responding to messages, such as executing a Lambda function.
Guardrails: You can use Guardrails to limit the model's responses in terms of what it should be able to answer. Additionally, Guardrails offer features such as detecting personally identifiable information (PII) and removing it from the response.
Model Evaluation: You can evaluate a model using AWS-provided metrics or by employing human reviewers.

Aside from these, there are other features of Bedrock the exam might ask about (Bedrock Studio, Watermark Detection). I strongly recommend doing some hands-on practice with Bedrock and experiencing what it has to offer.

Another GenAI-based AWS service is Amazon Q, which is a fully managed GenAI assistant for enterprise usage (whatever that means). It combines a lot of stuff into one service. It has a few flavors:

Amazon Q Business: it is a question-answer-based chatbot built on Amazon Bedrock. It can ingest your company-owned internal data, and it will be able to answer your questions based on that data.
Amazon Q Developer: it servers two completely different use cases:
- It is a chatbot built on AWS documentation, so it can answer your questions related to AWS services.
- It is a code companion (previously known as CodeWhisperer) similar to GitHub Copilot. It can generate source code. It can integrate with a bunch of IDEs and it can help you write code.

I personally would not worry too much about Amazon Q for the exam. In my experience, there aren’t many questions about this service. On the other hand, it’s important to ensure you don’t confuse Amazon Q with Kendra, another service built for similar purposes. The exam may present them side by side, but you should be able to determine which one is more appropriate for your scenario.

4. Prompt Engineering

Before embarking on my learning journey for the AI Practitioner certification, I considered prompt engineering to be a pseudo-science. My rule of thumb was (and still is) that if you want better answers from a model, provide it with as much information as possible. During my preparation and while building AI chatbots at my workplace, I learned that there are useful prompting techniques that can yield significantly better answers compared to what I was accustomed to before.

For the AI practitioner certification you should be aware of the following prompting techniques:

Zero-Shot Prompting: This is a more "scientific" definition of what I was doing before adopting any prompt engineering techniques. You simply present a query to the model without specific wording, formatting, or examples of expected output. The model's output may vary. It could be useful or it might be complete garbage.
Few-Shots Prompting: In this approach, you provide a prompt with examples of what you would expect as output from the model. Surprisingly, this technique works better than I had imagined. In terms of the exam, you should choose this technique when asked for a low-cost solution with the highest precision.
Chain of Thought Prompting: This involves dividing your queries into a sequence of reasoning steps, using phrases like "Think step by step." Interestingly, the GPT-3 model utilizes chain of thought prompting under the hood, which has contributed to its recent popularity. Therefore, expect some questions about it.
Retrieval-Augmented Generation (RAG): RAG is also considered a prompting technique. It relies on using a vector database to fetch content related to the user query. This content is then injected into the user prompt as additional context.

Related to the prompt engineering, the exam might ask you about hyperparameters you can set for a model to optimize its responses. Parameters you should be aware of are the following:

Temperature: value between 0 and 1.0, defines the creativity of a model. The higher the value, the more creative responses you will get.
Top P: value between 0 and 1.0, defines the size of the set of available words when generating the parts of an answer. For example, for a value of 0.25, the model will use the 25% most likely next word.
Top K: similar to Top P, but it is an integer value. By setting a Top K value, we tell our model that it only should use words from the next Top K available options.

The general rule for the hyperparameters is that setting lower values will make your model more conservative and give more coherent responses while setting a parameter to a higher value will result in more creative and less coherent responses.

4. Amazon SageMaker

Another important AWS service for the exam is Amazon SageMaker. SageMaker is a managed service used by data and machine learning engineers to build, train, and deploy machine learning models. Like other AWS services, it offers a variety of features. I will focus on those that may appear in the exam, although I encountered some unexpectedly challenging questions during my session. These questions felt as if they were taken from the Machine Learning Specialty exam question set.

One of the most important offerings of Sagemaker is SageMaker Studio. At first glance, this looks like a managed Jupyter Notebook, where a machine learning engineer can write Python code. It is way more than that. Part of SageMaker Studio is Data Wrangler, used for feature engineering and data preparation before training. From Data Wrangler we can publish data into SageMaker Feature Store.

Part of SageMaker Studio is SageMaker Clarify. It is used to evaluate foundation models against AWS-provided metrics or metrics provided by you. It even lets you leverage human intervention (have your employee evaluate the model, or use Ground Truth). SageMaker Clarify has a specific feature you should be aware of for the exam, this is Model Explainability. This is used to explain why you get certain outputs from a model and what kind of feature influenced the output.

SageMaker Ground Truth is another sub-service the exam expects you to know about. It is based on Reinforcement Learning from Human Feedback (RLHF), whenever you see this keyword, think of Ground Truth and vice-versa. Ground Truth is used to review models, do customizations, and do evaluations based on human feedback.

In terms of ML Governance, you should be aware of SageMaker Model Cards, SageMaker Model Dashboards, and SageMaker Role Manager. Model Cards lets you create cards with essential information about a model. Model Dashboard is a centralized repository for ML models. It displays insights for each model such as risk ratings, model quality, and data quality. Role Manager lets you create and define roles for AWS users.

SageMaker Model Monitor lets you monitor the quality of your models deployed in production. You can also create alerts for your models.

SageMaker Pipelines allows you to create pipelines for training and deploying models. Whenever the exam asks about MLOps-related services, most likely SageMaker Pipeline would be the correct answer.

Model Fine-Tuning: in the exam, you might face questions about model fine-tuning. You may want to use fine-tuning when you want to take an existing model and do some additional training on it with your data. SageMaker JumpStart is one of the places where you want to start fine-tuning a model.

The exam also likes to compare fine-tuning with other preparation techniques of an LLM model. In case you are faced with a comparison based on price, you would want to keep in mind the following order:

Prompt engineering (excluding RAG): least expensive
RAGs: they are more expensive than other prompt engineering techniques because they usually require the presence of a vector database
Instruction-based fine-tuning: it is a fine-tuning approach that uses labeled data to modify the weights of a model. It requires model training, which demands specific hardware, so it is considered more expensive than RAGs
Domain adaptation fine-tuning: uses unlabeled data for fine-tuning, it is the most expensive approach

Other SageMaker sub-services you would want to look up are the following: SageMaker Canvas, MLFlow for SageMaker, Automatic Model Tuning. Moreover, SageMaker provides a bunch of built-in machine-learning algorithms (for example: XGBoost, DeepAR, Seq2Seq, etc.). You may want to check them out, to at least recognize them if they pop up during the certification.

SageMaker is an advanced topic. It’s somewhat surprising that AWS expects such a deep level of knowledge about it, especially considering the exam is recommended for individuals who may never use this product. If you're comfortable writing a few lines of code and are familiar with Jupyter Notebooks, I recommend doing some hands-on practice with SageMaker.

5. AWS Managed AI Services

AWS offers a comprehensive list of AI services that are managed and trained by them. The list of the services you should be aware of are the following:

Amazon Comprehend and Amazon Comprehend Medical: extract relevant information from documents of all kinds.
Amazon Translate: an on-demand translation service, think of it as an on-demand version of Google Translate.
Amazon Transcribe and Amazon Transcribe Medical: speech-to-text service.
Amazon Polly: text-to-speech service.
Amazon Rekognition: used for image recognition and classification.
Amazon Forecast: used with time series data to forecast stuff. Discontinued by AWS, but is still part of the exam.
Amazon Lex: it is similar to Amazon Q, or an Amazon Bedrock agent. It is technically Alexa as a service.
Amazon Personalize: recommendation service.
Amazon Textract: used to extract text from images (OCR).
Amazon Kendra: it is a document search service. It is somewhat similar to Amazon Q, but it is way more restricted and it cannot do summarization (good idea to keep this in mind!)
Amazon Mechanical Turk: it is not necessarily an AI service. With Mechanical Turk you rely on a human workforce to carry out certain tasks for machine learning, such as labeling, classification, and data collection.
Amazon Augmented AI (A2I): likewise Mechanical Turk, is not necessarily a managed AI service. It is a service that lets you conduct a human review of machine learning models. It can use Mechanical Turk under the hood.
AWS DeepRacer: this is also an interesting thing to mention. It is a game, where you use reinforcement learning to drive a race car. While DeepRaces is still part of the exam, the service is discontinued by AWS.

The exam might present a task and it might ask which service would be able to solve that task. It also might put one of these services head-to-head with Bedrock or Amazon Q.

6. AI Challenges and Responsibilities

The exam will ask you about generative AI challenges and how to overcome them. A few challenges you should keep in mind are the following:

Regulatory violation
Social risks
Data security and privacy concerns
Toxicity
Hallucinations
Nondeterminism

You should be able to identify the challenges of a given AI application. You might be asked to find solutions to overcome some of these challenges. For example, to address hallucinations, you can use Knowledge Bases and Retrieval-Augmented Generation (RAG) techniques. To mitigate toxicity, you can implement Guardrails. To reduce nondeterminism, you can adjust the model's hyperparameters (such as temperature, Top P, or Top K).

Another important topic that may arise is governance. Governance refers to a set of practices that must be followed when developing AI products. For example, you should be able to address the ethical concerns of an AI-based solution, consider bias and fairness, adhere to regulatory and compliance requirements, and ensure data lineage and data quality. There are several AWS services you should be familiar with when discussing governance, including AWS Config, Amazon Inspector, AWS Audit Manager, AWS Artifact, AWS CloudTrail, and AWS Trusted Advisor. You should understand the purpose of each of these services and be able to recognize them in the context of governance-related questions.

Generative AI Security Scoping Matrix: it is a framework designed to identify and manage security risks associated with deploying GenAI applications.
It is used to classify your app in 5 defined GenAI scopes, from low to high ownership:

Scope 1: your app is using public GenAI services
Scope 2: your app is using a SaaS with GenAI features
Scope 3: your app is using a pre-trained model
Scope 4: your app is using a fine-tuned model
Scope 5: your app is using a self-trained model

7. AWS Security Services and Other Services

For the exam, you should be aware of a list of AWS services. As the themes go with other topics, you should have a surface-level knowledge about them. Most importantly, you should know when to use which.

The most important service you will be asked about is AWS Identity and Access Management (IAM). It is used to define roles and access policies. Whenever you, as a user, want to perform any action in your AWS account, you need the necessary permissions. These permissions are granted through roles and policies. Similarly, when one AWS service needs to interact with another, it must have an assigned role that grants the required access. In some cases, this interaction can also be facilitated through service policies. The exam does not go into detail about when to use roles versus policies. The key point to remember is that whenever you are asked about security, think of IAM.

Another important service that will pop-up is S3. S3 is an object storage service, think of it as Google Drive on steroids. Whenever the exam asks about storage for model input/out, you would want to default to S3.

EC2 is also an important service. EC2 provides virtual machines. In the context of machine learning, you need EC2 instances for training and inference. There are several types of EC2 instances. For the exam, you may want to remember the following ones:

P3, P4, P5, G3, G6 instances: These are instances with a GPU assigned to them, they can be used for training and for inference as well;
AWS Trainium and AWS Inferentia: these are instances specifically built for training and inference. They provide a lower cost for either training or inference.

The exam might mention Spot Instances. Spot Instances are EC2 instances available at a lower cost. You can obtain a Spot Instance by placing a bid. They are cheaper than On-Demand Instances, but the trade-off is that they can be interrupted if someone bids a higher price or if AWS needs the capacity for other purposes.

Networking:

You should know what a VPC is. A VPC (Virtual Private Cloud) is a virtual private network, isolated by default from internet traffic or other AWS services. Certain resources, such as EC2 instances or most databases, need to be placed in a VPC to function properly. You can provide internet access to the VPC if desired.
The exam often asks about VPC Endpoints in terms of security. VPC Endpoints allow communication with AWS services from a private VPC, one that does not have internet access. When using a VPC Endpoint, the traffic does not pass through the public internet.

Other services:

CloudWatch: used for monitoring and logging
CloudTrail: used for having a trail about any action in an AWS account
AWS Config: used to enforce compliance in an account
AWS Lambda Function: serverless functions that run only when needed. In the context of this exam, usually they are used for integration between 2 services

Courses and Practice Exams that I Recommend

The previous section aimed to present what you need to know to pass the exam. It is not comprehensive material, it might have missing topics or inaccuracies. If you want a more robust preparation plan, I recommend enrolling in a paid course.

AWS Skill Builder is the official learning portal run by AWS Training and Certification. They have a learning path for the AI practitioner exam. I personally did not use this because the video courses in the learning path do not strictly focus on exam topics.

I recommend taking one of the Udemy courses from Stephane Maarek or Frank Kane. For my preparation, I used Stephane Maarek's course. I'm also familiar with Frank's teaching style, making his course a solid recommendation as well.

Stephane's course was the first available on the market. I purchased it on the day of its release. Having taken his other AWS certification courses, I appreciate his organized content and clear presentation, which makes note-taking easy. Initially, the course had some gaps in the required material, but it was updated based on student feedback. I confidently recommend it.

In addition to courses, taking a practice exam before the live exam is beneficial. For this, Skill Builder is an excellent option. It offers a free practice exam with 20 questions, which is essential for anyone preparing for this certification. If you're willing to pay, Skill Builder also provides a comprehensive practice exam with 85 questions and explanations.

On Udemy, Stephane also offers a set of practice exams. I did not purchase these after completing everything on Skill Builder, so I cannot review them.

My Experience Taking the Exam

I took the exam on Thursday morning from home through Pearson. I had no issues this time, everything went smoothly.

I went through all 85 questions within an hour, after which I spent the next 20 minutes reviewing the questions I flagged. As I hinted before, the exam was not easy and had its fair share of challenges. Most of the questions had the difficulty I expected, but many seemed to be taken directly from the Machine Learning Specialty question set, or at least that’s how it felt. Moreover, the spelling of some questions felt really awkward, making me guess what the author was looking for. I suppose this is a downside to taking a beta exam.

I did not encounter any new types of questions, all of them were multiple choice and multiple selection. These types of questions are not a new innovation from AWS. Anyone who has taken a Microsoft Azure exam should be familiar with them. In my opinion, they do not affect the difficulty of the exam. Personally, I detest the ordering type of questions, while I prefer the case studies.

That being said, I passed without issues. My score was lower than I expected, but at the end of the day, it’s not something I care that much about.

My List of Recommendations for You for the Exam

Keep an eye on the clock. There are 85 questions, and you have less than 1.5 minutes per question. Don't spend too much time on any single question. Many of them can be answered quickly as long as you're well-prepared. Try to select the correct answer and move on.
If you're unsure about a question, flag it and move on. You'll have time to return to it later and figure it out. It's important not to panic if you don't know something right away.
You will encounter questions you don't know the answer to, and that's okay. The official exam guide is intentionally vague. Any course you take will have some gaps. Don't be discouraged if you come across something completely new during the exam. The important thing is to pass. Your final score doesn't matter much.
The exam is challenging, but with adequate preparation, anyone should be able to pass it. You'll be fine.

Closing Notes

Ultimately, I enjoyed the process of preparation and taking the exam itself. You should enjoy yours too! Even though it may not have a significant impact on my career, I’m proud that I was able to pass it.

I wish happy learning and good luck to anyone preparing to become an AI Practitioner!

DEV Community