View AIP-C01 Exam Questions

Q: 1

Scenario: An AI developer needs a scalable, secure way to collect telemetry data (temperature, pressure) from devices in remote locations with unstable connectivity, store it in Amazon S3, and minimize infrastructure management. Question- Which solution meets the given requirements?. Options:

Options

Discussion

Chris Feb 12, 2026 12:57 pm

A . Device sends MQTT to IoT Core, and the direct Firehose rule means no Lambda or extra compute to manage. Keeps it super minimal on infra, just what they’re asking for. Kinesis/Lambda would be overkill unless there’s a need for extra processing steps. Pretty sure A is what AWS recommends for this scenario, but happy to hear other takes if I’m missing something.

Zoe Feb 17, 2026 1:46 am

Yeah, A here. Firehose with IoT Core needs basically zero management and is built for this type of ingestion.

Nathan Y. Feb 26, 2026 7:25 pm

Seen similar scenarios in the official AWS guide and practice questions, and A is what they push for scalable IoT ingestion with minimal overhead. Firehose + IoT Core handles batching to S3 with almost no ops. If anyone saw AWS doc or lab pushing D, let me know, but I think A is solid.

Sam C. Feb 17, 2026 4:40 pm

D looks possible too since Lambda gives flexible processing before hitting S3, but pretty sure Firehose is simpler for this case.

RowanK Feb 20, 2026 12:47 pm

I don’t think it’s D. A is right since Firehose is less overhead and no extra Lambda needed.

Noah E. Mar 1, 2026 8:10 pm

A or D but I think A is the better fit since Firehose is fully managed and cuts out extra moving parts. D only makes sense if you really need custom Lambda processing in between, which the scenario doesn't mention. If they're just warehousing telemetry, A works best. Agree?

Daniel Feb 17, 2026 1:38 pm

A is correct for this one, official guide and AWS practice test both back that up.

Meera I. Feb 24, 2026 10:47 am

Its D. Kinesis plus Lambda for stream processing feels like it covers scale and flexibility here.

DanielR Feb 21, 2026 11:42 am

Saw this type of scenario in some practice labs and official study material, pretty sure it's A.

FriendlyMentor7040 Feb 18, 2026 10:25 am

Not A, D seems better if you expect any processing or logic before storage. Similar questions in exam guides suggest Lambda lets you customize data flow, even if it’s a bit more to manage. Anyone else see that in the official practice tests?

Be respectful. No spam.

Correct Answer:

Explanation

This solution provides a fully managed, serverless, and highly scalable pipeline for IoT data ingestion. Message Queuing Telemetry Transport (MQTT) is a lightweight protocol ideal for devices with constrained resources and unreliable network connectivity, as it supports different Quality of Service (QoS) levels to ensure message delivery. AWS IoT Core securely handles device communication at scale. The IoT Core rule engine can directly forward messages to an Amazon Data Firehose stream without any custom code. Firehose is a fully managed service that automatically batches, compresses, and encrypts the data before reliably delivering it to Amazon S3, perfectly aligning with the requirement to minimize infrastructure management.

Why Incorrect

B. While AWS IoT Greengrass is excellent for edge processing and managing intermittent connectivity, this option requires writing and maintaining custom AWS SDK code on each device for S3 uploads, increasing management overhead compared to a managed cloud pipeline.

C. Using Amazon API Gateway with HTTP is less suitable for this scenario. HTTP is less efficient than MQTT for telemetry and does not handle intermittent network connectivity as gracefully, requiring more complex client-side retry logic.

D. This option is overly complex for the stated requirements. While functional, it uses an Amazon Kinesis Data Stream and an AWS Lambda function to perform a task that Amazon Data Firehose is specifically designed to handle out-of-the-box, thus failing the "minimize infrastructure management" principle.

References

1. AWS IoT Core Developer Guide, "Choosing a protocol": This guide explains the benefits of using MQTT for IoT applications, stating, "MQTT is a lightweight binary protocol... Because it's lightweight, it can be used on devices with limited CPU and memory resources and in networks with limited bandwidth." This supports the choice of MQTT for remote devices.

2. AWS IoT Core Developer Guide, "Amazon Data Firehose action": The documentation details the direct, configuration-based integration between IoT Core and Firehose. It states, "The Amazon Data Firehose action sends data from an MQTT message to an Amazon Data Firehose stream," demonstrating the streamlined pipeline in option A.

3. Amazon Kinesis Data Firehose Developer Guide, "What Is Amazon Kinesis Data Firehose?": This document highlights the service's purpose: "Amazon Kinesis Data Firehose is the easiest way to reliably load streaming data into data lakes, data stores... You don't need to write applications or manage resources." This directly supports option A's adherence to minimizing infrastructure management.

4. AWS Documentation, "FAQ: Amazon Kinesis": In comparing Kinesis Data Streams and Kinesis Data Firehose, AWS clarifies: "Use Amazon Kinesis Data Firehose if you want to deliver data to Amazon S3... Kinesis Data Firehose handles all of the underlying stream management." This confirms that Firehose (Option A) is simpler than Kinesis Data Streams + Lambda (Option D) for this use case.

Q: 2

Scenario: During SageMaker AMT tuning, many jobs continue running despite poor early performance, wasting GPU usage. The company needs a tuning strategy that automatically stops underperforming trials and reallocates resources. Question- Which tuning strategy should be employed to enhance optimization efficiency and expedite hyperparameter search?. Options:

Options

Discussion

Riley S. Feb 23, 2026 6:48 am

B . Hyperband is what actually provides the automatic early stopping, C is tempting but doesn't do the auto-stopping part.

Skyler Feb 20, 2026 6:00 pm

B tbh, Hyperband is what I've seen recommended in the official guide and practice exams for auto-stopping poor jobs. Not 100 percent sure, but pretty confident compared to the other options.

Arjun Feb 28, 2026 6:30 am

C or D for this one. Seen similar on practice exams and random search or Bayesian gets mentioned a lot. Official guide and exam reports help here.

NathanZ Feb 22, 2026 3:35 pm

Why not C for this if the focus wasn’t on early stopping?

Adam Q. Feb 17, 2026 11:29 pm

B here. Hyperband is basically built for this exact early stopping and resource allocation scenario, unlike grid or random search. Pretty sure that's the one they want, but open to pushback if I'm missing something.

Adam R. Feb 16, 2026 12:50 am

C is good for refining search, but it won't cut off bad trials early. B matches the need for auto early stopping and resource shifting since Hyperband does that by design. Unless they ask strictly about optimization, I'd stick with B here. Open to other reads but pretty sure.

Aaron U. Feb 27, 2026 3:06 am

B not C. Hyperband is built for early stopping and reallocating resources, which is the key point in this scenario.

Anita Feb 11, 2026 12:30 pm

B imo, Hyperband fits since it auto-stops weak trials and reallocates GPU. Saw this highlighted in the AWS official guide and exam practice sets. If the scenario didn't need early stopping, C might work, but B is right here.

Ethan B. Feb 20, 2026 3:21 pm

I'd say C on this. Bayesian optimization is known for smartly narrowing down the hyperparameter space, and in some tuning cases, it can outperform grid or random search. Not totally sure here though, since Hyperband is also popular.

Nina I. Feb 20, 2026 5:57 am

Option C. had something like this in a mock exam.

Be respectful. No spam.

Correct Answer:

Explanation

The Hyperband tuning strategy, available in Amazon SageMaker, is specifically designed to address the problem of inefficient resource allocation during hyperparameter tuning. It operates as a multi-fidelity method, using a successive halving algorithm. This approach starts numerous trials with a small amount of resources (e.g., epochs) and then progressively eliminates the lower-performing half. The remaining, more promising trials are allocated more resources in subsequent rounds. This mechanism inherently performs early stopping of underperforming jobs, optimizing computational resource usage and accelerating the discovery of optimal hyperparameters, which directly solves the scenario's core problem.

Why Incorrect

A. Grid search is computationally expensive as it evaluates every combination, and the option explicitly excludes early stopping, which is the primary requirement.

C. Bayesian optimization intelligently selects the next hyperparameters to test but does not inherently use the aggressive, resource-based early stopping mechanism that defines Hyperband.

D. Random search samples hyperparameters without learning from prior results and, like other strategies, requires a separate early stopping configuration; it is not the strategy itself.

References

1. Amazon SageMaker Developer Guide: "The Hyperband tuning strategy treats the automatic model tuning process as a search for the best configuration in an infinite number of configurations... Hyperband uses a successive halving algorithm to choose the configurations to advance to the next rung. It stops underperforming configurations after one rung."

Source: AWS Documentation, Amazon SageMaker Developer Guide, "How Hyperband Works".

2. Academic Publication (Original Hyperband Paper): "We introduce HYPERBAND, a novel algorithm for hyperparameter optimization... We show that HYPERBAND can provide a principled approach to early-stopping which is simple, flexible, and effective."

Source: Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., & Talwalkar, A. (2018). Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization. Journal of Machine Learning Research, 18(185), 1-52. (Section 1: Introduction).

3. AWS Machine Learning Blog: "Hyperband is a simple yet powerful algorithm for hyperparameter tuning that maximizes the number of configurations that can be evaluated by using a principled early stopping mechanism."

Source: AWS Machine Learning Blog, "Amazon SageMaker Automatic Model Tuning now supports Hyperband" (Nov 26, 2019).

4. University Courseware: Lecture materials on AutoML often describe Hyperband as a key algorithm for efficient hyperparameter optimization through early stopping.

Source: Carnegie Mellon University, 10-708 Probabilistic Graphical Models, Lecture on "Bayesian Optimization and Hyperband". (Discusses Hyperband as a state-of-the-art method for early stopping).

Q: 3

Scenario: A document classification model detects fraud. It performs well on the majority ("legitimate claim") documents but frequently misclassifies the minority ("fraudulent claim") samples. SageMaker Clarify pretraining bias analysis reveals a significant skew in the dataset. Question- What issue is most likely causing the model's poor performance on fraudulent claim detection? Options:

Options

Discussion

Sam E. Feb 17, 2026 6:29 am

Option D, encountered exactly similar question in my exam. Kendra is built for semantic search and processes unstructured data from S3 directly. The other services don't handle semantic queries natively.

AveryZ Feb 22, 2026 3:54 pm

Option B If the dataset skew was actually about features and not the targets, then C could be a possible trap answer.

Mia O. Feb 24, 2026 4:30 pm

My vote is B, class imbalance is the classic issue here since Clarify flagged dataset skew. That would make the model much worse at spotting the minority (fraud) class. Not totally certain if other factors could play in, but skew points to B.

PreciseAnalyst8146 Feb 13, 2026 1:26 pm

B imo

Nora I. Feb 26, 2026 1:24 pm

I don’t think it’s A. B fits here since class imbalance makes the model weak at detecting the fraud cases. Clarify flagging label skew is a big clue. A and D are common traps but not what's described, agree?

Ajay Feb 24, 2026 4:25 pm

AWS loves to sneak Kendra into these semantic search questions, D imo.

Aaron G. Feb 16, 2026 5:20 pm

So with this one, wouldn't Kendra (D) be the best fit? SageMaker or OpenSearch can handle embeddings and search, but they're a lot more manual for RAG semantic queries. Kendra's pretty much designed for semantic/contextual search straight from S3 using its connector. Option C is tempting but Textract/OpenSearch won't do deep semantic out of the box. Am I missing a use case where C would be better?

Ryan Feb 23, 2026 4:52 am

Not D, I'm thinking C because Textract plus Redshift lets you analyze text, then OpenSearch can handle queries.

QuickNeteng2425 Feb 18, 2026 1:21 pm

Its B here, since SageMaker Clarify calling out dataset skew usually points at class imbalance. Fraud is classic for this, minority class trips the model up. Open to being wrong but that's what matches the scenario.

Robin U. Feb 16, 2026 3:44 pm

B is the right pick since SageMaker Clarify flagged dataset skew, which really just means class imbalance. That’s classic in fraud scenarios where one label’s way rarer than the other, so model ignores the minority by default. Haven’t seen a question like this catch people with C, but lmk if you think otherwise.

Be respectful. No spam.

Correct Answer:

Explanation

The scenario explicitly states that a "SageMaker Clarify pretraining bias analysis reveals a significant skew in the dataset." This directly points to Class Imbalance (CI), a pre-training bias metric that SageMaker Clarify is designed to detect. Class Imbalance occurs when the distribution of target labels is unequal. In this fraud detection case, "fraudulent claim" samples are the minority class. A model trained on such a dataset can achieve high accuracy by simply predicting the majority class ("legitimate claim"), failing to learn the distinguishing features of the minority class, which explains the poor performance on fraudulent claims.

Why Incorrect

A. A high learning rate is a hyperparameter issue that typically causes training instability or failure to converge, not a specific bias against a minority class.

C. Insufficient text extraction is a feature engineering problem. While it could degrade overall performance, it does not inherently explain the specific failure on the minority class versus the majority class.

D. Overfitting describes poor generalization to any unseen data. The problem here is a specific, predictable failure on the minority class due to dataset distribution, which is more precisely defined as bias from class imbalance.

References

1. Amazon SageMaker Developer Guide: Under the section "Pretraining Bias Metrics," the guide lists "Class Imbalance (CI)" as a key metric. It states, "A dataset is imbalanced if the class labels are not equally distributed... The CI is the ratio of the number of samples in the majority class to the number of samples in the minority class." This directly links the "significant skew" found by Clarify to the concept of Class Imbalance.

Source: Amazon SageMaker Developer Guide, Section: "Bias detection with SageMaker Clarify" -> "Pretraining Bias Metrics".

2. AWS Machine Learning Blog: In the article "Handle class imbalance with Amazon SageMaker," it is explained that "if the training dataset has a disproportionate ratio of samples in each class, a machine learning (ML) model may be biased towards the majority class. This is a common problem in use cases like fraud detection... where the number of fraudulent transactions is much smaller than the number of legitimate ones." This aligns perfectly with the scenario.

Source: AWS Machine Learning Blog, "Handle class imbalance with Amazon SageMaker," May 18, 2022.

3. Academic Publication: He, H., & Garcia, E. A. (2009). Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284. This foundational paper explains that standard algorithms are often biased towards the majority class and that their performance on the minority class is compromised.

DOI: https://doi.org/10.1109/TKDE.2008.239, Section II-A, "The Nature of the Imbalance Problem".

Q: 4

Scenario: A data scientist needs to develop a fraud detection model on SageMaker with a severely imbalanced dataset (fraudulent transactions are rare). They must minimize operational overhead and ensure the model is fair and unbiased. Question- Which approach will fulfill the given requirements?. Options:

Options

Discussion

Ryan J. Feb 20, 2026 8:37 pm

Option D I don't think A is right since Lex is more for voice bots, not ASR tuning. D matches the AWS Transcribe workflow from what I remember on similar practice questions.

Nathan Feb 22, 2026 5:32 am

Option D. because Clarify is the AWS tool for bias checks and Pipelines automate the workflow (helps with keeping ops overhead down). A looks tempting but Reinforcement Learning isn't really typical for fraud detection classification. Pretty sure D is right here, but happy to hear counterpoints if folks disagree.

Olivia X. Feb 18, 2026 3:21 am

D since Pipelines automates the workflow and Clarify is actually built for bias checks-fits best unless they wanted a human-in-the-loop, which the scenario doesn't say. Pretty sure that's the key nitpick people miss.

Zoe X. Feb 19, 2026 11:47 pm

B tbh, fine-tuning a Bedrock model for better embeddings could help the ASR context. It sounds like a modern approach if you want better accuracy over time with less manual updates. Maybe it's not the most direct, but worth considering.

Ben X. Feb 28, 2026 3:33 am

D imo. Had something exactly like this show up on my exam and Transcribe custom vocabulary was the answer there too.

Luke I. Feb 26, 2026 11:08 pm

Pretty sure D is right here since Amazon Transcribe custom vocab lets you add/update product names fast, you don't have to retrain the whole model. The others seem more for general AI or search stuff? Not fully confident, let me know if I missed something.

Sanjay G. Feb 17, 2026 12:44 pm

D fits-Pipelines cut down ops overhead, and Clarify is made for checking bias automatically. A2I’s for human reviews, but the question didn’t say that was needed. I think D’s the most AWS-native and direct route here, unless I’m missing something.

Jack Feb 17, 2026 7:20 pm

I don’t think A is right since RL isn't a fit for fraud classification. It's D, Pipelines and Clarify are built for this.

Sara L. Feb 19, 2026 9:17 am

C or B for me. Both mention using A2I for bias, which seems like it could help with detection, and they include SMOTE for balancing data. I know Clarify is more standard for bias checks but thought A2I handled some of that too? Not 100 percent convinced though, maybe missing something with Pipelines. Anyone else prefer A2I here or am I way off?

Parker J. Feb 19, 2026 8:26 am

Pretty sure D's correct, but if human review was a compliance requirement or if they needed a manual approval step, then A2I (so B or C) would come into play instead. Here, nothing in the question says that, so D fits best for minimizing ops and automating fairness. Disagree?

Be respectful. No spam.

Correct Answer:

Explanation

This approach correctly addresses all requirements of the scenario. Using the Synthetic Minority Oversampling Technique (SMOTE) is a standard and effective method for handling severely imbalanced datasets in classification tasks like fraud detection. Amazon SageMaker Pipelines are designed to automate and orchestrate machine learning workflows, which directly fulfills the requirement to minimize operational overhead. Finally, Amazon SageMaker Clarify is the specific AWS service designed to detect potential bias in data and models before deployment, ensuring the model is fair and unbiased. This combination provides a complete, automated, and responsible AI solution within the SageMaker ecosystem.

Why Incorrect

A. SageMaker Reinforcement Learning is not suitable for a supervised classification problem like fraud detection. It is designed for learning optimal actions in an environment, not for classifying transactions.

B. Amazon Augmented AI (A2I) is used for implementing human review of model predictions, not for pre-deployment bias detection. SageMaker Clarify is the correct tool for analyzing bias.

C. This option is identical to option B and is incorrect for the same reason. Amazon A2I is a human-in-the-loop service, not a tool for pre-deployment bias analysis.

References

1. SageMaker Clarify for Bias Detection: The AWS documentation states, "Amazon SageMaker Clarify helps improve your machine learning models by detecting potential bias and helping explain how these models make predictions." This confirms its use for pre-deployment bias analysis.

Source: AWS Developer Guide, "Detecting Bias in ML Models with Amazon SageMaker Clarify". Section: "Fairness and model evaluation with SageMaker Clarify".

2. SageMaker Pipelines for Automation: The documentation highlights that "Amazon SageMaker Pipelines helps you to automate and manage your machine learning (ML) workflows," which directly addresses minimizing operational overhead.

Source: AWS Developer Guide, "Amazon SageMaker Pipelines". Section: "Create and Manage Workflows with Amazon SageMaker Pipelines".

3. Handling Imbalanced Data (SMOTE): While SMOTE is a general technique, its application within SageMaker is a common pattern. AWS provides examples of using techniques like SMOTE in SageMaker Processing jobs to rebalance datasets before training.

Source: AWS Machine Learning Blog, "Handling class imbalance in a fraudulent transaction dataset". This blog details using SMOTE within a SageMaker workflow.

4. Amazon Augmented AI (A2I) Purpose: The official documentation specifies, "Amazon Augmented AI (Amazon A2I) makes it easy to build the workflows required for human review of ML predictions." This confirms it is not a bias detection tool.

Source: AWS Developer Guide, "What Is Amazon Augmented AI?". Section: "Introduction".

Q: 5

Scenario: A forecasting pipeline needs retraining on a larger dataset with a different distribution. Budget is limited, so the new tuning job must leverage previously saved high-performing hyperparameters, and must automatically stop if validation loss does not improve. Question- Which hyperparameter tuning job configuration should be used?. Options:

Options

Discussion

NehaE Feb 28, 2026 12:05 am

Makes sense to pick A here.

SharpSec6882 Feb 24, 2026 11:16 am

A. not D. TRANSFER_LEARNING is specifically for cases where the data distribution changes, which is what the scenario says. D's IDENTICAL_DATA_AND_ALGORITHM only works if the dataset hasn't changed, so that's a trap here.

Avery M. Feb 13, 2026 8:35 am

Option A

Amelia Feb 15, 2026 3:22 am

Alex I. Feb 12, 2026 9:33 pm

I don’t think it’s D. A matches because the data distribution changed, so only TRANSFER_LEARNING warm start (option A) lets you reuse old hyperparams from a different dataset. Early stopping helps with the budget too. Pretty sure this is what AWS expects for scenarios like this, but chime in if you see a catch.

Logan Feb 24, 2026 11:53 pm

Ishaan K. Feb 27, 2026 9:22 am

QuickCandidate9514 Feb 18, 2026 1:40 am

I think A. IDENTICAL_DATA_AND_ALGORITHM (D) is only for same dataset, but here the distribution has changed so transfer learning warm start makes sense.

Liam C. Feb 22, 2026 8:13 pm

A tbh, since D is tempting but that one only works when the data distribution doesn't change.

Parker P. Feb 20, 2026 7:27 pm

C/D? AMT Hyperband in C would be better for pure cost reduction, but the question wants to reuse prior hyperparams even with distribution change, so A makes sense. TRANSFER_LEARNING warm start fits that need, plus Early Stopping covers budget. Pretty sure it's A but open to other logic.

Be respectful. No spam.

Correct Answer:

Explanation

The scenario requires leveraging knowledge from a previous hyperparameter tuning job for a new task involving a different dataset, while also managing a limited budget. A warm start tuning job is the correct approach to reuse prior knowledge. Specifically, the TRANSFERLEARNING warm start type is designed for scenarios where you are tuning on a new dataset but want to use a previous job as a starting point. This accelerates the tuning process. Additionally, enabling Automatic Model Tuning (AMT) Early Stopping directly addresses the budget constraint by automatically terminating training jobs that no longer show improvement in the validation loss, thus saving compute resources.

Why Incorrect

B. This option is incorrect because it explicitly states it will not import prior tuning job results, which contradicts the core requirement to leverage previously saved hyperparameters.

C. This is incorrect for the same reason as option B; it relies on Hyperband's efficiency alone and does not import prior warm start knowledge as required by the scenario.

D. This option is incorrect because the IDENTICALDATAANDALGORITHM warm start mode is only appropriate when resuming a tuning job with the exact same dataset and algorithm, which is not the case here as the dataset is new and has a different distribution.

References

1. AWS SageMaker Developer Guide, "Run a Warm Start Tuning Job": This document details the two types of warm start jobs. It specifies, "Use the TRANSFERLEARNING warm start type when you want to use a previous tuning job as a starting point to tune on a new dataset with a similar schema." This directly supports option A. It also clarifies that IDENTICALDATAANDALGORITHM is for resuming a job on the same data, making option D incorrect.

Source: docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-warm-start.html

2. AWS SageMaker Developer Guide, "Stop Training Jobs Early": This page explains the early stopping feature. It states, "To save time and resources, you can configure your hyperparameter tuning job to stop training jobs that are not improving as measured by the objective metric." This supports the use of early stopping for budget efficiency as mentioned in option A.

Source: docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-early-stopping.html

3. AWS SageMaker API Reference, HyperParameterTuningJobWarmStartConfig data type: The official API documentation defines the WarmStartType parameter, listing IDENTICALDATAANDALGORITHM and TRANSFERLEARNING as the valid enumerations, confirming the technical options available and their intended use cases.

Source: docs.aws.amazon.com/sagemaker/latest/APIReference/APIHyperParameterTuningJobWarmStartConfig.html

Q: 6

Scenario: A multinational company needs an efficient solution to process audio/video content, translate it from Spanish (and other languages) into English, and summarize it quickly using an LLM, minimizing deployment time and maximizing scalability. Question- Which option will best fulfill these requirements in the shortest time possible? Options:

Options

Discussion

Drew Feb 25, 2026 12:51 pm

A is quickest since it’s all managed services for each step, so no need to build or train anything custom. B would take longer because of model training time. Anyone see a scenario where D would actually be faster?

Jack Feb 25, 2026 1:41 pm

A , had a similar question on a practice exam and it was definitely A for fastest setup.

Karan X. Mar 1, 2026 1:00 pm

D imo if we needed deep analytics, but the question emphasizes speed and deployment simplicity. A seems to hit all the asks with less setup work. Not totally sure though since sometimes they want SageMaker involved, agree?

FocusedLead9794 Feb 22, 2026 10:00 am

A since Transcribe plus Translate plus Bedrock gets you conversion, translation, and LLM summarization with almost zero setup. B and D take more time setting up SageMaker models. Pretty sure A matches the requirements tightest-let me know if I missed something.

PracticalCandidate9819 Feb 25, 2026 6:04 am

D hits all the requirements: SSE-KMS for encryption at rest, IAM for strict access, and CloudWatch monitoring covers continuous model checks. The others skip something critical like strong encryption or real model metrics. Pretty sure D is it, but let me know if I missed a flaw.

Ravi T. Feb 25, 2026 5:43 am

Option D

Aisha Z. Feb 21, 2026 8:51 pm

A seen similar question in my last practice exam. Managed workflow so quickest option here tbh.

Anita C. Feb 14, 2026 6:48 pm

Probably A, . Managed services like Transcribe, Translate, Bedrock are ready out of the box so shortest ramp-up time here.

Vikram S. Mar 2, 2026 3:17 pm

A Managed services like Transcribe and Bedrock mean you skip custom model training and big setup time. Pretty sure this is what AWS wants for fast scalable translation + summary, but if they wanted analytics instead maybe D? Disagree?

Reese Q. Feb 26, 2026 1:47 pm

A tbh. SageMaker's a trap here, much slower to deploy than just chaining Transcribe and Bedrock.

Be respectful. No spam.

Correct Answer:

Explanation

This option presents a complete, serverless, and efficient end-to-end workflow using the most appropriate managed AWS services for each required task. Amazon Transcribe is specifically designed for converting speech from audio/video into text. Amazon Translate is the dedicated service for neural machine translation. Amazon Bedrock provides a serverless API to access powerful foundation models like AI21 Labs' Jamba for summarization. This combination directly addresses all requirements, especially minimizing deployment time and maximizing scalability, by leveraging fully managed, purpose-built services.

Why Incorrect

B. Training custom models in Amazon SageMaker is significantly more time-consuming and complex than using managed services, directly contradicting the requirement for minimal deployment time.

C. This option misuses services. AWS Glue is for ETL on structured data, not audio/video transcription, and Amazon Lex is for building conversational bots, not for text summarization.

D. This option is critically flawed as it omits the initial, essential step of transcribing the audio/video content into text. Amazon Translate cannot process audio/video files directly.

References:

1. Amazon Transcribe: Official documentation states, "Amazon Transcribe uses advanced deep learning technologies to recognize speech in audio and video files and transcribe it into text." This confirms its role as the correct first step for processing the source media.

Source: AWS Documentation, "What Is Amazon Transcribe?", Introduction.

2. Amazon Translate: The service is designed for text-to-text translation. "Amazon Translate is a neural machine translation service that delivers fast, high-quality, affordable, and customizable language translation." This confirms it is the correct service for the second step, acting on the text output from Transcribe.

Source: AWS Documentation, "What is Amazon Translate?", Introduction.

3. Amazon Bedrock: Bedrock is designed for rapid integration of generative AI capabilities. "With Amazon Bedrock's serverless experience, you can quickly get started... and easily integrate and deploy them into your applications... Common use cases include text generation, chatbots, search, text summarization, image generation, and personalization." This validates its use for quick and scalable summarization.

Source: AWS Documentation, "What is Amazon Bedrock?", Introduction.

4. Solution Architecture: AWS often presents architectures combining these services for similar use cases. For example, a common pattern for content analysis involves a pipeline of Amazon S3 -> Amazon Transcribe -> Amazon Translate -> Amazon Comprehend or an LLM in Amazon Bedrock for summarization. This pattern aligns perfectly with option A.

Source: AWS Architecture Blog, "Summarize and query content using generative AI on AWS", Architecture Overview section. This blog post details a solution using Amazon Transcribe and Amazon Bedrock for summarization.

Q: 7

Scenario: A CNN model training job (using an EC2 On-Demand Instance) experiences significantly long training times due to slow data reads from S3, as it currently uses File mode (sequential download). The engineer must improve I/O performance without modifying the model architecture or scripts. Question- Which action should the engineer take to optimize training performance most efficiently? Options:

Options

Discussion

Kevin Z. Feb 12, 2026 11:28 pm

Option D

Aisha Feb 19, 2026 6:17 am

Yep, I’d say D too. Pipe mode streams data straight from S3 as you train, so it’s way better for I/O than File or FastFile modes. Saw a similar question in practice dumps. Not 100% since AWS docs change sometimes, but from what I’ve seen D is what the exam wants.

Ryan U. Feb 22, 2026 8:42 am

D pretty sure since Pipe mode streams S3 data for top throughput. C tempts people but isn't as fast.

Cameron V. Feb 20, 2026 6:45 pm

It’s D, not C. C can help, but Pipe mode (D) is more efficient for streaming large datasets from S3 and avoids the local download bottleneck. A lot of folks mix those up since FastFile is newer, but exam feedback usually points to D.

Logan Feb 11, 2026 10:43 am

Pipe mode is the AWS way for faster S3 reads in this exact scenario, so D makes sense. Saw similar in a practice exam, and D was correct for highest I/O without changing code. Pretty sure that's what they're after here.

PreciseAuditor8699 Feb 17, 2026 12:53 am

C or D? I remember official training docs talk about Pipe mode (D) as best for high-throughput S3 streaming, but FastFile (C) also improves read speed for some loads. Would check the AWS exam guide and experiment a bit before picking. Not 100% sure.

Meera N. Feb 25, 2026 2:20 am

Option C

Nora N. Feb 28, 2026 4:10 am

D imo. Only D has all the controls: KMS encryption on both S3 and Bedrock, CloudTrail for full API auditing, and CloudWatch for regional monitoring (latency/throughput). The others skip key stuff like observability or proper encryption. Pretty sure that's what the question's after but open to pushback if I missed something!

Nathan T. Feb 14, 2026 1:43 pm

Practice questions and the AWS training guide both talk about FastFile mode as a streaming option, so isn't C valid here?

Drew U. Mar 1, 2026 4:16 pm

Maybe C. FastFile mode streams from S3 too, and doesn't need code changes. I thought it was the direct alternative to File mode. Not totally confident though, someone correct me if I'm off.

Be respectful. No spam.

Correct Answer:

Explanation

The core issue is the I/O bottleneck caused by SageMaker's default File mode, which downloads the entire dataset from S3 to the training instance's local storage before training begins. This leads to significant startup delays, especially with large datasets. Pipe mode resolves this by streaming data directly from S3 to the training algorithm as it runs. This eliminates the initial download wait time, allowing training to start almost immediately and significantly improving data throughput. Since the constraint is to not modify the model scripts, changing the data input configuration from File to Pipe mode is the most efficient and appropriate action.

Why Incorrect

A. Increasing the instance size might improve network bandwidth but doesn't change the inefficient "download-then-train" mechanism of File mode. The primary bottleneck is the data access pattern, not the compute or network hardware itself.

B. Using Amazon Rekognition Custom Labels is a completely different, higher-level AI service. This would require abandoning the existing CNN model and scripts, directly violating the problem's constraints.

C. FastFile mode is an improvement over standard File mode as it uses multipart downloads to speed up the initial data transfer. However, it still downloads the entire dataset before training, making it less efficient than Pipe mode's streaming approach for I/O-bound jobs.

References

1. AWS SageMaker Developer Guide - Access Training Data: This document explicitly compares the different input modes. It states, "In Pipe mode, the data is streamed directly from Amazon S3 to your training algorithm without saving the data to the EBS volume of the training instance. This shortens the startup time for training jobs and offers better read throughput than File mode."

Source: AWS SageMaker Developer Guide, Section: "Access Training Data" -> "Select a data source" -> "Amazon S3".

2. AWS SageMaker API Reference - CreateTrainingJob: The documentation for the CreateTrainingJob operation shows that InputMode is a configurable parameter within the Channel specification. This confirms that switching from 'File' to 'Pipe' is a configuration change, not a code modification.

Source: AWS SageMaker API Reference, CreateTrainingJob -> InputDataConfig -> Channel -> InputMode.

3. AWS Machine Learning Blog - "Using Pipe input mode for Amazon SageMaker algorithms": This blog post details the performance benefits of Pipe mode. It highlights, "With pipe mode, your training job streams data directly from Amazon S3. Because the data is streamed, your training job starts sooner, finishes faster, and requires less disk space."

Source: AWS Machine Learning Blog, "Using Pipe input mode for Amazon SageMaker algorithms," published November 20, 2017.

Q: 8

Scenario: A retail team needs an automated way (minimal manual effort) to build a model to predict customer churn and identify the most relevant features contributing to the prediction (explainability). Question- Which of the following solutions will best fulfill these requirements while minimizing manual effort?. Options:

Options

Discussion

Riley Feb 18, 2026 8:39 pm

Option A

Nora Feb 18, 2026 11:17 am

Reese Feb 12, 2026 6:37 pm

A . Autopilot does the training and Clarify handles feature importance, so barely any manual steps needed. The others mean more setup or don't fully automate both parts. Anybody think B could work better for fast explainability?

Skyler U. Mar 1, 2026 5:45 am

I don’t think C’s right. A. Saw a similar question on a practice test, Autopilot plus Clarify checks both automation and explainability.

Jason Feb 22, 2026 3:23 am

A . Autopilot plus Clarify is way less manual and checks both boxes here.

Drew X. Mar 2, 2026 2:20 am

Pretty sure A is right for minimal manual work. Autopilot does the heavy lifting and Clarify handles feature explainability with almost zero setup. Saw a similar question in recent exam reports, but let me know if anyone used B successfully?

Owen Feb 10, 2026 10:25 pm

D imo. Ground Truth plus custom TensorFlow adds a lot more manual steps, so it's a trap if we're looking for minimal effort-pretty sure A's the one that automates both training and explainability best.

Priya G. Feb 11, 2026 7:22 pm

Makes sense to pick A here. Autopilot automates the whole ML pipeline for churn prediction and Clarify gives built-in feature importance, so minimal manual work and you get explainability. Pretty sure this fits the scenario best, unless I missed something.

Grace Feb 21, 2026 3:24 am

Not sure B is right, as Data Wrangler isn't for model training, and C just does clustering not prediction. A?

Jamie Mar 1, 2026 3:45 pm

B is off, D is what you want for XGBoost here.

Be respectful. No spam.

Correct Answer:

Explanation

Amazon SageMaker Autopilot is designed to automate the end-to-end process of building a machine learning model with minimal manual effort. It automatically explores different data preprocessing steps, algorithms, and hyperparameters to find the best model for a given dataset. For a customer churn problem, which is a classification task, Autopilot is an ideal solution. Furthermore, SageMaker Autopilot integrates with SageMaker Clarify to automatically generate explainability reports. These reports use techniques like SHAP (SHapley Additive exPlanations) to quantify the contribution of each feature to the model's predictions, directly fulfilling the requirement to identify the most relevant features.

Why Incorrect

B. SageMaker Data Wrangler is primarily for data preparation and feature engineering. Its "Quick Model" feature is for exploratory analysis, not for building and tuning an optimized production model automatically.

C. The k-means algorithm is an unsupervised clustering method. It groups similar customers but does not perform supervised classification to predict a specific outcome like churn.

D. This option requires the most manual effort. SageMaker Ground Truth is for data labeling, and building a custom TensorFlow model is a complex, code-intensive task, directly contradicting the "minimal manual effort" requirement.

References

1. Amazon SageMaker Autopilot Documentation: "Amazon SageMaker Autopilot automatically builds, trains, and tunes the best machine learning models based on your data, while allowing you to maintain full control and visibility... For an Autopilot experiment, SageMaker Clarify is used to help explain how the models make predictions." (Source: AWS SageMaker Developer Guide, "Automate model development with Amazon SageMaker Autopilot").

2. Amazon SageMaker Clarify Documentation: "SageMaker Clarify provides tools to help you understand why your machine learning models make the predictions that they do... Clarify uses a feature attribution approach based on the concept of a Shapley value..." (Source: AWS SageMaker Developer Guide, "Explainability for Amazon SageMaker Autopilot models").

3. Amazon SageMaker Data Wrangler Documentation: "Amazon SageMaker Data Wrangler reduces the time it takes to aggregate and prepare data for machine learning (ML) from weeks to minutes." Its focus is on the data preparation stage. (Source: AWS SageMaker Developer Guide, "Prepare ML Data with Amazon SageMaker Data Wrangler").

4. Amazon SageMaker Built-in Algorithms - K-Means: "The Amazon SageMaker k-means algorithm is an unsupervised learning algorithm. It attempts to find discrete groupings within data..." (Source: AWS SageMaker Developer Guide, "K-Means Algorithm").

Q: 9

Scenario: A claims automation system uses SageMaker AI, predicting claim approval based on vehicle damage severity and other features (age, mileage). The model must be continuously monitored for feature attribution drift in production (i.e., if the model starts prioritizing less relevant features like vehicle age over damage severity). Question- Which solution should be implemented? Options:

Options

Discussion

QuickOps7160 Feb 13, 2026 11:18 am

Probably A here. Only ModelExplainabilityMonitor with SHAP directly catches attribution drift in production, which is exactly what the scenario describes. Options like D are more for training time and B focuses on quality metrics, not feature importance. Seen this wording trip people up before, but pretty sure A is the fit.

Alex J. Feb 20, 2026 5:24 pm

Option D makes more sense here. ModelExplainabilityMonitor with SHAP is designed to track feature attribution drift specifically, not just input or output distribution shifts (like C does). C is a common trap but doesn't really capture changes in how the model weighs features. Agree?

Quinn Y. Feb 11, 2026 4:20 am

A tbh, since D is more for bias/fairness and not ongoing feature drift in production.

Cameron A. Feb 23, 2026 8:12 am

A - official AWS docs and the practice test both mention ModelExplainabilityMonitor for attribution drift.

MethodicalConsultant4152 Feb 25, 2026 5:13 pm

Not sure I agree with A here, I think C. DataCapture lets you log inputs and outputs, so you could spot if the feature distributions or model outputs are changing. Maybe I’m missing something, but isn’t that still useful for drift? Open to being corrected though.

Alex G. Feb 13, 2026 5:30 am

Yeah this one's definitely D

Layla J. Feb 24, 2026 6:26 am

Not seeing how any but A directly handles feature attribution drift. ModelExplainabilityMonitor with SHAP is made for this, tracks importance shift in prod. Others focus on metrics or bias, not feature priority changes. Anyone see a case for D here?

HelpfulLead8377 Feb 26, 2026 7:03 pm

A makes sense here since ModelExplainabilityMonitor is set up for tracking changes in feature importance using SHAP, which is exactly what's needed for attribution drift. The other options might help with performance or data drift, but not with how features are weighed. Pretty sure about A but open to other takes.

Karan W. Feb 14, 2026 4:43 am

Why not B here? Isn't ModelQualityMonitor only about accuracy metrics, not how features are weighed?

Nina M. Feb 18, 2026 6:04 pm

Be respectful. No spam.

Correct Answer:

Explanation

The core requirement is to continuously monitor for feature attribution drift in production. This means tracking changes in how the model weighs different input features when making predictions. The Amazon SageMaker ModelExplainabilityMonitor is the specific tool designed for this purpose. It uses explainability algorithms like SHAP (SHapley Additive exPlanations) to compute feature importance scores for real-time inferences. It then compares these scores against a pre-established baseline (generated from the training data) to detect significant drift, which directly addresses the scenario of the model incorrectly prioritizing features like vehicle age over damage severity over time.

Why Incorrect

B. ModelQualityMonitor tracks predictive performance metrics (e.g., accuracy, recall) by comparing predictions to ground truth. It measures what the model predicts, not why, and would not directly detect feature attribution drift.

C. This describes a manual, custom implementation for data or prediction drift monitoring. While possible, it is not the purpose-built, managed solution for feature attribution monitoring provided by SageMaker, making it less efficient and direct than option A.

D. SageMaker Clarify is the underlying technology, but this option incorrectly frames its use for a one-time analysis on the training dataset. The question requires continuous monitoring in production, which is the specific function of the ModelExplainabilityMonitor.

References

1. AWS SageMaker Developer Guide - Monitor model explainability: "You can also use Model Monitor to monitor for drift in feature attribution... Model Monitor detects drift by comparing the feature attributions for predictions that your model makes in production to the feature attributions from a baseline. Model Monitor uses SHAP values to explain the relative importance of each feature to the predictions that a model makes." This directly supports option A.

2. AWS SageMaker Developer Guide - How Model Explainability Monitoring Works: "The ModelExplainabilityMonitor class can inspect your model and generate a baseline for you... When you start a monitoring schedule, it launches a processing job that uses the baseline to analyze the predictions that your model makes on the data that it receives in production." This confirms the operational flow described in option A.

3. AWS SageMaker Developer Guide - Monitor model quality: "Model quality monitoring compares the predictions that your model makes with the actual ground truth labels that you store in Amazon S3... SageMaker compares the model's predictions to the ground truth labels to compute quality metrics." This confirms that ModelQualityMonitor (Option B) focuses on performance, not explainability.

4. AWS SageMaker Python SDK Documentation - sagemaker.modelmonitor.ModelExplainabilityMonitor: The documentation for this class explicitly states it "Handles Amazon SageMaker Model Monitor explainability monitoring jobs," confirming it as the correct tool for the task.

Q: 10

Scenario: SageMaker notebook instances are deployed inside an isolated VPC with interface endpoints, yet unauthorized external users can still access them through the internet. Question- How can the team limit access to the SageMaker notebook instances, ensuring only authorized VPC users can connect?. Options:

Options

Discussion

Olivia L. Feb 13, 2026 6:22 am

C. The IAM policy option is what actually blocks presigned notebook access from outside the VPC interface endpoint. Seen this approach recommended in AWS docs, since security groups alone (D) can't stop URL creation elsewhere. Pretty confident that's what they want here, but let me know if you think otherwise.

Nathan G. Feb 17, 2026 1:17 pm

Yeah, C makes the most sense here.

IvyN Feb 24, 2026 11:58 am

C imo, D's just network level but C's IAM policy actually blocks presigned URL creation from outside, so more secure.

Aisha K. Feb 15, 2026 8:00 pm

Why not just use a Glue job over VPN? That way, you only pull the non-sensitive bits right from SQL and make sure nothing confidential ever leaves. D seems risky since the whole DB backup moves first. Anyone see an issue with C's process?

Luna W. Feb 23, 2026 6:57 am

Honestly, this one is all about where presigned URLs get created. If the question asked for just network-level blocks, D could be right. Here though, C nails it, since locking down CreatePresignedNotebookInstanceUrl to VPC endpoints means nobody outside makes a valid URL in the first place. Unless there's another way to create that URL externally, C fits. Agree?

Noah D. Feb 26, 2026 9:47 pm

Its C, saw something like this in a practice. The IAM policy with VPC endpoint condition stops outside URL creation, so only users in the VPC can actually access. Not 100 percent but fits the scenario.

AishaM Feb 26, 2026 6:19 pm

Had something like this in a mock and went with D. If you update the security group to only allow VPC CIDRs, it should block incoming traffic from outside sources-so external users can't connect in directly. Not totally sure it's enough if a presigned URL leaks though, open to corrections.

Taylor J. Feb 15, 2026 12:09 pm

C over D. Security groups (D) just filter traffic but don’t stop someone from creating a presigned URL elsewhere. C’s IAM policy locks down presigned URL generation to VPC endpoints, which is what actually limits access. Pretty sure that’s it but open to other takes.

Meera X. Feb 11, 2026 4:01 pm

If users outside can still generate presigned URLs, isn't locking down with just security groups (D) not enough? Security groups can help, but the IAM policy in C actually blocks presigned URL creation unless it's through the VPC endpoint. I think that's the extra step needed to really restrict access, but open to input if I'm missing a scenario here.

EthanA Mar 3, 2026 5:18 am

Seen similar practice questions and pretty sure the key is restricting presigned URL creation. That's why C makes sense, only lets users in via the VPC endpoint. Let me know if you see it differently.

Be respectful. No spam.

Correct Answer:

Explanation

The most effective and specific method to ensure only users within the VPC can access the SageMaker notebook is to control the generation of the access URL itself. By applying an IAM policy that restricts the sagemaker:CreatePresignedNotebookInstanceUrl action with the aws:sourceVpce condition key, you ensure that the presigned URL required to access the notebook's interface can only be created when the request originates from the VPC interface endpoint. This directly links the authorization to connect with the user's network location, effectively preventing anyone outside the VPC from initiating an access session, which is the core of the requirement.

Why Incorrect Options are Wrong:

A. Set up VPC Traffic Mirroring to capture traffic to and from the notebook instances and identify unauthorized access attempts, enabling enhanced monitoring.

VPC Traffic Mirroring is a passive monitoring tool for inspecting network traffic. It does not block or prevent access, making it a detective control, not a preventative one.

B. Apply VPC Endpoint Policies to control which IAM users or services can access SageMaker AI through the VPC interface endpoint, providing more granular access control for interactions with SageMaker AI.

VPC Endpoint Policies govern which principals can use the endpoint to make API calls. This does not control the network path to the notebook's web UI or prevent a user from using a presigned URL from outside the VPC.

D. Update the security group for the notebook instances to restrict incoming traffic to only the CIDR blocks associated with the VPC. Apply this security group across all interfaces linked to the SageMaker notebook instances.

While a necessary network-level control, this is less precise than option C. It doesn't prevent a user inside the VPC from sharing a valid presigned URL with an external party who could potentially use it if any other network path exists. Option C prevents the URL's creation from outside the VPC entirely.

---

References:

1. Amazon SageMaker Developer Guide - Connect to a Notebook Instance Through a VPC Interface Endpoint: This official guide explicitly recommends the solution in option C. It states, "To ensure that users can access the notebook instance only when they are in your private VPC, create an IAM policy that allows the sagemaker:CreatePresignedNotebookInstanceUrl operation only from a specific VPC endpoint..." This directly supports using an IAM policy with a condition key as the primary mechanism.

2. AWS Identity and Access Management User Guide - AWS global condition context keys: This document details the aws:sourceVpce condition key, explaining that it is used to "check if the request is coming from a specific VPC endpoint." This is the technical foundation for the policy described in option C.

3. Amazon SageMaker API Reference - CreatePresignedNotebookInstanceUrl: The documentation for this API action confirms that it is the function used to "get a URL that you can use to connect to your notebook instance." Therefore, controlling this specific action is the most direct way to manage access to the notebook's UI.

Question 1 of 20 · Page 1 / 2

Premium Access Includes

✓ Quiz Simulator
✓ Exam Mode
✓ Progress Tracking
✓ Question Saving
✓ Flash Cards
✓ Drag & Drops
✓ 3 Months Access
✓ PDF Downloads

Get Premium Access

📖 About this Domain

🎓 What You Will Learn

🛠️ Skills You Will Build

💡 Top Tips to Prepare

📖 About this Domain

🎓 What You Will Learn

🛠️ Skills You Will Build

💡 Top Tips to Prepare

📖 About this Domain

🎓 What You Will Learn

🛠️ Skills You Will Build

💡 Top Tips to Prepare

Premium Access Includes

FLASH OFFER

avail 10% DISCOUNT on YOUR PURCHASE