Question 3 - Google Professional-Machine-Learning-Engineer Real Exam Questions [March 2026 Update]

Q: 3

You developed a custom model by using Vertex Al to forecast the sales of your company s products based on historical transactional data You anticipate changes in the feature distributions and the correlations between the features in the near future You also expect to receive a large volume of prediction requests You plan to use Vertex Al Model Monitoring for drift detection and you want to minimize the cost. What should you do?

Options

Discussion

Piya O. Feb 16, 2026 10:33 pm

Option D, but wow Google sure makes you jump through hoops for config tuning. Sampling rate near 0 keeps the bills reasonable even if you have a ton of data, and attributions still give proper drift signals. If budget is the main concern, pretty sure this is it.

Quinn Mar 2, 2026 1:23 am

Makes sense to me, D. Lower sampling rate is the biggest saver for cost with lots of requests.

Piya E. Feb 19, 2026 3:04 am

D imo, since sampling rate close to 0 actually cuts storage way more than lowering monitoring frequency, especially with feature attributions on.

Sara Mar 4, 2026 12:42 am

D , because using both features and attributions gives you comprehensive drift checks, and keeping the sampling rate low directly cuts down on monitoring costs when traffic is high. If minimizing cost is the main aim this combo makes sense. Pretty sure that's what Google expects here but open to other views.

Chris B. Feb 28, 2026 10:08 am

D fits, since lowering the sampling rate keeps storage and compute way down when volume is high, and adding feature attributions means you don’t lose drift insights. Saw a similar item in exam reports. Pretty sure this is right but I’d listen if anyone disagrees.

Parker N. Mar 1, 2026 7:59 pm

D or maybe C, but leaning D since sampling rate close to 0 cuts down storage and compute a lot. Including feature attributions helps with drift detection too. Not totally sure, but low sampling should minimize cost the most. Anyone disagree?

Jamie G. Mar 1, 2026 8:21 pm

B tbh, closer to 1 means you catch more drift with all that data. With just the features, feels simpler and might still save cost if attributions aren't critical. Trap could be over-monitoring though.

Luke Feb 20, 2026 10:21 am

Probably D. You want both features and attributions for more insight into drift, but keeping that prediction-sampling-rate down (closer to 0) is what really helps minimize cost as usage scales. Pretty sure that's the balanced approach if you want visibility but need to watch spend. Disagree?

Kevin A. Feb 24, 2026 8:33 pm

B seems like a good pick because setting the prediction-sampling-rate closer to 1 should catch more drift cases with lots of data, which matters if you expect major changes. Plus, only monitoring features feels simpler and reduces complexity so might save costs too. Maybe missing something about feature attributions though-open to being corrected.

Be respectful. No spam.

Correct Answer:

Explanation

The best option for using Vertex AI Model Monitoring for drift detection and minimizing the cost is to

use the features and the feature attributions for monitoring, and set a prediction-sampling-rate value

that is closer to 0 than 1. This option allows you to leverage the power and flexibility of Google Cloud

to detect feature drift in the input predict requests for custom models, and reduce the storage and

computation costs of the model monitoring job. Vertex AI Model Monitoring is a service that can

track and compare the results of multiple machine learning runs. Vertex AI Model Monitoring can

monitor the model’s prediction input data for feature skew and drift. Feature drift occurs when the

feature data distribution in production changes over time. If the original training data is not available,

you can enable drift detection to monitor your models for feature drift. Vertex AI Model Monitoring

uses TensorFlow Data Validation (TFDV) to calculate the distributions and distance scores for each

feature, and compares them with a baseline distribution. The baseline distribution is the statistical

distribution of the feature’s values in the training data. If the training data is not available, the

baseline distribution is calculated from the first 1000 prediction requests that the model receives. If

the distance score for a feature exceeds an alerting threshold that you set, Vertex AI Model

Monitoring sends you an email alert. However, if you use a custom model, you can also enable

feature attribution monitoring, which can provide more insights into the feature drift. Feature

attribution monitoring analyzes the feature attributions, which are the contributions of each feature

to the prediction output. Feature attribution monitoring can help you identify the features that have

the most impact on the model performance, and the features that have the most significant drift

over time. Feature attribution monitoring can also help you understand the relationship between the

features and the prediction output, and the correlation between the features1. The prediction-

sampling-rate is a parameter that determines the percentage of prediction requests that are logged

and analyzed by the model monitoring job. Using a lower prediction-sampling-rate can reduce the

storage and computation costs of the model monitoring job, but also the quality and validity of the

data. Using a lower prediction-sampling-rate can introduce sampling bias and noise into the data,

and make the model monitoring job miss some important features or patterns of the data. However,

using a higher prediction-sampling-rate can increase the storage and computation costs of the model

monitoring job, and also the amount of data that needs to be processed and analyzed. Therefore,

there is a trade-off between the prediction-sampling-rate and the cost and accuracy of the model

monitoring job, and the optimal prediction-sampling-rate depends on the business objective and the

data characteristics2. By using the features and the feature attributions for monitoring, and setting a

prediction-sampling-rate value that is closer to 0 than 1, you can use Vertex AI Model Monitoring for

drift detection and minimize the cost.

The other options are not as good as option D, for the following reasons:

Option A: Using the features for monitoring and setting a monitoring-frequency value that is higher

than the default would not enable feature attribution monitoring, and could increase the cost of the

model monitoring job. The monitoring-frequency is a parameter that determines how often the

model monitoring job analyzes the logged prediction requests and calculates the distributions and

distance scores for each feature. Using a higher monitoring-frequency can increase the frequency

and timeliness of the model monitoring job, but also the computation costs of the model monitoring

job. Moreover, using the features for monitoring would not enable feature attribution monitoring,

which can provide more insights into the feature drift and the model performance1.

Option B: Using the features for monitoring and setting a prediction-sampling-rate value that is

closer to 1 than 0 would not enable feature attribution monitoring, and could increase the cost of the

model monitoring job. The prediction-sampling-rate is a parameter that determines the percentage

of prediction requests that are logged and analyzed by the model monitoring job. Using a higher

prediction-sampling-rate can increase the quality and validity of the data, but also the storage and

computation costs of the model monitoring job. Moreover, using the features for monitoring would

not enable feature attribution monitoring, which can provide more insights into the feature drift and

the model performance12.

Option C: Using the features and the feature attributions for monitoring and setting a monitoring-

frequency value that is lower than the default would enable feature attribution monitoring, but

could reduce the frequency and timeliness of the model monitoring job. The monitoring-frequency is

a parameter that determines how often the model monitoring job analyzes the logged prediction

requests and calculates the distributions and distance scores for each feature. Using a lower

monitoring-frequency can reduce the computation costs of the model monitoring job, but also the

frequency and timeliness of the model monitoring job. This can make the model monitoring job less

responsive and effective in detecting and alerting the feature drift1.

Reference:

Preparing for Google Cloud Certification: Machine Learning Engineer, Course 3: Production ML

Systems, Week 4: Evaluation

Google Cloud Professional Machine Learning Engineer Exam Guide, Section 3: Scaling ML models in

production, 3.3 Monitoring ML models in production

Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter 6:

Production ML Systems, Section 6.3: Monitoring ML Models

Using Model Monitoring

Understanding the score threshold slider

Premium Access Includes

FLASH OFFER

avail 10% DISCOUNT on YOUR PURCHASE