A . Black box attacks based on adversarial examples create an exact duplicate model of the original.
Black box attacks do not create an exact duplicate model. Instead, they exploit the model by querying
it and using the outputs to craft adversarial examples without knowledge of the internal workings.
B . These attack examples cause a model to predict the correct class with slightly less accuracy even
though they look like the original image.
Adversarial examples typically cause the model to predict the incorrect class rather than just
reducing accuracy. These examples are designed to be visually indistinguishable from the original
image but lead to incorrect classifications.
C . These attacks can't be prevented by retraining the model with these examples augmented to the
training data.
This statement is incorrect because retraining the model with adversarial examples included in the
training data can help the model learn to resist such attacks, a technique known as adversarial
training.
D . These examples are model specific and are not likely to cause another model trained on the same
task to fail.
Adversarial examples are often model-specific, meaning that they exploit the specific weaknesses of
a particular model. While some adversarial examples might transfer between models, many are
tailored to the specific model they were generated for and may not affect other models trained on
the same task.
Therefore, the correct answer is D because adversarial examples are typically model-specific and may
not cause another model trained on the same task to fail.