Q: 20
[Data Engineering]
A company's machine learning (ML) specialist is building a computer vision model to classify 10
different traffic signs. The company has stored 100 images of each class in Amazon S3, and the
company has another 10.000 unlabeled images. All the images come from dash cameras and are a
size of 224 pixels * 224 pixels. After several training runs, the model is overfitting on the training
data.
Which actions should the ML specialist take to address this problem? (Select TWO.)
Options
Discussion
C/E? Had something like this in a mock. Data augmentation (C) adds variety so the model generalizes better, and semi-supervised with k-NN (E) helps expand the labeled set without manual effort. Not 100 percent certain but this matches most practice explanations, agree?
Probably C and E. Data augmentation (C) increases diversity of training data, which helps prevent overfitting, and using k-NN for labeling (E) lets you leverage the big set of unlabeled images instead of just sticking to a small labeled set. Pretty sure this is right but open to other ideas.
Its C and E. Saw a similar one in exam reports, data augmentation and semi-supervised labeling help with overfitting.
Be respectful. No spam.