Q: 18
[Machine Learning Implementation and Operations]
A bank's Machine Learning team is developing an approach for credit card fraud detection The
company has a large dataset of historical data labeled as fraudulent The goal is to build a model to
take the information from new transactions and predict whether each transaction is fraudulent or
not
Which built-in Amazon SageMaker machine learning algorithm should be used for modeling this
problem?
Options
Discussion
B . Had something like this in a mock, labeled fraud data fits XGBoost.
D . Random Cut Forest is for anomaly detection and fraud is basically an anomaly, right? Whenever I see "fraud detection" it's easy to fall for the unsupervised option. Unless those labels are strictly needed by the question, I could see D being picked. Someone double check me on that.
C or D would work if the dataset didn't have fraud labels, right? For unsupervised scenarios, K-means or Random Cut Forest are common picks from what I remember on the official guide and practice tests.
D imo. Random Cut Forest is used for anomaly detection, which sounds like it could work for fraud. But does the question say if they're expected to use supervised labels or just find outliers? If it wasn't labeled data, I'd definitely pick D instead of B here.
Be respectful. No spam.