Q: 3
A company is building an ML model. The company collected new data and analyzed the data by
creating a correlation matrix, calculating statistics, and visualizing the data.
Which stage of the ML pipeline is the company currently in?
Options
Discussion
C . Correlation matrix and stats are textbook exploratory data analysis steps. You're not creating features here, just understanding the data shape. Pretty sure that's what AWS expects.
C not A. They're just analyzing and visualizing data here so that's classic EDA. Don't see any mention of cleaning or transforming.
A. stats and correlation checks are part of pre-processing sometimes right?
I don't think it's C, I'd pick A. Calculating stats and looking at correlations usually happens during data pre-processing, before EDA or feature steps. There's a trap here because EDA sounds close, but A fits better in my opinion.
Maybe A. I think data pre-processing includes calculating stats and checking correlations, right?
C imo. Making correlation matrices and stats is classic for exploratory data analysis, not actually modifying or creating new features. Unless they mention making new columns from old data, it's not B. Anyone disagree?
This looks super close to one I had on a mock, the correlation matrix part basically nails it as C.
Skip B, C here. Calculating stats and making correlation matrices is classic EDA, not engineering features. Trap answer is B.
C or B-if "analyzed" also means creating new features, would that count as feature engineering instead? Just want to be sure on what they mean by 'analyzed.'
This sounds almost identical to a practice question I did-correlation matrix and data visualization always pointed to C (exploratory data analysis). No mention of handling missing values or creating features, so C fits.
Be respectful. No spam.