Question 17

Question

[Modeling]
An online reseller has a large, multi-column dataset with one column missing 30% of its data A
Machine Learning Specialist believes that certain columns in the dataset could be used to reconstruct
the missing data.
Which reconstruction approach should the Specialist use to preserve the integrity of the dataset?

Accepted Answer

Multiple imputation

AveryZ · Answer

I don’t think it’s B. Multiple imputation (C) is more robust here since it uses other columns to estimate missing values, which helps maintain statistical integrity. Last observation carried forward works best for time series but not general datasets like this.

Maya · Answer

Seen similar on the official practice test, pretty sure it's C.

Nora · Answer

C or D? Mean substitution is quick and keeps the dataset size but with 30% missing, results can get skewed.

Anita M. · Answer

Its B. Had something like this in a mock and used last observation carried forward for filling missing values since it reuses real data, keeps the dataset size stable. Pretty sure that's the best for integrity. Anyone got a different take?

Premium Access Includes

FLASH OFFER

avail 10% DISCOUNT on YOUR PURCHASE