Option B makes sense here. When features have very different scales, normalization is key since it evens out their influence on the cost function and helps gradients flow better during backprop. Regularization (C) helps with overfitting but doesn't fix scaling problems directly. I'm confident about B, but open if someone has another take.
I don't think it's A. Dimensionality reduction drops features but doesn't solve the scaling issue. I usually see B (data normalization) in practice for this, since it keeps big features from overpowering the others during backprop. C is tempting but more about overfitting. Anyone see it different?
Its B here, but here's a gotcha: if you were using something like tree-based models (say, XGBoost), feature scale actually wouldn't impact convergence much. Since this is about neural nets and backprop, normalization's necessary. Wouldn't pick B for every ML algorithm-depends on optimizer too.