I saw a similar question in practice and picked D. I thought fine-tuning sometimes needs you to look at architecture, like number of layers, before training. Not totally sure but that was my logic, maybe someone can confirm?
Q: 9
When fine-tuning an LLM for a specific application, why is it essential to perform exploratory data
analysis (EDA) on the new training dataset?
Options
Discussion
A . The main thing with EDA is you dig into the training data to spot any weird patterns or mistakes before you start the actual fine-tuning. You're not setting learning rates or messing with layers at this stage. I think it's pretty clearly focused on catching data quality issues-open to pushback if someone sees it different.
Official guide recommends EDA for spotting patterns and data issues, not selecting hyperparams or layers. Practice tests back this up too.
Saw something like this before and picked C.
A tbh. EDA is all about digging into the dataset for weird patterns or issues before any modeling steps happen. Stuff like learning rate (B) or picking number of layers (D) comes after, once you understand your data. Option C is also off, since hardware planning isn't really EDA's job. Pretty sure A is best here, but open to other takes if I missed something.
Not C, it's really A. EDA is for discovering patterns or data issues, not figuring out hardware needs. I know the compute question can be tempting but that's a different step.
Makes sense to choose A. EDA is all about understanding your dataset-finding patterns, outliers or errors before you touch any hyperparameters. Pretty sure that's what they're after here, but if anyone sees a curveball let me know.
A imo. EDA helps you find patterns or data problems, not tweak model parameters or resources.
For me, A here. EDA is all about finding patterns and issues in your data, not tweaking learning rates (B) or model layers (D)-those come after analyzing the dataset. C trips people up but that's more resource planning than EDA. If you disagree let me know!
C doesn’t fit here. A is what EDA actually does, helps you find data issues or trends before fine-tuning. Confident it’s A.
Be respectful. No spam.
Question 9 of 15