Q: 5
In the context of quantizing large language models (LLMs), which of the following statements best
describes the key trade-offs between model size, performance, and accuracy when using quantization
techniques?
Options
Discussion
D . Quantization shrinks the model but you might lose some accuracy if it’s too aggressive. Others don’t really capture the typical trade-off I’ve seen. Pretty sure about this but open to other takes.
D . Shrinking the model is great but sometimes you trade off a bit of accuracy.
Sometimes D is only correct if they're talking about lower-bit quantization, not mixed-precision. D
Don’t think it's C, since quantization doesn’t guarantee zero accuracy loss and sometimes post-quant fine-tuning is still needed. D captures the real trade-off: you shrink the model but risk accuracy hits if you go too far. Saw a similar question in practice sets-C feels like a trap.
I don’t think it’s D. C.
Probably D. Quantization cuts down model size a lot, but from what I’ve seen in official guides and some exam reports, it can hurt accuracy if you go too aggressive with it. Never seen quantization double compute or guarantee zero loss. Anyone seen something else in practice exams?
Be respectful. No spam.