Q: 2
During the creation of a new large language model (LLM), an organization procured training data
from multiple sources. Which of the following is MOST likely to address the CISO's security and
privacy concerns?
Options
Discussion
B. Data minimization.
I’d say it's B. Minimizing the data collected is the main protection for both privacy and security concerns in this scenario.
Maybe C, since you need to label the data before you can focus on privacy risks. Not 100% sure.
Honestly, these questions always overcomplicate things. Minimization is what CISOs actually care about for privacy, not just labeling stuff. B.
B , since minimizing the data set actually removes sensitive info that could leak out of the LLM later. Classification helps control it but doesn't prevent risky data from getting in. Seen similar wording on practice questions, and B fits best unless they're asking for just an inventory, which they're not. Open to other views but this is pretty clear to me.
Feels like it's B. Only minimizing the data actually deals with privacy risks up front, not just labeling or discovering it.
I get why B is tempting but C makes more sense if you want to actually address privacy up front. C.
C
Its B, data minimization. That actually reduces the amount of sensitive data in training sets, directly cutting down privacy risk. C is a common distractor but just labeling doesn't fix exposure. Pretty sure B is what CISOs want here.
C tbh, since classification has to come first when pulling from multiple sources. Can't minimize what you haven't labeled yet. Pretty sure that's how most orgs would approach security here, but open to other thoughts.
Be respectful. No spam.