Q: 11
You are monitoring the resource utilization of a DGX SuperPOD cluster using NVIDIA Base Command
Manager (BCM). The system is experiencing slow performance, and you need to identify the cause.
What is the most effective way to monitor GPU usage across nodes?
Options
Discussion
B makes sense, the dashboard gives you real-time cluster stats so you’re not checking every node one by one.
Don't think D is right here, since nvidia-smi only gives info per node and isn't practical for tracking many nodes in a cluster. B's dashboard shows everything together, which is what the question asks for.
Had something like this in a mock, pretty sure the Base View dashboard (B) is what you want for cluster-wide GPU stats.
Be respectful. No spam.