Q: 13
In managing an AI data center, you need to ensure continuous optimal performance and quickly
respond to any potential issues. Which monitoring tool or approach would best suit the need to
monitor GPU health, usage, and performance metrics across all deployed AI workloads?
Options
Discussion
Its B, since Prometheus with Node Exporter can collect system metrics and you can add exporters for GPU. Not 100% sure but seen setups use it for monitoring a range of hardware.
Be respectful. No spam.