Question 4

Question

You are deploying a large-scale AI model training pipeline on a cloud-based infrastructure that uses
NVIDIA GPUs. During the training, you observe that the system occasionally crashes due to memory
overflows on the GPUs, even though the overall GPU memory usage is below the maximum capacity.
What is the most likely cause of the memory overflows, and what should youdo to mitigate this
issue?

Accepted Answer

The system is encountering fragmented memory; enable unified memory management

Vikram D. · Answer

Option D

Alex · Answer

A is wrong, D. Fragmented memory can block allocations even when total GPU usage seems fine, so unified memory management (D) would help here. Batch size (A) is a trap since usage didn’t exceed capacity. I think D but open to other views if someone has seen different behavior in practice.

Jack · Answer

Its D here since fragmented memory can cause allocation failures even if you haven't hit max total usage. Enabling unified memory gives the system more flexibility to manage those gaps. Not 100% but makes most sense for this scenario, agree?

Ethan H. · Answer

D fits best. Fragmented GPU memory can block big allocs even if total usage looks OK, and unified memory management helps smooth that out. I’m pretty sure that’s what they want but happy to hear other takes.

FriendlyAnalyst5495 · Answer

D , saw similar in a practice exam. Fragmented memory is classic for overflows below max usage.

Premium Access Includes

FLASH OFFER

avail 10% DISCOUNT on YOUR PURCHASE