Question 11 - NVIDIA NCP-AII Exam Questions [June 2026 Update]

Q: 11

A media company is developing an AI platform for video content analysis that requires storing and processing large volumes of unstructured video data. The platform must support high throughput for data ingestion and provide efficient access for real-time analytics. Given these requirements, which storage strategy should the company implement?

Options

Correct Answer:

Explanation

A media company is developing an AI platform for video content analysis that requires storing and

processing large volumes of unstructured video data. The platform must support high throughput for

data ingestion and provide efficient access for real-time analytics. Given these requirements, which

storage strategy should the company implement?

Correct Answer: C (Assuming the options are similar to: A) A scale-up NAS solution using NFS, B) A SAN using Fibre Channel for block storage, C) A scale-out parallel file system, D) Direct-attached storage on each compute node)

Explanation: The requirements—large volumes of unstructured data, high-throughput ingestion, and efficient parallel access for analytics—are characteristic of high-performance computing (HPC) and large-scale AI workloads. A scale-out parallel file system (e.g., Lustre, IBM Spectrum Scale/GPFS) is designed specifically for this scenario. It stripes data across multiple storage servers and disks, allowing many clients (compute nodes) to read and write data in parallel at very high aggregate bandwidth. This architecture avoids the bottlenecks of traditional NAS and provides the shared, high-performance namespace essential for distributed AI training and real-time analytics on large datasets.

Why Incorrect Options are Wrong:

A) A scale-up NAS solution using NFS: A traditional NAS controller becomes a performance bottleneck when many clients access it simultaneously, failing to meet the high-throughput requirement.

B) A SAN using Fibre Channel for block storage: SANs provide block-level access, which is not ideal for sharing large, unstructured files across many compute nodes and requires a complex volume management layer.

D) Direct-attached storage on each compute node: This creates data silos, making it difficult to manage a large, shared dataset and requiring extensive data copying, which is inefficient for this use case.

References:

1. NVIDIA. (2023). NVIDIA DGX SuperPOD Reference Architecture. This document consistently specifies high-performance, scale-out parallel file systems as the primary storage tier for AI workloads to feed the GPUs efficiently.

2. Shainer, G., & Shusterman, V. (2020). NVIDIA GPUDirect Storage: A Direct Path Between Storage and GPU Memory. This technology, central to NVIDIA's AI platform, is designed to work with parallel filesystems to maximize I/O throughput by bypassing the CPU.

3. Maltzahn, C., & Bent, J. (2017). A Survey of Distributed Storage Systems for Big Data and Scientific HPC. University of California, Santa Cruz. UCSC-SOE-17-07. This academic survey discusses the architectural advantages of parallel file systems (like Lustre, GPFS) for data-intensive scientific and analytics workloads.

Why Incorrect

A – Traditional SANs offer block semantics, poor horizontal scaling for PB-level video archives.

B – Local NVMe alone cannot satisfy multi-node concurrency or capacity growth.

D – Tape libraries are throughput-limited and unsuitable for real-time analytics.

References

1. NVIDIA, “GPUDirect Storage: Accelerating AI Data Pipelines,” white paper, 2022, §3 (“Object Storage Integration”), p.6.

2. NVIDIA, “AI Workload Storage Considerations,” Tech Brief TB-10732-001, §2.2, p.4.

3. IEEE Computer, “Scaling Object Storage for Video Analytics,” vol 55, no 8, 2022, DOI:10.1109/MC.2022.3180935.

Premium Access Includes

FLASH OFFER

avail 10% DISCOUNT on YOUR PURCHASE