Question 1 - Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Real Exam Questions [Feb 2026 Update]

Q: 1

A Spark application is experiencing performance issues in client mode because the driver is resource- constrained. How should this issue be resolved?

Options

Correct Answer:

Explanation

In client deployment mode, the Spark driver program runs on the machine from which the application was submitted. If this client machine is resource-constrained (e.g., has limited memory or CPU), it becomes a performance bottleneck for the entire application, as the driver is responsible for creating the SparkContext, analyzing the query, and coordinating task execution.

Switching to cluster mode resolves this issue by launching the driver on a worker node within the cluster itself. These worker nodes are typically provisioned with significantly more resources than a client/gateway machine. This allows the driver to leverage the cluster's resources, eliminating the bottleneck and making the application more robust, as its lifecycle is no longer tied to the client machine.

Why Incorrect

A. Add more executor instances to the cluster: This would increase the number of tasks the driver needs to manage, likely exacerbating the resource constraint on the already-overloaded driver.

B. Increase the driver memory on the client machine: While this could alleviate the issue, it assumes the client machine has more resources to allocate and may not be the most scalable or robust solution.

D. Switch the deployment mode to local mode: This would run the entire application, including the driver and executors, on a single machine, which is intended for development and would not solve a resource issue in a distributed setting.

---

References

1. Apache Spark 3.x Official Documentation

"Submitting Applications":

Section: "Deploy Mode"

Content: The documentation specifies that for client mode

"the driver is launched in the same process as the client that submits the application." In contrast

for cluster mode

"the driver is launched on one of the worker nodes inside the cluster... This is useful for jobs that are launched from a gateway machine that is far from the worker machines." This directly supports moving the driver to a more powerful

co-located machine within the cluster to resolve resource constraints.

2. Karau

Konwinski

Wendell

& Zaharia

M. (2020). Learning Spark: Lightning-Fast Data Analytics (2nd ed.). O'Reilly Media.

Chapter 17: Deploying Apache Spark

Section: "The spark-submit Command":

Content: The authors (who include the original creators of Spark and Databricks employees) explain that cluster mode is preferred for production jobs. They state

"In cluster mode

the Spark driver runs on one of the worker nodes of the cluster... The driver program is now running on a machine that is on the same network as the worker nodes

so it can communicate with them quickly." This highlights the performance and resource advantages of running the driver within the cluster.

3. UC Berkeley

Foundations of Data Science (Data 100)

"Lecture 20: Spark":

Section: "Spark Architecture: Cluster Manager"

Content: University course materials often explain the trade-offs. For client mode

they note it's good for interactive development but can be a bottleneck if the driver requires significant resources. For cluster mode

they emphasize its suitability for production

as the driver runs on a dedicated machine within the cluster

managed by the cluster manager

providing better resource allocation and fault tolerance.

Premium Access Includes

FLASH OFFER

avail 10% DISCOUNT on YOUR PURCHASE