Question 6 - Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Real Exam Questions [Feb 2026 Update]

Q: 6

16 of 55. A data engineer is reviewing a Spark application that applies several transformations to a DataFrame but notices that the job does not start executing immediately. Which two characteristics of Apache Spark's execution model explain this behavior? (Choose 2 answers)

Options

Correct Answer:

C, E

Explanation

Apache Spark's execution model is fundamentally based on lazy evaluation. Transformations, such as select(), filter(), or groupBy(), do not execute immediately. Instead, Spark builds a Directed Acyclic Graph (DAG) representing the logical execution plan. This plan outlines the sequence of operations to be performed. The actual computation and data processing are deferred until an action is invoked. Actions, like count(), collect(), or save(), are operations that trigger the execution of the entire pipeline of transformations defined in the DAG to produce a final result, either by returning it to the driver or writing it to storage.

Why Incorrect

A. Transformations are not executed immediately; they are lazy and only build the computation plan (DAG).

B. Optimization occurs when an action is called, not during the definition of transformations. The delay is due to lazy evaluation, not the optimization process itself.

D. The trigger for execution is a programmatic "action" called in the code, not a form of "manual intervention" by a user.

References

1. Apache Spark Official Documentation

RDD Programming Guide: In the "Lazy Evaluation" section

it states

"All transformations in Spark are lazy

in that they do not compute their results right away. Instead

they just remember the transformations applied to some base dataset... The transformations are only computed when an action requires a result to be returned to the driver program."

Source: Apache Spark Documentation

RDD Programming Guide

Section: Lazy Evaluation.

2. Databricks Documentation

Introduction to Apache Spark: "Apache Spark uses lazy evaluation for transformations. Transformations are lazy operations

meaning that they are not executed until an action is called. This allows Spark to optimize the query plan by pipelining transformations."

Source: Databricks Documentation

"Developer tools

languages

and APIs"

"Introduction to Apache Spark".

3. Zaharia

et al. (2010). Spark: Cluster Computing with Working Sets. This foundational academic paper on Spark states: "RDDs support two types of operations: transformations

which create a new dataset from an existing one

and actions

which return a value to the driver program after running a computation on the dataset. [...] All transformations in Spark are lazy

in that they do not compute their results right away."

Source: Zaharia

et al. (2010). Spark: Cluster Computing with Working Sets. USENIX HotCloud'10

Page 2

Section 3.1 RDD Operations.

Premium Access Includes

FLASH OFFER

avail 10% DISCOUNT on YOUR PURCHASE