Question 11 - Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Real Exam Questions [Feb 2026 Update]

Q: 11

17 of 55. A data engineer has noticed that upgrading the Spark version in their applications from Spark 3.0 to Spark 3.5 has improved the runtime of some scheduled Spark applications. Looking further, the data engineer realizes that Adaptive Query Execution (AQE) is now enabled. Which operation should AQE be implementing to automatically improve the Spark application performance?

Options

Correct Answer:

Explanation

Adaptive Query Execution (AQE) is a query re-optimization framework in Spark SQL that uses runtime statistics to improve query plans. One of its key features is the ability to dynamically switch join strategies. For instance, AQE can change a plan from a sort-merge join to a more efficient broadcast hash join if it observes during execution that one side of the join is small enough to be broadcasted. This runtime adjustment, based on actual intermediate data sizes rather than initial estimates, can significantly enhance performance. AQE also dynamically coalesces shuffle partitions and handles data skew.

Why Incorrect

B. Collecting and storing persistent table statistics in the metastore is a function of the Cost-Based Optimizer (CBO) using commands like ANALYZE TABLE, not a runtime feature of AQE.

C. AQE re-optimizes query plans at stage boundaries (shuffles). A single-stage job has no shuffles, so AQE has no opportunity to re-plan and improve its performance.

D. Optimizing the layout of Delta files (e.g., file compaction) is a data management operation specific to Delta Lake, performed by commands like OPTIMIZE, and is separate from AQE's query plan optimization.

References

1. Databricks Documentation

"Adaptive query execution": This document explicitly lists "Dynamically switches join strategies" as a primary feature of AQE. It states

"AQE converts a sort-merge join to a broadcast hash join when the runtime statistics of any join side is smaller than the broadcast hash join threshold."

2. Apache Spark 3.5.0 Documentation

"SQL Guide > Performance Tuning > Adaptive Query Execution": The official Spark documentation details the three main components of AQE. Under the section "Dynamically Switching Join Strategies

" it explains how AQE can demote a sort-merge join to a broadcast hash join based on runtime statistics.

3. Apache Spark 3.5.0 Documentation

"SQL Guide > Performance Tuning > Cost-Based Optimizer": This section describes how Spark uses table-level statistics for query optimization

which are collected via the ANALYZE TABLE command and persisted. This confirms that persistent statistics are part of CBO

not AQE.

4. Databricks Documentation

"Optimize data file layout": This source describes the OPTIMIZE command for Delta Lake

clarifying that it is a file layout and compaction utility

which is distinct from the runtime query plan optimizations performed by AQE.

Premium Access Includes

FLASH OFFER

avail 10% DISCOUNT on YOUR PURCHASE