Question 15 - Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Real Exam Questions [March 2026 Update]

Q: 15

A developer notices that all the post-shuffle partitions in a dataset are smaller than the value set for spark.sql.adaptive.maxShuffledHashJoinLocalMapThreshold. Which type of join will Adaptive Query Execution (AQE) choose in this case?

Options

Correct Answer:

Explanation

Adaptive Query Execution (AQE) optimizes query plans at runtime. One of its key features is dynamically changing join strategies. The configuration spark.sql.adaptive.maxShuffledHashJoinLocalMapThreshold sets a size threshold for post-shuffle partitions. When AQE determines that all partitions of one side of a join are smaller than this threshold, it can convert what was initially planned as a sort-merge join into a more efficient shuffled hash join. This is because building a hash table for the smaller side in each partition becomes more performant than sorting both sides. The scenario described directly meets this condition, triggering the conversion to a shuffled hash join.

Why Incorrect

A. A Cartesian join is a cross-product used when no join keys are specified and is unrelated to this AQE optimization threshold.

C. A broadcast nested loop join is a fallback strategy, often for non-equi joins, and is not the target optimization for this specific threshold.

D. A sort-merge join is the likely initial plan, but AQE converts it to a shuffled hash join because the size condition is met.

References

1. Databricks Documentation

Adaptive query execution: In the section "Optimize joins

" the documentation states: "AQE can convert a sort-merge join to a shuffled hash join when one side of the join is small enough. This is controlled by the configuration spark.sql.adaptive.maxShuffledHashJoinLocalMapThreshold."

Source: Databricks Documentation > Optimizations > Adaptive query execution > Optimize joins.

2. Apache Spark 3.x Official Documentation

SQL Guide

Performance Tuning: In the section on Adaptive Query Execution

under "Dynamically switching join strategies

" it explains: "AQE can convert a sort-merge join to a shuffled hash join when the runtime statistics of any join side is smaller than the configured threshold spark.sql.adaptive.maxShuffledHashJoinLocalMapThreshold."

Source: Apache Spark Documentation > SQL Guide > Performance Tuning > Adaptive Query Execution > Dynamically switching join strategies.

Premium Access Includes

FLASH OFFER

avail 10% DISCOUNT on YOUR PURCHASE