1. Databricks Documentation
"Adaptive query execution": This document explicitly lists "Dynamically switches join strategies" as a primary feature of AQE. It states
"AQE converts a sort-merge join to a broadcast hash join when the runtime statistics of any join side is smaller than the broadcast hash join threshold."
2. Apache Spark 3.5.0 Documentation
"SQL Guide > Performance Tuning > Adaptive Query Execution": The official Spark documentation details the three main components of AQE. Under the section "Dynamically Switching Join Strategies
" it explains how AQE can demote a sort-merge join to a broadcast hash join based on runtime statistics.
3. Apache Spark 3.5.0 Documentation
"SQL Guide > Performance Tuning > Cost-Based Optimizer": This section describes how Spark uses table-level statistics for query optimization
which are collected via the ANALYZE TABLE command and persisted. This confirms that persistent statistics are part of CBO
not AQE.
4. Databricks Documentation
"Optimize data file layout": This source describes the OPTIMIZE command for Delta Lake
clarifying that it is a file layout and compaction utility
which is distinct from the runtime query plan optimizations performed by AQE.