1. Official Apache Spark Documentation: The documentation for LinearRegression in MLlib explicitly details the optimization algorithms used. It states
"The implementation is based on the MLlib LBFGS optimizer for L2-regularized linear regression... For unregularized linear regression
the implementation uses a wrapper for the NormalEquation and Cholesky solvers... The normal equation solver is limited to at most 4096 features." This confirms that for a large number of variables (beyond 4096)
the iterative L-BFGS optimizer is the method employed.
Source: Apache Spark 3.5.0 MLlib Guide
Classification and regression > Linear methods > Linear regression > Mathematical detail.
2. Academic Publication: The foundational paper on Spark's machine learning library highlights the design choice of using iterative methods for scalability. "For many ML algorithms
we can express the optimization problem as a sum of loss terms... and solve it with gradient descent. The gradient can be computed on a cluster by summing gradients computed on subsets of the data in parallel... MLlib has a general-purpose gradient descent optimizer." L-BFGS is a more advanced quasi-Newton iterative optimization method built on these principles.
Source: Meng
X.
Bradley
J.
Yavuz
B.
Sparks
E.
Venkataraman
S.
Liu
D.
... & Talwalkar
A. (2016). MLlib: Machine Learning in Apache Spark. Journal of Machine Learning Research
17(34)
1-7. (Section 3.1: Implementation).
3. Official Databricks Documentation: The Databricks documentation on Linear Regression also confirms the use of iterative optimization. "The training algorithm uses the L-BFGS optimizer." This directly points to an iterative optimization approach as the standard for their implementation.
Source: Databricks Documentation
Machine Learning Guide > MLflow > MLflow models > spark.mllib > Linear Regression.