1. Hastie
T.
Tibshirani
R.
& Friedman
J. (2009). The Elements of Statistical Learning: Data Mining
Inference
and Prediction (2nd ed.). Springer. In Chapter 7
Section 7.5
"The Akaike Information Criterion
" pages 233-234
the text explains that AIC is a widely used criterion for model selection that trades off model fit with the number of parameters
making it suitable for comparing different models.
2. Burnham
K. P.
& Anderson
D. R. (2002). Model selection and multimodel inference: A practical information-theoretic approach (2nd ed.). Springer-Verlag. Chapter 2
Section 2.2
"Akaike's Information Criterion
" pages 60-66
details the application of AIC for selecting the best model from a set of candidates
emphasizing its utility in comparing models that may not be nested.
3. Pennsylvania State University. (n.d.). STAT 501: Regression Methods
Lesson 11: Model Selection & Validation. Eberly College of Science. In Section 11.2
"AIC
BIC
and Mallows' Cp
" the courseware describes AIC as an estimate of prediction error used to select the best model from a set of candidates
where the model with the lowest AIC is chosen.