1. Project Management Institute. (2023). Project Management for the AI-Powered Organization. Newtown Square, PA: Project Management Institute. The "AI Project Lifecycle" section details the "Data Sourcing and Preparation" phase, which explicitly identifies data quality assessment and ETL as foundational activities for any AI initiative.
2. Shickel, B., Tighe, P. J., Bihorac, A., & Rashidi, P. (2018). Deep EHR: A survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE Journal of Biomedical and Health Informatics, 22(5), 1589-1604. Section III, "DATA PREPARATION AND PREPROCESSING," discusses the critical need for robust data preprocessing pipelines (i.e., ETL) to handle the heterogeneity and quality issues in EHR data before it can be used for machine learning. https://doi.org/10.1109/JBHI.2017.2767063
3. Stanford University. (n.d.). CS 229: Machine Learning - Course Notes. Stanford, CA: Stanford University. The course materials emphasize that a significant portion of applied machine learning work involves the data pipeline, including data cleaning, preprocessing, and feature engineering—all core components of an ETL process—to ensure data quality and consistency.