1. Stanford University Courseware (CS229): In the context of supervised learning, a loss function (or cost function) J(θ) is defined to measure the error between the hypothesis hθ(x) and the actual output y. The objective is to find parameters θ that minimize this function. This directly supports that loss measures how wrong the predictions are.
Source: Ng, A. (2023). CS229: Machine Learning Course Notes, Supervised Learning. Stanford University. Section: "Cost Function".
2. MIT OpenCourseWare (6.S191): The course materials explain that a loss function computes a single value that quantifies how poorly a model performed on a given example. The goal of training is to find a set of model weights that minimizes this loss.
Source: Amini, A., & Soleimany, A. (2023). MIT 6.S191: Introduction to Deep Learning, Lecture 1: Foundations of Deep Learning. MIT OpenCourseWare. Section: "Defining a Loss Function".
3. Oracle Cloud Infrastructure (OCI) Documentation: The OCI Data Science service documentation, when describing the model training process, implicitly relies on the concept of minimizing a loss function. For example, algorithms like logistic regression are trained by minimizing a loss function (e.g., log loss) to find the best-fitting model. This process is fundamental to improving model predictions.
Source: Oracle Cloud Infrastructure Documentation. (2024). Data Science, Concepts, About Model Training. Section: "Training Process".