1. Databricks Documentation
summary function: The summary function in Databricks computes descriptive statistics for numeric and string columns. For numeric columns
it calculates count
mean
stddev
min
and max
which are all quantitative summaries. This aligns with the definition of using summary statistics to quantitatively describe data.
Source: Databricks Documentation > Spark SQL > DataFrame API > pyspark.sql.DataFrame.summary.
2. University Courseware
Penn State University STAT 200: "Descriptive statistics consists of methods for organizing and summarizing information... Numerical methods for summarizing data include calculating measures of center
such as the mean or median
and measures of spread
such as the standard deviation or interquartile range." This directly supports the concept of using quantitative measures to summarize data.
Source: Penn State University
Eberly College of Science
STAT 200: Elementary Statistics
Lesson 1.1: "What is Statistics?".
3. University Courseware
MIT OpenCourseWare: "Descriptive statistics: describing data with graphs and numerical summaries." This definition emphasizes the use of numerical (quantitative) methods to describe the data.
Source: MIT OpenCourseWare
18.05 Introduction to Probability and Statistics
Spring 2014
Topic 1: "Introduction to Statistics".