1. Official Vendor Documentation (NVIDIA): The NVIDIA RAPIDS cuDF library, designed for GPU-accelerated data science, provides string methods that heavily utilize regular expressions for filtering and manipulation. For example, the StringMethods.contains function checks if a specific pattern or regex is contained within the strings of a Series.
Source: NVIDIA RAPIDS cuDF v24.04.00 documentation, cudf.core.column.string.StringMethods.contains.
2. University Courseware (Stanford University): In natural language processing (NLP) courses, regular expressions are taught as a foundational technique for processing text. They are used for tasks like tokenization and identifying specific entities, which is a form of data selection based on patterns (keywords).
Source: Jurafsky, D., & Martin, J. H. (2023). Speech and Language Processing (3rd ed.). Chapter 2: "Regular Expressions, Text Normalization, Edit Distance". (This textbook is standard courseware for Stanford's CS224N and other leading NLP courses).
3. University Courseware (University of California, Berkeley): Data science curricula introduce regular expressions as a core tool for working with text data. They are used to clean, validate, and filter data based on complex string patterns, which includes selecting lines or documents containing specific keywords.
Source: UC Berkeley, Data 100: Principles and Techniques of Data Science, Fall 2023, Lecture 10: "Text: Regex".