Mori, S., Suen, C. Y., & Yamamoto, K. (1992). Historical review of OCR research and development. Proceedings of the IEEE, 80(7), 1029-1058.
DOI: https://doi.org/10.1109/5.156468
Reference (Section I, Introduction): The paper defines the fundamental goal of OCR as the "problem of converting handwritten or machine-printed text into machine-readable text." This directly aligns with the historian's goal.
Holley, R. (2009). How Good Can It Get? Analysing and Improving OCR Accuracy in Large Scale Historic Newspaper Digitisation Programs. D-Lib Magazine, 15(3/4).
DOI: https://doi.org/10.1045/march2009-holley
Reference (Abstract): This article explicitly discusses the use of "Optical Character Recognition (OCR)" as the core technology for "historic newspaper digitisation programs," which is the exact scenario in the question.
Smith, R. (2007). An Overview of the Tesseract OCR Engine. Proceedings of the Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), Vol. 2, pp. 629-633.
DOI: https://doi.org/10.1109/ICDAR.2007.4376991
Reference (Section II.A, Architecture): "The Tesseract OCR engine... takes a binary image as input... and produces text output." This describes the technical process of converting an image (the scanned newspaper) into text (the digitized version).
Zou, Z., Shi, Z., Guo, Y., & Ye, J. (2019). Object Detection in 20 Years: A Survey. arXiv preprint arXiv:1905.05055.
DOI: https://doi.org/10.48550/arXiv.1905.05055
Reference (Section 1, Introduction): This paper defines object detection as a task to "localize and classify" objects. It distinguishes this from image classification (labeling the whole image) and segmentation. Neither task involves extracting text.
Zhao, W., Chellappa, R., Phillips, P. J., & Rosenfeld, A. (2003). Face recognition: A literature survey. ACM computing surveys (CSUR), 35(4), 399-458.
DOI: https://doi.org/10.1145/954339.954342
Reference (Section 1, Introduction): This survey defines face recognition (a component of facial analysis) as an application for "automatically identifying or verifying a person from a digital image." This clearly differentiates it from the text-based task.