1. Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., & Zhou, D. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. Advances in Neural Information Processing Systems, 35, 24824-24837. In Section 2, the paper introduces CoT and demonstrates its effectiveness on arithmetic reasoning benchmarks like GSM8K, showing it "improves the performance of large language models by a large margin."
2. NVIDIA Technical Blog. (2023, August 16). An Introduction to Large Language Models: Prompt Engineering and P-Tuning. The section "Advanced Prompting Techniques" explicitly describes Chain-of-Thought as a method to "guide the LLM through the reasoning process for complex queries," highlighting its suitability for such tasks.
3. Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., Dong, Z., Du, Y., Yang, C., Chen, Y., Chen, Z., Jiang, J., Ren, R., Li, Y., Tang, X., Liu, Z., Liu, P., Nie, J.-Y., & Wen, J.-R. (2023). A Survey of Large Language Models. Section 4.2.2, "Chain-of-Thought (CoT) Prompting," states that CoT is a key technique to "enhance the reasoning ability of LLMs" by generating intermediate reasoning steps. https://doi.org/10.48550/arXiv.2303.18223