1. Cisco Systems, Inc. (2023). Cisco Responsible AI Framework.
Reference: Page 6, "Accountability" Principle.
Quote: "We are accountable for our AI systems, and we implement human-in-the-loop governance to ensure our AI systems operate as intended. This includes monitoring and measuring system performance and impact on an ongoing basis." This directly links HITL to governance and ensuring AI operates as intended, which includes preventing harmful outputs.
2. National Institute of Standards and Technology (NIST). (2023). AI Risk Management Framework (AI RMF 1.0). NIST AI 100-1.
Reference: Page 16, Section 3.2, "GOVERN" Function.
Details: The GOVERN function is a core part of the framework and emphasizes that "human oversight is present" and that policies should be in place for "human review, and intervention." This establishes HITL as a key policy for managing AI risks, including the generation of harmful content.
3. Weidinger, L., et al. (2021). Taxonomy of Risks posed by Language Models. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22).
Reference: Page 225, Section 5.1, "Mitigation approaches".
Details: This peer-reviewed paper discusses mitigation strategies for risks from large language models. It explicitly lists "Human-in-the-loop (HITL) content moderation and review" as a key mitigation approach for risks such as discrimination, hate speech, and misinformation, which fall under the umbrella of harmful content.
DOI: https://doi.org/10.1145/3531146.3533088