1. Microsoft Learn
"Azure OpenAI Service content filtering
" Introduction. The document states
"The Azure OpenAI Service includes a content filtering system that works alongside core models. The system works by running both the prompt and completion through an ensemble of classification models aimed at detecting and preventing the output of harmful content." It explicitly lists "Hate" as a core filtered category.
2. Microsoft Learn
"Responsible AI practices for Azure OpenAI models
" Mitigations section. This section details content filtering as a primary mitigation strategy. It explains
"In addition to the safety system
the Azure OpenAI Service features a content filtering system... The system detects and takes action on specific categories of potentially harmful content in both input prompts and output completions."
3. Microsoft Learn
"Responsible AI standard v2
" Section 2.3
Safety. This document outlines Microsoft's principles. Under the Safety principle
it discusses the need to "prevent models from generating harmful content
" a goal directly achieved through technical implementations like content filtering.