New and improved content material moderation tooling

To assist builders defend their purposes in opposition to attainable misuse, we’re introducing the sooner and extra correct Moderation endpoint. This endpoint gives OpenAI API builders with free entry to GPT-based classifiers that detect undesired content material—an occasion of using AI systems to help with human supervision of those techniques. We have now additionally launched each a technical paper describing our methodology and the dataset used for analysis.

When given a textual content enter, the Moderation endpoint assesses whether or not the content material is sexual, hateful, violent, or promotes self-harm—content material prohibited by our content policy. The endpoint has been skilled to be fast, correct, and to carry out robustly throughout a spread of purposes. Importantly, this reduces the possibilities of merchandise “saying” the flawed factor, even when deployed to customers at-scale. As a consequence, AI can unlock advantages in delicate settings, like training, the place it couldn’t in any other case be used with confidence.

No Result