Anthropic details Constitutional Classifiers, a system that aims to guard AI models against jailbreaks, monitoring both inputs and outputs for harmful content (Cristina Criddle/Financial Times)
Cristina Criddle reports on Constitutional Classifiers, a system developed to protect AI models from jailbreaks by monitoring inputs and outputs ...