Subscribe to independent global reporting to stay informed on significant corporate, financial, and political developments worldwide. With expert commentary and analysis, you can spot emerging risks and opportunities. For only 1€ for 4 weeks, and then 69€ per month, unlock unlimited access to valuable insights you can trust.
Full Article
Anthropic makes ‘jailbreak’ advance to stop AI models producing harmful results
Anthropic, a San Francisco-based start-up, has developed "constitutional classifiers" to protect against harmful content generated by AI models like its Claude chatbot, as tech giants like Microsoft and Meta work to address the risks posed by AI technology. The new system acts as a safeguard layer on top of language models, monitoring inputs and outputs for dangerous information. Anthropic offered bug bounties to testers who attempted to bypass the security measures, with the system rejecting...
Read more