Workflow
Claude Opus 3
icon
Search documents
X @Anthropic
Anthropic· 2025-07-08 22:11
Our new study found that only 5 of 25 models showed higher compliance in the “training” scenario. Of those, only Claude Opus 3 and Sonnet 3.5 showed >1% alignment-faking reasoning.We explore why these models behave differently, and why most models don't show alignment faking. https://t.co/24K0iNxDpQ ...