Workflow
X @Anthropic
Anthropic·2025-10-06 17:15

Last week we released Claude Sonnet 4.5. As part of our alignment testing, we used a new tool to run automated audits for behaviors like sycophancy and deception.Now we’re open-sourcing the tool to run those audits. https://t.co/cCJGNaVFrl ...