X @Anthropic - Reportify

New Anthropic Research: A new set of evaluations for sabotage capabilities.As models gain more agentic abilities, we need to get smarter in how we monitor them. We’re publishing a new set of complex evaluations that test for sabotage—and sabotage-monitoring—capabilities. https://t.co/spQHC2BNAd ...