X @Anthropic - Reportify

Current models are not effective saboteurs—nor are they good monitors.But our evals are designed for the future: smarter AIs will do better on these tasks. Our evals will be useful to help developers assess their capabilities.Read the full paper: https://t.co/PVuiEl1z9Z ...