X @Anthropic
Anthropic·2025-07-24 17:22
This project was an Anthropic Alignment Science × Interpretability collaboration.To support further research, we're releasing an open-source replication of our evaluation agent and materials for our other agents: https://t.co/0fZSRIMeqF ...