X @Sam Altman
Sam Altman·2025-09-18 13:51
As AI capability increases, alignment work becomes much more important.In this work, we show that a model discovers that it shouldn't be deployed, considers behavior to get deployed anyway, and then realizes it might be a test.OpenAI (@OpenAI):Today we’re releasing research with @apolloaievals.In controlled tests, we found behaviors consistent with scheming in frontier models—and tested a way to reduce it.While we believe these behaviors aren’t causing serious harm today, this is a future risk we’re prepari ...