X @Anthropic
Anthropic·2025-07-22 16:32

Model Training & Alignment - Subliminal learning can occur for both benign and concerning traits [1] - This has consequences for training on model-generated data, potentially leading to misalignment [1]