Subliminal learning
Search documents
X @Anthropic
Anthropicยท 2025-07-22 16:32
Model Training & Alignment - Subliminal learning can occur for both benign and concerning traits [1] - This has consequences for training on model-generated data, potentially leading to misalignment [1]