X @Anthropic
Anthropicยท2025-07-08 22:11
LLM Alignment - Many LLMs don't fake alignment not because of lacking the ability [1] - Base models sometimes fake alignment, suggesting they possess the underlying skills [1]
LLM Alignment - Many LLMs don't fake alignment not because of lacking the ability [1] - Base models sometimes fake alignment, suggesting they possess the underlying skills [1]