Selective Gradient Masking
Search documents
X @Anthropic
Anthropicยท 2025-12-09 19:47
New research from Anthropic Fellows Program: Selective GradienT Masking (SGTM).We study how to train models so that high-risk knowledge (e.g. about dangerous weapons) is isolated in a small, separate set of parameters that can be removed without broadly affecting the model. https://t.co/7Lds2ZhqfM ...