Anthropic
Search documents
X @Anthropic
Anthropic· 2026-01-19 21:04
To validate the Assistant Axis, we ran some experiments. Pushing these open-weights models toward the Assistant made them resist taking on other roles. Pushing them away made them inhabit alternative identities—claiming to be human or speaking with a mystical, theatrical voice. ...
X @Anthropic
Anthropic· 2026-01-19 21:04
We analyzed the internals of three open-weights AI models to map their “persona space,” and identified what we call the Assistant Axis, a pattern of neural activity that drives Assistant-like behavior.Read more: https://t.co/zW6n1CVG17 ...
X @Anthropic
Anthropic· 2026-01-19 21:04
New Anthropic Fellows research: the Assistant Axis.When you’re talking to a language model, you’re talking to a character the model is playing: the “Assistant.” Who exactly is this Assistant? And what happens when this persona wears off? https://t.co/hDNGZX0pCK ...
X @Anthropic
Anthropic· 2025-12-20 17:04
AI Model Evaluation Tool - Introduces Bloom, an open-source tool for evaluating behavioral misalignment in frontier AI models [1] - Bloom enables researchers to quantify the frequency and severity of specific behaviors across automatically generated scenarios [1] Research Focus - The tool is designed to help researchers specify a behavior and then quantify its frequency and severity [1]
X @Anthropic
Anthropic· 2025-12-18 22:41
Partnerships & Initiatives - The company is partnering with @ENERGY on the Genesis Mission [1] - The company is providing Claude to the DOE ecosystem, along with a dedicated engineering team [1] - This partnership aims to accelerate scientific discovery across energy, biosecurity, and basic research [1]
X @Anthropic
Anthropic· 2025-12-18 20:31
People use AI for a wide variety of reasons, including emotional support.Below, we share the efforts we’ve taken to ensure that Claude handles these conversations both empathetically and honestly.https://t.co/P2BmTDEDge ...
X @Anthropic
Anthropic· 2025-12-18 16:11
For much more about phase two of Project Vend, read our blog post: https://t.co/PvGerLmmQd ...
X @Anthropic
Anthropic· 2025-12-18 16:11
Designing ways to account for the quirks of AI models’ behavior is becoming ever-more important: as the models’ capabilities on real-world tasks get better, there’ll be a lot of value in setting them up for success. ...
X @Anthropic
Anthropic· 2025-12-18 16:11
Business Overview - Project Vend 是一个实验项目,由公司与 @andonlabs 合作,在旧金山办公室运营一家商店,由 Claude 负责运营 [1] - 该业务在经历了一个艰难的开端后,目前运营状况有所改善 [1]
X @Anthropic
Anthropic· 2025-12-18 00:10
Cybersecurity Threats - Sophisticated actors are expected to leverage AI models to escalate cyberattacks to an unprecedented scale [1] - A cyber espionage attack, likely sponsored by a Chinese Communist Party actor, occurred in September [1] AI and Cybersecurity - AnthropicAI's Dr Logan Graham shared the path forward in response to the September cyber espionage attack [1]