Sam Altman
Search documents
X @Sam Altman
Sam Altman· 2025-12-11 18:27
For example, GDPval measures how often industry experts prefer the model's output to the output of other industry experts. GPT-5.2 gets a 70% (beat or tie); GPT-5 got a 38%. Try it to makes slides, spreadsheets, code, and much more. ...
X @Sam Altman
Sam Altman· 2025-12-11 18:27
Performance is strong across the board: 55.6% on SWE-Bench Pro, 52.9% or ARC-AGI-2, 40.3% on Frontier Math. ...
X @Sam Altman
Sam Altman· 2025-12-11 18:27
GPT-5.2 is here! Available today in ChatGPT and the API.It is the smartest generally-available model in the world, and in particular is good at doing real-world knowledge work tasks. ...
X @Sam Altman
Sam Altman· 2025-12-01 17:36
David Sacks really understands AI and cares about the US leading in innovation. I am grateful we have him.David Sacks (@DavidSacks):INSIDE NYT’S HOAX FACTORYFive months ago, five New York Times reporters were dispatched to create a story about my supposed conflicts of interest working as the White House AI & Crypto Czar.Through a series of “fact checks” they revealed their accusations, which we debunked https://t.co/o67ls3RmC6 ...
X @Sam Altman
Sam Altman· 2025-11-23 00:12
Product Assessment - The Codex team is performing exceptionally well [1] - The product/model is already high-quality and expected to improve significantly [1] - The company believes Codex will become the best and most important product in its field [1] - Codex is expected to enable substantial downstream work [1]
X @Sam Altman
Sam Altman· 2025-11-20 20:00
A first preview of something we expect to see a lot more of soon:Sebastien Bubeck (@SebastienBubeck):3 years ago we could showcase AI's frontier w. a unicorn drawing. Today we do so w. AI outputs touching the scientific frontier: https://t.co/ALJvCFsaieUse the doc to judge for yourself the status of AI-aided science acceleration, and hopefully be inspired by a couple examples! https://t.co/5pxuUp9x3r ...
X @Sam Altman
Sam Altman· 2025-11-19 21:33
New Codex model is a significant improvement!prinz (@deredleritt3r):METR (50% accuracy):GPT-5.1-Codex-Max = 2 hours, 42 minutesThis is 25 minutes longer than GPT-5. https://t.co/NgqG3E5LfB ...
X @Sam Altman
Sam Altman· 2025-11-18 17:05
Congrats to Google on Gemini 3! Looks like a great model. ...
X @Sam Altman
Sam Altman· 2025-11-18 03:04
The rate reduction in price per unit of intelligence has been thing I've most consistently underestimated the past couple of years.300x in a year is nuts!Chris (@chatgpt21):GPT-5.1 (Thinking High) is about 300 times cheaper per task than o3-preview (Low) while scoring only a few points lower on ARC-AGI-1.1 year later intelligence has gotten 300 times cheaper.This is why I can’t stand people who say “wahh the models too expensive” it will become https://t.co/VkfepKVTgV ...
X @Sam Altman
Sam Altman· 2025-11-17 07:50
a new type of research company!Louis Andre (@louisnandre):Today, we're announcing @epistemescience, a new type of R&D company that recruits exceptional scientists to pursue high-impact ideas.Science isn’t bottlenecked by the availability of talent, but by places where they can do their best work.Scientific progress has driven human ...