Gemini 3.1 Pro Benchmarks
Matthew Berman·2026-02-20 17:13
Crazy. Look at this. Humanity's last exam with no tools scores a 44.4%.Absolutely dominates on ARC AGI 2. If you're not familiar, you have an initial pattern. It gives you the solved version of it. Then it gives you a new version and says solve it 94.3%.We have a fantastic coding score with SWEBench verified 80.6% basically tying Opus 4.6%. And six. ...