Model Performance & Benchmarks - Gemini 3 surpasses previous Frontier models in benchmarks, demonstrating significant advancements in AI capabilities [1] - Gemini 3 achieves 458% with code execution and search on Humanity's last exam, compared to Gemini 25% Pro at 21%, Cloud Sonnet 45% at 13%, and GBT 51% at 265% [2] - On the Vending Bench benchmark, Gemini 3's net worth reached $547816%, significantly outperforming Cloud Sonnet 45% at $3800 [4] - Gemini 3 Deep Think scores 41% on Humanity's Last Exam, compared to Gemini 3 Pro at 375%, Claude Sonnet 45% at 13%, GPT5 Pro at 30%, and GPT 51% at 265% [9][10] - Gemini 3 Deepthink achieves 451% on Arc AGI2 visual reasoning puzzles, a 10x improvement over Gemini 25% Pro [12] Enterprise Applications & Features - Boxcom's benchmark shows a 22-point performance increase for Gemini 3 Pro versus Gemini 25% Pro, with scores of 85% and 63% respectively [6] - Industry subsets in Boxcom's benchmark show significant performance jumps: Healthcare and Life Sciences (45% to 94%), Media and Entertainment (47% to 92%), and Financial Services (51% to 60%) [6] - Gemini 3 excels in complex multi-step reasoning and task automation, as highlighted by Box's new benchmark [7] - Gemini 3 supports multiple modalities, including text, images, video, audio, and code, with a unique focus on video understanding [12] - Gemini 3 can analyze YouTube videos frame by frame, understanding the content in detail [13] Google Integration & New Products - Gemini 3 is integrated into Google Search, dynamically generating user interfaces based on user queries [17] - Google launched anti-gravity, a VS Code fork coding platform that supports Gemini models and other models like GPTOSS and Anthropic's Sonnet [20] - The updated Gemini app features Gemini Agent capability, enabling the AI to complete real tasks on the user's behalf and create dynamic UIs [24] Model Architecture & Specifications - Gemini 3 is a brand new foundation model, not a modification of a prior model [27] - The model accepts text, images, audio, and video files as inputs, with a token context window of up to 1 million and output tokens of 64000 [28] - Gemini 3 is a sparse mixture of experts model built on Google's custom TPU architecture for both pre-training and inference [28]
Gemini 3 is the best model on earth