Gemini 2.5 PRO

Search documents
Grok4全网玩疯,成功通过小球编程测试,Epic创始人:这就是AGI
猿大侠· 2025-07-12 01:45
克雷西 发自 凹非寺 量子位 | 公众号 QbitAI 发布还不到一天,马斯克的Grok4就已经被网友们玩疯了。 比如有网友报告,Grok-4已经 成功通过了著名的六边形小球氛围编程测试 。 只见随着六边形的不断旋转,小球错落有致地从开口下落。 拿着显微镜捉虫的网友发现小球在返回中心位置时会穿墙,但作者表示这是故意为之。 | Plutus � @PlutusCosmos · 17小时 | | | | | --- | --- | --- | --- | | The balls penetrate the walls when the go back to the center. Is it intended? | | | | | O3 | U | ♡ 74 | 111 2.5万 | | Flavio Adamo � @flavioAd · 17小时 | | | | | yes | | | | | 01 | 17 | C 59 | 1 1 2.5万 | | SoyTeslike � @soyteslike · 16小时 | | | | | damn, already screenshotted but it wa ...
AI Engineering with the Google Gemini 2.5 Model Family - Philipp Schmid
AI Engineer· 2025-07-11 19:00
Hands on Workshop on learning to use Gemini 2.5 Pro in combination with Agentic tooling and MCP Servers. About Philipp Schmid Philipp Schmid is a Senior AI Developer Relations Engineer at Google DeepMind working on Gemini, Gemma with the mission to help every developer and builder to create and benefit from AI in a responsible way. Recorded at the AI Engineer World's Fair in San Francisco. Stay up to date on our upcoming events and content by joining our newsletter here: https://www.ai.engineer/newsletter ...
Glidelogic Corp. Announces Release of First AI-Generated Novel "The Thirteenth Proposal"
Globenewswire· 2025-07-11 17:02
LAS VEGAS, July 11, 2025 (GLOBE NEWSWIRE) -- via IBN -- Glidelogic Corp. (USOTC: GDLG, "Glidelogic", "the Company") today announced the release of its first fully AI-generated novel titled The Thirteenth Proposal. The new novel, a political thriller of approximately 80,000 English words (with a 140,000-character Chinese edition), was written entirely by artificial intelligence systems. The Thirteenth Proposal, created by Glidelogic's proprietary Novagen AI based on outputs from Google Gemini 2.5 Pro and dev ...
Grok4全网玩疯,成功通过小球编程测试,Epic创始人:这就是AGI
量子位· 2025-07-11 07:20
只见随着六边形的不断旋转,小球错落有致地从开口下落。 发布还不到一天,马斯克的Grok4就已经被网友们玩疯了。 比如有网友报告,Grok-4已经 成功通过了著名的六边形小球氛围编程测试 。 克雷西 发自 凹非寺 量子位 | 公众号 QbitAI 拿着显微镜捉虫的网友发现小球在返回中心位置时会穿墙,但作者表示这是故意为之。 | Plutus � @PlutusCosmos · 17小时 | | | | | --- | --- | --- | --- | | The balls penetrate the walls when the go back to the center. Is it intended? | | | | | O3 | U | ♡ 74 | 111 2.5万 | | Flavio Adamo � @flavioAd · 17小时 | | | | | yes | | | | | () 1 | 11 | C 59 | 111 2.5万 | | SoyTeslike � @soyteslike · 16小时 | | | | | damn, already screenshotted but it ...
全球最强AI模型?马斯克发布Grok 4!重仓国产AI产业链的589520单日吸金3922万元!
Xin Lang Ji Jin· 2025-07-11 01:17
Group 1: AI Model Development - xAI's Grok 4 achieved an accuracy rate of 25.4% in "Humanity's Last Exam," surpassing Google's Gemini 2.5 Pro at 21.6% and OpenAI's o3 at 21% [1] - The emergence of multi-modal large models is expected to create significant investment opportunities in both computational power and applications [1] - The AI sector is likely to see further catalytic events in the second half of the year, including the release of new models and platforms from companies like OpenAI and NVIDIA [1] Group 2: Investment Trends - The AI investment trend is gaining momentum, particularly following NVIDIA's market capitalization reaching 4 trillion [2] - The Huabao ETF, focused on the domestic AI industry chain, saw a net inflow of 39.22 million yuan on July 10, with 8 out of the last 10 trading days showing net inflows totaling 50.65 million yuan [2] - Analysts emphasize the importance of experiencing the benefits of the AI era and recognizing the long-term investment value in the rapidly evolving AI technology landscape [4] Group 3: Domestic AI Development - Domestic AI model DeepSeek has made significant advancements, breaking through overseas computational barriers and establishing a foundation for local AI companies [5] - The Huabao ETF is strategically positioned in the domestic AI industry chain, benefiting from the acceleration of AI integration in edge computing and software [5]
马斯克新发布的“全球最强模型”含金量如何?
第一财经· 2025-07-10 15:07
Core Viewpoint - The article discusses the launch of Grok 4, an AI model developed by xAI, which is claimed to be the most powerful AI model globally, surpassing existing top models in various benchmarks [1][2]. Group 1: Grok 4 Performance - Grok 4 achieved a perfect score in the AIME25 mathematics competition and scored 26.9% in the "Human Last Exam" (HLE), which consists of 2,500 expert-level questions across multiple disciplines [1]. - The AI analysis index for Grok 4 reached 73, making it the top-ranked model, ahead of OpenAI's o3 and Google's Gemini 2.5 Pro, both at 70 [2]. - Grok 4 set a historical high score of 24% in the HLE, surpassing the previous record of 21% held by Google's Gemini 2.5 Pro [5]. Group 2: Development and Training - Grok 4's training volume is 100 times that of Grok 2, with over 10 times the computational power invested in the reinforcement learning phase compared to other models [5]. - The subscription fee for Grok 4 is set at $30 per month, while a more advanced version, Grok 4 Heavy, costs $300 per month [5]. Group 3: Financial Aspects and Funding - xAI has raised a total of $10 billion in its latest funding round, which includes $5 billion in debt and $5 billion in equity, bringing its total funding since 2024 to $22 billion [10]. - Despite the substantial funding, xAI faces high operational costs, reportedly spending $1 billion per month, with only $4 billion in cash remaining as of March 2025 [11]. - xAI's projected revenue for 2025 is $5 billion, significantly lower than OpenAI's expected $12.7 billion, indicating a lag in commercial progress [11]. Group 4: Future Outlook - xAI aims to leverage the vast data from X to train its models, potentially avoiding high data costs, with a goal to achieve profitability by 2027 [12]. - Upcoming releases include a programming model in August, a multi-agent model in September, and a video generation model in October, although previous delays raise questions about these timelines [12].
Grok4成“宇宙最强模型”?AI竞赛进入“马斯克节奏”
2 1 Shi Ji Jing Ji Bao Dao· 2025-07-10 14:09
Core Insights - Musk's xAI has launched its latest AI model, Grok 4, which is touted as the "strongest model in the universe" and claims to outperform competitors in various academic assessments [2][5][6] - Grok 4 achieved a 38.6% accuracy rate in the "Humanity's Last Exam," surpassing Google's Gemini 2.5 Pro and OpenAI's o3 [2][5] - The model's training involved a significant increase in computational resources, utilizing a supercomputing center with 100,000 H100 GPUs, resulting in a training volume 10 times that of Grok 3 and 100 times that of Grok 2 [2][6] Performance Metrics - Grok 4 demonstrated exceptional reasoning capabilities, scoring 88.9% in the Graduate-level Question Answering (GPQA) and achieving a perfect score in the American Mathematics Invitational Exam (AIME25) [6] - In a commercial simulation task, Grok 4 managed an average net asset of $4,684.15, double that of its closest competitor, showcasing its long-term planning and multi-step reasoning abilities [6] Business Strategy - xAI has introduced a premium subscription plan for Grok 4 at $300 per month, which is 50% more expensive than OpenAI's top-tier subscription [7] - The API pricing is also aggressive, charging $3 per million tokens for input and $15 for output, reflecting the high training costs associated with Grok 4 [7] Future Developments - Musk plans to integrate Grok 4 with humanoid robots and aims to create high-precision physical simulators, including black hole simulations, to test AI against physical laws [7][8] - Grok 4 is expected to be embedded in Tesla's latest firmware, potentially serving as the brain for voice assistance and autonomous driving [3][7] Industry Context - The AI arms race is intensifying, with Musk's aggressive pace in advancing AI models and applications, positioning xAI as a formidable competitor in the market [9] - The integration of various technologies, including autonomous driving and commercial space ventures, is creating a closed-loop system that enhances the capabilities of Grok 4 [9]
马斯克发布Grok 4!号称“世界上最强AI模型”
Zheng Quan Shi Bao Wang· 2025-07-10 11:44
左手刚刚融资,右手就发大模型,马斯克重金打造的Grok 4,正式面世! 7月10日,特斯拉创始人兼首席执行官马斯克旗下的人工智能公司xAI正式发布了Grok 4。在将近1小时 的发布会直播中,xAI发布了这个系列的两款模型,分别是Grok 4(单智能体版本)和Grok 4 Heavy (多智能体版本),其中后者支持4个智能体并行思考,在推理过程中横向比对、纵向协同,调用更大 规模的计算资源以完成更复杂、更精密的任务。 作为xAI在2023年推出首代大模型以来的第四次重要更新,Grok 4在"人类的最后考试"(Humanity's Last Exam)取得了25.4%的准确率,超过了谷歌Gemini 2.5 Pro的21.6%和OpenAI o3(高版本)的21%,被称 为"世界上最强AI模型"。 据xAI的研究人员介绍,Humanity's Last Exam测试总共有2500个问题,包括数学、自然科学、工程以及 所有人文学科,问题广泛且都是博士甚至高级研究水平,极具挑战性,但Grok 4在这些问题上都可以得 到很好的分数。 此外,据发布会披露,在GPQA、AIME25、LCB(Jan-May)、HMMT25 ...
马斯克发布“全球最强AI模型”Grok 4,称这是人工智能第一次能够解决真实世界中难以解决的复杂工程问题
Sou Hu Cai Jing· 2025-07-10 11:42
Core Insights - Musk announced the release of Grok 4, claiming it is the first AI capable of solving complex engineering problems that cannot be found in the internet or books [4] Group 1: Product Features - Grok 4 is a reasoning model that supports both text and image inputs, function calls, and structured outputs [2] - It has a context window of 256K tokens, which is lower than Gemini 2.5 Pro's 1M tokens but higher than Claude 4 Sonnet and Opus (200K tokens) and R1 0528 (128K tokens) [2] - The pricing for Grok 4 is similar to Grok 3, at $3/15 per million input/output tokens, with cache input tokens priced at $0.75 per million [2] Group 2: Performance Metrics - Grok 4 outputs 75 tokens per second, which is slower than o3 (188 tokens/s), Gemini 2.5 Pro (142 tokens/s), and Claude 4 Sonnet Thinking (85 tokens/s), but faster than Claude 4 Opus Thinking (66 tokens/s) [3] - It ranks first in various benchmarks such as Humanity's Last Exam, MMLU-Pro, AIME 2024, AIME 25, and GPQA, outperforming OpenAI's o3 and Google's Gemini 2.5 Pro [3] Group 3: Future Developments - xAI announced upcoming products, including an AI programming model set to launch in August, a multimodal agent in September, and a video generation model in October [5]
马斯克带领xAI团队发布Grok 4,“全球最强模型”含金量如何?
Di Yi Cai Jing· 2025-07-10 08:19
此次发布比原定时间推迟了约一小时,马斯克略显憔悴。 7月10日中午12点,经历了前一代模型的延期和此次直播推迟,埃隆·马斯克终于现身Grok 4发布会进行开场,画面中的他略显憔悴,一周前提及"和xAI团队 通宵打磨模型",看起来为这次发布准备已久。 在帖子中,官方称此次发布的Grok 4是 "全球最强大的AI模型",马斯克则在直播中表示,"Grok 4几乎在所有学科上都比人类研究生更聪明" ,具体含金量如 何? 数据显示,Grok 4的多项基准测试很能"打",实现了对现有顶尖模型的超越。在AIME25数学竞赛上,Grok 4拿下了满分,在"人类最后的考试"(HLE)测试 中,不用工具的情况下拿下了26.9%的高分,该测试包含 2500 个专家级问题,涵盖上百个学科。 测评机构Artificial Analysis获得早期访问权限并在发布会后公布了 Grok 4 基准测试,官方提到,Grok 4的人工智能分析指数达到73,"是我们的智能指数首次 将 xAI 列为第一名"。从数据来看,Grok 4领先于 OpenAI o3(70)、谷歌Gemini 2.5 Pro(70)、Anthropic的 Claude 4 ...