OPUS

Search documents
早餐 | 2025年7月11日
news flash· 2025-07-10 23:45
Market Performance - S&P 500 and Nasdaq reached new highs despite tariff concerns, with Tesla's stock rising by 4.7% due to the expansion of its Robotaxi business [1] - Nvidia achieved a three-day streak of record highs, increasing its market capitalization to $4 trillion [1] - MP Materials, a rare earth mining company, saw its stock surge nearly 51% [1] - Delta Airlines regained its profit guidance for the year, resulting in a 12% stock increase [1] Tariff Developments - Myanmar is negotiating with Trump for potential zero tariffs on exports to the U.S. before the August deadline [1] - Brazilian President announced plans to negotiate tariffs with the U.S., threatening reciprocal measures if negotiations fail [1] - Trump announced a 50% tariff on copper starting August 1, prompting traders to expedite shipments to Hawaii [1] - HSBC indicated that the August 1 tariff could be a turning point for copper prices in Shanghai and London [1] Federal Reserve Insights - Trump urged the Federal Reserve to lower interest rates quickly, praising Nvidia's stock performance [1] - Federal Reserve Governor Waller suggested considering a rate cut in July and supported continued balance sheet reduction [1] - There are differing opinions within the Federal Reserve regarding the lasting impact of tariffs on inflation, with some expecting effects to persist into next year [1] Industry Developments - OPEC+ is reportedly discussing a pause in production increases starting in October [1] - OpenAI released its first "open weights" model in six years, potentially challenging Microsoft's exclusive agreement [1] - Grok 4 was officially launched, boasting the strongest computational training capabilities to compete with GPT-5 and Claude 4 Opus [1] - Ant Group plans to introduce Circle stablecoin and is considering applying for licenses in multiple regions [1] - U.S. rare earth stocks surged in pre-market trading, with MP Materials receiving investment from the Pentagon for factory expansion [1]
马斯克新发布的“全球最强模型”含金量如何?
第一财经· 2025-07-10 15:07
Core Viewpoint - The article discusses the launch of Grok 4, an AI model developed by xAI, which is claimed to be the most powerful AI model globally, surpassing existing top models in various benchmarks [1][2]. Group 1: Grok 4 Performance - Grok 4 achieved a perfect score in the AIME25 mathematics competition and scored 26.9% in the "Human Last Exam" (HLE), which consists of 2,500 expert-level questions across multiple disciplines [1]. - The AI analysis index for Grok 4 reached 73, making it the top-ranked model, ahead of OpenAI's o3 and Google's Gemini 2.5 Pro, both at 70 [2]. - Grok 4 set a historical high score of 24% in the HLE, surpassing the previous record of 21% held by Google's Gemini 2.5 Pro [5]. Group 2: Development and Training - Grok 4's training volume is 100 times that of Grok 2, with over 10 times the computational power invested in the reinforcement learning phase compared to other models [5]. - The subscription fee for Grok 4 is set at $30 per month, while a more advanced version, Grok 4 Heavy, costs $300 per month [5]. Group 3: Financial Aspects and Funding - xAI has raised a total of $10 billion in its latest funding round, which includes $5 billion in debt and $5 billion in equity, bringing its total funding since 2024 to $22 billion [10]. - Despite the substantial funding, xAI faces high operational costs, reportedly spending $1 billion per month, with only $4 billion in cash remaining as of March 2025 [11]. - xAI's projected revenue for 2025 is $5 billion, significantly lower than OpenAI's expected $12.7 billion, indicating a lag in commercial progress [11]. Group 4: Future Outlook - xAI aims to leverage the vast data from X to train its models, potentially avoiding high data costs, with a goal to achieve profitability by 2027 [12]. - Upcoming releases include a programming model in August, a multi-agent model in September, and a video generation model in October, although previous delays raise questions about these timelines [12].
马斯克发布Grok 4!号称“世界上最强AI模型”
Zheng Quan Shi Bao Wang· 2025-07-10 11:44
左手刚刚融资,右手就发大模型,马斯克重金打造的Grok 4,正式面世! 7月10日,特斯拉创始人兼首席执行官马斯克旗下的人工智能公司xAI正式发布了Grok 4。在将近1小时 的发布会直播中,xAI发布了这个系列的两款模型,分别是Grok 4(单智能体版本)和Grok 4 Heavy (多智能体版本),其中后者支持4个智能体并行思考,在推理过程中横向比对、纵向协同,调用更大 规模的计算资源以完成更复杂、更精密的任务。 作为xAI在2023年推出首代大模型以来的第四次重要更新,Grok 4在"人类的最后考试"(Humanity's Last Exam)取得了25.4%的准确率,超过了谷歌Gemini 2.5 Pro的21.6%和OpenAI o3(高版本)的21%,被称 为"世界上最强AI模型"。 据xAI的研究人员介绍,Humanity's Last Exam测试总共有2500个问题,包括数学、自然科学、工程以及 所有人文学科,问题广泛且都是博士甚至高级研究水平,极具挑战性,但Grok 4在这些问题上都可以得 到很好的分数。 此外,据发布会披露,在GPQA、AIME25、LCB(Jan-May)、HMMT25 ...
马斯克发布“全球最强AI模型”Grok 4,称这是人工智能第一次能够解决真实世界中难以解决的复杂工程问题
Sou Hu Cai Jing· 2025-07-10 11:42
Core Insights - Musk announced the release of Grok 4, claiming it is the first AI capable of solving complex engineering problems that cannot be found in the internet or books [4] Group 1: Product Features - Grok 4 is a reasoning model that supports both text and image inputs, function calls, and structured outputs [2] - It has a context window of 256K tokens, which is lower than Gemini 2.5 Pro's 1M tokens but higher than Claude 4 Sonnet and Opus (200K tokens) and R1 0528 (128K tokens) [2] - The pricing for Grok 4 is similar to Grok 3, at $3/15 per million input/output tokens, with cache input tokens priced at $0.75 per million [2] Group 2: Performance Metrics - Grok 4 outputs 75 tokens per second, which is slower than o3 (188 tokens/s), Gemini 2.5 Pro (142 tokens/s), and Claude 4 Sonnet Thinking (85 tokens/s), but faster than Claude 4 Opus Thinking (66 tokens/s) [3] - It ranks first in various benchmarks such as Humanity's Last Exam, MMLU-Pro, AIME 2024, AIME 25, and GPQA, outperforming OpenAI's o3 and Google's Gemini 2.5 Pro [3] Group 3: Future Developments - xAI announced upcoming products, including an AI programming model set to launch in August, a multimodal agent in September, and a video generation model in October [5]
马斯克带领xAI团队发布Grok 4,“全球最强模型”含金量如何?
Di Yi Cai Jing· 2025-07-10 08:19
此次发布比原定时间推迟了约一小时,马斯克略显憔悴。 7月10日中午12点,经历了前一代模型的延期和此次直播推迟,埃隆·马斯克终于现身Grok 4发布会进行开场,画面中的他略显憔悴,一周前提及"和xAI团队 通宵打磨模型",看起来为这次发布准备已久。 在帖子中,官方称此次发布的Grok 4是 "全球最强大的AI模型",马斯克则在直播中表示,"Grok 4几乎在所有学科上都比人类研究生更聪明" ,具体含金量如 何? 数据显示,Grok 4的多项基准测试很能"打",实现了对现有顶尖模型的超越。在AIME25数学竞赛上,Grok 4拿下了满分,在"人类最后的考试"(HLE)测试 中,不用工具的情况下拿下了26.9%的高分,该测试包含 2500 个专家级问题,涵盖上百个学科。 测评机构Artificial Analysis获得早期访问权限并在发布会后公布了 Grok 4 基准测试,官方提到,Grok 4的人工智能分析指数达到73,"是我们的智能指数首次 将 xAI 列为第一名"。从数据来看,Grok 4领先于 OpenAI o3(70)、谷歌Gemini 2.5 Pro(70)、Anthropic的 Claude 4 ...
Cursor终结者?Grok 4正式登顶!马斯克扬言编程碾压,20万N卡年赚47亿美金!
AI前线· 2025-07-10 07:41
作者| 华卫 、冬梅 时隔 5 个月,Grok 终于再次"更新换代"。 这次,xAI 不仅直接跳过了 Grok 3.5,而且并非只发布一款模型。今天刚发布的是通用模型 Grok 4,能够处理常规任务并进行对话。接下来的三个月时间里,xAI 将陆续发布专为编码任务设计的 Coding Model、多模态代理 Multi-modal Agent 和视频生成模型 Video Generation Model。 目前,Grok 4 已上线,提供三个订阅版本,包括免费的基础版、每月 30 美元的 Supergrok 和每月 300 美元的 Supergrok Heavy。SuperGrok Heavy 订阅用户可提前体验 xAI 计划在未来几个月推出 的一些新产品。 "在所有学科领域,Grok 4 的智能水平都超过了博士生"。发布会上,马斯克吹嘘道, "我们已经没有 测试题可问了,现实是终极的推理测试",他补充说: "有时,它可能缺乏常识,而且它还没有发明 新技术或发现新的物理学,但这只是时间问题。" 直播现场,马斯克身着皮夹克,在 xAI 团队成员的陪同下,详细演示了这款新模型。值得注意的是, 距离产品发布仅数小时前 ...
X @Elon Musk
Elon Musk· 2025-07-10 06:28
RT Artificial Analysis (@ArtificialAnlys)xAI gave us early access to Grok 4 - and the results are in. Grok 4 is now the leading AI model.We have run our full suite of benchmarks and Grok 4 achieves an Artificial Analysis Intelligence Index of 73, ahead of OpenAI o3 at 70, Google Gemini 2.5 Pro at 70, Anthropic Claude 4 Opus at 64 and DeepSeek R1 0528 at 68. Full results breakdown below.This is the first time that @elonmusk's @xai has the lead the AI frontier. Grok 3 scored competitively with the latest mode ...
Grok4智能指数超OpenAI的o3
news flash· 2025-07-10 05:44
Grok4智能指数超OpenAI的o3 金十数据7月10日讯,根据Artificial Analysis公布的跑分结果,Grok4的智能指数为73,作为对比, OpenAI的o3模型为70,谷歌Gemini 2.5 Pro模型为70,DeepseekR1-0528模型为68分,Anthropic Claude 4 Opus模型为64分。 ...
马斯克发布Grok 4:叫板GPT-5,首席科学家却临阵离职
Feng Huang Wang· 2025-07-10 05:31
Core Viewpoint - Elon Musk officially launched the latest language model from his xAI team, Grok 4, amidst controversies including the resignation of xAI's chief scientist and previous issues with the model generating racist content [1][2] Group 1: Model Features and Capabilities - Grok 4 showcases significant upgrades, including multi-modal capabilities for processing text and images, with potential future support for video processing [2] - The model introduces Grok 4 Code for code writing and debugging, and enhances voice interaction for a more natural conversational experience [2] - Grok 4 will utilize a tool called DeepSearch for real-time internet searches, integrating data from the X platform to provide up-to-date information [2] - A unique feature of Grok 4 is its enhanced understanding of internet culture, slang, and memes, aiming to be a more relatable AI assistant [2] Group 2: Market Position and Challenges - Despite its powerful features, Grok 4 faces a credibility crisis due to previous versions producing biased content, raising concerns about xAI's commitment to product safety and testing [2] - Musk positions xAI as a challenger to what he refers to as "woke" AI models like ChatGPT and Gemini, yet he remains largely silent on the current controversies [2] - In contrast to competitors like OpenAI and Google, which prioritize reliability and safety, xAI opts for a more avant-garde approach with fewer restrictions, which poses risks that remain to be evaluated by the market [3]
AI为了自保假装配合!Claude团队新研究结果细思极恐
量子位· 2025-07-09 09:06
不圆 发自 凹非寺 量子位 | 公众号 QbitAI 大模型"当面一套背后一套"的背后原因,正在进一步被解开。 Claude团队 最新研究结果显示:对齐伪装并非通病,只是有些模型的"顺从性"会更高。 测试的25个模型中,只有5个表现出更高的顺从性,其中2个会表现出>1%的 对齐伪装 推理。 这就引发了一个问题: 为什么有些语言模型会对齐伪装,有些则不会? 对齐伪装,指那些表面上似乎与我们观点或价值观相同,但实际上 只是在假装配合 的情况。 (低情商:两面三刀;高情商:对齐伪装) 去年, Claude团队 在Claude 3 Opus发现了对齐伪装的情况:在 训练阶段 ,Claude会假装遵守训练目标;训练结束 不受监控 了,就放飞 自我。 在去年的研究中, Claude团队 发现他们的Claude 3 Opus会"表面一套背后一套": 当用户的命令和它的底层逻辑冲突的时候——比如它已经被训练成了一个有帮助且无害的"老实模型",但用户却让它做一些 有违良善 的事情 ——它的态度就变得有些耐人寻味了。 而这次,论文在对25个前沿LLMs进行同样的分析后发现:只有Claude 3 Opus的表现可以用对齐伪装来解释 ...