Workflow
DeepSeek
icon
Search documents
AI竞技场,归根到底只是一门生意
3 6 Ke· 2025-08-06 01:47
"XX发布最强开源大模型,多项基准测试全面超越XX等闭源模型!" "万亿参数开源模型XX强势登顶全球开源模型榜首!" "国产之光!XX模型在中文评测榜单拿下第一!" 随着AI时代的到来,各位的朋友圈、微博等社交平台是不是也常常被诸如此类的新闻刷屏了? 今天这个模型拿到了冠军,明天那个模型变成了王者。评论区里有的人热血沸腾,有的人一头雾水。 一个又一个的现实问题摆在眼前: 这些模型所谓的"登顶"比的是什么?谁给它们评分,而评分的依据又是什么?为什么每个平台的榜单座次都不一样, 到底谁更权威? 如果各位也产生了类似的困惑,说明各位已经开始从"看热闹"转向"看门道"。 本文之中,我们便来拆解一下不同类型"AI竞技场"——也就是大语言模型排行榜——的"游戏规则"。 01 类型一:客观基准测试(Benchmark),给AI准备的"高考" 人类社会中,高考分数是决定学生大学档次的最主要评判标准。 同样地,在AI领域,也有很多高度标准化的测试题,用来尽可能客观地衡量AI模型在特定能力上的表现。 因此,在这个大模型产品频繁推陈出新的时代,各家厂商推出新模型后,第一件事就是拿到"高考"考场上跑个分,是 骡子是马,拉出来遛遛。 ...
解决方案走在市民诉求前,“12345”热线探索超大城市智慧高效治理
Chang Jiang Ri Bao· 2025-08-06 00:21
Core Viewpoint - The article highlights the proactive approach of Wuhan's citizen service hotline "12345" in addressing urban issues before residents even voice their concerns, showcasing a shift from reactive to proactive governance through data analysis and technology [1][3][4]. Group 1: Proactive Governance - The "12345" hotline has implemented a system that analyzes citizen complaints to predict and resolve issues before they are reported, leading to over 20,000 preemptive actions in the first seven months of the year [1][3]. - The service has successfully reduced the number of complaints by 28% during the period when proactive alerts were issued, indicating the effectiveness of the "未诉先办" (preemptive action) strategy [3]. Group 2: Data-Driven Decision Making - The "灵醒" platform utilizes extensive historical and real-time data to monitor urban issues, allowing for a shift from reactive "firefighting" to proactive "fire prevention" in city management [4][5]. - The platform's new modules, such as "民生十二时辰" and "民情月历," help track and analyze the frequency and trends of citizen complaints, enhancing the ability to anticipate and address issues [4]. Group 3: Technological Integration - The integration of AI technology, specifically the "灵醒AI助手," has improved the efficiency of the hotline, allowing for rapid and accurate responses to citizen inquiries, with a 92.7% accuracy rate in dispatching information [6]. - The AI assistant has transformed the hotline's operations, enabling quicker responses and reducing the time taken to fill out service requests, thereby enhancing citizen satisfaction [6].
六年来首次!OpenAI新模型开放权重,Altman称为"全球最佳开放模型"
Hua Er Jie Jian Wen· 2025-08-05 20:05
Core Insights - OpenAI has released two open-weight language models, gpt-oss-120b and gpt-oss-20b, marking its first open-weight model launch since 2019 and responding to competition from Meta, Mistral AI, and DeepSeek [1][2][12] Model Specifications - gpt-oss-120b and gpt-oss-20b are designed for low-cost options, with gpt-oss-20b able to run on a laptop with 16GB RAM and gpt-oss-120b requiring approximately 80GB RAM [2][5] - gpt-oss-120b has a total of 117 billion parameters, activating 5.1 billion parameters per token, while gpt-oss-20b has 21 billion parameters, activating 3.6 billion parameters per token [5][6] Performance Evaluation - gpt-oss-120b performs comparably to OpenAI's o4-mini in core inference benchmarks, while gpt-oss-20b matches or exceeds the performance of o3-mini [7][8] - Both models utilize advanced pre-training and post-training techniques, focusing on efficiency and practical deployment across environments [5][11] Security Measures - OpenAI has implemented extensive security measures to prevent malicious use of the models, filtering harmful data during pre-training and conducting specialized fine-tuning for security assessments [11] - The company collaborates with independent expert groups to evaluate potential security risks associated with the models [11] Market Impact - The release of these models is seen as a strategic shift for OpenAI, which had previously focused on proprietary API services, now responding to competitive pressures in the open-weight model space [12][15] - OpenAI has partnered with major cloud service providers like Amazon to offer these models, enhancing accessibility for developers and researchers [3][11]
OpenAI releases two new open-weight AI models
CNBC Television· 2025-08-05 19:15
Open AAI just announcing two new openw weight AI models. Those are models where some of the parameters around how they're trained are public and accessible. McKenzie Sagalos is here.She's got more on how this could change the AI race in today's tech check. And Mac, we're talking about not completely open source, but a little bit. And that's a really important distinction here, Becky.So, OpenAI is shifting strategy today, making its tech more accessible than it's been in six years. Because until now you coul ...
X @Bloomberg
Bloomberg· 2025-08-05 17:02
OpenAI is releasing a pair of open and freely available AI models months after China's DeepSeek found success with a similar approach https://t.co/6AzFhFInXK ...
OpenAI releases lower-cost models to rival Meta, Mistral and DeepSeek
CNBC· 2025-08-05 17:00
Core Insights - OpenAI has released two open-weight language models, gpt-oss-120b and gpt-oss-20b, marking the first release since GPT-2 in 2019, aimed at providing lower-cost options for developers and researchers [1] - Open-weight models have publicly available parameters, offering transparency and control, but differ from open-source models which provide full source code [2] - OpenAI collaborated with major tech companies like Nvidia and AMD to ensure compatibility of the models across various chips [3] Industry Context - The release of open-weight models by OpenAI contributes to a growing ecosystem, with other companies like Meta and Mistral AI also launching similar models [2][3] - Nvidia's CEO highlighted OpenAI's role in advancing innovation in open-source software, indicating a significant impact on the AI landscape [4]
大模型大逃杀:一山不容「六小虎」|深氪
36氪· 2025-08-05 10:38
Core Viewpoint - The article discusses the challenges and transformations faced by the "Six Little Tigers" in the AI industry, highlighting their struggles with competition, internal turmoil, and the impact of external pressures from investors and market dynamics [6][9][60]. Group 1: Industry Overview - The "Six Little Tigers" were initially valued at over 20 billion RMB, but have faced significant setbacks in the rapidly evolving AI landscape, leading to a loss of confidence among employees and high executive turnover [6][13][60]. - The emergence of DeepSeek as a dominant player has shifted the competitive landscape, forcing the Six Little Tigers to reevaluate their strategies and operations [41][60]. Group 2: Internal Challenges - The article details the internal restructuring and layoffs within the Six Little Tigers, with many employees leaving due to a loss of faith in the companies' futures [10][12][13]. - A significant percentage of employees (41.07%) have reported being in a job-seeking status as of July 2025, indicating widespread dissatisfaction and uncertainty [13]. Group 3: Strategic Missteps - The pursuit of a "Super App" strategy led to ineffective competition and internal chaos, as companies focused on rapid growth metrics rather than sustainable product development [17][24][40]. - The aggressive marketing and product strategies, driven by FOMO (fear of missing out), resulted in a misalignment between product capabilities and market needs, ultimately harming the companies' long-term viability [18][21][40]. Group 4: Market Dynamics - The competitive pressure from DeepSeek has forced the Six Little Tigers to adopt open-source strategies, which they previously avoided, in an attempt to regain market relevance [48][49]. - The article emphasizes that the market is increasingly favoring a few top players, suggesting that only the strongest models will survive in the long run [60][64]. Group 5: Future Outlook - Despite the current turmoil, there remains potential for recovery and innovation within the Six Little Tigers, as they still possess significant resources and talent to pivot towards more sustainable business models [70][75]. - The article concludes that the journey towards achieving AGI (Artificial General Intelligence) is ongoing, with the possibility of resurgence for the companies if they can adapt and learn from past mistakes [76][75].
马斯克脑机公司对手,强脑科技拟IPO前融资估值超13亿美元
Feng Huang Wang· 2025-08-05 06:57
不过,强脑科技的融资动向凸显出投资者对新一代创业公司的兴趣日益浓厚。这些企业旨在颠覆科技行 业,并在AI、机器人等领域推动技术进步。 强脑科技并不是唯一一家希望挑战Neuralink的中国创业公司。据第一财经报道,今年早些时候,植入式 脑机接口公司上海阶梯医疗完成了3.5亿元人民币(约4870万美元)的B轮融资,并启动了中国首个侵入式 脑机接口的临床试验。强脑科技创始人韩璧丞今年4月在接受《南华早报》采访时表示,正考虑向香港 拓展业务。 截至发稿,强脑科技尚未就此置评。(作者/箫雨) 凤凰网科技讯 北京时间8月5日,据彭博社报道,浙江强脑科技正在磋商以超过13亿美元的估值进行融 资。这轮融资后,强脑科技可能将在中国香港或内地启动IPO。 强脑科技由哈佛校友韩璧丞在2015年创办,与埃隆·马斯克(Elon Musk)旗下脑机接口公司Neuralink竞 争。强脑科技与DeepSeek等公司并称杭州"六小龙",致力于开发仿生肢体以及人脑控制计算机相关技 术。 知情人士称,强脑科技正在洽谈约1亿美元的IPO前融资。该创业公司已开始准备上市所需文件,但尚 未决定具体上市地点或其他细节。有关强脑科技融资和上市的讨论仍存 ...
中金公司楼欣宇|中国AI新叙事:DeepSeek点燃估值重估,资本竞逐“双向奔赴”
Di Yi Cai Jing· 2025-08-05 06:47
2025世界人工智能大会(WAIC)近日于上海圆满落幕。超7万平方米的展区规模、800余家参展企业, 以及一度被炒至3000元的单日门票,无不印证着本届大会的空前热度。 这股热潮背后,折射出全球市场对中国人工智能产业发展的持续聚焦。在AI技术日益成为全球经济增 长核心引擎的背景下,坐拥全球最大应用市场及显著工程师红利,中国被视为AI产业发展的关键土 壤。然而,本土AI企业的突围与壮大,除技术创新外,更依赖于高效、畅通的资本循环。 核心议题由此浮现:当前,在全球技术民族主义升温与地缘博弈加剧的复杂环境中,中国AI企业寻求 海外市场拓展、技术合作、跨国并购或引入国际战略投资者,其成功的关键要素与核心障碍何在?与此 同时,DeepSeek的迅速崛起,正引发全球资本对中国科技资产的系统性价值重估。这一趋势深刻重塑 着中资AI企业的融资生态:融资节奏如何变化?哪些细分赛道更受资本青睐?AI企业的资本化路径又 显现出哪些新动向? DeepSeek的成功并非偶然。市场分析认为,它背后是中国庞大的人才储备、强大的工业化基础以及广 阔的市场应用场景。"中国拥有全球最大的人口红利和应用市场,这为AI技术的落地和推广提供了得天 独 ...
谷歌约战,DeepSeek、Kimi都要上,首届大模型对抗赛明天开战
机器之心· 2025-08-05 04:09
Core Viewpoint - The upcoming AI chess competition aims to showcase the performance of various advanced AI models in a competitive setting, utilizing a new benchmark testing platform called Kaggle Game Arena [2][12]. Group 1: Competition Overview - The AI chess competition will take place from August 5 to 7, featuring eight cutting-edge AI models [2][3]. - The participating models include notable names such as OpenAI's o4-mini, Google's Gemini 2.5 Pro, and Anthropic's Claude Opus 4 [7]. - The event is organized by Google and aims to provide a transparent and rigorous testing environment for AI models [6][8]. Group 2: Competition Format - The competition will follow a single-elimination format, with each match consisting of four games. The first model to score two points advances [14]. - If a match ends in a tie (2-2), a tiebreaker game will be played, where the white side must win to progress [14]. - Models are restricted from using external tools like Stockfish and must generate legal moves independently [17]. Group 3: Evaluation and Transparency - The competition will ensure transparency by open-sourcing the game execution framework and environment [8]. - The performance of each model will be displayed on the Kaggle Benchmarks leaderboard, allowing real-time tracking of results [12][13]. - The event is designed to address the limitations of current AI benchmark tests, which struggle to keep pace with the rapid development of modern models [12].