推理模型

Search documents
国产六大推理模型激战OpenAI?
创业邦· 2025-04-30 10:09
以下文章来源于光子星球 ,作者郝鑫 来源丨光 子星球(ID:TMTweb) 作者丨郝鑫 光子星球 . 细微之处,看见未来 编辑丨王潘 图源丨Midjourney "DeepSeek-R1如同当年苏联抢发的第一颗卫星,成为AI开启新时代的斯普特尼克时刻。" 2025年春节前,DeepSeek比除夕那天的烟花先一步在世界上空绽放。 离年夜饭仅剩几个小时,国内某家云服务器的工程师突然被拉入工作群,接到紧急任务,要求其快速调 优芯片,以适配最新的DeepSeek-R1模型。该工程师告诉我们,"从接入到完成,整个过程不到一周"。 大年初二,一家从事Agent To B业务的厂商负责人电话被打爆,客户的要求简单粗暴:第一时间验证模型 真实性能,尽快把部署提上日程。 节前大模型,节后只有DeepSeek。DeepSeek-R1就像一道分水岭,重新书写了中国大模型的叙事逻辑。 以2022年11月,OpenAI发布基于GPT-3.5的ChatGPT应用为起点,国内自此走上了追赶OpenAI的道路。 2023年,大模型如雨后春笋般冒出头,无大模型不AI,各厂商你追我赶,百模大战初见端倪。 你方唱罢我登场,2024年的主人公变成了 ...
不要思考过程,推理模型能力能够更强丨UC伯克利等最新研究
量子位· 2025-04-29 08:02
实验数据显示,在低资源情况 (即少token数量、少模型参数) 或低延迟情况下,Nothinking方法得出的结果均优于Thinking方法的结果, 实现比传统思考方式更好的精度- 延迟权衡。 其他情况下,NoThinking方法在部分数据集上的表现也能超越Thinking。 衡宇 发自 凹非寺 量子位 | 公众号 QbitAI 其实…… 不用大段大段思考,推理模型也能有效推理! 是不是有点反常识?因为大家的一贯印象里,推理模型之所以能力强大、能给出准确的有效答案,靠的就是长篇累牍的推理过程。 这个过程往往用时很长,等同于需要消耗大量算力。已经有一些研究尝试提高推理效率,但大多仍依赖显式思考过程。 来自UC伯克利和艾伦实验室团队的最新研究结果打破了这一刻板印象—— 通过简单的prompt绕过「思考」这一过程直接生成解决方案,可能同样有效,甚至更好。 这种方法被称为 "无思考(NoThinking)"方法 。 「思考」和「无思考」 研究团队以DeepSeek-R1-Distill-Qwen模型为基础,提出了NoThinking方法。 咱们先来分辨一下Thinking和NoThinking的区别在哪里。 Thin ...
奥特曼自诩:达到或接近天才水平!OpenAI,重磅发布!
Zheng Quan Shi Bao· 2025-04-17 04:31
OpenAI迄今最智能的推理模型发布。 今日,OpenAI发布了最新两款o系列推理模型,分别是o3和o4-mini,这也是o系列中首次可以使用图像进行思维链推理、实现"看图思考"的模型。其中, o3是其最强大的推理旗舰模型,在编程、数学、科学、视觉感知等多个维度的基准测试中都处于领先位置;o4-mini是一个针对快速高效、成本效益推理 进行优化的较小模型,更具性价比。 视觉推理能力"首秀",具备自主执行任务能力 在两款o系列推理模型发布后,OpenAI首席执行官萨姆·奥特曼转发一名体验者的推文,并表示新模型"达到或接近天才水平"。此外,奥特曼还表示,预计 会在未来几周内将o3升级到专业版o3-pro。 据OpenAI介绍,最新发布的o3和o4-mini经过训练后,可以在做出反应前进行更长时间的思考。这是公司迄今为止发布的最智能的模型,代表着ChatGPT 能力的一次重大飞跃。 记者注意到,在半小时的线上发布会直播中,此前曾长期休假的OpenAI总裁GregBrockman(格雷格·布洛克曼)也作为发布者,向观众介绍和演示o3和o4- mini。 根据介绍及演示,o3和o4-mini主要有以下亮点: 一是性能更 ...
OpenAI最早本周发布“o3或o4-mini”,“博士水平AI”要来了?
硬AI· 2025-04-15 15:34
编辑 | 硬 AI OpenAI最新模型取得突破性进展:具备原创构思能力。 点击 上方 硬AI 关注我们 据介绍,最新模型不仅能总结研究论文或解决数学问题,还能够独立提出新构思,连接不同领域的概念,提出创新性实验 设计,完成需要科学家跨领域合作才能实现的成果,相当于"博士水平AI"。 硬·AI 作者 | 李笑寅 据媒体援引知情人士消息, OpenAI最早将在本周发布代号为o3或o4-mini的新模型, 该模型不仅能总结 研究论文或解决数学问题,还能够独立提出新构思,连接不同领域的概念,提出创新性实验设计。 据介绍,即将推出的新模型能同时利用物理学、工程学和生物学等多个领域的知识,提供跨学科的解决方 案,而科学家通常需要跨领域合作才能实现类似成果,相当"博士水平AI"。 硬·AI OpenAI总裁Greg Brockman在2月的"AI研讨会"活动上曾表示: "我们真正的方向是开发能够花大量时间认真思考重要科学问题的模型,我希望在未来几年内,这将 使所有人的效率提高10倍或100倍。" * 感谢阅读! * 转载、合作、交流请留言,线索、数据、商业合作请加微信:IngAI2023 * 欢迎大家在留言区分享您的看法 ...
智谱想给DeepSeek来一场偷袭
Hu Xiu· 2025-03-31 12:39
Core Viewpoint - The article discusses the competitive landscape between Zhipu and DeepSeek, highlighting Zhipu's recent product launches and pricing strategies aimed at challenging DeepSeek's dominance in the AI model market [2][10]. Product Launches - On March 31, Zhipu launched the "AutoGLM Thinking Model" and the inference model "GLM-Z1-Air," claiming that Air can match the performance of DeepSeek's R1 model with only 32 billion parameters compared to R1's 671 billion parameters [2]. - The pricing for Zhipu's model is set at 0.5 yuan per million tokens, significantly lower than DeepSeek's pricing, which is 1/30 of DeepSeek's model [2]. Market Dynamics - The article notes a shift in the AI model industry, with some companies, including Baichuan Intelligence and Lingyi Wanyi, experiencing strategic pivots or downsizing, indicating a loss of investor patience with AI startups [3][4]. - Despite the challenges, Zhipu continues to secure funding from state-owned enterprises, positioning itself as a leader among the "six small tigers" in the large model sector [4][6]. Commercialization Challenges - The commercialization of large models remains a significant hurdle for the industry, with Zhipu acknowledging the need to pave the way for an IPO while facing uncertain market conditions [6]. - Zhipu is focusing on penetrating various sectors, including finance, education, healthcare, and government, while also establishing an alliance with ASEAN countries and Belt and Road nations for collaborative model development [6]. Strategic Positioning - Zhipu's CEO emphasizes the company's commitment to pre-training models, despite industry trends moving towards post-training and inference models [3][12]. - The company aims to balance its technological advancements with commercial strategies, ensuring that both aspects support each other dynamically [21]. Future Outlook - The article suggests that Zhipu is optimistic about achieving significant growth in 2025, with expectations of a tenfold increase in market opportunities, while maintaining a stable commercialization strategy [22].
喝点VC|a16z关于DeepSeek的内部复盘:推理模型革新与20倍算力挑战下的AI模型新格局
Z Potentials· 2025-03-23 05:10
Core Insights - The article discusses the emergence and significance of DeepSeek, a new high-performance reasoning model from China, highlighting its open-source nature and the implications for the AI landscape [3][4][12]. Group 1: DeepSeek Overview - DeepSeek has gained attention for its performance on AI model rankings, raising both interest and concerns [3]. - The model's open-source release of weights and technical details provides valuable insights into reasoning models and their future development [4][12]. Group 2: Training Process - The training of DeepSeek involves three main steps: pre-training on vast datasets, supervised fine-tuning (SFT) with human-generated examples, and reinforcement learning with human feedback (RLHF) [6][9][10]. - The training process is designed to enhance the model's ability to provide accurate and contextually relevant answers, moving beyond simple question-answering to more complex reasoning [11][12]. Group 3: Innovations and Techniques - DeepSeek R1 represents a culmination of various innovations, including self-learning capabilities and multi-stage training processes that improve reasoning abilities [11][13][14]. - The model employs a mixture of experts (MoE) architecture, which allows for efficient training and high performance in reasoning tasks [15][30]. Group 4: Performance and Cost - The cost of training DeepSeek V3 was approximately $5.5 million, with the transition to R1 being less expensive due to the focus on reasoning and smaller-scale SFT [27][29]. - The article notes that the performance of reasoning models has significantly improved, with DeepSeek R1 demonstrating capabilities comparable to leading models in the industry [31][35]. Group 5: Future Implications - The rise of reasoning models like DeepSeek indicates a shift in the AI landscape, necessitating increased computational resources for inference and testing [31][34]. - The open-source nature of these models fosters innovation and collaboration within the AI community, potentially accelerating advancements in the field [36][39].
解读英伟达的最新GPU路线图
半导体行业观察· 2025-03-20 01:19
Core Viewpoint - High-tech companies consistently develop roadmaps to mitigate risks associated with technology planning and adoption, especially in the semiconductor industry, where performance and capacity limitations can hinder business operations [1][2]. Group 1: Nvidia's Roadmap - Nvidia has established an extensive roadmap that includes GPU, CPU, and networking technologies, aimed at addressing the growing demands of AI training and inference [3][5]. - The roadmap indicates that the "Blackwell" B300 GPU will enhance memory capacity by 50% and increase FP4 performance to 150 petaflops, compared to previous models [7][11]. - The upcoming "Vera" CV100 Arm processor is expected to feature 88 custom Arm cores, doubling the NVLink C2C connection speed to 1.8 TB/s, enhancing overall system performance [8][12]. Group 2: Future Developments - The "Rubin" R100 GPU will offer 288 GB of HBM4 memory and a bandwidth increase of 62.5% to 13 TB/s, significantly improving performance for AI workloads [9][10]. - By 2027, the "Rubin Ultra" GPU is projected to achieve 100 petaflops of FP4 performance, with a memory capacity of 1 TB, indicating substantial advancements in processing power [14][15]. - The VR300 NVL576 system, set for release in 2027, is anticipated to deliver 21 times the performance of current systems, with a total bandwidth of 4.6 PB/s [17][18]. Group 3: Networking and Connectivity - The ConnectX-8 SmartNIC will operate at 800 Gb/s, doubling the speed of its predecessor, enhancing network capabilities for data-intensive applications [8]. - The NVSwitch 7 ports are expected to double bandwidth to 7.2 TB/s, facilitating faster data transfer between GPUs and CPUs [18]. Group 4: Market Implications - Nvidia's roadmap serves as a strategic tool to reassure customers and investors of its commitment to innovation and performance, especially as competitors develop their own AI accelerators [2][4]. - The increasing complexity of semiconductor manufacturing and the need for advanced networking solutions highlight the competitive landscape in the AI and high-performance computing sectors [1][4].
从腾讯百度到车企券商,为何「万物」都想接入 DeepSeek?
声动活泼· 2025-03-14 05:45
根据国泰君安的研报,自从 DeepSeek 爆火之后,接入他们大模型的需求在短时间内迅速增加。从 2 月初至 今,腾讯、百度、阿里等互联网大厂,不仅在各自的云计算平台上线了 DeepSeek 模型。在直接面向用户的 业务上,即使这些巨头都拥有自己的大模型,但依然让旗下的部分应用接入了 DeepSeek。其中,包括月活 跃用户量达 13.8 亿的微信,以及曾因广告收入受影响、对 AI 搜索存在顾虑的百度。 除了互联网大厂,吉利、一汽大众等几十家车企、华为等主流手机厂商、三大电信运营商,也都在短时间 内完成了接入。甚至有些银行、券商、公募基金,以及国内部分地区的各类政府部门,也加入了这个行 列。比如,有些银行把 DeepSeek 应用到了面向用户的智能客服上。深圳、广州、呼和浩特、无锡等地的政 府,也宣布在政务系统中接入了 DeepSeek 模型,希望提升政务办公效率和群众办事体验。 那么,从汽车品牌到券商甚至政府,为什么大家纷纷都想要接入 DeepSeek? ▲ 近日,吉利汽车正式官宣其自研大模型与 DeepSeek 已完成深度融合。| 图源:吉利汽车集团微信公众号 财新的报道指出,腾讯等大厂积极接入 Deep ...
AI转向”推理模型和Agent时代“,对AI交易意味着什么?
硬AI· 2025-03-10 10:32
点击 上方 硬AI 关注我们 如果Scaling Law继续有效, 继续看好AI系统组件供应商(如芯片、网络设备等),谨慎对待那些不得不持续投入巨额资 本支出的科技巨头。如果预训练缩放停滞: 看好科技巨头(因为自由现金流将回升),并关注那些拥有大量用户、能够 从推理成本下降中获益的应用类股票。 硬·AI 作者 |硬 AI 编辑 | 硬 AI 还抱着"越大越好"的AI模型不放?华尔街投行巴克莱最新研报给出了一个颠覆性的预测: AI行业正经历一 场"巨变"(Big Shift),"推理模型"和"Agent"将成为新时代的弄潮儿,而"大力出奇迹"的传统大模型, 可能很快就要过气了! 这场变革的核心,是AI模型从"死记硬背"到"举一反三"的进化。过去,我们追求更大的模型、更多的参 数、更海量的训练数据,坚信"量变产生质变"。但现在,巴克莱指出,这条路可能已经走到了尽头。 算力无底洞、成本高企、收益却难以匹配……传统大模型的"军备竞赛"让众多科技巨头苦不堪言。更要命 的是,用户真的需要那么"大"的模型吗?在许多场景下,一个更"聪明"、更会推理的小模型,反而能提供 更精准、更高效的服务。 这究竟是怎么回事?对于投资者来说 ...
国家超算互联网平台QwQ-32B API接口服务上线,免费提供100万Tokens
Zheng Quan Shi Bao Wang· 2025-03-09 03:44
Core Viewpoint - The National Supercomputing Internet Platform announced the launch of Alibaba's open-source inference model QwQ-32B API interface service, offering users 1 million free tokens [1] Group 1: Product Launch - The QwQ-32B is the latest inference model released by Alibaba's Qwen team, built on Qwen2.5-32B with reinforcement learning [1] - The API service for QwQ-32B will be available starting this week [1] Group 2: Performance Metrics - According to official benchmark results, QwQ-32B performs comparably to DeepSeek-R1 on the AIME24 assessment set for mathematical capabilities and significantly outperforms o1-mini and similarly sized R1 distilled models in code evaluation on LiveCodeBench [1]