Workflow
Qwen3
icon
Search documents
老黄开年演讲「含华量」爆表,直接拿DeepSeek、Kimi验货下一代芯片
3 6 Ke· 2026-01-07 01:35
CES巨幕上,老黄的PPT已成中国AI的「封神榜」。DeepSeek与Kimi位列C位之时,算力新时代已至。 万众瞩目的2026 CES科技盛宴上,一张PPT瞬间燃爆AI圈。 老黄主旨演讲上,中国大模型Kimi K2、DeepSeek V3.2,以及Qwen赫然上屏,位列全球开源大模型前列,性能正在逼近闭源模型。 这一刻,是属于中国AI的高光时刻。 另外,OpenAI的GPT-OSS和老黄自家的Nemotron,也做了标注。 而且,DeepSeek-R1、Qwen3 和 Kimi K2 代表着MoE路线下顶级规模的尝试,仅需激活少量参数,大幅减少计算量和HBM显存带宽的压力。 在下一代Rubin架构亮相的核心环节上,老黄还选用了DeepSeek和Kimi K2 Thinking来秀性能。 在Rubin暴力加成下,Kimi K2 Thinking推理吞吐量直接飙了10倍。更夸张的是,token成本暴降到原来的1/10。 这种「指数级」的降本增效,等于宣告了:AI推理即将进入真正的「平价时代」。 另外,在计算需求暴涨这页PPT上,480B的Qwen3和1TB的Kimi K2成为代表性模型,验证了参数规模每年以十倍 ...
检索做大,生成做轻:CMU团队系统评测RAG的语料与模型权衡
机器之心· 2026-01-06 00:31
Core Insights - The core argument of the research is that expanding the retrieval corpus can significantly enhance Retrieval-Augmented Generation (RAG) performance, often providing benefits that can partially substitute for increasing model parameters, although diminishing returns occur at larger corpus sizes [4][22]. Group 1: Research Findings - The study reveals that the performance of RAG is determined by both the retrieval module, which provides evidence, and the generation model, which interprets the question and integrates evidence to form an answer [7]. - The research indicates that smaller models can achieve performance levels comparable to larger models by increasing the retrieval corpus size, with a consistent pattern observed across multiple datasets [11][12]. - The findings show that the most significant performance gains occur when moving from no retrieval to having retrieval, with diminishing returns as the corpus size increases [13]. Group 2: Experimental Design - The research employed a full factorial design, varying only the corpus size and model size while keeping other variables constant, using a large dataset of approximately 264 million real web documents [9]. - The evaluation covered three open-domain question-answering benchmarks: Natural Questions, TriviaQA, and Web Questions, using common metrics such as F1 and ExactMatch [9]. Group 3: Mechanisms of Improvement - The increase in corpus size enhances the probability of retrieving answer-containing segments, leading to more reliable evidence for the generation model [16]. - The study defines the Gold Answer Coverage Rate, which measures the probability that at least one of the top chunks provided to the generation model contains the correct answer string, showing a monotonic increase with corpus size [16]. Group 4: Practical Implications - The research suggests that when resources are constrained, prioritizing the expansion of the retrieval corpus and improving coverage can allow medium-sized generation models to perform close to larger models [20]. - The study emphasizes the importance of tracking answer coverage and utilization rates as diagnostic metrics to identify whether bottlenecks are in the retrieval or generation components [20].
2025AI应用大爆发,2026普通人有什么机会?
3 6 Ke· 2025-12-26 08:59
Core Insights - The AI industry is experiencing significant growth, but there is a stark income disparity, with Nvidia capturing nearly 90% of market profits, leading to concerns about the sustainability of the ecosystem [3][4] - The global AI application market is projected to see substantial increases in spending, with enterprise GenAI expenditures expected to rise from $11.5 billion in 2024 to $37 billion in 2025, marking a year-on-year growth of approximately 320% [3] - The commercialization of AI applications has formed a clear hierarchy, with general large models leading the first tier, while vertical applications are rapidly gaining traction in specific sectors [5][6] Group 1: Market Dynamics - The AI application market is not as dire as perceived, with significant growth in consumer spending on applications like ChatGPT, which is expected to reach $2.48 billion in 2025, up from $487 million in 2024, representing a 408% increase [4] - The first tier of commercial applications is dominated by general large models, with OpenAI leading at an annual recurring revenue (ARR) of $10 billion and a projected compound annual growth rate (CAGR) of 260% from 2023 to 2025 [5] - Chinese applications are currently positioned in the second tier, with ARR between 100 million and 1 billion yuan, focusing on vertical applications that demonstrate clear cost reduction benefits [5][8] Group 2: Application Development - Over 200 AI applications have been launched between July and November, with a significant focus on vertical applications that address specific user needs, such as AI image processing and efficiency tools [6] - In the global top 50 generative AI apps, 22 are developed by Chinese teams, indicating that Chinese applications are competitive, although there remains a significant income gap compared to the U.S. market [8] - The cost of producing AI dynamic animations has drastically decreased, with production costs now ranging from 50,000 to 100,000 yuan, only 10% to 30% of traditional methods [17] Group 3: Challenges and Opportunities - Quality remains a major bottleneck for AI applications, with 33% of respondents identifying it as the primary challenge, particularly in terms of accuracy and consistency of output [11][13] - The current landscape shows that AI applications are primarily limited to high-cost scenarios like programming and customer service, with significant cost-saving potential but insufficient revenue generation [14] - The AI industry is moving towards a phase where understanding AI's application in business is crucial, as evidenced by the rising interest in AI-driven content creation, particularly in the animation sector [16][19]
2025AI应用大爆发,2026普通人有什么机会?
首席商业评论· 2025-12-26 08:24
Core Viewpoint - The AI industry in China is rapidly catching up, but there remains a significant income gap compared to the US, primarily due to the profit distribution imbalance where Nvidia captures nearly 90% of the market profits, leaving downstream application developers struggling with high computing costs and low profitability [4][10]. Group 1: Market Dynamics - The AI application market is expected to see substantial growth, with enterprise GenAI spending projected to rise from $11.5 billion in 2024 to $37 billion in 2025, marking a year-on-year increase of approximately 320% and accounting for 6% of the global SaaS market [4]. - The global AI application commercialization has formed a clear hierarchy, with general large models leading the first tier, while vertical applications rapidly gain traction in AI programming, multi-modal, and AI search fields [6]. - OpenAI leads the first tier with an annual recurring revenue (ARR) of $10 billion and a projected compound annual growth rate (CAGR) of 260% from 2023 to 2025, driven primarily by its consumer product ChatGPT [6]. Group 2: Application Trends - Over 200 AI applications have emerged from July to November, with a significant focus on vertical applications that address specific user needs, such as AI image processing and AI professional consulting [7]. - Chinese applications are gaining traction globally, with ByteDance's Dola and DeepSeek ranking fourth and fifth in global monthly active users (MAU) [7][10]. - In the top 50 generative AI apps globally, 22 are developed by Chinese teams, although only three are primarily used in China, highlighting the disparity in revenue generation between Chinese and American AI applications [10]. Group 3: Challenges and Opportunities - The main challenge for AI applications is ensuring quality, with one-third of respondents identifying it as a primary bottleneck, particularly in terms of accuracy and consistency of output [16]. - The cost of generating AI dynamic animations has significantly decreased, with production costs dropping to 10-30% of traditional methods, indicating a potential for high ROI in the AI animation sector [22]. - The current AI applications are becoming more user-friendly, allowing individuals without technical backgrounds to engage with AI tools effectively, although understanding the business context remains crucial for success [21][25].
X @Avi Chawla
Avi Chawla· 2025-12-20 19:51
RT Avi Chawla (@_avichawla)Deploy and run LLMs directly on your phone!Unsloth now lets you fine-tune LLMs and deploy them 100% locally on iOS/Android devices.The video shows this in action, where I ran Qwen3 on an iPhone 17 Pro at ~25 tokens/s.I have shared a guide in the replies. https://t.co/p4NqLj0jRE ...
X @Avi Chawla
Avi Chawla· 2025-12-20 06:31
Deploy and run LLMs directly on your phone!Unsloth now lets you fine-tune LLMs and deploy them 100% locally on iOS/Android devices.The video shows this in action, where I ran Qwen3 on an iPhone 17 Pro at ~25 tokens/s.I have shared a guide in the replies. https://t.co/p4NqLj0jRE ...
蚂蚁阿福爆火背后:大厂AI,正霸榜2025
3 6 Ke· 2025-12-17 02:24
中国的AI叙事逻辑,在悄然变化。 2025年以来,曾经以"AI六小虎"为代表,凭借创新概念搅动赛道的初创公司,当下面临着更深度的竞争压力。而反观传统互联网大厂,其在AI赛道的动作 越发密集和大力度。 最明显的体现,便是大厂布局的不断深入。除了竞相推出全新的基础模型和应用以外,大厂还公开表态要加大对AI的资金和技术投入,诸如阿里、字节 等已经有了突出进展的选手,不但已经具备了全栈的AI能力,也已经过渡到用"杀手级"应用在核心场景做卡位。 比如,12月15日,蚂蚁集团宣布全面升级旗下AI健康应用"蚂蚁阿福",新版App凭借健康陪伴等亮点,发布后迅速攀升至苹果应用商店免费榜第三位,引 发了行业广泛关注。这也是近期阿里系继开售夸克AI眼镜、上线"千问"及"灵光"应用、成立千问C端事业群等动作之后的又一大动作。 产品层面,年初DeepSeek等产品的爆火,以前所未有的力度完成了一轮AI市场教育,大众对AI产品的认知真正从一种新奇技术转变为得心应手的日常工 具。而阿里、字节、腾讯等大厂又凭借巨额营销投入和天然的生态入口等优势,牢牢把控住大量用户的"屏幕"。 以a16z今年发布的全球top 100消费级Gen AI应用榜单 ...
斯坦福报告:AI透明度集体倒退!IBM夺冠,马斯克xAI垫底
Sou Hu Cai Jing· 2025-12-16 10:28
(来源:The 2025 Foundation Model Transparency Index) 从报告的整体结果来看,尽管 2024 年报告显示模型透明度有短暂的改善,但 2025 年的报告指出,透明度指数正呈现倒退:各基础模型的平 均得分从 2024 年的 58 分下降到 2025 年的 40 分,几乎与 2023 年报告首次发布时的水平相当(备注:100 分为满分)。 最近,美国斯坦福大学等团队发布了一份名为《2025 年基础模型透明度指数》(FMTI,The 2025 Foundation Model Transparency Index)的报 告。其中,IBM 以透明度得分最高夺冠,xAI 和 Midjourney 则垫底。 该报告揭示了一种令人担忧的趋势:随着技术的发展,尽管基础模型在性能等各方面持续进步,但其在数据使用、模型训练和下游影响等方 面的透明度却出现了集体倒退。 这是该报告自 2023 年以来发布的第三个年度版本,就像通过量化透明度为各大基础模型开发者做一次"体检"。 本次评估共包括 13 家基础模型公司,其中既有阿里巴巴、DeepSeek 和马斯克的 xAI 等首次纳入报告的企业, ...
DeepSeek V3到V3.2的进化之路,一文看全
机器之心· 2025-12-08 04:27
Core Insights - DeepSeek has released two new models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which have generated significant interest and discussion in the AI community [2][5][11] - The evolution from DeepSeek V3 to V3.2 includes various architectural improvements and the introduction of new mechanisms aimed at enhancing performance and efficiency [10][131] Release Timeline - The initial release of DeepSeek V3 in December 2024 did not create immediate buzz, but the subsequent release of the DeepSeek R1 model changed the landscape, making DeepSeek a popular alternative to proprietary models from companies like OpenAI and Google [11][14] - The release of DeepSeek V3.2-Exp in September 2025 was seen as a preparatory step for the V3.2 model, focusing on establishing the necessary infrastructure for deployment [17][49] Model Types - DeepSeek V3 was initially launched as a base model, while DeepSeek R1 was developed as a specialized reasoning model through additional training [19][20] - The trend in the industry has seen a shift from hybrid reasoning models to specialized models, with DeepSeek seemingly reversing this trend by moving from specialized (R1) to hybrid models (V3.1 and V3.2) [25] Evolution from V3 to V3.1 - DeepSeek V3 utilized a mixed expert model and multi-head latent attention (MLA) to optimize memory usage during inference [29][30] - DeepSeek R1 focused on Reinforcement Learning with Verifiable Rewards (RLVR) to enhance reasoning capabilities, particularly in tasks requiring symbolic verification [37][38] Sparse Attention Mechanism - DeepSeek V3.2-Exp introduced a non-standard sparse attention mechanism, which significantly improved efficiency in training and inference, especially in long-context scenarios [49][68] - The DeepSeek Sparse Attention (DSA) mechanism allows the model to selectively focus on relevant past tokens, reducing computational complexity from quadratic to linear [68] Self-Verification and Self-Correction - DeepSeekMath V2, released shortly before V3.2, introduced self-verification and self-correction techniques to improve the accuracy of mathematical reasoning tasks [71][72] - The self-verification process involves a verifier model that assesses the quality of generated proofs, while self-correction allows the model to iteratively improve its outputs based on feedback [78][92] DeepSeek V3.2 Architecture - DeepSeek V3.2 maintains the architecture of its predecessor, V3.2-Exp, while incorporating improvements aimed at enhancing overall model performance across various tasks, including mathematics and coding [107][110] - The model's training process has been refined to include updates to the RLVR framework, integrating new reward mechanisms for different task types [115][116] Performance Benchmarks - DeepSeek V3.2 has shown competitive performance in various benchmarks, achieving notable results in mathematical tasks and outperforming several proprietary models [127]
图解Qwen3-VL多模态模型
自动驾驶之心· 2025-11-29 02:06
阿杰 | 十年技术老兵:曾深耕大数据建模、后端架构设计与算法优化,经手过千万级用户系统。这里分享技术实战干货、踩坑复盘与行业趋势解读,陪开发 者一起成长。 作者 | 阿杰不敲代码时 来源 | 阿杰不敲代码时 原文链接: 图解Qwen3-VL多模态模型 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 以下文章来源于阿杰不敲代码时 ,作者阿杰不敲代码时 阿杰不敲代码时 . 本文只做学术分享,如有侵权,联系删文 前面不久 ,写了一篇关于VLM的文章,不知道是不是内容不好还是标题的原因,导致大家好像不是很感兴趣,但是如果要知道Qwen3-VL的内部细节。如果基础不怎 么牢固或者没有基础,那一篇还是需要看看的,当然我也是认为大家看了那篇,才来看这边哈,这里也就不在重复一些知识了。不排除有些大佬可能有基础,跳过第 一篇来看这个,也是可以。如果写的有不对的地方,也欢迎大家指正与批评。 视觉语言模型 (VLM) 是自回归 AI 模型,可将文本和图像处理为输入。在这一篇文章中我们也会详细的从源码来看Qwen3-VL模型怎么 ...