Workflow
DeepSeek V3
icon
Search documents
人工智能行业专题:探究模型能力与应用的进展和边界
Guoxin Securities· 2025-08-25 13:15
2025年08月25日 证券研究报告 | 人工智能行业专题(11) 探究模型能力与应用的进展和边界 行业研究 · 行业专题 互联网 · 互联网II 投资评级:优于大市(维持) 证券分析师:张伦可 证券分析师:陈淑媛 证券分析师:刘子谭 证券分析师:张昊晨 0755-81982651 021-60375431 liuzitan@guosen.com.cn zhanghaochen1@guosen.com.cn zhanglunke@guosen.com.cn chenshuyuan@guosen.com.cn S0980525060001 S0980525010001 S0980521120004 S0980524030003 请务必阅读正文之后的免责声明及其项下所有内容 报告摘要 Ø 风险提示:宏观经济波动风险、广告增长不及预期风险、行业竞争加剧风险、AI技术进展不及预期风险等。 请务必阅读正文之后的免责声明及其项下所有内容 2 Ø 本篇报告主要针对海内外模型发展、探究模型能力与应用的进展和边界。我们认为当前海外模型呈现差异化发展,企业调用考虑性价比。当前 OpenAI在技术路径上相对领先,聚焦强化推理与专业 ...
实测DeepSeek V3.1:不止拓展上下文长度
自动驾驶之心· 2025-08-21 23:34
作者 | 量子位 原文链接:https://mp.weixin.qq.com/s/x0X481MgH_8ujjB0_XL4SQ 点击下方 卡片 ,关注" 大模型之心Tech "公众号 戳我-> 领取大模型巨卷干货 >> 点击进入→ 大模型没那么大Tech技术交流群 本文只做学术分享,如有侵权,联系删文 ,自动驾驶课程学习与技术交流群事宜,也欢迎添加小助理微信AIDriver004做进一步咨询 DeepSeek V3.1 和V3相比,到底有什么不同? 官方说的模模糊糊,就提到了上下文长度拓展至128K和支持多种张量格式,但别急,我们已经 上手实测 ,为你奉上更多新鲜信息。 我们比较了V3.1和V3,注意到它在编程表现、创意写作、翻译水平、回答语气等方面都出现了不同程度的变化。 不过要说最明显的更新,大概是DeepSeek网页端界面的【深度思考(R1)】悄悄变成了【深度思考】。 手机端还在慢慢对齐(笑) 当前DeepSeek V3.1 Base可在抱抱脸上下载,也可通过网页、APP和小程序使用完整版本。 开学考试现在开始 鉴于现在网页端已全部替换成了V3.1,我们通过阿里云调用了DeepSeek V3的API(最 ...
实测DeepSeek V3.1,不止拓展上下文长度
量子位· 2025-08-20 07:48
Core Insights - The article discusses the differences between DeepSeek V3.1 and its predecessor V3, highlighting improvements in programming performance, creative writing, translation quality, and response tone [2][6][40]. Group 1: Model Features - DeepSeek V3.1 has expanded context length to 128K tokens, while V3 has a maximum of 65K tokens [8][7]. - The new version supports multiple tensor formats, enhancing its usability across different platforms [1][6]. - The API for V3 still operates with a maximum context length of 65K tokens, indicating a significant upgrade in V3.1 [7][8]. Group 2: Performance Comparison - In programming tasks, V3.1 demonstrated a more comprehensive approach, providing detailed code and usage instructions compared to V3 [12][13]. - For creative writing, V3.1 produced a more poetic and emotional response, contrasting with V3's straightforward style [20][18]. - Both versions successfully solved a mathematical problem, but their presentation styles differed, with V3.1 offering a clearer explanation [23][24]. Group 3: Translation Capabilities - V3.1 showed improved understanding of complex sentences in translation tasks, although it missed translating some simple words [29][26]. - The translation of a biology paper's abstract revealed V3.1's enhanced capability in handling specialized terminology compared to V3 [28][27]. Group 4: Knowledge and Reasoning - In a knowledge-based query about a specific fruit type, both versions identified it as a drupe, but V3.1's reasoning strayed off-topic [30][36]. - V3.1 achieved a score of 71.6% on the Aider benchmark, outperforming V3 and indicating its competitive edge in non-reasoning tasks [42][40]. Group 5: User Feedback and Market Response - The release of V3.1 has generated significant interest, becoming a trending topic on social media platforms [40][41]. - Users have noted improvements in physical understanding and the introduction of new tokens, although some issues related to the online API have been reported [45][49].
万字解析DeepSeek MOE架构!
自动驾驶之心· 2025-08-14 23:33
Core Viewpoint - The article provides a comprehensive overview of the Mixture of Experts (MoE) architecture, particularly focusing on the evolution and implementation of DeepSeek's MoE models (V1, V2, V3) and their optimizations in handling token distribution and load balancing in AI models [2][21][36]. Group 1: MoE Architecture Overview - MoE, or Mixture of Experts, is a model architecture that utilizes multiple expert networks to enhance performance, particularly in sparse settings suitable for cloud computing [2][3]. - The initial interest in MoE architecture surged with the release of Mistral.AI's Mixtral model, which highlighted the potential of sparse architectures in AI [2][3]. - The Switch Transformer model introduced a routing mechanism that allows tokens to select the top-K experts, optimizing the processing of diverse knowledge [6][10]. Group 2: DeepSeek V1 Innovations - DeepSeek V1 addresses two main issues in existing MoE practices: knowledge mixing and redundancy, which hinder expert specialization [22][24]. - The model introduces fine-grained expert division and shared experts to enhance specialization and reduce redundancy, allowing for more efficient knowledge capture [25][26]. - The architecture includes a load balancing mechanism to ensure even distribution of tokens across experts, mitigating training inefficiencies [32]. Group 3: DeepSeek V2 Enhancements - DeepSeek V2 builds on V1's design, implementing three key optimizations focused on load balancing [36]. - The model limits the number of devices used for routing experts to reduce communication overhead, enhancing efficiency during training and inference [37]. - A new communication load balancing loss function is introduced to ensure equitable token distribution across devices, further optimizing performance [38]. Group 4: DeepSeek V3 Developments - DeepSeek V3 introduces changes in the MOE layer computation, replacing the softmax function with a sigmoid function to improve computational efficiency [44]. - The model eliminates auxiliary load balancing losses, instead using a learnable bias term to control routing, which enhances load balancing during training [46]. - A sequence-level auxiliary loss is added to prevent extreme imbalances within individual sequences, ensuring a more stable training process [49].
用户都去哪了?DeepSeek使用率断崖式下跌?
菜鸟教程· 2025-07-23 02:10
Core Viewpoint - DeepSeek R1, initially a phenomenon in the AI sector, is now facing user attrition and declining traffic, raising questions about its market strategy and user experience [8][11]. Group 1: Market Performance - DeepSeek R1 achieved remarkable growth, with daily active users (DAU) reaching 22.15 million within 20 days of launch, topping the iOS App Store in over 140 countries [2]. - However, recent reports indicate a significant decline in web traffic, with DeepSeek's visits dropping from 614 million in February to 436 million in May, a decrease of 29% [9]. - In contrast, competitors like ChatGPT and Claude have seen increases in web traffic, with ChatGPT's visits rising by 40.6% [9]. Group 2: User Experience Issues - Users are migrating to third-party platforms, with third-party deployment usage of DeepSeek models increasing nearly 20 times since launch [16]. - Key user pain points include high token latency and a smaller context window of 64K, which limits its ability to handle large code or document analyses [21][23]. - DeepSeek's strategy of prioritizing low costs over user experience has led to longer wait times compared to third-party services [21]. Group 3: Strategic Choices - DeepSeek's approach reflects a focus on research and development rather than immediate profit, positioning itself more as a computational laboratory than a commercial entity [26]. - The company has chosen not to address user experience issues, indicating a deliberate strategy to maximize internal computational resources for AGI development [26]. Group 4: Competitive Landscape - The AI industry is witnessing intense competition, with new models like GPT-4.5, Gemini 2.5, and others being released, which has contributed to user migration from DeepSeek [38]. - Anthropic, facing similar challenges, has focused on optimizing its model and forming partnerships with cloud service providers to enhance computational resources [30]. Group 5: Public Perception - Domestic users have expressed mixed feelings about DeepSeek, citing slow speeds and server issues, while others remain supportive of its long-term vision [34][40]. - The competitive landscape is evolving rapidly, with new iterations of models being released, making it challenging for DeepSeek to retain users [38][47].
梁文锋等来及时雨
虎嗅APP· 2025-07-16 00:05
Core Viewpoint - The article discusses the competitive landscape of AI models, particularly focusing on DeepSeek and its challenges in maintaining user engagement and market position against emerging competitors like Kimi and others in the "AI Six Dragons" group. Group 1: DeepSeek's Performance and Challenges - DeepSeek experienced a significant decline in monthly active users, dropping from a peak of 169 million in January to a decrease of 5.1% by May [1][2]. - The download ranking of DeepSeek has plummeted, moving from the top of the App Store charts to outside the top 30 [2]. - The user engagement rate for DeepSeek has fallen from 7.5% at the beginning of the year to 3% by the end of May, with a 29% decrease in website traffic [2][3]. Group 2: Competition and Market Dynamics - Competitors like Kimi and others are rapidly releasing new models, with Kimi K2 achieving significant performance benchmarks and offering competitive pricing [1][8]. - The pricing strategy of Kimi K2 aligns closely with DeepSeek's API pricing, making it a direct competitor in terms of cost [8]. - Other players in the market are also emphasizing lower costs and better performance, which is eroding DeepSeek's previously established reputation for cost-effectiveness [7][8]. Group 3: Technological and Strategic Implications - DeepSeek's reliance on the H20 chip has been impacted by export restrictions, which has hindered its ability to scale and innovate [3][4]. - The lack of major updates to DeepSeek's models has led to a perception of stagnation, while competitors are rapidly iterating and improving their offerings [6][12]. - The article highlights the importance of multi-modal capabilities, which DeepSeek currently lacks, potentially limiting its appeal in a market that increasingly values such features [13]. Group 4: Future Outlook - To regain market interest, DeepSeek needs to expedite the release of new models like V4 and R2, as well as enhance its tool capabilities to meet developer needs [12][13]. - The competitive landscape is shifting rapidly, and without significant updates or innovations, DeepSeek risks losing further ground to its rivals [12][14]. - The article suggests that maintaining developer engagement and user interest is crucial for DeepSeek's long-term success in the evolving AI market [11].
K2开源大模型,会是Kimi的DeepSeek时刻吗?
Hu Xiu· 2025-07-14 03:20
Core Insights - The article discusses the emergence of MoonShot's latest open-source model K2, which has a parameter scale of 1 trillion, making it the largest open-source model currently available [2] - K2's performance in various benchmarks positions it as a strong competitor against established models like Claude 4 Opus and GPT-4.1, highlighting China's growing influence in the global AI landscape [2][4] - The competitive landscape in the AI sector is intensifying, with Chinese companies like MoonShot and MiniMax leading the charge in open-source innovation, challenging Western counterparts [4][6] Company Developments - MoonShot's K2 model has quickly gained popularity, becoming the top trending open-source model on HuggingFace shortly after its release [4] - The model's architecture incorporates fewer attention heads and more experts, enhancing efficiency in processing long contexts, which is a significant improvement over previous models [8][10] - MoonShot has disclosed a total funding amount of approximately $1.5 billion, which is significantly lower than that of its Western competitors, indicating a more efficient operational model [6] Market Impact - K2's compatibility with OpenAI and Anthropic's API formats positions it favorably in the AI application development market, potentially allowing it to capture a significant share of the market [7] - The article notes that the competitive dynamics between MoonShot and DeepSeek have intensified, with both companies releasing multiple models aimed at various AI applications [5][12] - The focus on multi-agent collaboration and the integration of various models into K2 may enhance its commercial viability and market appeal [12]
AI 编程冲击来袭,程序员怎么办?IDEA研究院张磊:底层系统能力才是护城河
AI前线· 2025-07-13 04:12
Core Viewpoint - The article discusses the challenges and opportunities in the development of multi-modal intelligent agents, emphasizing the need for effective integration of perception, cognition, and action in AI systems [1][2][3]. Multi-modal Intelligent Agents - The three essential components of intelligent agents are "seeing" (understanding input), "thinking" (processing information), and "doing" (executing actions), which are critical for advancing AI capabilities [2][3]. - There is a need to focus on practical problems with real-world applications rather than purely academic pursuits [2][3]. Visual Understanding and Spatial Intelligence - Visual input is complex and high-dimensional, requiring a deep understanding of three-dimensional structures and interactions with objects [3][5]. - Current models, such as the visual-language-action (VLA) model, struggle with precise object understanding and positioning, leading to low operational success rates [5][6]. - Achieving high accuracy in robotic operations is crucial, as even a small failure rate can lead to user dissatisfaction [5][8]. Research and Product Balance - Researchers in the industrial sector must balance between conducting foundational research and ensuring practical application of their findings [10][11]. - The ideal research outcome is one that combines both research value and application value, avoiding work that lacks significance in either area [11][12]. Recommendations for Young Professionals - Young professionals should focus on building solid foundational skills in computer science, including understanding operating systems and distributed systems, rather than solely on model tuning [16][17]. - The ability to optimize systems and understand underlying principles is more valuable than merely adjusting parameters in AI models [17][18]. - A strong foundation in basic disciplines will provide a competitive advantage in the evolving AI landscape [19][20].
腾讯研究院AI速递 20250710
腾讯研究院· 2025-07-09 14:49
Group 1: Veo 3 Upgrade - The Google Veo 3 upgrade allows audio and video generation from a single image, maintaining high consistency across multiple angles [1] - The new feature is implemented through the Flow platform's "Frames to Video" option, enhancing camera movement capabilities, although the Gemini Veo3 entry is currently unavailable [1] - User tests indicate natural expressions and effective performances, marking a significant breakthrough in AI storytelling applicable in advertising and animation [1] Group 2: Hugging Face 3B Model - Hugging Face has released the open-source 3B parameter model SmolLM3, outperforming Llama-3.2-3B and Qwen2.5-3B, supporting a 128K context window and six languages [2] - The model features a dual-mode system allowing users to switch between deep thinking and non-thinking modes [2] - It employs a three-stage mixed training strategy, trained on 11.2 trillion tokens, with all technical details, including architecture and data mixing methods, made available [2] Group 3: Kunlun Wanwei Skywork-R1V 3.0 - Kunlun Wanwei has open-sourced the Skywork-R1V 3.0 multimodal model, achieving a score of 142 in high school mathematics and 76 in MMMU evaluation, surpassing some closed-source models [3] - The model utilizes a reinforcement learning strategy (GRPO) and key entropy-driven mechanisms, achieving high performance with only 12,000 supervised samples and 13,000 reinforcement learning samples [3] - It excels in physical reasoning, logical reasoning, and mathematical problem-solving, setting a new performance benchmark for open-source models and demonstrating cross-disciplinary generalization capabilities [3] Group 4: Vidu Q1 Video Creation - Vidu Q1's multi-reference video feature allows users to upload up to seven reference images, enabling strong character consistency and zero storyboard video generation [4] - Users can combine multiple subjects with simple prompts, with clarity upgraded to 1080P, and support for character material storage for repeated use [5] - Test results show it is suitable for creating multi-character animation trailers, supporting frame extraction and quality enhancement, reducing video production costs to less than 0.9 yuan per video [5] Group 5: VIVO BlueLM-2.5-3B Model - VIVO has launched the BlueLM-2.5-3B edge multimodal model, which excels in over 20 evaluations and supports GUI interface understanding [6] - The model allows flexible switching between long and short thinking modes, introducing a thinking budget control mechanism to optimize reasoning depth and computational cost [6] - It employs a sophisticated structure (ViT+Adapter+LLM) and a four-stage pre-training strategy, enhancing efficiency and mitigating the text capability forgetting issue in multimodal models [6] Group 6: DeepSeek-R1 System - The X-Masters system, developed by Shanghai Jiao Tong University and DeepMind Technology, has achieved a score of 32.1 in the "Human Last Exam" (HLE), surpassing OpenAI and Google [7] - The system is built on the DeepSeek-R1 model, enabling smooth transitions between internal reasoning and external tool usage, using code as an interactive language [7] - X-Masters employs a decentralized-stacked multi-agent workflow, enhancing reasoning breadth and depth through collaboration among solvers, critics, rewriters, and selectors, with the solution fully open-sourced [7] Group 7: Zhihui Jun's Acquisition - Zhihui Jun's Zhiyuan Robot has acquired control of the listed company Shuangwei New Materials for 2.1 billion yuan, aiming for a 63.62%-66.99% stake [8] - Following the acquisition, Shuangwei New Materials' stock resumed trading with a limit-up, reaching a market value of 3.77 billion yuan, with the actual controller changing to Zhiyuan CEO Deng Taihua and core team members including "Zhihui Jun" Peng Zhihui [8] - This acquisition, conducted through "agreement transfer + active invitation," is seen as a landmark case for new productivity enterprises in A-shares following the implementation of national policies [8] Group 8: AI Model Usage Trends - In the first half of 2025, the Gemini series models captured nearly half of the large model API market, with Google leading at 43.1%, followed by DeepSeek and Anthropic at 19.6% and 18.4% respectively [9] - DeepSeek V3 has maintained a high user retention rate since its launch, ranking among the top five in usage, while OpenAI's model usage has fluctuated significantly [9] - The competitive landscape shows differentiation: Claude-Sonnet-4 leads in programming (44.5%), Gemini-2.0-Flash excels in translation, GPT-4o leads in marketing (32.5%), and role-playing remains highly fragmented [9] Group 9: AI User Trends - A report by Menlo Ventures indicates that there are 1.8 billion AI users globally, with a low paid user rate of only 3%, and a high student usage rate of 85%, while parents are becoming heavy users [10] - AI is primarily used for email writing (19%), researching topics of interest (18%), and managing to-do lists (18%), with no single task dependency exceeding one-fifth [10] - The next 18-24 months are expected to see six major trends in AI: rise of vertical tools, complete process automation, multi-person collaboration, explosion of voice AI, physical AI in households, and diversification of business models [10]
2025上半年大模型使用量观察:Gemini系列占一半市场份额,DeepSeek V3用户留存极高
Founder Park· 2025-07-09 06:11
Core Insights - The article discusses the current state and trends of the large model API market in 2025, highlighting significant growth and shifts in market share among key players [1][2][25]. Token Usage Growth - In Q1 2025, the total token usage for AI models increased nearly fourfold compared to the previous quarter, stabilizing at around 2 trillion tokens per week thereafter [7][25]. - The top models by token usage include Gemini-2.0-Flash, Claude-Sonnet-4, and Gemini-2.5-Flash-Preview-0520, with Gemini-2.0-Flash maintaining a strong position due to its low pricing and high performance [2][7]. Market Share Distribution - Google holds a dominant market share of 43.1%, followed by DeepSeek at 19.6% and Anthropic at 18.4% [8][25]. - OpenAI's models show significant volatility in usage, with GPT-4o-mini experiencing notable fluctuations, particularly in May [8][25]. Segment-Specific Insights - In the programming domain, Claude-Sonnet-4 leads with a 44.5% market share, while Gemini-2.5-Pro follows [12]. - For translation tasks, Gemini-2.0-Flash dominates with a 45.7% share, indicating its widespread integration into translation software [17]. - The role-playing model market is fragmented, with small models collectively holding 26.6% of the share, while DeepSeek leads in this area [21]. API Usage Trends - The most utilized APIs on OpenRouter are primarily for code writing, with Cline and RooCode leading the way [25]. - The overall trend indicates a strong preference for tools that facilitate coding and application development [25]. Competitive Landscape - DeepSeek's V3 model has shown strong user retention and is favored over its predecessor, likely due to faster processing times [25]. - Meta's Llama series is declining in popularity, while Mistral AI has captured approximately 3% of the market, primarily among users interested in fine-tuning open-source models [25]. - X-AI's Grok series is still establishing its market position, and the Qwen series holds a modest 1.6% share, indicating room for growth [25].