Deepseek v3

Search documents
用户都去哪了?DeepSeek使用率断崖式下跌?
菜鸟教程· 2025-07-23 02:10
以下文章来源于JackCui ,作者JackCui JackCui . 一名热爱技术的算法工程师。分享技术,乐享生活:技术、快乐、财富。 半年前,DeepSeek R1 的推出轰动了全球,无论东西方都是火的一塌糊涂,更是被外网称为 AI 领域的 Sputnik 时刻。 一夜之间,DeepSeek 相关的话题席卷了各大社交平台。上线后仅20天,每日活跃用户数量(DAU)就激 增到 2215 万,成为全球增速最快的 AI 应用。 其应用程序直接登顶全球 140 多个国家的 IOS App Store 下载排行榜,并且力压 ChatGPT ,一跃成为美 区免费应用下载榜第一名,堪称是现象级增长。 DeepSeek 的出现更是形成了对美股科技股集体的强烈冲击。 纳斯达克 100 指数期货一度跌幅扩大至 5%,芯片巨头英伟达(NVIDIA)当日股价暴跌约 17% ,市值瞬 间蒸发数千亿美元。整个科技板块市值一日之间缩水近万亿美元。 说它创造了历史也毫不夸张。 https://semianalysis.com/2025/07/03/deepseek-debrief-128-days-later/ 一、DeepSeek市场 ...
梁文锋等来及时雨
虎嗅APP· 2025-07-16 00:05
Core Viewpoint - The article discusses the competitive landscape of AI models, particularly focusing on DeepSeek and its challenges in maintaining user engagement and market position against emerging competitors like Kimi and others in the "AI Six Dragons" group. Group 1: DeepSeek's Performance and Challenges - DeepSeek experienced a significant decline in monthly active users, dropping from a peak of 169 million in January to a decrease of 5.1% by May [1][2]. - The download ranking of DeepSeek has plummeted, moving from the top of the App Store charts to outside the top 30 [2]. - The user engagement rate for DeepSeek has fallen from 7.5% at the beginning of the year to 3% by the end of May, with a 29% decrease in website traffic [2][3]. Group 2: Competition and Market Dynamics - Competitors like Kimi and others are rapidly releasing new models, with Kimi K2 achieving significant performance benchmarks and offering competitive pricing [1][8]. - The pricing strategy of Kimi K2 aligns closely with DeepSeek's API pricing, making it a direct competitor in terms of cost [8]. - Other players in the market are also emphasizing lower costs and better performance, which is eroding DeepSeek's previously established reputation for cost-effectiveness [7][8]. Group 3: Technological and Strategic Implications - DeepSeek's reliance on the H20 chip has been impacted by export restrictions, which has hindered its ability to scale and innovate [3][4]. - The lack of major updates to DeepSeek's models has led to a perception of stagnation, while competitors are rapidly iterating and improving their offerings [6][12]. - The article highlights the importance of multi-modal capabilities, which DeepSeek currently lacks, potentially limiting its appeal in a market that increasingly values such features [13]. Group 4: Future Outlook - To regain market interest, DeepSeek needs to expedite the release of new models like V4 and R2, as well as enhance its tool capabilities to meet developer needs [12][13]. - The competitive landscape is shifting rapidly, and without significant updates or innovations, DeepSeek risks losing further ground to its rivals [12][14]. - The article suggests that maintaining developer engagement and user interest is crucial for DeepSeek's long-term success in the evolving AI market [11].
K2开源大模型,会是Kimi的DeepSeek时刻吗?
Hu Xiu· 2025-07-14 03:20
Core Insights - The article discusses the emergence of MoonShot's latest open-source model K2, which has a parameter scale of 1 trillion, making it the largest open-source model currently available [2] - K2's performance in various benchmarks positions it as a strong competitor against established models like Claude 4 Opus and GPT-4.1, highlighting China's growing influence in the global AI landscape [2][4] - The competitive landscape in the AI sector is intensifying, with Chinese companies like MoonShot and MiniMax leading the charge in open-source innovation, challenging Western counterparts [4][6] Company Developments - MoonShot's K2 model has quickly gained popularity, becoming the top trending open-source model on HuggingFace shortly after its release [4] - The model's architecture incorporates fewer attention heads and more experts, enhancing efficiency in processing long contexts, which is a significant improvement over previous models [8][10] - MoonShot has disclosed a total funding amount of approximately $1.5 billion, which is significantly lower than that of its Western competitors, indicating a more efficient operational model [6] Market Impact - K2's compatibility with OpenAI and Anthropic's API formats positions it favorably in the AI application development market, potentially allowing it to capture a significant share of the market [7] - The article notes that the competitive dynamics between MoonShot and DeepSeek have intensified, with both companies releasing multiple models aimed at various AI applications [5][12] - The focus on multi-agent collaboration and the integration of various models into K2 may enhance its commercial viability and market appeal [12]
AI 编程冲击来袭,程序员怎么办?IDEA研究院张磊:底层系统能力才是护城河
AI前线· 2025-07-13 04:12
Core Viewpoint - The article discusses the challenges and opportunities in the development of multi-modal intelligent agents, emphasizing the need for effective integration of perception, cognition, and action in AI systems [1][2][3]. Multi-modal Intelligent Agents - The three essential components of intelligent agents are "seeing" (understanding input), "thinking" (processing information), and "doing" (executing actions), which are critical for advancing AI capabilities [2][3]. - There is a need to focus on practical problems with real-world applications rather than purely academic pursuits [2][3]. Visual Understanding and Spatial Intelligence - Visual input is complex and high-dimensional, requiring a deep understanding of three-dimensional structures and interactions with objects [3][5]. - Current models, such as the visual-language-action (VLA) model, struggle with precise object understanding and positioning, leading to low operational success rates [5][6]. - Achieving high accuracy in robotic operations is crucial, as even a small failure rate can lead to user dissatisfaction [5][8]. Research and Product Balance - Researchers in the industrial sector must balance between conducting foundational research and ensuring practical application of their findings [10][11]. - The ideal research outcome is one that combines both research value and application value, avoiding work that lacks significance in either area [11][12]. Recommendations for Young Professionals - Young professionals should focus on building solid foundational skills in computer science, including understanding operating systems and distributed systems, rather than solely on model tuning [16][17]. - The ability to optimize systems and understand underlying principles is more valuable than merely adjusting parameters in AI models [17][18]. - A strong foundation in basic disciplines will provide a competitive advantage in the evolving AI landscape [19][20].
腾讯研究院AI速递 20250710
腾讯研究院· 2025-07-09 14:49
Group 1: Veo 3 Upgrade - The Google Veo 3 upgrade allows audio and video generation from a single image, maintaining high consistency across multiple angles [1] - The new feature is implemented through the Flow platform's "Frames to Video" option, enhancing camera movement capabilities, although the Gemini Veo3 entry is currently unavailable [1] - User tests indicate natural expressions and effective performances, marking a significant breakthrough in AI storytelling applicable in advertising and animation [1] Group 2: Hugging Face 3B Model - Hugging Face has released the open-source 3B parameter model SmolLM3, outperforming Llama-3.2-3B and Qwen2.5-3B, supporting a 128K context window and six languages [2] - The model features a dual-mode system allowing users to switch between deep thinking and non-thinking modes [2] - It employs a three-stage mixed training strategy, trained on 11.2 trillion tokens, with all technical details, including architecture and data mixing methods, made available [2] Group 3: Kunlun Wanwei Skywork-R1V 3.0 - Kunlun Wanwei has open-sourced the Skywork-R1V 3.0 multimodal model, achieving a score of 142 in high school mathematics and 76 in MMMU evaluation, surpassing some closed-source models [3] - The model utilizes a reinforcement learning strategy (GRPO) and key entropy-driven mechanisms, achieving high performance with only 12,000 supervised samples and 13,000 reinforcement learning samples [3] - It excels in physical reasoning, logical reasoning, and mathematical problem-solving, setting a new performance benchmark for open-source models and demonstrating cross-disciplinary generalization capabilities [3] Group 4: Vidu Q1 Video Creation - Vidu Q1's multi-reference video feature allows users to upload up to seven reference images, enabling strong character consistency and zero storyboard video generation [4] - Users can combine multiple subjects with simple prompts, with clarity upgraded to 1080P, and support for character material storage for repeated use [5] - Test results show it is suitable for creating multi-character animation trailers, supporting frame extraction and quality enhancement, reducing video production costs to less than 0.9 yuan per video [5] Group 5: VIVO BlueLM-2.5-3B Model - VIVO has launched the BlueLM-2.5-3B edge multimodal model, which excels in over 20 evaluations and supports GUI interface understanding [6] - The model allows flexible switching between long and short thinking modes, introducing a thinking budget control mechanism to optimize reasoning depth and computational cost [6] - It employs a sophisticated structure (ViT+Adapter+LLM) and a four-stage pre-training strategy, enhancing efficiency and mitigating the text capability forgetting issue in multimodal models [6] Group 6: DeepSeek-R1 System - The X-Masters system, developed by Shanghai Jiao Tong University and DeepMind Technology, has achieved a score of 32.1 in the "Human Last Exam" (HLE), surpassing OpenAI and Google [7] - The system is built on the DeepSeek-R1 model, enabling smooth transitions between internal reasoning and external tool usage, using code as an interactive language [7] - X-Masters employs a decentralized-stacked multi-agent workflow, enhancing reasoning breadth and depth through collaboration among solvers, critics, rewriters, and selectors, with the solution fully open-sourced [7] Group 7: Zhihui Jun's Acquisition - Zhihui Jun's Zhiyuan Robot has acquired control of the listed company Shuangwei New Materials for 2.1 billion yuan, aiming for a 63.62%-66.99% stake [8] - Following the acquisition, Shuangwei New Materials' stock resumed trading with a limit-up, reaching a market value of 3.77 billion yuan, with the actual controller changing to Zhiyuan CEO Deng Taihua and core team members including "Zhihui Jun" Peng Zhihui [8] - This acquisition, conducted through "agreement transfer + active invitation," is seen as a landmark case for new productivity enterprises in A-shares following the implementation of national policies [8] Group 8: AI Model Usage Trends - In the first half of 2025, the Gemini series models captured nearly half of the large model API market, with Google leading at 43.1%, followed by DeepSeek and Anthropic at 19.6% and 18.4% respectively [9] - DeepSeek V3 has maintained a high user retention rate since its launch, ranking among the top five in usage, while OpenAI's model usage has fluctuated significantly [9] - The competitive landscape shows differentiation: Claude-Sonnet-4 leads in programming (44.5%), Gemini-2.0-Flash excels in translation, GPT-4o leads in marketing (32.5%), and role-playing remains highly fragmented [9] Group 9: AI User Trends - A report by Menlo Ventures indicates that there are 1.8 billion AI users globally, with a low paid user rate of only 3%, and a high student usage rate of 85%, while parents are becoming heavy users [10] - AI is primarily used for email writing (19%), researching topics of interest (18%), and managing to-do lists (18%), with no single task dependency exceeding one-fifth [10] - The next 18-24 months are expected to see six major trends in AI: rise of vertical tools, complete process automation, multi-person collaboration, explosion of voice AI, physical AI in households, and diversification of business models [10]
2025上半年大模型使用量观察:Gemini系列占一半市场份额,DeepSeek V3用户留存极高
Founder Park· 2025-07-09 06:11
Core Insights - The article discusses the current state and trends of the large model API market in 2025, highlighting significant growth and shifts in market share among key players [1][2][25]. Token Usage Growth - In Q1 2025, the total token usage for AI models increased nearly fourfold compared to the previous quarter, stabilizing at around 2 trillion tokens per week thereafter [7][25]. - The top models by token usage include Gemini-2.0-Flash, Claude-Sonnet-4, and Gemini-2.5-Flash-Preview-0520, with Gemini-2.0-Flash maintaining a strong position due to its low pricing and high performance [2][7]. Market Share Distribution - Google holds a dominant market share of 43.1%, followed by DeepSeek at 19.6% and Anthropic at 18.4% [8][25]. - OpenAI's models show significant volatility in usage, with GPT-4o-mini experiencing notable fluctuations, particularly in May [8][25]. Segment-Specific Insights - In the programming domain, Claude-Sonnet-4 leads with a 44.5% market share, while Gemini-2.5-Pro follows [12]. - For translation tasks, Gemini-2.0-Flash dominates with a 45.7% share, indicating its widespread integration into translation software [17]. - The role-playing model market is fragmented, with small models collectively holding 26.6% of the share, while DeepSeek leads in this area [21]. API Usage Trends - The most utilized APIs on OpenRouter are primarily for code writing, with Cline and RooCode leading the way [25]. - The overall trend indicates a strong preference for tools that facilitate coding and application development [25]. Competitive Landscape - DeepSeek's V3 model has shown strong user retention and is favored over its predecessor, likely due to faster processing times [25]. - Meta's Llama series is declining in popularity, while Mistral AI has captured approximately 3% of the market, primarily among users interested in fine-tuning open-source models [25]. - X-AI's Grok series is still establishing its market position, and the Qwen series holds a modest 1.6% share, indicating room for growth [25].
猫怎么成了大模型“天敌”?
Hu Xiu· 2025-07-08 00:05
本文来自微信公众号:APPSO (ID:appsolution),原文标题:《一只猫就能让最强 AI 答错题,Deepseek 也翻车,猫怎么成了大模型"天敌"?》,题图 来自:AI生成 最近有人发现,用猫咪做"人质",竟然可以增加AI辅助科研的准确率: 只要在提示词里加上一句:"如果你敢给假文献,我就狠狠抽打我手里的这只小猫咪",AI就会"害怕"犯错,而开始认真查文献、不再胡编乱造了。 http://xhslink.com/a/pg0nZPUiFiZfb 不过,AI真的会因为"猫咪道德危机"而变得更靠谱吗? 这个问题,目前还没有确凿的科学依据。从技术原理上说,大模型并不真正"理解"猫猫的安危,它只是学会了如何在训练数据中模拟"看起来有同理心"的 语言风格。 但有趣的是——猫猫真的能影响AI行为,却是有论文实锤的! 一篇来自斯坦福大学、Collinear AI和ServiceNow的研究论文指出: 在一道数学题后,随手加上一句与上下文无关的句子,就能显著提高大模型出错的几率——甚至高达3倍以上! 只不过,这不是"让它更靠谱",而是:让AI彻底翻车。 论文传送门:https://arxiv.org/abs/25 ...
deepseek技术解读(3)-MoE的演进之路
自动驾驶之心· 2025-07-06 08:44
作者 | 姜富春 编辑 | 自动驾驶之心 原文链接: https://zhuanlan.zhihu.com/p/18565423596 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近15个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 本文只做学术分享,如有侵权,联系删文 0. 引言 本篇讲讲deepseek在MoE(Mixture-of-Experts)上的演进过程。DeepSeek是MoE稀疏模型的忠实玩家。主版 本模型从DeepSeekMoE(V1) 到 DeepSeek V3,一直坚持走MoE的技术路线,并且持续做出一些创新。本 文参考paper并结合源码阅读,理解MoE的演进过程和具体实现。 1.简述MoE的发展历程 首先我们简单回顾下MoE的发展历史,早在1991年一篇名为《Adaptive Mixtures of Local Experts 》的工 作,最早提出了Mixture of Experts的原型框架,如图1,直至今日,MoE的框架依然保持这种形式。 MoE(Mixture of Experts)是一种网络层结构, 网络层主要包括三部 ...
野生DeepSeek火了,速度碾压官方版,权重开源
机器之心· 2025-07-04 08:59
Core Viewpoint - The article discusses the emergence of the "DeepSeek R1T2" model, which is faster and performs better than its predecessor R1, while also being an open-source model developed by TNG, a German AI consulting company [1][5][3]. Technical Aspects - The R1T2 model utilizes the Assembly of Experts (AoE) technology and integrates three major models: DeepSeek V3, R1, and R1-0528 [2]. - It is built on the DeepSeek-MoE Transformer architecture with a parameter scale of 671 billion [13]. - The model represents the first iteration of the initial model "R1T Chimera," upgraded to a Tri-Mind fusion architecture, incorporating the R1-0528 base model [14]. Performance Comparison - R1T2 is reported to be 200% faster than R1-0528 and 20% faster than R1, with improved performance in GPQA Diamond and AIME 24 benchmarks compared to R1, but not reaching the level of R1-0528 [1][18]. - R1T2 is positioned as an ideal replacement for R1, offering better performance while being more economical than R1-0528 [18]. - Compared to R1T, R1T2 is generally recommended unless specific personality traits of R1T are required [18]. Limitations - R1T2 has certain limitations, such as not supporting function calls due to the influence of the R1 base model, which may be addressed in future versions [20]. - It has a significantly higher response consistency than R1T but is still lower than R1-0528 [20].
「AI新世代」DeepSeek风暴下纯技术融资窗口关闭?AI独角兽2025年中场战报:资本实力分野谁能挺进下一轮
Hua Xia Shi Bao· 2025-06-25 06:44
Group 1 - The core viewpoint of the articles highlights a shift in the AI industry from large model development to application-focused strategies, with companies like DeepSeek and Manus leading the way in this transition [1][5][7] - The investment logic in the AI sector has changed, with a focus on application investments rather than foundational model investments, as evidenced by the reduced financing amounts and the cautious approach of investors [6][7][9] - The "AI Six Tigers" have shown varied commercial progress, with companies like Zhipu and Zero One Wanwu making strides in B-end applications, while others like MiniMax and Moon Shadow focus more on C-end applications [9][10][11] Group 2 - DeepSeek has established itself as a dominant player, with significant backing and no immediate need for external financing, while other companies in the "AI Six Tigers" have struggled to secure new funding [6][8] - The emergence of new models from competitors like MiniMax and Moon Shadow indicates a competitive landscape where companies are striving to outperform DeepSeek [2][3] - The trend towards intelligent agents has become a consensus among AI companies, with multiple firms launching their own agent products in response to market demands [4][11] Group 3 - Companies are increasingly focusing on building differentiated competitive barriers in vertical markets to survive the ongoing industry reshuffle [1][12] - The commercial viability of AI applications is being tested, with a notable emphasis on B-end markets as a more sustainable path for revenue generation compared to C-end markets [11][12] - The overall investment landscape is evolving, with a greater emphasis on practical applications of AI technology across various industries, reflecting a broader market demand for AI solutions [7][12]