Seek .(SKLTY)
Search documents
官宣!DeepSeek-V3.1 发布,API调用价格低至0.5元/百万Tokens
Xin Lang Ke Ji· 2025-08-21 07:05
Core Insights - DeepSeek announced the release of DeepSeek-V3.1 and will adjust the API pricing effective September 6, 2025 [1][3] - The new pricing structure includes input prices of 0.5 CNY per million tokens for cache hits and 4 CNY per million tokens for cache misses, with output prices set at 12 CNY per million tokens [1] Group 1: Upgrade Features - The V3.1 upgrade introduces a hybrid reasoning architecture that supports both thinking and non-thinking modes within a single model [3] - Enhanced thinking efficiency allows DeepSeek-V3.1-Think to provide answers in a shorter time compared to its predecessor, DeepSeek-R1-0528 [3] - Improved agent capabilities through post-training optimization significantly enhance the model's performance in tool usage and agent tasks [3] Group 2: User Experience - The official app and web model have been upgraded to DeepSeek-V3.1, allowing users to switch freely between thinking and non-thinking modes via a "deep thinking" button [3]
DeepSeek-V3.1发布
Zheng Quan Shi Bao Wang· 2025-08-21 07:01
Core Insights - DeepSeek has officially released DeepSeek-V3.1, which includes significant upgrades in its architecture and performance [1] Group 1: Key Features of DeepSeek-V3.1 - Hybrid reasoning architecture: The model supports both thinking and non-thinking modes simultaneously [1] - Enhanced thinking efficiency: DeepSeek-V3.1-Think can provide answers in a shorter time compared to DeepSeek-R1-0528 [1] - Improved agent capabilities: The new model shows significant improvements in tool usage and agent tasks through post-training optimization [1]
DeepSeek-V3.1正式发布,迈向 Agent 时代的第一步
Hua Er Jie Jian Wen· 2025-08-21 06:39
Group 1 - DeepSeek officially released DeepSeek-V3.1, featuring a hybrid reasoning architecture that supports both thinking and non-thinking modes [1] - The new version, DeepSeek-V3.1-Think, offers higher thinking efficiency, providing answers in a shorter time compared to DeepSeek-R1-0528 [1] - Enhanced agent capabilities have been achieved through post-training optimization, significantly improving performance in tool usage and intelligent tasks [1] Group 2 - Starting from September 6, 2025, the pricing for API calls on the DeepSeek open platform will be adjusted, with input costs set at 0.5 to 4 yuan per million tokens (cache hit) and 4 yuan per million tokens (cache miss), while output costs will be 12 yuan per million tokens [1]
DeepSeek-V3.1正式发布
Di Yi Cai Jing· 2025-08-21 06:37
本次升级包含以下主要变化:混合推理架构:一个模型同时支持思考模式与非思考模式;更高的思考效 率:相比DeepSeek-R1-0528,DeepSeek-V3.1-Think能在更短时间内给出答案;更强的Agent能力:通过 Post-Training优化,新模型在工具使用与智能体任务中的表现有较大提升。 官方App与网页端模型已同步升级为DeepSeek-V3.1。用户可以通过"深度思考"按钮,实现思考模式与非 思考模式的自由切换。 (文章来源:第一财经) 据DeepSeek官方公众号消息,DeepSeek-V3.1正式发布。 ...
DeepSeek、宇树科技上榜2025年《财富》中国科技50强榜单
Feng Huang Wang· 2025-08-21 05:21
Core Insights - The "Fortune China Top 50 Technology Companies" list was released, featuring companies like Huawei, DeepSeek, and Yushu Technology [1] Group 1: DeepSeek - DeepSeek is recognized as a leading AI large model product in China, with its DeepSeek-R1 model scoring 88.5 on the MMLU benchmark test, which is lower than OpenAI's GPT-4 (92.0) and Google's Gemini Pro (90.0), but higher than Meta's Llama 3 (82.0) and Anthropic's Claude 2 (85.1) [1] - DeepSeek ranks among the top 10 globally in terms of open-source large model downloads, indicating strong market presence [1] - As of June 2025, DeepSeek is projected to have 163 million monthly active users, making it the leading application in AI-generated content globally [1] Group 2: Yushu Technology - In 2024, Yushu Technology achieved global sales of 18,000 quadruped robots, capturing a 23% market share, ranking second only to Boston Dynamics [1] - Yushu Technology was awarded the WIPO 2025 Global Award, distinguishing it as the only representative from China among 780 applicants from 95 countries and regions [1] - The company's success is attributed to innovations in robotic motion control, high-performance joint motors, and real-time systems, along with a comprehensive global intellectual property strategy [1]
DeepSeek又更新了,期待梁文锋“炸场”
Hu Xiu· 2025-08-21 02:28
Core Insights - DeepSeek has released an updated version of its model, V3.1, which shows significant improvements in context length and user interaction, although it is not the highly anticipated R2 model [2][4][14] - The model now supports a context length of 128K, enhancing its ability to handle longer texts and improving its programming capabilities [5][10] - The update merges the functionalities of V3 and R1, leading to reduced deployment costs and improved efficiency [13][25] Group 1: Model Improvements - The new V3.1 model has a parameter count of 685 billion, showing only a slight increase from the previous version, V3, which had 671 billion parameters [7] - User experience has been enhanced with more natural language responses and the use of tables for information presentation [8][10] - The programming capabilities of V3.1 have been validated through tests, achieving a score of 71.6% in multi-language programming, outperforming Claude 4 Opus [10] Group 2: Market Context - The release of V3.1 comes seven months after the launch of R1, during which time other major companies have also released new models, using R1 as a benchmark [3][16] - Despite the improvements in V3.1, the industry is still eagerly awaiting the release of the R2 model, which has not been announced [4][20] - The competitive landscape includes companies like Alibaba and ByteDance, which have launched models that claim to surpass DeepSeek R1 in various metrics [17][19] Group 3: Future Outlook - There are indications that the merging of V3 and R1 may be a preparatory step for the release of a multi-modal model [25] - Industry insiders suggest that the focus will shift towards innovations in economic viability and usability for future models [24] - The absence of the R2 model in the current update has heightened expectations for its eventual release, with speculation that it may not arrive until later [21][22]
DeepSeek又更新了,期待梁文锋「炸场」
Xin Lang Ke Ji· 2025-08-21 00:52
Core Viewpoint - The recent upgrade of DeepSeek to version 3.1 has shown significant improvements in context length and user interaction, while also merging features from previous models to reduce deployment costs [1][11][12]. Group 1: Model Improvements - DeepSeek V3.1 now supports a context length of 128K, enhancing its ability to handle longer texts [4]. - The model's parameter count increased slightly from 671 billion to 685 billion, but the user experience has improved noticeably [5]. - The model's programming capabilities have been highlighted, achieving a score of 71.6% in multi-language programming tests, outperforming Claude 4 Opus [7]. Group 2: Economic Efficiency - The merger of V3 and R1 models allows for reduced deployment costs, requiring only 60 GPUs instead of the previous 120 [12]. - Developers noted that the performance could improve by 3-4 times with the new model due to increased cache size [12]. - The open-source release of DeepSeek V3.1-Base on Huggingface indicates a move towards greater accessibility and collaboration in the AI community [13]. Group 3: Market Context - The AI industry is closely watching the developments of DeepSeek, especially in light of the absence of the anticipated R2 model [19]. - Competitors like OpenAI, Google, and Alibaba have released new models, using R1 as a benchmark for their advancements [1][15]. - The market is eager for DeepSeek's next steps, particularly regarding the potential release of a multi-modal model following the V3.1 update [23].
实测低调上线的DeepSeek新模型:编程比Claude 4还能打,写作...还是算了吧
3 6 Ke· 2025-08-20 12:14
Core Insights - DeepSeek has officially launched and open-sourced its new model, DeepSeek-V3.1-Base, following the release of GPT-5, despite not having released R2 yet [1] - The new model features 685 billion parameters and supports multiple tensor types, with significant optimizations in inference efficiency and an expanded context window of 128k [1] Model Performance - Initial tests show that DeepSeek V3.1 achieved a score of 71.6% on the Aider Polyglot programming benchmark, outperforming other open-source models, including Claude 4 Opus [5] - The model successfully processed a long text and provided relevant literary recommendations, demonstrating its capability in handling complex queries [4] - In programming tasks, DeepSeek V3.1 generated code that effectively handled collision detection and included realistic physical properties, showcasing its advanced programming capabilities [8] Community and Market Response - Hugging Face CEO Clément Delangue noted that DeepSeek V3.1 quickly climbed to the fourth position on the trends chart, later reaching second place, indicating strong market interest [79] - The update removed the "R1" label from the deep thinking mode and introduced native "search token" support, enhancing the search functionality [79][80] Future Developments - The company plans to discontinue the mixed thinking mode in favor of training separate Instruct and Thinking models to ensure higher quality outputs [80] - As of the latest update, the model card for DeepSeek-V3.1-Base has not yet been released, but further technical details are anticipated [81]
DeepSeek V3.1发布后,投资者该思考这四个决定未来的问题
3 6 Ke· 2025-08-20 10:51
Core Insights - DeepSeek has quietly launched its new V3.1 model, which has generated significant buzz in both the tech and investment communities due to its impressive performance metrics [1][2][5] - The V3.1 model outperformed the previously dominant Claude Opus 4 in programming capabilities, achieving a score of 71.6% in the Aider programming benchmark [2] - The cost efficiency of V3.1 is notable, with a complete programming task costing approximately $1.01, making it 68 times cheaper than Claude Opus 4 [5] Group 1: Performance and Cost Advantages - The V3.1 model's programming capabilities have surpassed those of Claude Opus 4, marking a significant achievement in the open-source model landscape [2] - The cost to complete a programming task with V3.1 is only about $1.01, which is a drastic reduction compared to competitors, indicating a strong cost advantage [5] Group 2: Industry Implications - The emergence of V3.1 raises questions about the future dynamics between open-source and closed-source models, particularly regarding the erosion and reconstruction of competitive advantages [8] - The shift towards a "hybrid model" is becoming prevalent among enterprises, combining private deployments of fine-tuned open-source models with the use of powerful closed-source models for complex tasks [8][9] Group 3: Architectural Innovations - The removal of the "R1" designation and the introduction of new tokens in V3.1 suggest a potential exploration of "hybrid reasoning" or "model routing" architectures, which could have significant commercial implications [11] - The concept of a "hybrid architecture" aims to optimize inference costs by using a lightweight scheduling model to allocate tasks to the most suitable expert models, potentially enhancing unit economics [12] Group 4: Market Dynamics and Business Models - The drastic reduction in inference costs could lead to a transformation in AI application business models, shifting from per-call or token-based billing to more stable subscription models [13] - As foundational models become commoditized due to open-source competition, the profit distribution within the value chain may shift towards application and solution layers, emphasizing the importance of high-quality private data and industry-specific expertise [14] Group 5: Future Competitive Landscape - The next competitive battleground will focus on "enterprise readiness," encompassing stability, predictability, security, and compliance, rather than solely on performance metrics [15] - Companies that can provide comprehensive solutions, including models, toolchains, and compliance frameworks, will likely dominate the trillion-dollar enterprise market [15]
奥尔特曼:DeepSeek和Kimi是OpenAI开源的重要原因
Huan Qiu Wang Zi Xun· 2025-08-20 08:21
Core Viewpoint - OpenAI's founder Sam Altman believes that the U.S. is underestimating the threat posed by China's next-generation artificial intelligence, and that chip regulations alone are not an effective solution [1][3] Group 1: AI Competition - Altman stated that China can develop faster in reasoning capabilities and has strengths in research and product development [3] - The AI competition between the U.S. and China is deeply intertwined, going beyond simple rankings [3] Group 2: OpenAI's Strategic Shift - OpenAI recently released its first open-weight models, gpt-oss-120b and gpt-oss-20b, marking a significant strategic shift from its long-standing closed-source approach [3] - The decision to release open-weight models was influenced by competition from Chinese models, such as DeepSeek and Kimi K2 [3] - Altman emphasized that if OpenAI did not act, Chinese open-source models would gain widespread adoption, making this a significant factor in their decision [3]