DeepSeek R2模型

Search documents
X @外汇交易员
外汇交易员· 2025-08-29 13:11
The Information:知情人士透露,DeepSeek已决定使用华为的人工智能芯片来训练其部分模型,反映其正在减少对英伟达芯片的依赖,此举是在中国政府对国内科技企业施压后做出的。 https://t.co/DLk8n2yMUL外汇交易员 (@myfxtrader):FT:DeepSeek R2模型推迟发布与使用华为昇腾芯片有关。两名知情人士透露,华为曾派出一个工程师团队前往DeepSeek办公室进行现场协助,但未能取得成功。DeepSeek仍在与华为合作,使该模型在推理方面与昇腾芯片兼容。知情人士又称,梁文峰在内部表达对R2进展的不满意。 https://t.co/4GVJbkwCG6 ...
屹唐股份起诉应用材料;阿里启动近千人招聘丨新鲜早科技
2 1 Shi Ji Jing Ji Bao Dao· 2025-08-14 01:47
21世纪经济报道新质生产力研究院综合报道 阿里启动近千人招聘 阿里巴巴集团旗下智能信息事业群近日启动了大规模AI人才招聘计划,社招与校招总规模近千人。此 次招聘聚焦大语言模型、多模态识别与理解、多模态训练工程以及智能体应用、AI硬件等前沿技术领 域,工作地点覆盖北京、上海、杭州、广州等城市。 早上好,新的一天又开始了。在过去的24小时内,科技行业发生了哪些有意思的事情?来跟21tech一起 看看吧。 【巨头风向标】 ChatGPT最新功能更新 8月13日,OpenAI首席执行官山姆·阿尔特曼在社交媒体平台发文介绍ChatGPT最新功能更新情况。其中 提到,目前GPT-5在响应模式的选择上提供"自动""快速"和"深度思考"三种模式,阿尔特曼称,多数用 户适合"自动"模式,但新增的控制选项将为特定需求用户提供便利。他还表示,开发中的GPT-5新版人 格将比现有版本更显亲和,同时避免GPT-4o的过度热情(基于多数用户反馈)。"团队深刻认识到:未 来必须实现更个性化的AI人格定制功能。"他说。(智通财经) 苹果回应马斯克垄断指控 针对特斯拉CEO埃隆·马斯克指控苹果应用商店偏袒OpenAI、违反反垄断法,苹果公司作 ...
专家访谈汇总:DeepSeek二代模型因芯片短缺遭遇开发困境
阿尔法工场研究院· 2025-06-29 13:15
Group 1: AI and Technology - The satellite internet and quantum technology sectors are showing positive performance, with companies in telecommunications, optical communications, and satellite internet expected to experience a new growth phase [1] - The demand for AI continues to grow, particularly as large enterprises like Oracle and Meta increase capital expenditures, indicating strong growth potential for optical modules as foundational components of computing clusters [1] - DeepSeek's next-generation R2 AI model development is facing challenges due to a shortage of Nvidia H20 processors in the Chinese market, impacting the training process of the model [3][2] - The reliance of top Chinese AI companies on American hardware is highlighted by the export restrictions, which poses a significant vulnerability despite DeepSeek's claims of lower resource investment compared to American firms like OpenAI [2] Group 2: Precious and Industrial Metals - The demand for gold remains strong due to U.S. fiscal issues and a weakening dollar credit system, with expectations for gold prices to continue rising [1] - The supply-demand gap for gold is expected to persist throughout the year, with a gradual improvement in fundamentals and a potential downward convergence of the gold-silver ratio, suggesting silver may enter a phase of catch-up [1] - The demand for energy metals is supported by the robust outlook for the electric vehicle and photovoltaic industries, although the supply side remains in an oversupply situation, keeping prices at the bottom range [1] - Economic growth significantly impacts the prices of non-ferrous metals, with manufacturing PMI new orders closely correlating with metal prices, while discrepancies in U.S. manufacturing orders and inventory data indicate potential price uncertainties [3] - Changes in overseas inventory are negatively correlated with metal prices, particularly for tin, copper, lead, and aluminum, suggesting significant impacts from inventory fluctuations [3]
DeepSeek R1模型完成“小版本试升级”,编程、逻辑理解上了一个层次!
华尔街见闻· 2025-05-29 00:57
Core Viewpoint - DeepSeek has released an updated version of its R1 model, enhancing its capabilities in semantic understanding, complex logical reasoning, and long text processing stability, amidst escalating competition in the AI sector [1][2]. Group 1: Model Enhancements - The R1 model has significantly improved its understanding capabilities, with user feedback indicating a notable increase in performance, particularly in activating parameters and presenting key information logically [3]. - Programming capabilities have also seen a substantial upgrade, with users reporting the ability to generate over 1000 lines of code without bugs [4]. - The R1 model is now considered competitive with Claude 4, a leading programming model [5]. Group 2: Previous Model Performance - Earlier this year, DeepSeek released the DeepSeek-V3-0324 model, which outperformed Claude-3.7-Sonnet in various assessments, particularly in mathematics and coding tasks, and was noted for its strong performance in reasoning tasks despite being a non-reasoning model [6]. - The cost-effectiveness of the R1 model is highlighted, being priced at only 1/11 of Claude-3.7-Sonnet and 1/277 of GPT-4.5, while also being open-source and free for commercial use [7]. Group 3: Market Impact - The emergence of the R1 model has led to a decline in global tech stocks, as investors question the necessity of significant investments by companies like Microsoft in developing advanced AI models and services [8]. Group 4: Future Developments - There is ongoing speculation regarding the release of the R2 model, which is expected to enhance code generation capabilities and reasoning in multiple languages. Initial plans for its release were set for early May [9]. - The R2 model is anticipated to utilize a more advanced mixture of experts model, with a total parameter count projected to reach 1.2 trillion, significantly reducing reasoning costs compared to GPT-4 [10]. - Despite the speculation, DeepSeek has not officially confirmed any details regarding the R2 model's release timeline [11].
还在等DeepSeek R2?刚刚,DeepSeek R1模型小版本试升级已完成!优化了这些方面
Mei Ri Jing Ji Xin Wen· 2025-05-28 13:03
Core Viewpoint - DeepSeek has announced the completion of a minor version upgrade for its R1 model, inviting users to test the new features on its official website, app, and mini-programs while maintaining existing API interfaces and usage methods [1]. Group 1: Upgrade Features - The upgrade focuses on several key areas: 1. Response quality optimization, enhancing accuracy in complex reasoning and multi-step calculations, as well as improving coherence and clarity in long text understanding and generation, and reliability in specialized outputs like mathematics and programming [2]. 2. A slight improvement in response speed, with a 10% to 20% reduction in latency, particularly when processing long text inputs across web, app, and API interfaces [2][4]. 3. Enhanced dialogue stability, with improved context memory, especially in long conversations, supporting up to 128K context and reducing instances of "forgetting settings" or "going off track" [4]. 4. API and interface compatibility remains stable, with no changes to API calling methods, parameters, or return structures, allowing users to seamlessly use the new version without adjustments [5]. Group 2: Upgrade Process - The upgrade is termed a "trial upgrade" due to: 1. It being a "gray release," where a portion of users will experience the upgrade first [6]. 2. The company will collect feedback to ensure stability before a full rollout [6]. 3. Users of the official app, website, or mini-program may already be using the upgraded version in "Deep Thinking" mode [6]. Group 3: Future Developments - There is ongoing speculation regarding the release of the DeepSeek R2 model, with the company previously denying rumors about its launch on March 17 [6].
全网都在等梁文锋
凤凰网财经· 2025-04-29 12:39
以下文章来源于凤凰网科技 ,作者凤凰网科技 凤凰网科技 . 凤凰科技频道官方账号,带你直击真相。 来源|凤凰网科技 作者|姜凡 编辑|董雨晴 五月将至,中美科技巨头或将迎来新一轮巅峰对决。 先是在4月中旬,OpenAI一口气发布了GPT-4.1 o3、o4 mini系列模型;谷歌则拿出了Gemini 2.5 Flash Preview,一个混合推理模型;与谷歌同 一天,豆包在杭州巡展中正式发布了1.5·深度思考模型,在多模态上展现出了更强的实力。凤凰网科技从行业人士处了解到,阿里的下一代大模型 Qwen3也将于本月内发布。 混战之下,那股"神秘的东方力量"似乎也在悄悄准备着新的发布。 敏感的神经之下,一点蛛丝马迹都会被放大。 昨日,全球最大AI开源社区Hugging Face首席执行官Clément Delangue在社交平台发布了一条耐人 寻味的动态。这条动态仅由三个眼睛的表情符号构成,并附上了DeepSeek团队在Hugging Face平台的官方资源库入口。 这组充满悬念的组合引发科技圈热议,业内普遍推测DeepSeek R2模型已进入发布倒计时。 01 DeepSeek R2发布已进入倒计时? 近半个 ...
全网都在等梁文锋
投中网· 2025-04-29 06:21
凤凰科技频道官方账号,带你直击真相。 将投中网设为"星标⭐",第一时间收获最新推送 以下文章来源于凤凰网科技 ,作者凤凰网科技 凤凰网科技 . DeepSeek R2模型要来了? 作者丨 姜凡 编辑丨 董雨晴 来源丨 凤凰网科技 五月将至,中美科技巨头或将迎来新一轮巅峰对决。 先是在4月中旬,OpenAI一口气发布了GPT-4.1 o3、o4 mini系列模型;谷歌则拿出了Gemini 2.5 Flash Preview,一个混合推理模型;与谷歌同一天,豆包在杭州巡展中正式发布了1.5·深度思 考模型,在多模态上展现出了更强的实力。凤凰网科技从行业人士处了解到,阿里的下一代大模型 Qwen3也将于本月内发布。 混战之下,那股"神秘的东方力量"似乎也在悄悄准备着新的发布。 敏感的神经之下,一点蛛丝马迹都会被放大。 昨日,全球最大AI开源社区Hugging Face首席执行 官Clément Delangue在社交平台发布了一条耐人寻味的动态。这条动态仅由三个眼睛的表情符号构 成,并附上了DeepSeek团队在Hugging Face平台的官方资源库入口。 这组充满悬念的组合引发科技圈热议,业内普遍推测DeepS ...
速递|DeepSeek加速R2模型研发,计划5月前推出,新模型将强化代码能力
Z Finance· 2025-02-26 08:19
Core Viewpoint - DeepSeek's low-cost AI inference model has caused over $1 trillion in market fluctuations globally, outperforming many Western competitors [1][2]. Group 1: DeepSeek's AI Model - DeepSeek is accelerating the launch of its successor to the R1 model, initially planned for May, now aimed for an earlier release without a specific timeline [1]. - The new R2 model is expected to enhance code generation capabilities and expand to more non-English languages [1]. - The R1 model, utilizing relatively weaker Nvidia chips, competes effectively with high-end AI models developed by major US tech companies that have invested billions [1]. Group 2: Industry Impact and Competition - The release of DeepSeek's R2 model may serve as a pivotal moment in the AI industry, potentially prompting global companies to accelerate their own R&D efforts and challenge the current dominance of a few major players [1]. - The U.S. government may express increased concerns regarding the R2 launch, as it could further motivate Chinese companies to enhance their AI strategies [1]. - Numerous Chinese firms have already indicated plans to integrate DeepSeek's models into their products [1]. Group 3: Company Background - Information about DeepSeek is limited, with its founder Liang Wenfeng having gained billionaire status through the quantitative hedge fund Huanfang Quant [2]. - The company is characterized more as a research laboratory than a traditional profit-driven enterprise, as indicated by insights from former employees and industry professionals [2].