Workflow
Seek .(SKLTY)
icon
Search documents
重磅!中国团队发布SRDA新计算架构,从根源解决AI算力成本问题,DeepSeek“神预言”成真?
Xin Lang Cai Jing· 2025-06-09 13:27
作者 | 玉盘 AI 团队 审核 | 华卫 "大模型每生成 1 美元价值,需支付 3 美元算力成本",算力成本挑战已无争议。从软件层面的各类优化 方案层出不穷,真正从硬件源头着手的方案却屈指可数,市面上能看到的包括 Groq 在内的新计算硬件 也多数在大模型爆发前定型,难以充分匹配大模型本身的需求。 DeepSeek 从用户角度的不少构想与玉盘 SRDA 在做的事不谋而合,包括 IO 融合、3D 堆叠 DRAM 等, 而玉盘进一步提出了更完整的架构设计,或正式拉开下一代大模型专用计算架构的序幕。 今天,国内团队玉盘 AI 发布《SRDA AI 大模型专用计算架构》白皮书,提出了一种全新的计算架构: 系统级精简可重构数据流架构 SRDA (System-level Simplified Reconfigurable Dataflow Architecture), 从硬件源头解决当前 AI 算力的核心瓶颈。 与此同时,DeepSeek 于半个月前发表论文《Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI ...
报道:DeepSeek核心高管离职创业,瞄准Agent赛道
news flash· 2025-06-09 13:02
据虎嗅获悉,某DeepSeek核心高管已悄然离职创业,并将于2025年圣诞节前后发布Agent产品。有信源 告诉虎嗅,该高管系原DeepSeek CTO。不过另有知情人士向虎嗅透露,DeepSeek内部并无明确的CTO 一职,即在DeepSeek内部并无明确为"CTO"的职位,但有承担相应工作的人士。上述知情人士向虎嗅进 一步透露,这位DeepSeek核心高管的创业项目已拿到某头部VC融资。 ...
2025年第18期(总899期):开源大模型DeepSeek实现三个“首
Sou Hu Cai Jing· 2025-06-07 08:35
今天分享的是:2025年第18期(总899期):开源大模型DeepSeek实现三个"首次",应借助开源顺势推动AI普惠化平权化发展 报告共计:10页 开源大模型DeepSeek的创新实践与AI普惠化发展路径 一、DeepSeek:全球开源AI大模型的新标杆 AI大模型开源需满足代码完整、模型参数公开、训练数据透明三大核心标准,较传统软件开源更复杂。此前多数大模型厂商走 纯闭源或"半开源"路线,如OpenAI的GPT-4、Meta的Llama 3仅部分开源且附带商用限制,仅有少数机构实现全栈开源。 DeepSeek则以全栈开源和宽松协议树立新典范:不仅开放代码、权重、文档下载,公开GPRO训练算法等技术细节,还采用无商 用限制的MIT许可,支持用户进行"模型蒸馏",为行业提供了透明、开放的技术基座。 二、DeepSeek的三大突破性"首次" 1. 技术路径革新:开辟大模型发展第二路线 DeepSeek-R1通过纯强化学习(RL)训练证明"小而美"路径的可行性,打破了依赖"Scaling Law"的"唯资源论"定式。其推理成本 与定价显著低于国际主流模型,为资源有限的国家提供了低成本高效能的技术方案,助力缩小全球 ...
中国创新药的DeepSeek时刻:从“跟跑”到局部“领跑”
Core Insights - The recent $1.25 billion upfront payment by 3SBio to Pfizer for the PD-1/VEGF bispecific antibody license marks a significant milestone in the Chinese pharmaceutical industry, reflecting a shift from "follower" to "leader" in innovation [1] - This transaction highlights the evolution of Chinese pharmaceutical companies from producing "me-too" products to developing "first-in-class" innovative drugs, allowing them to gain pricing power based on unique technologies [2] - The global pharmaceutical industry is witnessing a new value chain model where Chinese companies leverage their engineering and cost advantages for early-stage development, while multinational firms utilize their strengths in regulatory science and global market access [2] Industry Transformation - The integration of artificial intelligence (AI) in drug development is transforming the traditional, experience-based process into a data-driven, predictable, and optimized industrial process, significantly reducing time and costs [3] - China's large pool of high-quality engineering talent is being further amplified as drug design becomes more algorithmic, enhancing the country's competitive edge in pharmaceutical innovation [4] - The vast data resources available in China, due to its large patient base and improving healthcare information systems, are becoming a strategic asset for innovation in the AI era [4] Collaborative Ecosystem - China is building a comprehensive AI and biopharmaceutical innovation ecosystem, supported by policy reforms that shorten drug review times and improve market access for innovative drugs [4] - The dual drive of technological and policy innovation is enhancing the overall efficiency and commercialization success rates of the pharmaceutical industry [4] Future Outlook - The ongoing industrial revolution, driven by AI, presents unprecedented opportunities for the Chinese innovative pharmaceutical sector, with the potential for new industry leaders emerging from advancements in ADC, cell therapy, gene editing, and AI drug design [5] - The ability to seize these opportunities will shape the industry landscape for the next decade and beyond, making it a critical consideration for both entrepreneurs and the broader economic transformation in China [5]
摩根士丹利:DeepSeek R2-新一代人工智能推理巨擘?
摩根· 2025-06-06 02:37
Investment Rating - The semiconductor production equipment industry is rated as Attractive [5][70]. Core Insights - The imminent launch of DeepSeek R2, which features 1.2 trillion parameters and significant cost efficiencies, is expected to positively impact the Japanese semiconductor production equipment (SPE) industry [3][7][11]. - The R2 model's capabilities include enhanced multilingual support, broader reinforcement learning, multi-modal functionalities, and improved inference-time scaling, which could democratize access to high-performance AI models [7][9][11]. - The development of efficient AI models like R2 is anticipated to increase demand for AI-related SPE, benefiting companies such as DISCO and Advantest [11]. Summary by Sections DeepSeek R2 Launch - DeepSeek's R2 model is reported to have 1.2 trillion parameters, a significant increase from R1's 671 billion parameters, and utilizes a hybrid Mixture-of-Experts architecture [3][7]. - The R2 model offers cost efficiencies with input costs at $0.07 per million tokens and output costs at $0.27 per million tokens, compared to R1's $0.15-0.16 and $2.19 respectively [3][7]. Industry Implications - The launch of R2 is expected to broaden the use of generative AI, leading to increased demand for AI-related SPE across the supply chain, including devices like dicers, grinders, and testers [11]. - The report reiterates an Overweight rating on DISCO and Advantest, which are positioned to benefit from the anticipated increase in demand for AI-related devices [11]. Company Ratings - DISCO (6146.T) is rated Overweight with a target P/E of 25.1x [12]. - Advantest (6857.T) is also rated Overweight, with a target P/E of 14.0x [15].
摩根士丹利:DeepSeek R2 可能即将发布-对日本SPE行业的影响
摩根· 2025-06-06 02:37
Investment Rating - The semiconductor production equipment industry is rated as Attractive [5] Core Insights - The imminent launch of DeepSeek R2, which features 1.2 trillion parameters and significant cost efficiencies, is expected to positively impact the Japanese semiconductor production equipment (SPE) industry [3][7] - The development of lightweight, high-performing AI models like DeepSeek R2 is anticipated to democratize access to generative AI, thereby expanding the market for AI-related SPE [11] Summary by Sections DeepSeek R2 Characteristics - DeepSeek R2 is reported to have 1.2 trillion parameters, with 78 billion active parameters and utilizes a hybrid Mixture-of-Experts architecture [3] - The input cost for R2 is $0.07 per million tokens, significantly lower than R1's $0.15-0.16, while the output cost is $0.27 compared to R1's $2.19 [3][7] - Enhanced multilingual capabilities and broader reinforcement learning are key upgrades in R2, allowing it to handle various data types including text, image, voice, and video [9][11] Market Implications - The anticipated launch of R2 is expected to boost demand for AI-related devices, including GPU and HBM, as well as custom chips and other AI devices [11] - The report reiterates an Overweight rating on DISCO and Advantest, which are expected to benefit from increased demand for AI-related devices [7][11] Company Ratings - Advantest (6857.T) is rated Overweight with a target price of ¥10,300 based on expected earnings peak [16] - DISCO (6146.T) is also rated Overweight with a target P/E of 25.1x based on earnings estimates [13]
DeepSeek新版R1模型实际性能如何?第三方评测来了
Nan Fang Du Shi Bao· 2025-06-05 12:26
DeepSeek还指出,相较于旧版 R1,新版模型在复杂推理任务中的表现有了显著提升。例如在衡量数学 推理能力的AIME 2025测试中,新版模型准确率由旧版的 70% 提升至 87.5%。 此前,DeepSeek更新R1模型时提到,新版R1 针对"幻觉"问题进行了优化。与旧版相比,更新后的模型 在改写润色、总结摘要、阅读理解等场景中,幻觉率降低了45%-50%左右,能够有效地提供更为准确、 可靠的结果。 SuperCLUE的测评结果显示,新版R1模型在榜单上超过o3,居于第四位,总分63.55,比旧版R1提升 1.61分。相比之下,o4-mini(high)在被测模型中得分最高,为70.51分;Gemini 2.5 Pro preview 05-06为 66.48分,居第二。 5月29日,DeepSeek(深度求索)时隔四个月发布R1模型的升级版本。中文大模型权威测评机构 SuperCLUE于6月4日发布的结果显示,新版R1模型的总体表现比旧版有所提升,超过OpenAI的o3模 型,但相比于o4-mini(high)、谷歌Gemini 2.5 Pro Preview 05-06等模型仍有一定差距。 此外,R ...
DeepSeek发源地再推人工智能创新高地方案!科创板人工智能ETF(588930)现涨超2%,实时成交额突破6000万元
Mei Ri Jing Ji Xin Wen· 2025-06-05 06:55
Group 1 - The core viewpoint of the news is the significant development and investment in artificial intelligence (AI) in Hangzhou, with specific targets set for 2025, including a market-scale computing power exceeding 50 EFLOPS and a revenue target for the AI core industry exceeding 390 billion yuan [1] - The implementation plan for AI innovation in Hangzhou aims to cultivate two internationally leading foundational models and over 25 industry-specific influential models, alongside establishing more than 700 large-scale enterprises in the AI sector [1] - The A-share market showed a slight fluctuation, but AI-related stocks surged, with notable increases in companies like Yuke Technology and Chipone Technology, indicating a high market interest in AI themes [1] Group 2 - Shanxi Securities highlighted the growing global demand for AI computing power, particularly driven by large model training and inference, presenting significant opportunities for domestic AI and server manufacturers [2] - The domestic demand for AI computing power remains strong, especially from major internet companies and intelligent computing centers, with IDC predicting the accelerated server market in China to reach $25.3 billion by 2028, growing at a compound annual growth rate of over 20% from 2024 to 2028 [2] - The introduction of DeepSeek R1 is expected to lower the barriers for AI application development and deployment, making inference demand a primary growth driver for AI computing power, thus expanding market space for domestic manufacturers [2]
美的空调怎么样?DeepSeek看起来是真的香!
Cai Fu Zai Xian· 2025-06-04 06:39
Core Viewpoint - The Midea Fresh Air Machine T6 is positioned as a multifunctional air management solution that prioritizes health and comfort, particularly for families with children, by addressing various air quality concerns and providing a holistic approach to indoor air management [1][10]. Group 1: Product Features - The Midea Fresh Air Machine T6 integrates six functions: air conditioning, fresh air, air purification, disinfection, dehumidification, and humidification, making it a versatile "air steward" [3]. - The air conditioning feature utilizes a unique design with irregular micro-holes to soften strong winds, creating a comfortable airflow experience without the sensation of direct wind [3]. - The device includes a 3-liter water tank for independent humidification at a rate of 450ml/h, ensuring continuous moisture for up to 6 hours, and a powerful dehumidification capability of 5.03kg/h to combat humidity [5]. Group 2: Health and Safety - The air purification and fresh air functions can quickly restore a clean atmosphere in the home, effectively removing odors and airborne particles [7]. - The device is capable of eliminating common bacteria such as E. coli and H1N1, ensuring a healthy air environment [8]. - It features DeepSeek technology that automatically senses and adjusts air humidity, temperature, and airflow, enhancing user convenience and health [8]. Group 3: User Experience - The Midea Fresh Air Machine T6 is designed for ease of use, with strong voice interaction capabilities that allow users, including children, to operate it effortlessly [8]. - The product is perceived not just as a machine but as a comprehensive approach to air quality management, reflecting a growing awareness of the importance of air quality in daily life [10].
DeepSeek与ChatGPT:免费与付费背后的选择逻辑
Sou Hu Cai Jing· 2025-06-04 06:29
Core Insights - The emergence of DeepSeek, a domestic open-source AI model, has sparked discussions due to its free advantages, yet many still prefer to pay for ChatGPT, raising questions about user preferences and the quality of AI outputs [1][60]. - The output quality of AI tools is significantly influenced by user interaction, with 70% of the output quality depending on how users design their prompts [4][25]. Technical Differences - DeepSeek utilizes a mixed expert model with a training cost of $5.5 million, making it a cost-effective alternative compared to ChatGPT, which has training costs in the hundreds of millions [2]. - In the Chatbot Arena test, DeepSeek ranked third, demonstrating competitive performance, particularly excelling in mathematical reasoning with a 97.3% accuracy rate in the MATH-500 test [2]. Performance in Specific Scenarios - DeepSeek has shown superior performance in detailed analyses and creative writing tasks, providing comprehensive insights and deep thinking capabilities [3][17]. - The model's reasoning process is more transparent but requires structured prompts for optimal use, indicating that user guidance is crucial for maximizing its potential [7][12]. Cost and Efficiency - DeepSeek's pricing is 30% lower than ChatGPT, with a processing efficiency that is 20% higher and energy consumption reduced by 25% [8][9]. - For enterprises, private deployment of DeepSeek can be cost-effective in the long run, with a one-time server investment of around $200,000, avoiding ongoing API fees [9][10]. Deployment Flexibility - DeepSeek offers flexibility in deployment, allowing individual developers to run the 7B model on standard hardware, while enterprise setups can support high concurrency [11][10]. - The model's ability to run on lightweight devices significantly lowers the barrier for AI application [11]. Advanced Prompting Techniques - Mastery of advanced prompting techniques, such as "prompt chaining" and "reverse thinking," can significantly enhance the effectiveness of DeepSeek [13][14]. - The model's performance can be optimized by using multi-role prompts, allowing it to balance professionalism and readability [15][16]. Language Processing Capabilities - DeepSeek demonstrates a 92.7% accuracy rate in Chinese semantic understanding, surpassing ChatGPT's 89.3%, and supports classical literature analysis and dialect recognition [17]. Industry Applications - In finance, DeepSeek has improved investment decision efficiency by 40% for a securities company [18]. - In the medical field, it has achieved an 85% accuracy rate in disease diagnosis, nearing the level of professional doctors [19]. - For programming assistance, DeepSeek's error rate is 23% lower than GPT-4.5, with a 40% faster response time [20]. Complementary Nature of AI Tools - DeepSeek and ChatGPT are not mutually exclusive but serve as complementary tools, each suited for different tasks based on user needs [21][22]. - DeepSeek is preferable for deep reasoning, specialized knowledge, and data privacy, while ChatGPT excels in multi-modal interaction and creative content generation [24][22]. Importance of Prompting Skills - The ability to design effective prompts is becoming a core competency in the AI era, influencing the quality of AI outputs [54][55]. - The book "DeepSeek Application Advanced Tutorial" aims to enhance users' prompting skills and unlock the model's full potential [61].