Workflow
Seek .(SKLTY)
icon
Search documents
“千帆”系列昇腾DeepSeek技术沙龙重庆站成功举办
Sou Hu Cai Jing· 2025-06-10 23:57
Group 1 - The event "DeepSeek Technology Salon" was successfully held in Chongqing, focusing on the collaboration between Ascend AI and DeepSeek technology, with over 100 experts from more than 40 industry clients and partners participating [1][4] - A joint solution for educational intelligent applications was launched by JiFa Education and Huawei, integrating multi-modal AI and adaptive learning algorithms to support various educational scenarios [2] - The Chongqing Artificial Intelligence Innovation Center aims to leverage its computing power resources to create a leading public service platform for AI in the western region, contributing to the development of the AI industry nationwide [4][13] Group 2 - Huawei's strategy is shifting from traditional "software customization + services" to a model combining "computing power + data + large models," enhancing service value and reshaping the industry landscape [8][10] - The Chongqing OpenLab is designed to provide a comprehensive platform for innovation, integrating various centers to facilitate the successful implementation of industry solutions and business scenarios [11] - The Chongqing Artificial Intelligence Innovation Center's first phase has a computing power of 400P, focusing on promoting AI technology innovation, industry development, and talent cultivation [13]
十大推理模型挑战2025年高考数学题:DeepSeek-R1、腾讯混元T1并列第一,马斯克的Grok 3遭遇“滑铁卢”
Mei Ri Jing Ji Xin Wen· 2025-06-10 13:53
Core Insights - The discussion around the difficulty of mathematics in the 2025 college entrance examination continues to be a hot topic, with a focus on the performance of various AI reasoning models in a standardized test based on the new curriculum mathematics I paper [1] Group 1: AI Model Performance - The evaluation tested ten AI reasoning models, including DeepSeek-R1, Tencent's Mix Yuan T1, OpenAI's o3, Google's Gemini 2.5 Pro, and xAI's Grok 3, to assess their mathematical capabilities [1] - DeepSeek-R1 and Tencent's Mix Yuan T1 achieved perfect scores of 117, demonstrating exceptional performance in algebra and function problems [4] - The scores of other models included: iFlytek's Spark X1 with 112, Gemini 2.5 Pro with 109, OpenAI's o3 with 107, Alibaba's Qwen 3 with 106, and Doubao's Deep Thinking with 104 [2][7] Group 2: Evaluation Methodology - The assessment utilized a standardized test with a total score of 150, but excluded questions requiring graphical analysis to ensure a level playing field among the models [3] - Scoring was based on high school examination standards, with a focus on final answers for open-ended questions rather than the process [3] Group 3: Notable Failures - Grok 3, developed by xAI and touted as the "strongest AI," ranked third from the bottom with a score of 91, primarily due to its inability to correctly interpret multiple-choice questions [8] - The second lowest was the Zhiyu Qingyan reasoning model, scoring 78, which often faltered at the final step of reasoning, leading to lost points [8][10] - Kimi k1.5 ranked last, suffering significant score losses on the final two challenging questions [10]
一文了解DeepSeek和OpenAI:企业家为什么需要认知型创新?
Sou Hu Cai Jing· 2025-06-10 12:49
Core Insights - The article emphasizes the transformative impact of AI on business innovation and the necessity for companies to adapt their strategies to remain competitive in the AI era [1][4][40] Group 1: OpenAI's Journey - OpenAI was founded in 2015 by Elon Musk and Sam Altman with the mission to counteract the monopolistic tendencies of tech giants and promote open, safe, and accessible AI [4][7] - The development of large language models (LLMs) by OpenAI is attributed to the effective use of the Transformer architecture and the Scaling Law, which predicts a linear relationship between model size, training data, and computational resources [8][11] - The emergence of capabilities in models like GPT is described as a phenomenon of "emergence," where models exhibit unexpected abilities when certain thresholds of parameters and data are reached [12][13] Group 2: DeepSeek's Strategy - DeepSeek adopts a "Limited Scaling Law" approach, focusing on maximizing efficiency and performance with limited resources, contrasting with the resource-heavy strategies of larger AI firms [18][22] - The company employs innovative model architectures such as Multi-Head Latent Attention (MLA) and Mixture of Experts (MoE) to optimize performance while minimizing costs [20][21] - DeepSeek's R1 model, released in January 2025, showcases its ability to perform complex reasoning tasks without human feedback, marking a significant advancement in AI capabilities [23][25] Group 3: Organizational Innovation - DeepSeek promotes an AI Lab paradigm that encourages open collaboration, resource sharing, and dynamic team structures to foster innovation in AI development [27][28] - The organization emphasizes self-organization and autonomy among team members, allowing for a more flexible and responsive approach to research and development [29][30] - The company's success is attributed to breaking away from traditional corporate constraints, enabling a culture of creativity and exploration in foundational research [34][38]
重磅!中国团队发布SRDA新计算架构,从根源解决AI算力成本问题,DeepSeek“神预言”成真?
Xin Lang Cai Jing· 2025-06-09 13:27
作者 | 玉盘 AI 团队 审核 | 华卫 "大模型每生成 1 美元价值,需支付 3 美元算力成本",算力成本挑战已无争议。从软件层面的各类优化 方案层出不穷,真正从硬件源头着手的方案却屈指可数,市面上能看到的包括 Groq 在内的新计算硬件 也多数在大模型爆发前定型,难以充分匹配大模型本身的需求。 DeepSeek 从用户角度的不少构想与玉盘 SRDA 在做的事不谋而合,包括 IO 融合、3D 堆叠 DRAM 等, 而玉盘进一步提出了更完整的架构设计,或正式拉开下一代大模型专用计算架构的序幕。 今天,国内团队玉盘 AI 发布《SRDA AI 大模型专用计算架构》白皮书,提出了一种全新的计算架构: 系统级精简可重构数据流架构 SRDA (System-level Simplified Reconfigurable Dataflow Architecture), 从硬件源头解决当前 AI 算力的核心瓶颈。 与此同时,DeepSeek 于半个月前发表论文《Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI ...
报道:DeepSeek核心高管离职创业,瞄准Agent赛道
news flash· 2025-06-09 13:02
Core Insights - A core executive from DeepSeek has quietly left to start a new venture, planning to launch an Agent product around Christmas 2025 [1] - The departing executive is reported to be the former CTO of DeepSeek, although there is no official CTO position within the company [1] - The new startup has secured funding from a prominent venture capital firm [1]
DeepSeek核心高管离职创业,瞄准Agent赛道|独家
Hu Xiu· 2025-06-09 08:24
出品|虎嗅科技组 作者|宋思杭 编辑|苗正卿 头图|视觉中国 虎嗅从多个独立信源获悉,半年前,某DeepSeek核心高管已悄然离职创业,并将于2025年圣诞节前后 发布Agent产品。有信源告诉虎嗅,该高管系原DeepSeek CTO。不过另有知情人士向虎嗅透露, DeepSeek内部并无明确的CTO一职,即在DeepSeek内部并无明确为"CTO"的职位,但有承担相应工作的 人士。 上述知情人士向虎嗅进一步透露,这位DeepSeek核心高管的创业项目已拿到IDG资本融资。对此,虎嗅 向IDG方面求证,IDG相关人士表示对此事并不知情。有业内人士向虎嗅表示,通常情况下这种"不予置 评"的态度在投资圈并不罕见,尤其是涉及高敏感度的人才流动和前沿技术赛道时。 将时间倒推半年前,2024年12月至2025年1月正是DeepSeek最热的时期,彼时DeepSeek前后分别发布并 开源了具有极致降本增效能力的V3模型和推理模型R1.而该核心高管却选择在此时离职创业并切入 Agent赛道,这一时间点的选择耐人寻味。 由此产生的几个疑问是:这位核心高管究竟为何选择在DeepSeek最热的时期推出?如果要做Agent,为 何 ...
2025年第18期(总899期):开源大模型DeepSeek实现三个“首
Sou Hu Cai Jing· 2025-06-07 08:35
Core Insights - DeepSeek has established itself as a new benchmark in the global open-source AI model landscape, adhering to three core standards: complete code, public model parameters, and transparent training data, which sets it apart from traditional software open-source practices [1][13][14]. Group 1: DeepSeek's Innovations - DeepSeek has achieved three groundbreaking "firsts" in the AI model domain: 1. It has pioneered a second development path for large models through pure reinforcement learning (RL), demonstrating a viable "small but beautiful" approach that significantly reduces inference costs compared to mainstream models, thus aiding resource-limited countries [2][17]. 2. The application of DeepSeek has surged, with its app reaching 16 million downloads in just 18 days and daily active users surpassing 30 million, setting industry records and attracting global media attention [3][18]. 3. DeepSeek has initiated an "Android moment" in the AI field by fostering a comprehensive ecosystem that integrates models, chips, and systems, attracting numerous hardware and software manufacturers globally [4][20]. Group 2: Recommendations for AI Inclusivity - To promote AI inclusivity and equity, the following strategies are recommended: 1. Strengthen collaborative innovation by leveraging open-source platforms like GitHub and Hugging Face to encourage enterprises and research institutions to engage in secondary development based on DeepSeek's open-source achievements [5][21]. 2. Accelerate the application of open-source large models across various industries, developing specialized models and high-quality datasets to support the modernization of industries [6][21]. 3. Enhance public understanding of AI through educational initiatives, fostering partnerships between enterprises and educational institutions to build development platforms and organize events to raise awareness of AI technologies [7][22]. Group 3: Conclusion - The emergence of DeepSeek signifies a transition from technical exploration to ecosystem construction in open-source large models, with its low-cost, high-performance, and fully open characteristics reshaping the competitive landscape and providing a feasible path for global AI inclusivity and equity [8].
中国创新药的DeepSeek时刻:从“跟跑”到局部“领跑”
Core Insights - The recent $1.25 billion upfront payment by 3SBio to Pfizer for the PD-1/VEGF bispecific antibody license marks a significant milestone in the Chinese pharmaceutical industry, reflecting a shift from "follower" to "leader" in innovation [1] - This transaction highlights the evolution of Chinese pharmaceutical companies from producing "me-too" products to developing "first-in-class" innovative drugs, allowing them to gain pricing power based on unique technologies [2] - The global pharmaceutical industry is witnessing a new value chain model where Chinese companies leverage their engineering and cost advantages for early-stage development, while multinational firms utilize their strengths in regulatory science and global market access [2] Industry Transformation - The integration of artificial intelligence (AI) in drug development is transforming the traditional, experience-based process into a data-driven, predictable, and optimized industrial process, significantly reducing time and costs [3] - China's large pool of high-quality engineering talent is being further amplified as drug design becomes more algorithmic, enhancing the country's competitive edge in pharmaceutical innovation [4] - The vast data resources available in China, due to its large patient base and improving healthcare information systems, are becoming a strategic asset for innovation in the AI era [4] Collaborative Ecosystem - China is building a comprehensive AI and biopharmaceutical innovation ecosystem, supported by policy reforms that shorten drug review times and improve market access for innovative drugs [4] - The dual drive of technological and policy innovation is enhancing the overall efficiency and commercialization success rates of the pharmaceutical industry [4] Future Outlook - The ongoing industrial revolution, driven by AI, presents unprecedented opportunities for the Chinese innovative pharmaceutical sector, with the potential for new industry leaders emerging from advancements in ADC, cell therapy, gene editing, and AI drug design [5] - The ability to seize these opportunities will shape the industry landscape for the next decade and beyond, making it a critical consideration for both entrepreneurs and the broader economic transformation in China [5]
摩根士丹利:DeepSeek R2-新一代人工智能推理巨擘?
摩根· 2025-06-06 02:37
Investment Rating - The semiconductor production equipment industry is rated as Attractive [5][70]. Core Insights - The imminent launch of DeepSeek R2, which features 1.2 trillion parameters and significant cost efficiencies, is expected to positively impact the Japanese semiconductor production equipment (SPE) industry [3][7][11]. - The R2 model's capabilities include enhanced multilingual support, broader reinforcement learning, multi-modal functionalities, and improved inference-time scaling, which could democratize access to high-performance AI models [7][9][11]. - The development of efficient AI models like R2 is anticipated to increase demand for AI-related SPE, benefiting companies such as DISCO and Advantest [11]. Summary by Sections DeepSeek R2 Launch - DeepSeek's R2 model is reported to have 1.2 trillion parameters, a significant increase from R1's 671 billion parameters, and utilizes a hybrid Mixture-of-Experts architecture [3][7]. - The R2 model offers cost efficiencies with input costs at $0.07 per million tokens and output costs at $0.27 per million tokens, compared to R1's $0.15-0.16 and $2.19 respectively [3][7]. Industry Implications - The launch of R2 is expected to broaden the use of generative AI, leading to increased demand for AI-related SPE across the supply chain, including devices like dicers, grinders, and testers [11]. - The report reiterates an Overweight rating on DISCO and Advantest, which are positioned to benefit from the anticipated increase in demand for AI-related devices [11]. Company Ratings - DISCO (6146.T) is rated Overweight with a target P/E of 25.1x [12]. - Advantest (6857.T) is also rated Overweight, with a target P/E of 14.0x [15].
摩根士丹利:DeepSeek R2 可能即将发布-对日本SPE行业的影响
摩根· 2025-06-06 02:37
Investment Rating - The semiconductor production equipment industry is rated as Attractive [5] Core Insights - The imminent launch of DeepSeek R2, which features 1.2 trillion parameters and significant cost efficiencies, is expected to positively impact the Japanese semiconductor production equipment (SPE) industry [3][7] - The development of lightweight, high-performing AI models like DeepSeek R2 is anticipated to democratize access to generative AI, thereby expanding the market for AI-related SPE [11] Summary by Sections DeepSeek R2 Characteristics - DeepSeek R2 is reported to have 1.2 trillion parameters, with 78 billion active parameters and utilizes a hybrid Mixture-of-Experts architecture [3] - The input cost for R2 is $0.07 per million tokens, significantly lower than R1's $0.15-0.16, while the output cost is $0.27 compared to R1's $2.19 [3][7] - Enhanced multilingual capabilities and broader reinforcement learning are key upgrades in R2, allowing it to handle various data types including text, image, voice, and video [9][11] Market Implications - The anticipated launch of R2 is expected to boost demand for AI-related devices, including GPU and HBM, as well as custom chips and other AI devices [11] - The report reiterates an Overweight rating on DISCO and Advantest, which are expected to benefit from increased demand for AI-related devices [7][11] Company Ratings - Advantest (6857.T) is rated Overweight with a target price of ¥10,300 based on expected earnings peak [16] - DISCO (6146.T) is also rated Overweight with a target P/E of 25.1x based on earnings estimates [13]