DeepSeek
Search documents
智谱AI CEO张鹏:DeepSeek对我们影响比较大
Xin Lang Cai Jing· 2026-01-08 03:33
新浪声明:所有会议实录均为现场速记整理,未经演讲者审阅,新浪网登载此文出于传递更多信息之目 的,并不意味着赞同其观点或证实其描述。 责任编辑:李思阳 专题:未竟之约:张小珺访谈录 近日在《未竟之约》栏目中,智谱AI CEO张鹏在与张小珺对话中表示,谈到DeepSeek时表示,"对我们 影响比较大"。 他表示,DeepSeek从研究层面,工程层面,甚至是包括市场层面的冲击都比较大。他表示,春节一回 来就在密集的讨论这件事情,确实有很多的启示和提醒,也学习到很多东西。 专题:未竟之约:张小珺访谈录 近日在《未竟之约》栏目中,智谱AI CEO张鹏在与张小珺对话中表示,谈到DeepSeek时表示,"对我们 影响比较大"。 他表示,DeepSeek从研究层面,工程层面,甚至是包括市场层面的冲击都比较大。他表示,春节一回 来就在密集的讨论这件事情,确实有很多的启示和提醒,也学习到很多东西。 他指出,讨论的结论就是说其实应该更开放式的,打开自己的这个视野,开放式地看待大模型的研究和 市场,很多时候这些因素都搅和在一起,很难把它理得非常的清楚和分割得非常的开,所以还是需要各 方的协同,以更开放的态度来看待这些事情,自己的研究 ...
xAI 200亿美元之后:大模型竞赛开始拼交付
Tai Mei Ti A P P· 2026-01-08 01:43
Core Insights - The article emphasizes a shift in the AI industry from a model-centric competition to a delivery-centric competition, highlighting that while models determine the upper limits of capability, the infrastructure and delivery mechanisms are crucial for scaling and monetizing these capabilities [1][10][13] Group 1: Shift in Focus from Models to Delivery - The transition from model competition to delivery competition is driven by three constraints: rising costs of training and inference, accelerated capability diffusion, and the need for a robust commercial closure [2][8] - The marginal cost of achieving cutting-edge capabilities is increasing, making it essential for leading models to be supported by lower inference costs and stable delivery quality to realize their advantages in scalable scenarios [2][9] Group 2: xAI's $20 Billion Significance - xAI's $20 billion investment is aimed at enhancing its second and third layers of competitive capability, focusing on infrastructure and delivery systems rather than just model development [3][10] - The investment emphasizes the expansion of computational infrastructure and the establishment of a visible asset base with over one million H100 equivalent GPUs, thereby enhancing supply certainty [3][6] Group 3: Competitive Landscape and Capability Layers - The competitive landscape is structured into three layers: model and training methods (first layer), infrastructure and supply chain (second layer), and distribution and entry points (third layer) [3][4] - Major players like Google excel across all three layers, while others like OpenAI and Meta have strengths in specific areas, indicating a need for companies to enhance their infrastructure and delivery capabilities to remain competitive [6][10] Group 4: Future Competition Dynamics - The future competition is expected to resemble a platform war rather than a model elimination race, with a focus on scaling delivery capabilities and ensuring compliance and stability [10][11] - The probability of a single company dominating the global market is low due to the decentralized nature of user preferences and regulatory environments, leading to a scenario where platforms excel in delivery and compliance [11][13] Group 5: Key Indicators for Future Success - Companies should focus on three leading indicators: unit inference cost curves, entry penetration rates, and delivery capabilities to assess competitive positioning in the evolving landscape [9][13] - The ability to convert model capabilities into scalable cash flows will depend on performance in these three areas, marking a significant shift in how success is measured in the AI industry [9][10]
新年首炸!DeepSeek提出mHC架构破解大模型训练难题
Sou Hu Cai Jing· 2026-01-07 09:13
Core Insights - DeepSeek has introduced a new architecture called mHC aimed at addressing stability issues in large-scale model training while maintaining performance improvements [1][11]. Group 1: Problem Identification - Large models face a dilemma in training stability, where traditional single-channel connections lead to information congestion as model size increases [3][5]. - Previous solutions, like the hyper-connection approach, improved efficiency but introduced new issues such as uncontrolled information amplification or suppression, leading to gradient explosion and training failures [5][7][9]. Group 2: mHC Architecture - The mHC architecture incorporates an intelligent scheduling system for multi-channel connections, utilizing the Sinkhorn-Knopp algorithm to maintain energy conservation during information transmission [11][13]. - Additional design features include non-negative constraints on input-output mappings to prevent useful signal loss due to coefficient cancellation [15]. Group 3: Infrastructure Optimization - DeepSeek has optimized its infrastructure by merging multiple computation steps into a single operator, reducing memory read/write cycles and employing recomputation strategies to lower memory usage [16][18]. - These optimizations have resulted in significant stability improvements with minimal increases in training time, even at an expansion factor of 4 [18]. Group 4: Performance Validation - Testing on various model sizes, particularly a 27 billion parameter model, demonstrated that mHC effectively resolved training instability issues, achieving lower loss values compared to traditional baseline models [21][22]. - The performance advantages of mHC were consistent across different model sizes, indicating its practical value for both small and large models [24]. Group 5: Industry Implications - The introduction of mHC suggests a shift in the industry towards refined architectural designs rather than merely increasing parameters and computational power, potentially lowering entry barriers for smaller companies in the large-scale model domain [26][29]. - This pragmatic technological innovation is expected to facilitate the deployment of AI technologies, making it easier for more enterprises to engage in large-scale model development [29].
年度重磅 | 2025影响力女性图鉴:她们发明了自己的战场
Xin Lang Cai Jing· 2026-01-07 08:26
Core Insights - The narrative around women's influence has fundamentally changed over the past year, showcasing women as powerful figures in various fields rather than seeking empowerment from others [1][38]. Part 1: The World Modeler - Fei-Fei Li, founder of World Labs, has focused on "Spatial Intelligence," launching the Marble product, which creates high-fidelity 3D worlds from images, videos, or text prompts [1][2][3]. Part 2: The Pain Translator - Han Kang, a Nobel laureate, sparked global discussions on "female bodily sovereignty" and "historical trauma" with her works, including "The Vegetarian" and "The White Book," which became bestsellers [5][7]. Part 3: The Gold Standard - Caitlin Clark, a WNBA star, doubled viewership and sponsorship fees, proving that female athletes can generate significant commercial value when given equal exposure [11][13]. - Qinwen Zheng, a tennis champion, became a global brand ambassador for Dior and earned $22.6 million in 2025, with 93% from endorsements, redefining the public image of East Asian female athletes [13][17]. Part 4: The Heritage Hacker - Zong Fuli, president of Hongsheng Beverage Group, undertook digital reforms and brand rejuvenation, applying for a new trademark "Wawa Xiaozong" to establish her own identity separate from her father's legacy [14][16][17]. Part 5: The AI Ethicist - Mira Murati, former CTO of OpenAI, founded Thinking Machines Lab with a $12 billion valuation, focusing on creating safer and more reliable AI systems, addressing the gap in public understanding of AI [18][20][21]. Part 6: The Invisible Heroine - Female data annotators in rural China are crucial in training AI models, providing stable income and connecting with modern technology, thus becoming visible contributors to the AI evolution [22][24]. Part 7: The Strategy Sovereign - Meng Wanzhou, rotating chairwoman of Huawei, shifted the company's focus from survival to leadership in AI, achieving significant milestones in various sectors, including the Harmony ecosystem and AI computing [25][27][28]. Part 8: The Grassroots Healer - Dr. Lu Shengmei, a pediatrician, has dedicated her life to serving the community, significantly reducing infant mortality rates and becoming a symbol of enduring value in a rapidly changing world [30][31]. Part 9: The Supply Chain Queen - Wang Laichun, chairwoman of Luxshare Precision, transformed the company from a traditional manufacturer to a technology platform, focusing on high-precision manufacturing and expanding into new markets [32][33][34]. Part 10: The Wilderness Chronicler - Li Juan, author of "My Altay," received the 2025 China Copyright Golden Award, solidifying her status as a literary figure who connects individual souls with nature, providing a counter-narrative to modern anxieties [35][37].
以创新重新释义转型期企业家精神
第一财经· 2026-01-07 02:34
Core Viewpoint - Innovation will be the main theme for the "14th Five-Year Plan" as emphasized by the Chinese Premier Li Qiang during his visit to Guangdong, highlighting the importance of innovation for economic and social development [2]. Group 1: Innovation as a Core Element - Companies are the main entities for innovation, and market application is essential for nurturing technological advancements. Successful companies are those that embrace and practice innovation [2]. - The current economic transition and technological revolution require a redefinition of entrepreneurial spirit, shifting from exploiting scarcity to creating and innovating scarcity [3]. Group 2: Market Orientation and Consumer Respect - Entrepreneurs must focus on the market and respect consumers to drive innovation. Companies lacking this respect will struggle to identify market opportunities and sustain growth [4]. - There is a need for a cognitive transformation among Chinese entrepreneurs to prioritize market orientation and consumer needs, moving away from reliance on past successes and superficial marketing tactics [4]. Group 3: Regulatory and Consumer Rights - Improving legislative quality and enforcement is crucial for protecting consumer rights and ensuring fair market competition. Regulatory bodies must take a firm stance against misleading advertising and practices that infringe on consumer rights [5]. - There is a call to enhance consumer rights through collective litigation and dispute resolution mechanisms, as current low costs for businesses to mislead consumers hinder innovation and economic growth [5]. Group 4: Sustainable Development through Innovation - Companies must understand that quality products and services are achieved through genuine innovation rather than mere marketing gimmicks. Respecting consumers and focusing on market needs is fundamental for sustainable development and innovation [5].
雷军回应小字营销:行业陋习,但我们改/DeepSeek开年「王炸」,梁文锋署名论文发布/马斯克立新年Flag:大规模量产脑机接口
Sou Hu Cai Jing· 2026-01-06 13:46
Group 1 - Lei Jun, the founder of Xiaomi, addressed the controversy surrounding "small font marketing," stating it is an industry habit that needs to be changed, emphasizing the importance of legal compliance while acknowledging the need for clearer communication with consumers [3][4] - Xiaomi plans to standardize product annotations using larger fonts in the future, aiming to improve clarity and consumer understanding [4] - In a recent live stream, Lei Jun revealed that Xiaomi's automotive division aims to deliver over 410,000 vehicles by 2025, with the Xiaomi YU7 model becoming the best-selling mid-to-large SUV for four consecutive months [5][7] Group 2 - BMW China announced a systematic price adjustment for 31 key models starting January 1, 2026, with the highest price drop reaching 300,000 yuan, reflecting a long-term strategy rather than a short-term price war [11][12] - The flagship electric model i7 M70L saw a price reduction from 1.899 million yuan to 1.598 million yuan, a decrease of approximately 16%, while the iX1 eDrive25L's price dropped by 24% [12] - The automotive industry is experiencing significant shifts, with multiple companies reporting their sales figures for 2025, indicating a competitive landscape [7] Group 3 - OpenAI is reportedly working on multiple AI hardware projects, including a pen-shaped device and portable audio equipment, aiming to create an ecosystem of products rather than a single offering [9][10] - The new audio model being developed by OpenAI is expected to provide more natural and expressive responses, enhancing user interaction with AI devices [10] Group 4 - Elon Musk announced that Neuralink plans to begin large-scale production of brain-machine interface devices in 2026, with a focus on simplifying the surgical process for implantation [16][18] - The company aims to enable users to control computers directly through neural signals, with previous successful trials involving a limited number of patients [18] Group 5 - Microsoft CEO Satya Nadella emphasized that 2026 will be a pivotal year for AI, marking a transition from initial exploration to widespread application, with a focus on reshaping human-AI relationships and engineering paradigms [27][29][30] - Nadella highlighted the need for AI to demonstrate tangible positive impacts in the real world to gain societal acceptance [30]
计算机行业周报:小红书Video-Thinker打破工具依赖,DeepSeek推出mHC-20260106
Huaxin Securities· 2026-01-06 12:34
Investment Rating - The report maintains a "Buy" rating for several companies in the AI and computing sectors, including Weike Technology (301196.SZ), Nengke Technology (603859.SH), Hehe Information (688615.SH), and Maixinlin (688685.SH) [9]. Core Insights - The report highlights the introduction of the Video-Thinker model by Xiaohongshu, which breaks the dependency on external tools for video reasoning, achieving state-of-the-art (SOTA) performance with a 7B parameter version [3][22]. - DeepSeek's new architecture, mHC, shows significant performance improvements with only a 6.7% increase in training time, marking a breakthrough in model efficiency [31][32]. - Kimi, a Chinese AI startup, completed a $500 million Series C funding round, with a post-money valuation of $4.3 billion, focusing on the development of its K3 model and talent incentives for 2026 [4][44]. Summary by Sections 1. Computing Dynamics - The report notes stable pricing in computing power leasing, with specific rates for various configurations [21]. - Xiaohongshu's Video-Thinker model integrates key capabilities such as temporal grounding and visual description, achieving new benchmarks in video reasoning [22][23]. - The model's training paradigm includes a two-stage process that enhances its reasoning capabilities while reducing reliance on external tools [26][27]. 2. AI Application Dynamics - Character.AI experienced an 8.32% increase in weekly traffic, indicating growing interest in AI applications [30]. - DeepSeek's mHC architecture addresses traditional bottlenecks in model efficiency, providing a robust framework for enhancing model capabilities [31][32]. 3. AI Financing Trends - Kimi's recent funding round will support the development of its K3 model and expansion of its talent pool, following significant technological advancements in 2025 [4][44]. - Meta's acquisition of Manus for $4-5 billion underscores the strategic importance of AI applications and the integration of advanced AI capabilities into its ecosystem [5][6]. 4. Market Performance - The report provides comparative performance metrics for various AI models, showcasing the advancements made by Video-Thinker over existing solutions [28][29]. - The overall market sentiment remains positive, with a focus on the long-term growth potential of AI applications and computing technologies [7].
AI 系列跟踪(88):AI 芯片厂商密集上市,DeepSeek 提出新架构,AI 产业化进程再加速
Changjiang Securities· 2026-01-06 11:10
Investment Rating - The report maintains a "Positive" investment rating for the industry [7] Core Insights - Recent developments in the AI sector include the successful listing of Wallen Technology on the Hong Kong Stock Exchange and Baidu's Kunlun Chip planning a spin-off listing. DeepSeek has proposed a new mHC architecture that reduces the energy and computational requirements for training advanced AI, potentially accelerating the industrialization of AI [2][4] - The report highlights the upcoming IPOs of AI companies Zhiyu and MiniMax on January 8 and 9, respectively, and notes the partnership of Doubao with the Spring Festival Gala as a significant event [2][10] - The report identifies several promising investment opportunities within the AI sector, including high-quality IP benefiting from AI technology advancements, internet giants with advantages in traffic, models, and data, and vertical sectors like advertising, e-commerce, and education that have successfully replicated overseas business models in China [2][10] Summary by Sections Recent Events - Wallen Technology has successfully listed on the Hong Kong Stock Exchange, filling an important gap in the computing power sector. The company has developed a full chain of capabilities from high-end AI chips to computing clusters, with its self-developed "Biren" GPGPU architecture and related hardware products. The stock surged by 75.82% on its first day, indicating a new phase for the domestic computing power industry [10] - Baidu's Kunlun Chip is set to enhance its valuation transparency and attract investors focused on hard technology by planning a spin-off listing. The Kunlun Chip P800 cluster, capable of supporting multiple large models, marks a significant milestone in domestic computing power [10] - DeepSeek's new mHC architecture addresses issues in the existing Hyper-Connections structure, showing a mere 6.7% increase in training time while achieving significant performance improvements, thus lowering the costs associated with AI model training [10] Investment Opportunities - The report emphasizes the accelerated marginal growth in AI, with a focus on investment opportunities in the AI sector. It highlights the potential of high-quality IP benefiting from AI advancements, internet giants with data advantages, and vertical sectors that can replicate successful overseas business models [2][10]
大模型开启架构革命之年,AI全产业链酝酿新变局
Sou Hu Cai Jing· 2026-01-06 10:27
Core Insights - The AI industry is experiencing a wave of architectural innovations, with significant developments from DeepSeek and collaborations between Princeton and UCLA, as well as a new model architecture from Yann LeCun expected within 12 months [1] - The emergence of new model architectures is seen as a crucial driver to overcome the diminishing returns associated with existing models, particularly as companies like OpenAI face limitations in performance improvements [1] - The integration of different architectural innovations will lead to a new phase of application acceleration in the AI sector, emphasizing the importance of user experience and model capability enhancement [1] Group 1: Architectural Innovations - DeepSeek introduced the mHC framework, which enhances stability and scalability while reducing computational and energy demands, making it valuable for industrial applications [6] - The DDL architecture allows neural networks to dynamically manage their internal memory states, catering to complex information processing needs in government, research, and commercial sectors [6] - Yann LeCun's JEPA architecture breaks the reliance on text, enabling understanding through video and spatial data, which is essential for applications like smart glasses and robotics [6][7] Group 2: Market Dynamics - The competition among AI model vendors is intensifying, with varying progress and strategies in implementing new architectures, leading to differentiation in capabilities across different fields [1][5] - Lenovo's multi-model integration strategy allows users to access various top-tier models, enhancing the "AI Twin" experience and addressing diverse application needs [6][9] - The rise of large-scale AI enterprises, such as Lenovo's Tianxi AI with over 280 million active users, poses challenges for smaller startups in selecting appropriate architectural paths [3][5] Group 3: Infrastructure and Hardware - The new model architectures necessitate upgrades in computational infrastructure, presenting opportunities for suppliers like Lenovo, which has shown proactive market responsiveness [2][10] - Lenovo's introduction of the WR5215 G5 server, featuring significant performance improvements and compatibility with domestic software ecosystems, highlights its strategic alignment with emerging architectural needs [10][12] - The collaboration between Lenovo and NVIDIA to develop a revolutionary server indicates a strong focus on integrating advanced chip technology to support diverse AI applications [12][13] Group 4: Future Outlook - Anticipation surrounds potential major releases from DeepSeek, which could further stimulate practical applications and economic value in the AI sector [14] - The ability of companies to adapt to the evolving landscape of model architectures and effectively integrate them into their offerings will be critical for maintaining competitive advantages [9][14]
对话周其仁:AI替代不了“你真正的喜欢”
虎嗅APP· 2026-01-06 09:13
以下文章来源于中国企业家杂志 ,作者梁宵 中国企业家杂志 . 讲好企业家故事,弘扬企业家精神 本文来自微信公众号: 中国企业家杂志 ,编辑:米娜,作者:梁宵 企业家问路在哪里?他只会给出三个字——"看着办"。 即便没有亲身感受过北京大学国家发展研究院教授(以下简称"北大国发院")周其仁的课堂,在两个 小时的采访中也不难发现,这位以犀利、敢言风格著称的经济学家,在研究、治学中特有的敏锐,以 及风趣: 从他的口中,不会听到艰深晦涩的概念和理论,都是具体而微的企业实践;对于"泛泛的宏观"、貌似 普遍的提法,他总是很警惕,"人云亦云一番,毫无意义";他还经常"反客为主",从受访者转变为提 问者,一个追问接着一个,直到答案已在来来回回间跃然而出;最好,对方意见相左,比起自说自 话,他显然更喜欢在"观点的自由市场"中辩出一个所以然——"来,斗。"他甚至鼓励争论继续。 据说很多学生毕业后还很怀念"周老师的午餐"——每周六中午课后,周其仁会邀请一些善于提问的学 生边吃边聊,可以想象得到,那是怎样一场高能的"头脑风暴"。 同是北大国发院经济学教授的汪丁丁对其尤为赞佩,"他的洞察力跟别人很不一样,他能够把细节抓 住,而且他抓住细 ...