大语言模型
Search documents
梁文锋署名DeepSeek新论文,“突破GPU内存限制”
Guan Cha Zhe Wang· 2026-01-13 12:28
Core Insights - DeepSeek, a Chinese AI startup, has published a technical paper introducing a new model training technique that bypasses GPU memory limitations, highlighting its focus on cost efficiency despite existing gaps with leading US firms [1][2] - The new technique, termed "Engram," addresses the bottleneck of limited high-bandwidth memory (HBM) in scaling AI models, which is a significant gap between China and the US in AI hardware [3][4] - The paper has garnered attention from industry professionals in both China and the US, indicating DeepSeek's role as a leader in AI innovation over the past year [1][2] Technical Developments - The paper titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models" presents the "conditional memory" technology aimed at improving the efficiency of AI models when processing long contexts, a major challenge for AI chatbots [2][3] - The Engram technique allows for the decoupling of computation and storage, enhancing the model's ability to retrieve foundational information more efficiently [3][4] - Validation of this technology was conducted on a model with 27 billion parameters, showing performance improvements in key industry benchmarks [3] Market Position and Competition - DeepSeek's previous model, DeepSeek-R1, was trained in two months at a cost of $5.5 million, significantly lower than competitors like OpenAI, while achieving comparable performance [6][7] - Microsoft President Brad Smith has noted that US AI companies are being surpassed by Chinese competitors like DeepSeek, particularly in emerging markets due to the low-cost and user-friendly nature of Chinese open-source models [7] - Anticipation is building for DeepSeek's upcoming V4 model, expected to launch in mid-February, which is said to possess strong programming capabilities [8]
复盘特斯拉FSD进化史:把端到端推向无人驾驶终局
3 6 Ke· 2026-01-13 12:14
Core Insights - Tesla's FSD V14 has demonstrated significant advancements in autonomous driving capabilities, completing a cross-country journey of 2732 miles (approximately 4400 kilometers) with zero human intervention [2][7][35] - The evolution of Tesla's FSD system from V12 to V14 showcases a shift from rule-based to data-driven approaches, enhancing the system's ability to learn and adapt to complex driving scenarios [19][45][86] Group 1: Tesla's FSD Development - Tesla's FSD V14 completed a cross-country trip, showcasing its advanced autonomous driving capabilities with zero human intervention [2][7] - The previous similar test by Delphi in 2015 took 9 days with significant human intervention, highlighting Tesla's technological advancements [5][6] - FSD V14 is seen as a potential benchmark in the industry, with Nvidia's Jim Fan suggesting it may have passed a "physical Turing test" [8][9] Group 2: Technical Evolution of FSD - The transition from FSD V12 to V14 represents a significant leap in capabilities, with V12 focusing on end-to-end learning and V13 enhancing contextual understanding [18][24][35] - FSD V13 introduced a new hardware platform (HW4) with a fivefold increase in AI computing power, enabling more complex decision-making [31][32] - FSD V14 further enhances the system's capabilities, allowing it to operate in L4 conditions and paving the way for the commercial rollout of Robotaxi services [35][40] Group 3: Competitive Landscape - Domestic competitors are narrowing the gap with Tesla, with some claiming the distance has reduced from three years to one year in terms of technology [12][13] - The competitive focus is shifting from generational differences to engineering efficiency, as companies seek to optimize their models and data within limited resources [86] - Tesla's unique approach, integrating autonomous driving with robotics and leveraging extensive data and computing resources, sets it apart from domestic players [67][70][76]
龙虎榜复盘丨AI医疗集体大涨,顶级游资锁仓“地天板”航天龙头股
Xuan Gu Bao· 2026-01-13 10:49
Group 1 - The core point of the news is that institutional investors are actively trading stocks, with a total of 67 stocks listed on the institutional leaderboard, where 47 stocks saw net buying and 20 stocks experienced net selling [1] - The top three stocks with the highest net buying by institutions are China Satellite (¥679 million), Yonyou Network (¥655 million), and Hengwei Technology (¥404 million) [1] - Yonyou Network's stock increased by 7.87%, and the company has invested over ¥10 billion in product upgrades over the past two years [2][3] Group 2 - AI for Science (AI4S) is identified as one of the three core directions of artificial intelligence, alongside large language models and embodied intelligence, focusing on accelerating scientific research through AI [3] - The AI4S sector is expected to see significant growth, particularly in pharmaceutical research, materials science, and energy chemistry, with 2026 projected as a potential breakthrough year for AI4S technology [3] - The demand for AI applications in healthcare is highlighted, with recent developments indicating a real need for AI in medical applications, as evidenced by the launch of "Ant Financial's AI" and the progress of companies like Zhizhu and miniMax [3]
对话千寻智能韩峰涛:真正的机器人是生产力,不是展品和玩具
雷峰网· 2026-01-13 10:20
Core Viewpoint - The article discusses the launch of Spirit v1.5, which has become the world's strongest open-source embodied model, surpassing the previous benchmark Pi0.5, indicating a significant advancement in embodied intelligence technology [3][6]. Group 1: Development of Embodied Intelligence - The launch of Spirit v1.5 marks a pivotal moment in embodied intelligence, showcasing a task success rate of over 50% in real-world scenarios, compared to Pi0.5's 42.67% [6]. - The founder of the company believes that 2026 will be a competitive year for embodied intelligence, similar to the rapid advancements seen in large language models in 2023 [6][9]. - The company plans to expand its data collection team to nearly 1,000 people to enhance data quality and quantity, which are crucial for model performance [6][36]. Group 2: Historical Context and Market Position - The Chinese industrial robot market has seen a significant rise, with domestic robots' market share increasing from 3% in 2015 to over 50% by 2024 [8][12]. - The founder emphasizes that the current era of embodied intelligence is driven by revolutionary changes in AI technology, which allows robots to perform meaningful tasks [9][20]. - The company aims to differentiate itself by focusing on AI as the core of its operations, rather than merely hardware production [18][24]. Group 3: Data Collection and Model Training - The company is developing its own data collection systems to gather sufficient data for training models, aiming for a target of 1 million hours of data to achieve better model performance [36][40]. - The founder highlights the importance of collecting real-world data through their robots, as opposed to relying on third-party data, which may not be effective for training [37][38]. - The company believes that achieving a model capable of L2 tasks, such as folding clothes, is essential for commercial viability and will enable the data flywheel to turn effectively [32][40]. Group 4: Future Outlook and Market Potential - The company anticipates that the market for capable robots will grow significantly, with potential sales volumes reaching levels comparable to automobiles and smartphones [28][29]. - The founder predicts that 2026 will be a critical year for embodied intelligence, akin to the pivotal moments seen in the development of large language models, leading to increased investment and interest in the sector [44].
AI应用投资机会梳理
2026-01-13 01:10
摘要 AI 应用投资机会梳理 20260112 AI 应用边际改善显著,大语言模型迭代加速,2025 年已达季度级别, 谷歌 Gemini、Anthropic 和 OpenAI 等头部实验室竞争激烈,模型性 能通过范式革新实现脉冲式提升,在线学习或终身学习成为新方向。 多模态模型发展潜力巨大,目前处于早期阶段,但未来有望实现跨越式 发展。OpenAI 的周活跃用户(WAU)已接近 10 亿,预计 2026 年底 可能达到 20 亿,AI 已成为全球流量格局中不可忽视的一部分。 国内外用户付费习惯差异影响国内 AI 应用市场,海外 C 端订阅模式在国 内推广受阻,B 端收费亦存在困难。教育等增值服务领域仍有机会实现 收入增长,AI 成果显著的公司将获得更多关注。 港股阿里巴巴、快手、美图和富博等公司在 AI 应用方面领先,值得关注。 阿里巴巴积极布局 AI 优化供应链和客户体验;快手利用 AI 改进内容推 荐;美图通过 AI 提升图像处理功能;富博在特定领域拥有先进 AI 技术。 OpenAI 大幅上修 2026-2029 年营收预期,探索电商和广告变现免费 用户,计划 2026 年实现 30 亿美元的免费用户 ...
刚刚,梁文锋署名开源“记忆”模块,DeepSeek V4更细节了
程序员的那些事· 2026-01-13 00:56
Core Insights - DeepSeek has introduced a new research paper titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models," in collaboration with Peking University, focusing on enhancing large language models (LLMs) through conditional memory and a new module called Engram [1][3][4]. Group 1: Research Background and Problem Statement - Current large language models primarily utilize Mixture of Experts (MoE) for sparsity, but existing Transformer architectures lack native knowledge retrieval mechanisms, leading to inefficient simulation of retrieval behavior [3][9]. - DeepSeek proposes conditional memory as a complementary approach to MoE, introducing the Engram module to address the limitations of current models [4][9]. Group 2: Engram Module and Its Functionality - The Engram module modernizes classic n-gram embeddings, enabling knowledge retrieval with O(1) time complexity [9]. - Engram separates static knowledge storage from dynamic computation processes, enhancing the model's ability to perform complex reasoning by offloading the reconstruction burden from the model's shallow layers [11][13]. Group 3: Performance Improvements - Engram has been scaled to 27 billion parameters, showing significant performance improvements over pure MoE baseline models under equivalent parameter and FLOPs conditions [11]. - Notably, Engram enhances knowledge retrieval capabilities, with improvements in metrics such as MMLU (+3.4), CMMLU (+4.0), and general reasoning tasks like BBH (+5.0) and ARC-Challenge (+3.7) [11][38]. Group 4: System Efficiency and Scalability - Engram's deterministic addressing supports prefetching from host memory at runtime with minimal performance overhead, allowing for efficient memory management [12][19]. - The architecture allows for the decoupling of parameter storage from computational resources, facilitating linear scalability with the number of accelerators [21][22]. Group 5: Experimental Results - Four models were trained: Dense-4B, MoE-27B, Engram-27B, and Engram-40B, all using the same training data and processes [35][36]. - Sparse architectures (MoE-27B, Engram-27B/40B) significantly outperformed the dense model (Dense-4B) across various benchmarks, demonstrating superior scaling properties [38]. Group 6: Long Context Training - Engram architecture has shown significant advantages in long-context tasks by preserving valuable attention capacity for global context processing [41]. - Controlled experiments indicate that Engram outperforms MoE models in complex retrieval tasks, confirming its architectural superiority [46].
刚刚,DeepSeek 突发梁文峰署名新论文:V4 新架构提前曝光?
AI前线· 2026-01-12 22:41
Core Insights - DeepSeek has released a significant technological achievement by open-sourcing a new paper and module called Engram, which introduces a "lookup-computation separation" mechanism to enhance the performance of large language models in various tasks [2][5]. Summary by Sections Introduction of Engram - Engram is a scalable, lookup-based memory module designed to improve the efficiency of language models by separating memory retrieval from computational tasks [10][18]. Need for Engram - Traditional large language models rely on Transformer and Mixture-of-Experts (MoE) architectures, which combine memory and computation in a way that can lead to inefficiencies. Engram aims to address this by allowing models to handle factual memory and logical reasoning separately [8][9]. Core Technology of Engram - Engram utilizes modernized hashed N-gram embeddings, allowing for O(1) time complexity in memory retrieval, which significantly reduces computational costs while maintaining high retrieval speed [11][13]. Relationship with MoE - Engram provides a new axis of sparsity that complements MoE by offering static memory retrieval capabilities, thus optimizing parameter efficiency. In a 27 billion parameter model, Engram can utilize a large number of parameters for memory while consuming minimal computational resources during inference [15][16]. Performance Metrics - Engram has shown improved performance metrics across various benchmarks, such as achieving a loss of 1.950 on the Pile dataset and an accuracy of 60.4% on MMLU with 5-shot learning, outperforming both Dense and MoE models [17]. Community Reception - The Engram technology has received positive feedback from the community, with users highlighting its potential to separate memory pattern retrieval from neural computation, marking a new direction in model architecture design [18][19][21]. Future Implications - Observers speculate that Engram will be a core component of DeepSeek's upcoming V4 model, indicating a significant architectural advancement in memory and reasoning collaboration [22][23].
智谱大涨31.40%,与滴滴达成战略合作
Zheng Quan Shi Bao Wang· 2026-01-12 14:33
Group 1 - The core viewpoint of the news is that Zhiyun (02513.HK) has formed a strategic partnership with Didi to explore key technologies in General Artificial Intelligence (AGI) and their applications in the transportation sector [1] - Didi has been increasing its investment in large models and intelligent agents, leading to innovations such as AI travel assistants and business travel assistants [1] - Zhiyun has a strong foundation in large model architecture, training paradigms, and intelligent agent technology, and the partnership aims to enhance the deployment of agents in complex business scenarios [1] Group 2 - Zhiyun was established in 2019 and focuses on developing advanced general large models, launching China's first proprietary pre-trained large model framework, GLM, in 2021 [2] - The company has achieved significant growth in its cloud-based Model as a Service (MaaS) and subscription business, with over 2.9 million users on its API platform [2] - According to Frost & Sullivan, Zhiyun ranks first among independent general large model developers in China and second overall, with a market share of 6.6% as of 2024 [2] Group 3 - Zhiyun's R&D investments from 2022 to 2024 are projected to be 84 million yuan, 529 million yuan, and 2.195 billion yuan, with 1.595 billion yuan allocated for the first half of 2025 [3] - The company has a research team of 657 members, with R&D personnel making up 74% of its workforce [3] - On January 12, Zhiyun's stock surged by 31.40%, closing at 208.4 HKD per share, with a market capitalization of 91.744 billion HKD, reflecting a cumulative increase of 79.35% since its IPO [3]
AI会抢走金融人的饭碗吗?行业大咖秀共识:那1%的灵感与温度机器永远学不会
第一财经· 2026-01-12 11:23
Core Viewpoint - The article emphasizes that artificial intelligence (AI) is not merely a "replacement" for humans but rather a "creator" and "reshaper" in the financial industry, highlighting the irreplaceable role of human judgment, creativity, and responsibility in the face of advancing technology [3]. Group 1: Financial Innovation and Human Role - Liu Xiaochun, Vice President of Shanghai New Finance Research Institute, asserts that financial innovation should focus on the essence of finance rather than technology, maintaining that the core role of humans in technology application remains unchanged [4]. - He categorizes relevant technologies into three levels: financial technology (designing financial solutions), institutional technology (establishing reasonable distribution of benefits and risks), and scientific technology, emphasizing the need to balance these elements for effective financial technology implementation [5]. - Despite ongoing efforts to reduce workforce in the banking sector over the past two decades, the total number of employees has continued to grow, indicating that while technology replaces certain roles, it also creates new demands for technical services [5]. Group 2: Financial Intelligence and Technological Pathways - Yuan Yue, Chairman of Zero Point Data, outlines the evolution of financial technology into a new phase centered on "financial intelligence," transitioning from early automation to intelligent decision-making [6]. - He introduces a framework using the A-Z method to identify core technologies supporting risk control and service optimization, while critiquing the current hype around large language models, stating they are insufficient for high-sensitivity financial applications [6][7]. Group 3: AI's Impact on Content Creation and Financial Services - The conference also explored the intersection of financial technology and content creation, with discussions on how AI's rapid development fundamentally impacts various industries, particularly finance [7]. - Zhang Wenyu from Zhejiang University highlights that the emergence of ChatGPT marks a shift from "weak AI" to a new era of AI that mimics human-like responses, emphasizing the importance of human creativity and insight in the face of AI advancements [8]. - Financial entrepreneur Zhu Guangye acknowledges the reality that many repetitive tasks in finance will be replaced by AI, but stresses that not all foundational work can be easily substituted, particularly in areas requiring human judgment and experience [9]. Group 4: Tools and Applications in Financial Technology - Tian Li, COO of Yingfan Technology, discusses the launch of the first intelligent agent swarm, aimed at enhancing user understanding and transforming passive data into personalized insights for financial institutions [9]. - The consensus among various industry experts is that financial technology provides precise tools for content creation, while content creation supports user education and brand communication for financial products, indicating a deep integration that will drive high-quality industry development [9].
AIforScience大时代,撬动科学研发万亿赛道
GOLDEN SUN SECURITIES· 2026-01-12 06:59
Investment Rating - The industry investment rating is "Increase" [5] Core Insights - The era of AI for Science (AI4S) is transforming scientific research, particularly in materials development, which has become increasingly complex due to multi-objective optimization requirements. AI4S utilizes AI algorithms to enhance molecular structure insights through quantum physics calculations and integrates real-world data from high-throughput robotic laboratories, significantly shortening research cycles [1] - The potential market size for AI4S in the pharmaceutical sector is estimated at approximately $108.2 billion, based on a 33% value share of the preclinical research market within the global pharmaceutical market of $1.64 trillion. Additionally, assuming a 25% penetration rate in sectors such as chemicals, pharmaceuticals, new energy, alloys, displays, and semiconductors, the total AI4S market demand could reach around $148.6 billion [2] - Key application areas for AI4S include innovative drug development, where the complexity of drug research aligns well with AI capabilities, and space photovoltaics, particularly with perovskite materials that can significantly enhance satellite energy efficiency [3] Summary by Sections AI4S Empowerment in Scientific Research - AI4S capabilities encompass "reading, computing, and doing." For instance, the company Tai Holdings has developed a patent data mining platform that can extract literature and patent data in one hour with a 95% accuracy rate, and over 200 AI models that enhance research speed and precision [1] Market Size and Potential - The pharmaceutical sector's AI4S market potential is approximately $108.2 billion, while the overall market demand across six sectors could reach about $148.6 billion under a 25% penetration assumption [2] Notable Application Areas - Innovative drug development is a primary focus for AI4S due to the high investment and complexity involved. Additionally, perovskite materials in space photovoltaics present a promising area for AI optimization, addressing technical challenges related to stability and efficiency [3][4]