Engram
Search documents
DeepSeek论文发表16天后,国内团队已经写出了模型的「生物字典」
机器之心· 2026-01-31 04:10
Core Insights - The article discusses the introduction of Gengram, a genomic module inspired by the Engram technology, which enhances the efficiency of genomic models by utilizing a memory lookup system instead of traditional methods [1][4]. Group 1: Gengram Technology Overview - Gengram employs a hash table to store common DNA sequences (k-mers) and allows models to reference this external memory, significantly reducing computational load [3][11]. - The module is lightweight, with approximately 20 million parameters, and integrates seamlessly into larger models, enhancing their performance without substantial additional computational costs [15][19]. Group 2: Performance Improvements - Models utilizing Gengram showed significant performance improvements in various tasks, including a 16.1% increase in AUC for splice site prediction and a 22.6% increase for epigenetic prediction tasks [17]. - Gengram's implementation allows models to achieve high performance with minimal training data, outperforming models that have been trained on significantly larger datasets [18]. Group 3: Mechanisms and Adaptability - Gengram features a dynamic gating mechanism that enables the model to decide when to reference the memory based on the context, optimizing resource usage [12][13]. - The module demonstrates excellent adaptability across different model architectures, improving training efficiency and balancing expert loads in mixture of experts (MoE) configurations [19][21]. Group 4: Scientific Insights and Innovations - Gengram's design allows it to infer biological principles, such as the physical structure of DNA, without prior knowledge, showcasing its potential for scientific discovery [22][25]. - The choice of a 21 base pair window size for local aggregation aligns with the physical properties of DNA, indicating a sophisticated understanding of biological structures [23][24]. Group 5: Team Background and Capabilities - The Genos Team, responsible for Gengram, is a collaboration between Zhejiang Lab and BGI-HangzhouAI, combining expertise in AI and life sciences [33][34]. - The Genos model, which serves as the foundation for Gengram, reportedly surpasses leading industry benchmarks, indicating a strong competitive position in genomic modeling [35].
科技 - DeepSeek:以更少资源实现更多价值Tech Bytes-DeepSeek – Doing More With Less
2026-01-22 02:44
Summary of DeepSeek's Innovation and Investment Implications Company and Industry Overview - **Company**: DeepSeek, a China-based AI company - **Industry**: Artificial Intelligence (AI) and semiconductor technology Core Insights and Arguments 1. **Innovation in AI Architecture**: DeepSeek's Engram module reduces high-bandwidth memory (HBM) constraints and infrastructure costs by decoupling storage from compute, suggesting that future AI advancements may focus on efficient hybrid architectures rather than merely larger models [1][2][9] 2. **Efficiency Gains**: The Engram approach enhances efficiency for Large Language Models (LLMs) by allowing essential information retrieval without overloading HBM, potentially reducing the need for costly HBM upgrades [2][3] 3. **Performance Metrics**: DeepSeek's findings indicate that hybrid architectures can outperform traditional models, with a minimum requirement of around 200GB system DRAM compared to existing systems that utilize significantly more [3][12] 4. **Next Generation LLM**: The upcoming DeepSeek LLM V4 is expected to leverage the Engram architecture, particularly excelling in coding and reasoning tasks, and may run efficiently on consumer-grade hardware [4][5] Investment Implications 1. **Market Potential**: Despite China's AI market being smaller than that of the US, its growth momentum suggests that investment opportunities may be underestimated. The report favors investments in Chinese memory and semiconductor localization themes, highlighting companies like Naura, AMEC, and JCET [5][9] 2. **Strategic Positioning**: By focusing on algorithmic efficiency rather than hardware expansion, DeepSeek exemplifies how companies can navigate geopolitical and supply-chain constraints, potentially leading to a more cost-effective and scalable AI ecosystem in China [21][16] Additional Important Insights 1. **Performance Comparison**: Over the past two years, Chinese AI models have significantly closed the performance gap with leading models like ChatGPT 5.2, emphasizing efficiency-driven innovations rather than sheer parameter growth [10][16] 2. **Conditional Memory Concept**: Engram introduces a method to separate static memory from dynamic reasoning, optimizing GPU usage and enhancing long-context handling, which has been a challenge for many large models [11][24] 3. **Benchmark Performance**: Engram has shown improved performance in benchmark tests, particularly in handling long-context inputs, which enhances the utility of AI models [20][21] This summary encapsulates the key points from the conference call regarding DeepSeek's innovations, their implications for the AI industry, and potential investment opportunities in the context of China's evolving AI landscape.
计算机行业周报:千问App接入阿里生态业务
Guoxin Securities Co., Ltd· 2026-01-21 13:25
Investment Rating - The report gives a "Positive" rating for the computer industry, expecting the industry index to outperform the market index by over 5% in the next six months [33]. Core Insights - The computer industry index rose by 3.82% from January 12 to January 16, outperforming the CSI 300 index by 4.39 percentage points, making it the top-performing sector among other industries [2][11]. - Key stocks that performed well include Tongda Hai with a 39.73% increase, Haohan Deep with a 30.57% increase, and Jiechuang Intelligent with a 28.95% increase. Conversely, *ST Lifang saw a decline of 33.66%, followed by Aerospace Information at -14.46% and Haixia Innovation at -13.40% [14][15]. - Significant developments include the announcement of the integration of Qianwen App into Alibaba's ecosystem, enabling AI-driven services for tasks like ordering food and booking flights [3][31]. Market Performance - The computer industry has a total of 335 listed companies, with 234 companies seeing a rise, accounting for 69.85% of the sector [14]. - The report highlights the performance of individual stocks, with notable gains and losses during the specified period [15]. Recent Developments - Elon Musk announced the open-sourcing of the latest recommendation algorithm for X, promising updates every four weeks [3]. - Apple and Google have entered a partnership where Google's Gemini will support Apple's AI initiatives, with Apple expected to pay around $1 billion annually for this technology [18][19]. - Meta's CEO Mark Zuckerberg announced the Meta Compute initiative, aiming to build a GW-level AI infrastructure over the next decade [21][22]. - The U.S. has relaxed export controls on NVIDIA's H200 chips to China, which is expected to restart shipments to Chinese customers [24].
AI、半导体:台积电大幅提升2026年资本开支
Huajin Securities· 2026-01-18 05:55
Investment Rating - The industry investment rating is "Outperform the Market" (maintained) [1][36] Core Insights - TSMC significantly increased its capital expenditure forecast for 2026 to between $52 billion and $56 billion, up from $40.9 billion in 2025, driven by strong demand for AI hardware [3][6] - The report highlights the robust growth in AI-driven high-performance computing (HPC) platforms, which accounted for 55% of TSMC's Q4 revenue, with advanced process nodes (3nm and 5nm) contributing 63% of Q4 revenue [3][6] - The report emphasizes the anticipated growth in the semiconductor cycle driven by AI, recommending a focus on the entire semiconductor supply chain, including key stocks such as SMIC, Hua Hong Semiconductor, and others [3][34] Summary by Sections Industry Dynamics - TSMC reported Q4 revenue of $33.73 billion, a year-on-year increase of 25.5% and a quarter-on-quarter increase of 1.9%, with a gross margin of 62.3% [6] - The report notes that TSMC's net profit reached NT$505.74 billion, a 35% year-on-year increase, with a net profit margin of 48.3% [6] Market Review - The electronic industry saw a weekly increase of 3.77% from January 12 to January 16, with the computer sector leading at 3.82% [9][11] - Among the electronic sub-sectors, the integrated circuit packaging and testing sector had the highest increase at 14.47% [11] High-Frequency Data Tracking - Panel prices for TVs are expected to see a mild increase in January, driven by strong supply control from manufacturers and rising material costs [14] - Memory prices showed an upward trend, with DDR5 16G prices rising from $32.50 to $35.00 and DDR4 16Gb prices increasing from $70.50 to $76.13 between January 12 and January 16 [19]
腾讯研究院AI速递 20260114
腾讯研究院· 2026-01-13 16:29
Group 1 - Anthropic has launched an AI office tool called Cowork, designed to automate daily tasks such as document creation, planning, data analysis, and file organization [1] - Cowork features proactive and autonomous capabilities, allowing it to create plans and sync progress in real-time, and integrates with external information sources and Chrome [1] - The development of Cowork took only a week and a half, with 100% of the code written by Claude Code, ensuring user control and the ability to halt operations at any time [1] Group 2 - Apple has announced a partnership with Google to develop the next generation of its foundational model based on Gemini, which will also overhaul Siri [2] - The Apple AI team has experienced significant talent loss, with dozens of core members leaving, making collaboration with Google a necessary choice due to Gemini's 1.2 trillion parameters compared to Apple's 150 billion [2] - Google processes 13 trillion tokens monthly, and Gemini has captured over 20% of the global market share, while Elon Musk criticized the concentration of power in this partnership [2] Group 3 - DeepSeek has introduced a new paper proposing a conditional memory module called Engram, which complements MoE conditional computation and addresses the lack of native knowledge retrieval in Transformers [3] - Engram significantly outperforms pure MoE baselines, with improvements in MMLU by 3.4, BBH by 5.0, and HumanEval by 3.0, while increasing long-context retrieval accuracy from 84.2% to 97.0% [3] - The upcoming DeepSeek V4 is becoming clearer, with conditional memory expected to be a core modeling primitive for the next generation of sparse large models [3] Group 4 - OpenAI has acquired AI healthcare startup Torch for approximately $100 million, with $60 million paid upfront and the remainder for employee retention incentives [4] - Torch integrates with healthcare systems like Kaiser Permanente and Apple Health, allowing for unified access to lab results, prescriptions, and medical records, while using AI for classification and health insights [4] - The founding team of Torch has joined OpenAI to develop the ChatGPT Health module, following their previous experience with an online clinic platform [4] Group 5 - Anthropic has launched HIPAA-compliant AI services for healthcare, enabling institutions and individuals to process protected health data while referencing authoritative databases [6] - Claude can export personal health data from applications like Apple Health for aggregation and understanding, with a commitment not to use any medical user data for model training [6] - Over 22,000 clinical service providers from Banner Health are using Claude, with 85% reporting increased work efficiency, and collaborations with major healthcare institutions are underway [6] Group 6 - Baichuan has released the open-source medical model M3, achieving a top score of 65.1 in HealthBench and winning the Hard category with a score of 44.4, surpassing GPT-5.2 [7] - M3 introduces native end-to-end serious inquiry capabilities, following the SCAN principles, and demonstrates superior inquiry abilities compared to average human doctors [7] - M3 employs a dynamic Verifier System and a new SPAR algorithm to address long dialogue training issues, with applications already integrated for doctors and patients [7] Group 7 - OpenAI is set to produce a special audio product called "Sweetpea," designed to replace AirPods, with mass production planned by Foxconn by Q4 2028 [8] - The device, designed by Jony Ive's team, features a metallic design resembling a pebble and includes two capsule-like units for behind-the-ear wear, with a focus on local AI processing [8] - The product is expected to launch in September 2026, with an estimated first-year shipment of 40-50 million units, allowing users to control functions via commands instead of an iPhone [8] Group 8 - Meituan has introduced a new sparse attention mechanism called LoZA, replacing 50% of low-performance MLA modules with a streaming sparse attention structure [9] - The new mechanism improves decoding speed for 128K context by 10 times and preloading speed for 256K context by 50%, while reducing computational complexity to linear O(L·S) [9] - LoZA can be implemented without retraining from scratch, featuring a design that balances local detail and overall logic within sparse windows [9] Group 9 - MIT Technology Review has released its list of the top ten breakthrough technologies for 2026, including large-scale AI data centers, sodium-ion batteries, base editing, and advanced nuclear reactors [10][11] - The report highlights the significant energy consumption of large-scale data centers and the successful application of sodium-ion batteries in specific vehicle models [11] - It emphasizes the shift in AI development focus from "what can be done" to "what should be done," with ethical considerations becoming a central theme in life sciences [11] Group 10 - The CEO of Fal platform revealed that generating a 5-second 24-frame video consumes 12,000 times the computational power of generating 200 tokens of text, with 4K resolution requiring ten times more [12] - The platform supports over 600 generative media models, with top clients using an average of 14 different models simultaneously, indicating a trend towards scaling AI-generated content [12] - The discussion suggests that as content generation becomes limitless, finite intellectual property will gain more value, with education and personalized advertising identified as promising application areas [12]
梁文锋署名DeepSeek最新论文,提出新方法突破GPU内存限制
Xin Lang Cai Jing· 2026-01-13 12:33
Core Viewpoint - DeepSeek, a Chinese AI startup, has developed a new model training technique that bypasses GPU memory limitations, enhancing cost efficiency and performance in AI model training [1][3]. Group 1: Technology and Innovation - DeepSeek and researchers from Peking University introduced a "conditional memory" technique called "Engram" to address the limitations of high bandwidth memory (HBM) in scaling AI models [3][4]. - The Engram technology allows for more efficient retrieval of foundational information by decoupling computation from storage, improving the model's performance in handling long contexts [4][6]. - In a model with 27 billion parameters, the new technique improved performance on key industry benchmarks by several percentage points, preserving capacity for complex reasoning tasks [4][6]. Group 2: Competitive Landscape - The HBM gap between China and the US is significant, with Chinese storage chip manufacturers lagging behind their US and South Korean counterparts [4]. - DeepSeek's previous model, DeepSeek-R1, was trained in two months at a cost of $5.5 million, significantly lower than the expenses incurred by US companies like OpenAI, while achieving comparable performance [6][7]. - Microsoft President Brad Smith highlighted that Chinese companies like DeepSeek are rapidly gaining ground in the global AI market, particularly in emerging markets, due to their low-cost open-source models [7]. Group 3: Future Developments - Anticipation is building for DeepSeek's upcoming V4 model, expected to launch in mid-February, which is said to possess strong programming capabilities [7].
梁文锋署名DeepSeek新论文,“突破GPU内存限制”
Guan Cha Zhe Wang· 2026-01-13 12:28
Core Insights - DeepSeek, a Chinese AI startup, has published a technical paper introducing a new model training technique that bypasses GPU memory limitations, highlighting its focus on cost efficiency despite existing gaps with leading US firms [1][2] - The new technique, termed "Engram," addresses the bottleneck of limited high-bandwidth memory (HBM) in scaling AI models, which is a significant gap between China and the US in AI hardware [3][4] - The paper has garnered attention from industry professionals in both China and the US, indicating DeepSeek's role as a leader in AI innovation over the past year [1][2] Technical Developments - The paper titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models" presents the "conditional memory" technology aimed at improving the efficiency of AI models when processing long contexts, a major challenge for AI chatbots [2][3] - The Engram technique allows for the decoupling of computation and storage, enhancing the model's ability to retrieve foundational information more efficiently [3][4] - Validation of this technology was conducted on a model with 27 billion parameters, showing performance improvements in key industry benchmarks [3] Market Position and Competition - DeepSeek's previous model, DeepSeek-R1, was trained in two months at a cost of $5.5 million, significantly lower than competitors like OpenAI, while achieving comparable performance [6][7] - Microsoft President Brad Smith has noted that US AI companies are being surpassed by Chinese competitors like DeepSeek, particularly in emerging markets due to the low-cost and user-friendly nature of Chinese open-source models [7] - Anticipation is building for DeepSeek's upcoming V4 model, expected to launch in mid-February, which is said to possess strong programming capabilities [8]
DeepSeek开源Engram,如何做到推理损失仅3%?
Tai Mei Ti A P P· 2026-01-13 08:44
Core Insights - DeepSeek has launched a new module called Engram, which focuses on conditional memory for large language models, aiming to enhance efficiency and reduce computational costs [1][4] - The company emphasizes innovation in architecture and methodology to break through the constraints of computational costs, with Engram representing a restructuring of memory storage at the architectural level [4][6] Group 1: Engram Module - Engram is designed as a differentiable, trainable component that separates memory load from the main computation, allowing for efficient retrieval of frequently occurring knowledge [4][6] - The module utilizes deterministic retrieval based on N-grams and hash mapping to access vectors from a large static embedding table, significantly speeding up the process without complex neural computations [4][6] Group 2: Memory Functionality - Engram incorporates a lightweight gating mechanism to determine the appropriateness of retrieved memory for the current context, enhancing both memory retention and output coherence [6] - The architecture divides the model's capabilities into three independent yet collaborative dimensions: model depth for logical reasoning, computational sparsity represented by MoE, and storage sparsity introduced by Engram [6][7] Group 3: Performance and Future Developments - Testing indicates that even with a memory bank of up to 100 billion parameters, the inference throughput loss remains below 3% [7] - DeepSeek plans to release its latest V4 model around the Chinese New Year, which is expected to significantly improve performance in handling complex tasks and coding capabilities, potentially surpassing competitors like Anthropic [7]
DeepSeek发布梁文锋署名新论文
券商中国· 2026-01-13 06:25
Group 1 - The article discusses a new paper released by DeepSeek on December 12, titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models," co-authored with Peking University [1] - The paper introduces the concept of conditional memory, which significantly enhances model performance in knowledge retrieval, reasoning, coding, and mathematical tasks under equal parameters and computational conditions [1] - DeepSeek has open-sourced a related memory module called Engram, which is part of the advancements discussed in the paper [1]
DeepSeek V4路线图隐现?梁文锋署名重磅论文发布,聚焦大模型条件记忆模块
Jin Rong Jie· 2026-01-13 04:38
Core Insights - DeepSeek has released a significant research paper focusing on the conditional memory module for large models, indicating it will be a core modeling primitive in the next generation of sparse large models [1][4] - The upcoming flagship model V4 is expected to be unveiled around the Spring Festival, with the recent research results potentially outlining its core research roadmap [1][4] Summary by Sections Research Findings - The paper titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models" was co-authored by DeepSeek and Peking University, with DeepSeek's founder Liang Wenfeng among the authors [4] - The core insight of the paper is that large models handle two distinct types of tasks: deep dynamic computation for combinatorial reasoning and static knowledge retrieval [4] - Existing Transformer architectures lack a native knowledge retrieval mechanism, leading to inefficient computation when simulating retrieval processes [4] Proposed Solutions - To address these inefficiencies, DeepSeek proposes the use of conditional memory as a supplementary dimension of sparsity, implemented through a module called Engram [5] - The team discovered a "U-shaped scaling law," indicating that a mixed sparse capacity allocation between MoE experts and Engram memory significantly outperforms pure MoE baseline models [5] - The Engram module is designed to optimize the balance between neural computation (MoE) and static memory, allowing for improved efficiency and performance in various domains, including general reasoning, coding, and mathematics [5] Future Developments - DeepSeek plans to release the next-generation flagship model V4 in February, with preliminary internal tests showing its programming capabilities surpass existing top models [6] - The V4 model is anticipated to be a focal point in the industry, especially following the success of the V3 model released at the end of 2024, which outperformed OpenAI's GPT-5 and Google's Gemini 3.0 Pro in several benchmark tests [6]