Seek .(SKLTY)
Search documents
PriceSeek提醒:雅化锂矿运回促氢氧化锂供应增
Xin Lang Cai Jing· 2026-01-14 04:09
Core Viewpoint - Yahua Group has successfully transported lithium ore from Zimbabwe back to China for the production of lithium hydroxide, indicating a stable and increased supply of raw materials, which may enhance production capacity and affect market dynamics negatively for lithium hydroxide prices [1][4]. Group 1: Company Developments - Yahua Group announced on January 13 that it has returned a bulk shipment of lithium ore from Zimbabwe for domestic production [1][4]. - The return of lithium ore is expected to stabilize and potentially increase the production of lithium hydroxide, which is crucial for battery manufacturing [1][4]. Group 2: Market Implications - The increase in raw material supply is likely to enhance market expectations for lithium hydroxide supply, potentially leading to downward pressure on spot prices due to alleviated shortages [2][5]. - The overall market sentiment is rated as slightly bearish (-1), as the substantial increase in supply is expected to negatively impact prices, despite not reaching an extreme level [2][5].
幻方量化去年收益率56.6%,为DeepSeek提供超级弹药
2 1 Shi Ji Jing Ji Bao Dao· 2026-01-14 02:16
Core Insights - The article highlights the impressive performance of Huansheng Quantitative, which achieved an average return of 56.55% in 2025, ranking second among quantitative private equity firms in China, only behind Lingjun Investment with 73.51% [2] - Huansheng Quantitative's management scale has exceeded 70 billion yuan, and its average returns over the past three years and five years are 85.15% and 114.35%, respectively [2] - The strong returns from Huansheng Quantitative provide substantial funding support for DeepSeek, a company focused on AI model development, founded by Liang Wenfeng [2][4] Company Overview - Huansheng Quantitative was established in 2015 and specializes in AI quantitative trading, consistently investing in AI algorithm research [2][4] - The company has a diverse team composed of experts in various fields, including mathematics, physics, and computer science, which enables it to tackle challenges in deep learning and big data modeling [2] - The company has experienced rapid growth, surpassing 100 billion yuan in management scale in 2019 and reaching over 700 billion yuan currently [2][4] Financial Performance - Based on industry estimates, Huansheng Quantitative's strong performance last year could generate over 700 million USD in revenue, assuming a 1% management fee and a 20% performance fee [6] - The funding for DeepSeek's research comes from Huansheng Quantitative's R&D budget, with Liang Wenfeng holding a majority stake in both companies [4][5] AI Model Development - DeepSeek, incubated by Huansheng Quantitative, aims to advance general artificial intelligence and has a budget of 5.57 million USD for its V3 model training costs [7] - DeepSeek plans to release its next-generation AI model, DeepSeek V4, around the Lunar New Year, which is expected to surpass existing top models in programming capabilities [7]
幻方量化去年收益率56.6% 为DeepSeek提供超级弹药
2 1 Shi Ji Jing Ji Bao Dao· 2026-01-14 02:15
Core Insights - The article highlights the impressive returns of Fantom Quantitative, which achieved an average return of 56.55% in 2025, ranking second among quantitative private equity firms in China, only behind Lingjun Investment with a return of 73.51% [1] - Fantom Quantitative's average return over the past three years is 85.15%, and 114.35% over the past five years, providing substantial funding support for DeepSeek's large model research [2] - Founded in 2015 by Liang Wenfeng, Fantom Quantitative focuses on AI quantitative trading and has a current management scale exceeding 70 billion yuan, maintaining a leading position in the domestic private quantitative investment sector [2][3] Company Overview - Fantom Quantitative has a team composed of award-winning mathematicians, physicists, and experts in AI, employing interdisciplinary collaboration to tackle challenges in deep learning, big data modeling, and quantitative analysis [2] - The company has been utilizing machine learning for fully automated quantitative trading since 2008 and has expanded rapidly since its inception [2] - Significant investments were made in AI training platforms, with "Firefly No. 1" established in 2019 and "Firefly No. 2" in 2021, leading to the establishment of DeepSeek in July 2023 [3] Financial Performance - Liang Wenfeng holds a majority stake in Fantom Quantitative and has ceased to introduce external funding for the fund, indicating a strong accumulation of capital for supporting large model research [4] - The strong performance of Fantom Quantitative is estimated to have generated over 700 million USD in revenue last year, assuming a 1% management fee and 20% performance fee [4] DeepSeek Developments - DeepSeek's V3 model has a total training cost budget of 5.57 million USD, while competitors like Zhizhu and MiniMax have reported significant R&D expenditures [5] - DeepSeek plans to release its next-generation AI model, DeepSeek V4, around the Lunar New Year, which is expected to surpass current leading models in programming capabilities [5]
DeepSeek论文披露全新模型机制,SSD等存储需求有望再进一步,龙头还发布炸裂业绩
Xuan Gu Bao· 2026-01-13 23:24
Group 1 - DeepSeek introduced a new paper proposing "conditional memory" as a new dimension of sparsity to optimize large language models through the Engram module [1] - The existing Transformer architecture lacks a native knowledge retrieval mechanism, leading to inefficient simulation of retrieval behavior [1] - Conditional memory complements the MoE (Mixture of Experts) approach and significantly enhances model performance in knowledge retrieval, reasoning, coding, and mathematical tasks under equal parameters and computational conditions [1] Group 2 - The Engram module is a large, scalable embedding table that acts as an external memory for Transformers, allowing for efficient retrieval of nearby content [2] - Engram caches frequently accessed embeddings in faster storage mediums while storing less frequently accessed data in larger, slower storage, maintaining low access latency [2] - The NAND industry is expected to have limited capital expenditure over the next two years, with leading manufacturers likely to focus on HBM rather than NAND, while AI applications are anticipated to drive SSD demand [2] Group 3 - Baiwei Storage forecasts a net profit of 850 million to 1 billion yuan for the year, representing a year-on-year growth of 427.19% to 520.22% [2] - Jiangbolong has launched several high-speed enterprise-level eSSD products, covering mainstream capacities from 480GB to 7.68TB [3]
DeepSeek母公司去年进账50亿,够烧2380个R1
3 6 Ke· 2026-01-13 13:02
Core Insights - DeepSeek has not engaged in new financing or significant commercialization activities despite the buzz surrounding large model players in the market [1] - DeepSeek continues to produce high-quality research papers, indicating a stable output of academic contributions [2] - The financial success of its parent company, Huanfang Quantitative, which earned approximately $7 billion last year, provides substantial funding for DeepSeek's research endeavors [6][8] Group 1: Financial Performance - Huanfang Quantitative's funds are showing impressive returns, with nearly all of its funds projected to yield over 55% in 2025 [3] - The average return for quantitative funds in China last year was 30.5%, significantly outperforming global competitors [4] - Huanfang Quantitative's asset management exceeds $70 billion, contributing to its substantial earnings [7] Group 2: Research and Development - DeepSeek's research expenditures are relatively low, with the latest V3 training costing $557,600 and R1 costing $29,400, allowing for the potential production of numerous models with available funds [6] - DeepSeek has maintained a focus on AGI research without the pressure of immediate financial returns, as it has not accepted external funding and is not tied to any major tech company [11][15] - The company has consistently released significant research outputs, including recent advancements in OCR and V3.2, while also open-sourcing components like the memory module [9][10] Group 3: Market Position and Strategy - DeepSeek operates with a unique business model that allows it to focus solely on AGI without the distractions of monetization pressures [10][12] - The company benefits from a stable and committed research team, with minimal turnover and even some returning members, indicating a strong internal culture [28][30] - DeepSeek's research outputs have become valuable to investors, as its technical papers provide insights that influence stock movements in related hardware companies [34][39] Group 4: Competitive Landscape - Compared to other major players like OpenAI, DeepSeek's approach is characterized by a lack of aggressive monetization strategies, focusing instead on pure research [26][9] - The ability to leverage a mature business model for cross-subsidization of AI research is often underestimated in the market [19][20] - DeepSeek's model integrates the strengths of both established companies and pure AI startups, positioning it uniquely in the competitive landscape [26]
梁文锋署名DeepSeek最新论文,提出新方法突破GPU内存限制
Xin Lang Cai Jing· 2026-01-13 12:33
Core Viewpoint - DeepSeek, a Chinese AI startup, has developed a new model training technique that bypasses GPU memory limitations, enhancing cost efficiency and performance in AI model training [1][3]. Group 1: Technology and Innovation - DeepSeek and researchers from Peking University introduced a "conditional memory" technique called "Engram" to address the limitations of high bandwidth memory (HBM) in scaling AI models [3][4]. - The Engram technology allows for more efficient retrieval of foundational information by decoupling computation from storage, improving the model's performance in handling long contexts [4][6]. - In a model with 27 billion parameters, the new technique improved performance on key industry benchmarks by several percentage points, preserving capacity for complex reasoning tasks [4][6]. Group 2: Competitive Landscape - The HBM gap between China and the US is significant, with Chinese storage chip manufacturers lagging behind their US and South Korean counterparts [4]. - DeepSeek's previous model, DeepSeek-R1, was trained in two months at a cost of $5.5 million, significantly lower than the expenses incurred by US companies like OpenAI, while achieving comparable performance [6][7]. - Microsoft President Brad Smith highlighted that Chinese companies like DeepSeek are rapidly gaining ground in the global AI market, particularly in emerging markets, due to their low-cost open-source models [7]. Group 3: Future Developments - Anticipation is building for DeepSeek's upcoming V4 model, expected to launch in mid-February, which is said to possess strong programming capabilities [7].
梁文锋署名DeepSeek新论文,“突破GPU内存限制”
Guan Cha Zhe Wang· 2026-01-13 12:28
Core Insights - DeepSeek, a Chinese AI startup, has published a technical paper introducing a new model training technique that bypasses GPU memory limitations, highlighting its focus on cost efficiency despite existing gaps with leading US firms [1][2] - The new technique, termed "Engram," addresses the bottleneck of limited high-bandwidth memory (HBM) in scaling AI models, which is a significant gap between China and the US in AI hardware [3][4] - The paper has garnered attention from industry professionals in both China and the US, indicating DeepSeek's role as a leader in AI innovation over the past year [1][2] Technical Developments - The paper titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models" presents the "conditional memory" technology aimed at improving the efficiency of AI models when processing long contexts, a major challenge for AI chatbots [2][3] - The Engram technique allows for the decoupling of computation and storage, enhancing the model's ability to retrieve foundational information more efficiently [3][4] - Validation of this technology was conducted on a model with 27 billion parameters, showing performance improvements in key industry benchmarks [3] Market Position and Competition - DeepSeek's previous model, DeepSeek-R1, was trained in two months at a cost of $5.5 million, significantly lower than competitors like OpenAI, while achieving comparable performance [6][7] - Microsoft President Brad Smith has noted that US AI companies are being surpassed by Chinese competitors like DeepSeek, particularly in emerging markets due to the low-cost and user-friendly nature of Chinese open-source models [7] - Anticipation is building for DeepSeek's upcoming V4 model, expected to launch in mid-February, which is said to possess strong programming capabilities [8]
DeepSeek开源Engram,如何做到推理损失仅3%?
Tai Mei Ti A P P· 2026-01-13 08:44
Core Insights - DeepSeek has launched a new module called Engram, which focuses on conditional memory for large language models, aiming to enhance efficiency and reduce computational costs [1][4] - The company emphasizes innovation in architecture and methodology to break through the constraints of computational costs, with Engram representing a restructuring of memory storage at the architectural level [4][6] Group 1: Engram Module - Engram is designed as a differentiable, trainable component that separates memory load from the main computation, allowing for efficient retrieval of frequently occurring knowledge [4][6] - The module utilizes deterministic retrieval based on N-grams and hash mapping to access vectors from a large static embedding table, significantly speeding up the process without complex neural computations [4][6] Group 2: Memory Functionality - Engram incorporates a lightweight gating mechanism to determine the appropriateness of retrieved memory for the current context, enhancing both memory retention and output coherence [6] - The architecture divides the model's capabilities into three independent yet collaborative dimensions: model depth for logical reasoning, computational sparsity represented by MoE, and storage sparsity introduced by Engram [6][7] Group 3: Performance and Future Developments - Testing indicates that even with a memory bank of up to 100 billion parameters, the inference throughput loss remains below 3% [7] - DeepSeek plans to release its latest V4 model around the Chinese New Year, which is expected to significantly improve performance in handling complex tasks and coding capabilities, potentially surpassing competitors like Anthropic [7]
DeepSeek开源大模型记忆模块,梁文锋署名新论文,下一代稀疏模型提前剧透
3 6 Ke· 2026-01-13 07:14
Core Insights - DeepSeek has introduced a new paradigm called "Conditional Memory" to enhance the Transformer model's knowledge retrieval capabilities, which were previously lacking [1][4][31] - The Engram module allows for significant improvements in model efficiency, enabling simpler tasks to be completed with fewer layers, thus freeing up resources for more complex reasoning tasks [4][21] Group 1: Conditional Memory and Engram Module - The paper presents Conditional Memory as an essential modeling primitive for the next generation of sparse models [1][4] - Engram enables the model to perform tasks that previously required six layers of attention in just one or two layers, optimizing resource allocation [4][21] - The Engram design incorporates a large vocabulary for static knowledge retrieval, allowing for O(1) speed in information retrieval [4][6] Group 2: Performance and Efficiency - The optimal allocation of parameters between MoE (Mixture of Experts) and Engram memory was found to be around 20% to 25%, leading to a reduction in model validation loss [17][21] - In experiments, the Engram-27B model outperformed the MoE-27B model in various knowledge-intensive tasks, with notable improvements in general reasoning and code mathematics [21][22] - The Engram-40B model further increased memory parameters, showing sustained performance improvements and indicating that memory capacity had not yet saturated [25][31] Group 3: Hardware Optimization - The Engram module allows for the offloading of large parameter tables to CPU memory, minimizing inference delays and maintaining high throughput [29][30] - The design principle of "hardware-aware efficiency" enables the decoupling of storage and computation, facilitating the use of massive parameter tables without significant performance costs [31]
梁文锋署名DeepSeek新论文发布,直指大模型“记忆”短板
Bei Ke Cai Jing· 2026-01-13 04:41
Core Insights - The paper published by DeepSeek addresses the memory limitations of current large language models and introduces the concept of "conditional memory" [2] - DeepSeek proposes a module named Engram, which breaks down language modeling tasks into two branches: "static pattern retrieval" for quick access to deterministic knowledge and "dynamic combinatorial reasoning" for complex logical operations [2] - The paper suggests that conditional memory is an essential modeling primitive for the next generation of sparse models, with speculation that DeepSeek's next model may be released before the Spring Festival [3] Group 1 - The paper titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models" was co-authored by Peking University and DeepSeek [1] - The introduction of "conditional memory" aims to enhance the memory capabilities of large language models [2] - The Engram module is designed to improve efficiency in language modeling by separating tasks into static and dynamic components [2] Group 2 - The paper emphasizes the importance of conditional memory for future sparse model development [3] - There are speculations regarding the release of DeepSeek's next-generation model around the Spring Festival, potentially replicating the success of previous launches [3]