DeepSeek
Search documents
This Artificial Intelligence Stock Could Be the Biggest Bargain Buy of 2026
Yahoo Finance· 2026-01-01 14:04
Core Viewpoint - The AI sector continues to show strong performance, with significant returns for investors, particularly highlighted by the 30% increase in the Global X Artificial Intelligence & Technology ETF in 2025 [1] Group 1: Market Performance and Trends - Despite initial challenges in 2025, including trade wars and concerns over AI infrastructure spending, the AI sector performed well [2] - Major AI stocks like Nvidia, Palantir, Broadcom, and Snowflake are currently trading at high sales and earnings multiples, indicating a potentially overheated market [3] Group 2: Micron Technology's Valuation and Growth Potential - Micron Technology is identified as a standout investment opportunity, currently trading at a trailing earnings multiple of 27, despite a 57% year-over-year revenue increase and a 167% rise in non-GAAP earnings [5] - The company expects a 132% year-over-year revenue increase in the current quarter, projecting revenues of $18.7 billion and a more than fivefold increase in adjusted earnings [5] - Consensus estimates suggest Micron's earnings could nearly quadruple in the next fiscal year to $32.14 per share, with a forward earnings multiple of just 9, significantly lower than the Nasdaq-100's average of 26 [6] Group 3: Market Dynamics and Future Outlook - The memory chip market is experiencing a boom, driven by demand that exceeds supply, particularly for high-bandwidth memory used in AI applications [8] - This shortage has led to increased prices for memory chips, benefiting Micron Technology as it capitalizes on the favorable market dynamics associated with AI infrastructure development [9]
DeepSeek新年炸场!梁文锋署名论文发布
Di Yi Cai Jing· 2026-01-01 13:44
Core Viewpoint - DeepSeek has introduced a new network architecture called mHC (Manifold-Constrained Hyper-Connections) aimed at addressing instability issues in large-scale model training, potentially guiding the evolution of next-generation infrastructure [1][3][4]. Group 1: Technical Innovations - The mHC architecture improves upon traditional hyper-connection frameworks by balancing performance and efficiency, akin to adding "traffic rules" to information channels, ensuring stable information flow during model training [4]. - The research highlights that mHC can enhance the stability and scalability of large models, making it easier to implement in complex scenarios, such as multi-modal models and industrial decision-making systems [5]. Group 2: Industry Implications - mHC may reduce hardware investment and training time for companies developing larger foundational models, thus lowering the barriers for small and medium AI enterprises to create more complex models [5]. - The innovation is seen as a fundamental advancement in addressing core issues within the Transformer architecture, with expectations for significant updates in DeepSeek's upcoming V4 version [5]. Group 3: Recent Developments - Despite not launching major versions like R2 or V4 in 2023, DeepSeek has continued to innovate, releasing DeepSeek-V3.2 and DeepSeek-Math-V2, the latter being the first math model to reach international Olympiad gold medal standards [6].
DeepSeek提出全新mHC架构;安克创新回应“裁员30%”;特斯拉鸿蒙版App开启尝鲜...
Sou Hu Cai Jing· 2026-01-01 13:18
Group 1 - DeepSeek has released a new paper proposing a novel mHC architecture, with CEO Liang Wenfeng listed as one of the authors [1] - Anker Innovation has responded to rumors of a 30% layoff, stating that the reported figure is significantly exaggerated and that the adjustments are part of a strategic upgrade [2] - Tesla has launched a HarmonyOS version of its app in the Huawei app market, supporting features like remote vehicle control and mobile key [3] Group 2 - Xiaomi has announced a limited-time offer for the YU7 model, allowing customers to choose between a tax subsidy and a three-year interest-free option for orders placed before December [5] - The Redmi Note 15 series has been officially launched, starting at 999 yuan, with various color options available [6] - Huawei has released the Smart Screen V6, with prices ranging from 7999 to 14999 yuan, and is offering a limited-time discount on its high-end ADS feature package [7] Group 3 - Apple has updated its list of "vintage products," including the iPhone 11 Pro and the last Intel MacBook Air [8] - Seres Group announced that the AITO car deliveries exceeded 57,000 units in December, setting a new monthly record, with total deliveries surpassing 420,000 units for the year [9] - Li Auto plans to focus on adjusting its models in the 300,000 to 400,000 yuan price range while continuing to iterate on its pure electric i8 series [10] Group 4 - Li Auto has achieved a cumulative delivery milestone of over 1.5 million vehicles, becoming the first new force brand in China to reach this figure [12] - Huawei's Enjoy series has met its annual challenge goals for 2025, with plans to introduce more diverse products in 2026 [13] - TrendForce reports that Samsung is rigorously executing its production halt plan, which may lead to a significant increase in DDR4 memory prices in 2026 [14]
刚刚,DeepSeek 扔出大杀器,梁文锋署名!暴力优化 AI 架构
程序员的那些事· 2026-01-01 13:15
Core Insights - DeepSeek introduced a new architecture called "Manifold-Constrained Hyper-Connections" (mHC), which enhances performance with only a 6.7% increase in training time on a 27 billion parameter model [3][36]. - The mHC architecture optimizes the residual connection space by projecting matrices onto constrained manifolds, ensuring stability and significantly expanding the residual stream width without substantial computational costs [8][25]. Group 1: Performance Improvements - In system-level benchmark tests, the mHC architecture consistently outperformed baseline models and Hyper-Connections (HC) across various tasks, demonstrating its effectiveness in large-scale pre-training [22][51]. - Specific performance metrics showed that mHC achieved a 2.1% improvement on the BBH benchmark and a 2.3% improvement on the DROP benchmark compared to HC [52][54]. Group 2: Technical Details - The core idea of mHC is to restore identity mapping properties under the topology of Hyper-Connections, allowing for practical value in large-scale training and real-world foundational model tasks [25]. - mHC employs a double stochastic matrix constraint to maintain stability while enhancing the interaction between residual streams, which is crucial for maximizing the potential of multi-stream architectures [26][27]. Group 3: Engineering Optimizations - The implementation of mHC involved several engineering optimizations, including reordering operations to improve efficiency and using mixed precision strategies to maximize numerical accuracy without sacrificing computational speed [38][42]. - The DualPipe scheduling strategy was enhanced to effectively overlap communication and computation, addressing significant communication delays introduced by the n-stream residual structure [46][48].
AI进化速递丨DeepSeek提出mHC新架构
Di Yi Cai Jing· 2026-01-01 13:05
Core Insights - DeepSeek has released a new paper proposing the mHC (Manifold-Constrained Hyperconnection) architecture [1] Group 1 - Zhiyuan has launched an integrated embodied large brain system called GenieReasoner [1] - The Moon's Dark Side project has introduced a new multimodal model earlier this year [1] - DeepSeek's new paper focuses on the mHC architecture, which aims to enhance hyperconnection capabilities [1]
DeepSeek,最新发布!
券商中国· 2026-01-01 12:40
Core Viewpoint - DeepSeek has introduced a new architecture called mHC (Manifold-Constrained Hyperconnection) to address the instability issues in traditional hyperconnections during large-scale model training while maintaining significant performance gains [1][3]. Summary by Sections Research and Development - The paper highlights that recent advancements in hyperconnections (HC) have broadened the residual flow width and diversified connection patterns, enhancing the widely adopted residual connection paradigm established over the past decade. However, these improvements have weakened the inherent identity mapping characteristics of residual connections, leading to severe training instability and limited scalability, along with significant memory access overhead [3]. - To tackle these challenges, DeepSeek proposed the mHC framework, which projects the HC residual connection space onto a specific manifold, thereby restoring the identity mapping characteristics and integrating strict infrastructure optimizations to ensure operational efficiency [3]. Experimental Results - Internal large-scale training results indicate that mHC effectively supports scalable training, with an additional time overhead of only 6.7% when the expansion rate is set to 4 [4]. Conclusion and Future Directions - The conclusion of the paper states that empirical results demonstrate mHC's ability to effectively restore identity mapping characteristics, achieving stable large-scale training with superior scalability compared to traditional HC. Importantly, mHC implements these improvements with negligible computational overhead through efficient infrastructure-level optimizations [6]. - As a generalized extension of the HC paradigm, mHC opens up several important research directions for the future. While this study utilized a double random matrix to ensure stability, the framework is compatible with various manifold constraints designed for specific learning objectives. In-depth research on differentiated geometric constraints may lead to new methods that better balance plasticity and stability [6].
今日财经要闻TOP10|2026年1月1日
Xin Lang Cai Jing· 2026-01-01 12:33
Group 1 - DeepSeek released a new paper on New Year's Day proposing a new architecture called mHC (Manifold-Constrained Hyperconnection) aimed at addressing instability issues in traditional hyperconnections during large-scale model training while maintaining significant performance gains [1] - The paper's first authors include Zhenda Xie, Yixuan Wei, and Huanqi Cao, with DeepSeek's founder and CEO Liang Wenfeng also listed as an author [1] Group 2 - The EU's Carbon Border Adjustment Mechanism (CBAM) will officially implement on January 1, 2026, with recent legislative proposals and implementation details released by the EU [2] - China expressed concerns over the EU's high default carbon emission intensity values for Chinese products, which are deemed unfair and discriminatory, and plans to gradually increase these values over the next three years [2] - The EU plans to expand the CBAM scope to include approximately 180 steel and aluminum-intensive downstream products by 2028, which China views as unilateral and protectionist [2] Group 3 - Multiple electric vehicle manufacturers have reported their delivery data for December 2025 and the entire year, with Li Auto delivering 44,246 vehicles in December and a total of 1,540,215 vehicles since inception [6][16] - NIO delivered 48,135 vehicles in December, marking a 54.6% year-on-year increase, and a total of 326,028 vehicles for the year, a 46.9% increase [6][16] - Xpeng Motors reported 37,508 vehicles delivered in December, with a total of 429,445 vehicles for the year, reflecting a 126% year-on-year growth [6][16] Group 4 - Warren Buffett officially retired as CEO of Berkshire Hathaway on December 31, 2025 [7][18]
DeepSeek 开年发布新论文:提出全新 mHC 架构,梁文锋现身作者名单
Xin Lang Cai Jing· 2026-01-01 12:24
Core Insights - DeepSeek has introduced a new architecture called mHC (Manifold-Constrained Hyperconnection) aimed at addressing the instability issues in traditional hyperconnections during large-scale model training while maintaining significant performance gains [1][6] Group 1: Research and Development - The paper presents mHC as a universal framework that projects the residual connection space of hyperconnections onto a specific manifold to restore the identity mapping property [6] - The authors of the paper include Zhenda Xie, Yixuan Wei, Huanqi Cao, and Liang Wenfeng, the founder and CEO of DeepSeek [1] Group 2: Performance and Scalability - Empirical experiments indicate that mHC is effective for large-scale training, providing tangible performance improvements and excellent scalability [6] - The proposed architecture is expected to contribute to a deeper understanding of topological architecture design and offer promising directions for the evolution of foundational models [6]
“股票盛世”!全球股市连续第3年“两位数上涨”
华尔街见闻· 2026-01-01 12:20
Core Viewpoint - The global stock market is expected to achieve double-digit growth for the third consecutive year in 2025, despite uncertainties from Trump's trade policies and concerns over AI sector bubbles. The MSCI global index has risen over 20% this year, outperforming most analysts' expectations [1]. Group 1: US Market Performance - After a significant downturn at the beginning of the year, the US stock market rebounded strongly, with the S&P 500 index showing an annual increase of nearly 16.5%. The release of a large language model by DeepSeek shocked Silicon Valley and led to a drop in tech stocks. Trump's announcement of large tariffs in April triggered sell-offs in stocks, bonds, and the dollar, but strong corporate earnings, expectations of Fed rate cuts, and better-than-expected economic growth quickly brought investors back to the market [2]. - Despite strong performance in the US market, other markets such as China, Japan, the UK, and Germany have outperformed the S&P 500 this year, with emerging market stock indices also performing better than US stocks. Investors sought more diversified allocations after experiencing volatility in the US market at the beginning of the year [4]. Group 2: Economic Resilience and Market Support - The resilience of the US economy, combined with the clear outlook for a shift in Fed monetary policy towards rate cuts, has been a core support for market performance, driving significant capital inflows into the stock market and reinforcing long-term bets on AI potential. Additionally, better-than-expected US economic growth data has alleviated market anxieties and boosted risk appetite [8]. Group 3: Valuation Concerns - Market valuations are significantly above historical averages, with analysts warning that the current rally, driven by tech giants, may not be sustainable. The Shiller cyclically adjusted price-to-earnings ratio for the S&P 500 is nearing 40 times, the second highest level since the early 2000s internet bubble [6][10]. - Following such a strong rally, market sentiment has begun to turn cautious, with some investors and analysts warning about the sustainability of the current market conditions. The rally has shown significant structural concentration and valuation divergence, primarily driven by a few tech giants, leading to a substantial deviation from long-term historical averages [10]. Group 4: Concentration Risk - The current market rally, driven by a small number of stocks, is accumulating structural risks. The so-called "seven giants" of US tech have reached about a quarter of the MSCI global developed market stock index, creating a deep binding of global index movements to the performance of these individual giants, thereby increasing overall market fragility [12]. - The increasing concentration trend in the market is prompting a deep examination of the merger frenzy in the AI sector. This trend has created a complex and interdependent financial network, exemplified by OpenAI, which not only holds stakes in key infrastructure suppliers but also receives substantial investments from other industry participants, potentially amplifying systemic risks [14].
刚刚,梁文锋署名,DeepSeek元旦新论文要开启架构新篇章
华尔街见闻· 2026-01-01 12:20
Core Insights - DeepSeek has introduced a new architecture called Manifold-Constrained Hyper-Connections (mHC) to address the instability issues in traditional hyper-connections during large-scale model training while maintaining significant performance gains [1][6][8]. Group 1: mHC Architecture - The mHC architecture extends the single residual flow of traditional Transformers into a multi-flow parallel structure, utilizing the Sinkhorn-Knopp algorithm to constrain the connection matrix on a doubly stochastic matrix manifold [1][8]. - The core objective of mHC is to retain the performance improvements from widening the residual flow while resolving training instability and excessive memory consumption [8][9]. - Empirical evidence shows that mHC not only addresses stability issues but also demonstrates exceptional scalability in large-scale training, such as with a 27 billion parameter model, where it only increased training time by 6.7% while achieving significant performance improvements [8][32]. Group 2: Challenges with Traditional Hyper-Connections - Traditional hyper-connections (HC) have led to severe training instability and limited scalability due to the fundamental disruption of the inherent identity mapping property, which is crucial for stable training [5][9]. - The widening of information channels in HC results in increased memory access overhead, contributing to what is known as the "memory wall" problem [9][5]. Group 3: Implementation and Efficiency - DeepSeek has designed a tailored infrastructure for mHC, which includes kernel fusion, selective recomputation, and an extended DualPipe communication overlap strategy to minimize memory usage and enhance efficiency [23][25][27]. - The Sinkhorn-Knopp algorithm is employed to ensure that the residual connection matrix remains stable and adheres to the properties of a doubly stochastic matrix, which helps mitigate gradient explosion issues [16][21]. Group 4: Experimental Validation - The research team conducted experiments using language model pre-training to validate the effectiveness of mHC, comparing it against baseline models and traditional HC [28][32]. - Results from various downstream benchmark tests indicate that mHC consistently outperforms baseline models and often surpasses HC, demonstrating its effectiveness in large-scale pre-training [34][33]. - The scalability experiments reveal that mHC maintains performance advantages even at higher computational budgets, showing only slight degradation in performance [36][37].