DeepSeek
Search documents
DeepSeek新模型MODEL1曝光
Jin Rong Jie· 2026-01-20 23:59
DeepSeek-R1发布一周年之际,新模型"MODEL1"曝光。DeepSeek在GitHub更新FlashMLA代码,横跨 114个文件中有28处提到MODEL1,与V32作为不同的模型出现。已知V32是DeepSeek-V3.2,MODEL1 很可能是新的架构。代码中的具体差异体现在KV缓存布局、稀疏性处理和FP8解码方面,在内存优化 上有多处不同。此前有消息称DeepSeek将在2月中旬春节前后发布下一代旗舰模型。 ...
中国大模型的降本与增效
Xin Lang Cai Jing· 2026-01-20 17:50
Core Insights - MiniMax, an AI company founded by Yan Junjie, focuses on transforming large model capabilities into consumer-grade products, achieving significant international reach with over 2.12 billion users [1][3] - The company reported a revenue of $53.43 million in the first nine months of 2025, with over 70% of this revenue coming from overseas markets [1][8] Group 1: Company Overview - MiniMax was established in 2022 and quickly became a competitor to OpenAI, launching its first text model in April 2022 and its first AI-native multimodal interaction platform, Talkie, in early 2023 [3][5] - The company went public on January 9, 2026, after just four years of operation, with a market capitalization of HKD 122.3 billion [4][5] Group 2: Product Offerings - MiniMax has developed several AI-native products, including Talkie, Hailuo AI, and MiniMax Voice, which collectively contributed 71.1% of the company's revenue in the first nine months of 2025 [6][5] - Talkie and Xingye generated $758,000 in revenue in 2023, while Hailuo AI began contributing revenue in 2024 [6] Group 3: Market Strategy - MiniMax's strategy emphasizes delivering products rather than just APIs, aiming to enhance brand image while providing scalable user applications [7][9] - The company serves over 200 million individual users and more than 100,000 enterprises across 200 countries, with 70% of its revenue derived from international markets [8][9] Group 4: Competitive Positioning - MiniMax's pricing strategy for its latest language model, MiniMaxM2, is significantly lower than that of leading overseas models, with API costs at approximately 30% of the competition [9] - The company aims to optimize infrastructure while expanding application scenarios, differentiating itself from competitors like DeepSeek [9]
从梁文锋到闫俊杰:中国大模型的“降本”与“增效”
Bei Jing Shang Bao· 2026-01-20 12:32
Core Insights - MiniMax, founded by Yan Junjie, is an AI company that has gained significant traction with 2.12 billion users, focusing on transforming large model capabilities into consumer-grade products [2][3] - The company has achieved substantial revenue growth, with 71.1% of its revenue coming from AI-native products in the first nine months of 2025 [6] Group 1: Company Overview - MiniMax was established in 2022 and has quickly positioned itself in the global market, launching products like Talkie and Hailuo AI [3][5] - The company went public on January 9, 2026, after just four years of operation, highlighting its rapid growth trajectory [3] Group 2: Product Development and Revenue - MiniMax's revenue for the first nine months of 2025 reached $53.43 million, with over 70% derived from international markets [2][6] - The revenue contributions from various products include Talkie/星野 at 35.1%, Hailuo AI at 32.6%, MiniMax Voice at 2%, and MiniMax Agent at 1.4% [6] Group 3: Market Strategy - MiniMax differentiates itself by focusing on product development rather than just API offerings, aiming to enhance user engagement and brand image [8] - The company has successfully penetrated over 200 countries, serving more than 200 million individual users and over 100,000 enterprises [9] Group 4: Competitive Positioning - MiniMax's pricing strategy for its latest language model, MiniMaxM2, is significantly lower than that of leading overseas models, making it competitive in the global market [9] - The company’s approach contrasts with DeepSeek, which focuses on infrastructure optimization, while MiniMax emphasizes application expansion [9]
与美国关系出现裂痕,欧洲要学中国打造自主版DeepSeek
Feng Huang Wang· 2026-01-20 08:21
Core Insights - European AI companies are seeking to innovate and reduce reliance on American technology amid rising geopolitical tensions with the U.S. [4] - The success of the Chinese AI startup DeepSeek has inspired European researchers to explore alternative paths for developing competitive AI products [5] - European governments are committing hundreds of millions of dollars to decrease dependence on foreign AI suppliers [5] Group 1: Current Landscape - U.S. companies dominate the AI industry across various segments, including processor design, data center capacity, and application development [4] - The perception that innovation is solely occurring in the U.S. is considered dangerous, as it may discourage European efforts to compete [5] - European AI labs may have an advantage in open research and development, allowing for collaborative improvements on models [5] Group 2: Urgency for Autonomy - The changing geopolitical landscape has heightened the urgency for Europe to achieve self-sufficiency in AI technology [6] - Tensions between European leaders and the Trump administration have raised concerns about the future of NATO and the reliance on U.S. technology [6][7] - European dependence on U.S. AI services is viewed as a potential liability in trade negotiations [7] Group 3: Strategies for Development - European countries are attempting to localize AI development through funding initiatives, regulatory adjustments, and partnerships with academic institutions [8] - There is a focus on creating competitive large language models tailored for European languages [8] - The ongoing success of U.S. platforms like ChatGPT poses a challenge for European AI companies to catch up [9] Group 4: Policy and Market Dynamics - There is ambiguity regarding how far Europe intends to push for "digital sovereignty" and whether it requires complete self-sufficiency or just local alternatives [10] - Some European suppliers advocate for strategies that prioritize local AI products, while others warn against excluding U.S. companies [10] - The consensus on policy measures to achieve self-sufficiency in AI is still lacking within Europe [10] Group 5: Future Aspirations - Despite limited budgets, European AI labs believe they can close the performance gap with U.S. leaders, as demonstrated by DeepSeek [11] - Projects like SOOFI aim to develop competitive language models with around 100 billion parameters [11] - The future progress in AI may not solely depend on the largest GPU clusters, indicating a shift in the competitive landscape [11]
未知机构:基础模型厂商的价值依然被低估华泰计算机0120我-20260120
未知机构· 2026-01-20 02:10
Summary of Conference Call Notes Industry and Companies Involved - The discussion primarily revolves around the AI model industry, specifically focusing on companies such as Zhipu and MiniMax, which are involved in foundational model training and deployment [1][2]. Core Insights and Arguments - **Misunderstanding of Business Models**: Many leaders still perceive Zhipu as a company focused on B2B project deployment and MiniMax as a B2C internet application provider. The report argues that these applications are merely commercial representations to provide visible returns to investors, while the true core lies in their foundational model training capabilities, which are among the top tier globally in open-source models [1]. - **Valuation of Kimi**: Kimi, a pre-IPO company, completed a $500 million financing round at the end of December, achieving a valuation of $4.3 billion. Shortly after, Kimi initiated another financing round with a pre-investment valuation of $4.8 billion. This rapid increase indicates a fear of missing out (FOMO) in the primary market regarding investments in large models, suggesting a re-evaluation of the value of domestic large models [1]. - **Recognition of AI Model Companies**: MiniMax's founder, Yan Junjie, participated in a significant roundtable discussion, becoming the second representative from an AI large model company to do so, following DeepSeek's founder. This participation highlights the industry's acknowledgment of the position of large model manufacturers [2]. - **Differences in AI Development**: The current wave of AI differs fundamentally from the previous wave of computer vision. While computer vision primarily addressed single recognition tasks, general large models possess greater potential across various domains such as work, life, and scientific discovery. The report suggests that this time, there will not be a decline in technology premium due to control by terminal manufacturers [3]. - **Market Potential of Foundational Models**: The report emphasizes the need to evaluate the valuations of Zhipu and MiniMax from a higher perspective, considering the contribution of large models to global GDP and the market share they could capture in the future. It suggests that the commercialization of large models is still evolving, with many pathways yet to be explored [3]. Other Important but Potentially Overlooked Content - **Product Launches and Market Awareness**: The recent launch of Anthropic's CoWork Agent product, which was entirely coded using Cloud Code, quickly gained popularity, further highlighting the potential embedded within foundational model manufacturers [3].
何帆开年演讲:这,是年轻人的红利!
Sou Hu Cai Jing· 2026-01-19 17:08
Core Insights - The central theme of the discussion revolves around the "Beauty Revolution" as a new approach to escape the phenomenon of "involution" in China, emphasizing the importance of aesthetic values in various industries and individual lives [4][18]. Group 1: Economic and Cultural Shifts - 2025 is identified as a pivotal year marked by significant changes, with a shift in the "offensive and defensive" dynamics in the economy [8][11]. - The emergence of DeepSeek, an AI model company, symbolizes a breakthrough in the tech landscape, challenging the notion that only Silicon Valley giants can dominate this field [9][10]. - Macro policies are expected to shift focus from security to domestic welfare and consumption, reflecting a more confident stance in policy-making [12][14]. Group 2: The Aesthetic Revolution - The "Beauty Revolution" is presented as a response to the overwhelming pressures of competition and the desire for a better quality of life [18][20]. - Aesthetic values are seen as essential skills and core competitive advantages for individuals and businesses in the future [5][48]. - The discussion highlights how different economic periods influence cultural trends, with the current focus on subcultures driven by youth and marginalized groups [22][30]. Group 3: Youth as Trendsetters - Young people are identified as the primary creators and drivers of cultural trends, with their aesthetic preferences shaping the future of beauty and lifestyle [34][36]. - The popularity of basic fashion items among youth reflects their growing confidence in personal style, moving away from reliance on luxury brands [37][38]. - Nostalgic trends, such as the Y2K style, indicate a desire to connect with the vibrancy of past economic upswings, rather than a literal return to those times [40][41]. Group 4: Business Opportunities in Aesthetic Values - The economic downturn presents unique opportunities for small and niche brands that understand and cater to specific lifestyle aesthetics [48][49]. - Examples of innovative products, such as wearable sleeping bags and designer thermoses, illustrate how aesthetic considerations can drive consumer interest and market success [55][56]. - The case of Hoto, a design-focused tool brand, demonstrates how aesthetic improvements can disrupt traditional markets characterized by low differentiation [60][68]. Group 5: Changing Consumer Preferences - The hospitality industry is adapting to new consumer behaviors, with hotels evolving to meet the diverse needs of younger guests [70][76]. - Real estate developments are shifting from merely selling properties to offering lifestyle experiences, as seen in the case of the "Tangu" project [78][90]. - The integration of art and culture into rural areas is transforming local economies and attracting younger populations, enhancing the vibrancy of these communities [100][106]. Group 6: The Role of Aesthetics in Competitiveness - Aesthetic sensibility is becoming a crucial competitive edge in various industries, as businesses that embrace this shift can better connect with consumers [92][112]. - The trend of "urban nomads" highlights the need for cities to adapt to transient populations, focusing on creating environments that foster creativity and community [109][111].
梁文锋后,又一位大模型企业代表参加总理座谈会
Bei Ke Cai Jing· 2026-01-19 14:08
校对 贾宁 闫俊杰,1989年生,河南人,MiniMax稀宇科技创始人和CEO。2015年7月获得中科院博士学位,在顶 级会议和期刊上发表约200篇学术论文。MiniMax成立于2022年,2026年1月9日在香港上市,目前总市 值1186亿港元,迄今已有超过200个国家及地区的逾2.12亿名用户,超过70%的收入来自海外市场。 编辑 岳彩周 新京报贝壳财经讯(记者罗亦丹)1月19日下午,中共中央政治局常委、国务院总理李强主持召开专 家、企业家和教科文卫体等领域代表座谈会,听取对《政府工作报告》和《"十五五"规划纲要(草 案)》两个征求意见稿的意见建议。 MiniMax稀宇科技创始人、CEO闫俊杰出席座谈会并发言,成为继DeepSeek创始人梁文锋后,第二位参 会的AI大模型企业代表。 ...
没有商业模式--DeepSeek最坚固的“护城河”
华尔街见闻· 2026-01-19 09:46
Core Viewpoint - DeepSeek's unique advantage lies in its lack of a commercial model, allowing it to focus solely on its AGI (Artificial General Intelligence) aspirations without external pressures or funding requirements [3][8][12]. Group 1: Market Expectations and Competition - The market's expectations for DeepSeek's upcoming model are tempered by the saturation of open-source models, making it less likely to shock the world again as it did previously [3][4]. - DeepSeek is no longer the only or the most open player in the market, as other labs have quickly followed suit with their own models [5][8]. Group 2: Funding and Control - DeepSeek's founder, Liang Wenfeng, has maintained a "zero external financing" approach, prioritizing control over financial gain, which is unique among top labs [3][9]. - The success of Liang's quantitative fund, which generated over $700 million in profit with a 53% return rate, allows DeepSeek to fund its operations without external investment [3][11]. Group 3: Advantages of No Commercial Model - The absence of external funding means DeepSeek is not burdened by commercial KPIs, allowing it to focus purely on technological advancements [3][12]. - The lack of external financial pressures fosters a flat organizational structure, reducing internal competition and bureaucracy, which can hinder innovation [14][15]. Group 4: Research and Resource Allocation - DeepSeek's limited resources do not impede its research quality, as good research does not necessarily require excessive computational power [13][14]. - The organization can prioritize innovative ideas without the distractions and conflicts that often accompany larger, well-funded labs [15][18].
租了8张H100,他成功复现了DeepSeek的mHC,结果比官方报告更炸裂
机器之心· 2026-01-19 08:54
Core Insights - DeepSeek's mHC architecture addresses numerical instability and signal explosion issues in large-scale training by extending traditional Transformer residual connections into a multi-stream parallel architecture [1][5] - The mHC model has garnered significant attention in the AI community, with successful reproductions yielding better results than the original DeepSeek paper [5][6] Group 1: mHC Architecture - The mHC model utilizes the Sinkhorn-Knopp algorithm to constrain the connection matrix to a doubly stochastic matrix manifold, ensuring stability during training [1][25] - Traditional residual connections in Transformers have remained unchanged since 2016, relying on a single information flow, while mHC introduces multiple parallel streams for enhanced expressiveness [9][14] - The mHC architecture maintains stability by preventing signal amplification, which can lead to catastrophic failures in large models [20][28] Group 2: Experimental Results - In experiments with 10M parameters, the original hyper-connection (HC) model exhibited a signal amplification of 9.2 times, while mHC maintained stability with an amplification of 1.0 [36][61] - Scaling up to 1.7B parameters, the HC model showed an alarming amplification of 10,924 times, highlighting the instability associated with larger models [54][66] - The experiments demonstrated that while HC models accumulate instability, mHC models consistently maintain structural integrity across different training conditions [70][71] Group 3: Implications and Future Directions - The findings suggest that while traditional residual connections are stable, they may not be optimal for larger models, as mHC offers a balance between expressiveness and stability [57][58] - Future research aims to explore scaling laws further, particularly at the 10B parameter scale, where significant amplification trends are anticipated [101] - The mHC approach not only mitigates instability but also eliminates the risk of catastrophic failures in large-scale training scenarios [93][96]
没有商业模式,是DeepSeek最坚固的“护城河”
3 6 Ke· 2026-01-19 08:22
Core Insights - The article discusses the upcoming anniversary of DeepSeek and the expectations surrounding its new model release, emphasizing that the market should temper its expectations as the AI landscape has evolved significantly since last year [1][10]. Group 1: Business Model and Funding - DeepSeek's strongest competitive advantage is its unique model of zero external financing, allowing it to pursue its AGI dream without commercial pressures [2][15]. - The founder, Liang Wenfeng, prioritizes control over financial backing, making DeepSeek an outlier in a capital-driven AI industry [3][18]. - DeepSeek's funding comes from its profitable quantitative fund, Huanfang Quantitative, which generated over $700 million (approximately 5 billion RMB) in profit last year, allowing for investment in resources without external investor pressure [4][18]. Group 2: Market Position and Competition - The article warns that while DeepSeek previously led the market with its models, it is no longer the only or the most open player, as many competitors have emerged with open-source models [10][11]. - The expectation that DeepSeek will release a groundbreaking model is tempered by the reality that the market is now saturated with open-source alternatives, diminishing its unique position [10][14]. Group 3: Internal Dynamics and Research Quality - The absence of external funding allows DeepSeek to maintain a flat organizational structure, reducing internal competition and bureaucracy, which can hinder research quality [20][22]. - The article highlights that excessive funding can lead to "big company syndrome," where resources are mismanaged and research quality suffers, a situation DeepSeek avoids by self-funding [6][20]. - The focus on research quality over sheer computational power is emphasized, with insights from Ilya Sutskever suggesting that significant breakthroughs do not necessarily require vast computational resources [7][21]. Group 4: Investor Perspective - The author expresses a paradoxical desire to invest in DeepSeek while recognizing that accepting external funding would compromise its unique characteristics and mission [9][25]. - The article concludes that DeepSeek's lack of a commercial model is its enduring strength, allowing it to align its internal goals with its AGI research without external pressures [25].