MiMo
Search documents
鏖战2025年,大模型围着开源转
3 6 Ke· 2025-12-25 10:29
Core Viewpoint - By 2025, open-source will dominate the landscape of large models, with a significant increase in the number of users adopting open-source models globally, marking a shift in the competitive dynamics between open and closed-source approaches [1][20]. Group 1: Open-Source vs Closed-Source Dynamics - The debate between open-source and closed-source large models has been ongoing, with both sides presenting strong arguments, but the trend is shifting towards open-source as more major internet companies adopt this approach [1][5]. - Closed-source models, initially seen as the only viable path due to advantages in data security and commercial monetization, are now facing challenges in areas like AI accessibility and ecosystem development [3][10]. - The emergence of open-source models has created a new competitive landscape, with companies like Meta and Alibaba leading the charge in open-source initiatives [5][10]. Group 2: Impact of DeepSeek - The introduction of DeepSeek has significantly altered the competitive balance, demonstrating that open-source models can achieve high performance at lower costs, thus attracting more companies to switch to open-source strategies [7][20]. - DeepSeek's training cost was approximately $294,000, with a training duration of about 80 hours, showcasing a more efficient approach compared to traditional methods [7]. - Open-source models like DeepSeek and Qwen have reportedly matched or even surpassed the performance of leading international products, shifting the focus of competition from pure performance to cost, efficiency, and commercialization capabilities [8][20]. Group 3: Market Trends and User Engagement - The AI application market is rapidly evolving, with mobile and PC active user numbers reaching 729 million and 200 million respectively by September 2025, indicating a shift towards more specialized and efficient applications [11][13]. - Open-source models are seen as the quickest path to market, fostering a collaborative ecosystem that enhances user engagement and accelerates innovation [13][14]. - Companies are increasingly recognizing the long-term commercial value of high user engagement within open-source ecosystems, leading to a competitive race among internet giants to provide comprehensive open-source solutions [15][19]. Group 4: Commercialization of Open-Source - Open-source does not equate to free; companies are exploring various monetization strategies, including enterprise versions, commercial APIs, and cloud services, to sustain their open-source initiatives [18][19]. - Alibaba has open-sourced over 300 models, generating more than 170,000 derivative models, positioning itself as a leader in the global open-source model landscape [16]. - Baidu is integrating its self-developed Kunlun chips with open-source models, adopting a full-stack autonomous approach to enhance its competitive edge [17].
“天才少女”罗福莉亮相背后:曾被雷军亲自点将,能成小米新王牌?
Sou Hu Cai Jing· 2025-12-18 12:26
这一表态,与小米正在铺开的大模型版图形成呼应。今年以来,小米几乎以月为单位推进模型发布:从4月的推理模型MiMo,到5月的多模态 MiMo-VL,9 月的端到端语音MiMo-Audio,再到11月面向家庭场景的Miloco。 截至今年第三季度,小米AIoT平台全球连接设备数首次突破10亿,达到10.4亿台。罗福莉的加入,意味着小米开始试图回答一个更长期的问题:在"人、 车、家"生态中,模型究竟该如何理解世界,并持续参与其中。 而首秀之后,雷军留给罗福莉的考题也刚开始。 对"物理AI"兴趣浓厚 小米MiMo大模型负责人罗福莉的首次公开亮相,迅速在圈内掀起一波热度。 12月17日,在小米"人车家全生态大会"上,罗福莉以小米高管身份站到前台,发表了一场近乎学术范的演讲。她的演讲重心不在参数,也不在秀指标,而是 通过抛出一个个判断,从技术层面表达了一系列思考。 罗福莉的表现之所以受瞩目,与"AI天才少女"的标签不无关系。她1995 年出生于四川宜宾,本科就读于北京师范大学计算机专业,硕士毕业于北京大学计 算语言学研究所计算语言学专业。2019年,因在人工智能领域顶级国际会议ACL上发表8篇论文引发外界关注。 在入职阿 ...
“天才少女”罗福莉走向台前
Hua Er Jie Jian Wen· 2025-12-17 12:35
作者 | 周智宇 编辑 | 张晓玲 一位95后科学家的现身,让小米一次例常的会议,吸睛无数。 12月17日,罗福莉现身小米合作伙伴大会。这位前DeepSeek核心成员、被业内冠以"天才少女"之名的95后科学家,以小米MiMo大模型负责人的身份完成首 秀。 罗福莉站在台前略显生涩,与向来以营销老练著称的小米相比,透着一丝"违和感"。但无论是她带来的开源模型MiMo-V2-Flash,还是小米集团总裁卢伟冰 宣布的未来五年2000亿元研发投入规划,都让外界得以一窥这家硬件厂商在AI时代的野心。 当下,传统硬件制造的利润空间持续压缩,而以大模型为核心的智能服务,已成为推动商业模式转型、拉升企业市值的关键变量。为了不让小米错失下一个 十年,雷军不惜重金将罗福莉招致麾下,这不仅是一次技术层面的"补课",更是一场基于商业理性的战略防御。 科技商业的权力版图正不断被重塑。唯有真正掌握核心智能"大脑",小米庞大的硬件生态才能守住自身的护城河,而非为他人做嫁衣。 罗福莉展示的MiMo-V2-Flash疯狂在做减法。她采用的MoE(混合专家)架构,虽然总参数高达309B,但在实际运行时只激活15B。也只有这样轻量化的模 型,才能 ...
China's Xiaomi says returns from AI investments 'far exceed expectations'
Yahoo Finance· 2025-12-05 09:30
Core Insights - Xiaomi's investment in artificial intelligence (AI) has yielded returns in 2025 that significantly exceeded expectations, according to the company's president Lu Weibing [1] - The company is shifting focus towards embodied AI after substantial investments in general AI over recent quarters, paralleling strategies seen in companies like Tesla [2] - Xiaomi's advancements in AI large models and applications have surpassed initial expectations, with a belief that the integration of AI with the physical world represents the next generation of intelligent technology [3] AI Developments - Xiaomi launched its first AI model, MiMo, in April and has recently open-sourced MiMo-Embodied, which showcases advanced performance in autonomous driving and embodied AI tasks [4] - The company has seen increased interest in AI applications within the electric vehicle (EV) sector, highlighted by the introduction of its premium SU7 Ultra, which features Hyper-Autonomous Driving technology [5] - There is a growing interest in embodied AI, leading Xiaomi to enhance its investments in robotics, following the introduction of its robot dog in 2021 and a humanoid prototype in 2022 [6] Talent Acquisition - Xiaomi is actively seeking talent to bolster its AI initiatives, recently hiring Luo Fuli, a former researcher from DeepSeek, to lead the MiMo team [7] Financial Performance - Xiaomi's smart EV and AI initiatives turned a profit for the first time in Q3, generating a record revenue of 29 billion yuan (approximately US$4.1 billion), marking a 199% year-on-year increase [8] - The company's total revenue for the quarter rose by 22% year-on-year to 113.1 billion yuan [8]
罗福莉官宣后,小米放出首个AI大招,10亿IoT设备一键接入大模型
3 6 Ke· 2025-11-14 11:16
Core Insights - Xiaomi has launched its first "large model + smart home" solution called Xiaomi Miloco, which is a local AI assistant designed to enhance smart home interactions [1][3]. Product Overview - Xiaomi Miloco utilizes the MiMo-VL-Miloco-7B model, which is based on the previously released MiMo model, and connects to various IoT devices in the home [2][11]. - The solution aims to provide a new interaction paradigm through natural language processing, allowing users to communicate with their smart home systems [5][6]. Features and Capabilities - Miloco features a new interaction paradigm that allows users to set rules and control devices using natural language [5]. - It leverages visual data from Xiaomi's cameras to analyze home events and respond to user queries [5]. - The model operates on edge devices, ensuring privacy and security by processing data locally without sending it to external servers [14]. Ecosystem Integration - Miloco connects with the Xiaomi ecosystem and supports integration with third-party IoT platforms, enhancing its functionality [6][9]. - The hardware requirements for deploying Miloco are minimal, needing only x64 architecture and a GPU from NVIDIA's 30 series or higher [6]. Market Context - The launch of Miloco is seen as a significant moment for smart home technology, comparable to the impact of ChatGPT in the AI space [3][14]. - Xiaomi's move follows similar advancements from competitors, indicating a competitive landscape in the smart home sector [14].
罗福莉C位亮相小米,离职DeepSeek后首次官宣
猿大侠· 2025-11-14 04:11
Core Viewpoint - Luo Fuli has officially joined Xiaomi as the head of the MiMo team, focusing on advancing multi-modal spatial intelligence, which is a crucial step towards achieving true Artificial General Intelligence (AGI) [4][24]. Timeline of Events - Rumors about Luo Fuli joining Xiaomi surfaced at the end of last year, with reports indicating that Lei Jun offered her a salary in the millions to lead Xiaomi's AI efforts [5][10]. - Significant milestones include the launch of DeepSeek-V3 on December 25, followed by media reports of Xiaomi assembling a GPU cluster the next day [6][7]. - On December 31, 2024, Lei Jun publicly shared Xiaomi's ambitions in AI during a New Year's live stream [8]. Background of Luo Fuli - Luo Fuli holds a Bachelor's degree in Computer Science from Beijing Normal University and a Master's degree in Computational Linguistics from Peking University, where she has published papers in top NLP conferences [15]. - She has worked at Alibaba's DAMO Academy and later at DeepSeek, contributing to the development of various deep learning models [17]. - Her academic work has garnered over 11,000 citations, with approximately 8,000 citations added in the past year alone [18]. Xiaomi's AI Strategy - The MiMo initiative is central to Xiaomi's efforts in developing large models, with a focus on "spatial intelligence," which aims to bridge the gap between information AI and physical AI [24][26]. - Luo Fuli's role is seen as pivotal in connecting Xiaomi's AI research with academic institutions, particularly with her former mentor from Peking University [22]. Concept of Spatial Intelligence - Spatial intelligence is described as the ultimate goal of integrating information AI with physical AI, facilitating a seamless connection between the digital and physical worlds [26]. - This concept aligns with Xiaomi's broader ecosystem strategy, which encompasses people, vehicles, and home integration [26].
罗福莉C位亮相小米,离职DeepSeek后首次官宣
量子位· 2025-11-12 08:01
Core Insights - Luo Fuli has officially announced her position at Xiaomi, leading the MiMo team to advance the development of multi-modal spatial intelligence, a key step towards achieving Artificial General Intelligence (AGI) [1][3][7] Group 1: Background and Context - Rumors about Luo Fuli joining Xiaomi surfaced at the end of last year, with reports indicating that she was recruited by Lei Jun with a salary of tens of millions [4][10] - Significant events include the launch of DeepSeek-V3 on December 25, followed by media reports of Xiaomi assembling a GPU cluster [5][6] - Luo Fuli's name appeared in Xiaomi's AI team papers as an independent researcher prior to her official announcement [11][20] Group 2: Luo Fuli's Profile - Luo Fuli holds a Bachelor's degree in Computer Science from Beijing Normal University and a Master's degree in Computational Linguistics from Peking University, with numerous publications in top NLP conferences [15][17] - She has over 11,000 citations for her academic papers, with approximately 8,000 citations added in the current year alone [18] - Luo previously worked at Alibaba's DAMO Academy and DeepSeek, contributing to the development of various deep learning models [17] Group 3: Xiaomi's AI Ambitions - Xiaomi aims to enter the deep waters of AI following the establishment of its automotive business, with a focus on spatial intelligence [9][24] - The concept of spatial intelligence, as articulated by Luo Fuli, involves bridging the gap between information AI and physical AI, which aligns with Xiaomi's ecosystem of people, vehicles, and homes [23][25]
官宣!95后「AI天才少女」罗福莉加入小米,雷军终于“挖人”成功
Sou Hu Cai Jing· 2025-11-12 07:43
Core Insights - The article highlights the rise of Luo Fuli, a talented AI researcher, who gained significant attention after her involvement in the successful development of the DeepSeek-V2 model, which is recognized as a leading Chinese AI model [2][3]. Group 1: Luo Fuli's Background and Achievements - Luo Fuli, known as a "genius girl," began her journey in AI while studying computational linguistics at Peking University, where she published eight papers at the prestigious ACL conference in 2019 [2]. - Her notable contributions to the DeepSeek-V2 model, which offers high cost-effectiveness at 1 yuan per million input tokens, positioned her as a key figure in the AI community [2]. Group 2: Transition to Xiaomi - Reports indicate that Luo Fuli left DeepSeek in February 2025, and her name appeared in a paper co-authored by Xiaomi's AI team and Peking University in October 2025, suggesting her transition to Xiaomi [5]. - Xiaomi's acquisition of Luo Fuli is seen as a strategic move, as the company is building a robust AI ecosystem, including the MiMo model and a GPU cluster, which can leverage her expertise [6]. Group 3: Talent Competition in AI - The article emphasizes the intense competition for top AI talent, with companies vying for individuals capable of developing practical AI products [6]. - Luo Fuli's rise to prominence reflects the scarcity of elite AI professionals, making her a highly sought-after asset in the industry [6]. Group 4: Personal Attributes and Work Ethic - Despite her accolades, Luo Fuli maintains a humble approach, focusing on technical challenges and expressing a desire to work quietly on meaningful projects [8].
监督学习未死,一题训练五小时起飞!华人学者新方法20倍训练效率释放大模型推理能力
量子位· 2025-08-04 07:00
Core Viewpoint - The article discusses the breakthrough of One-Shot Critique Fine-Tuning (One-Shot CFT) in enhancing reasoning capabilities of large language models (LLMs) with minimal data and computational resources, outperforming traditional reinforcement learning (RL) methods and small-scale supervised fine-tuning (SFT) approaches [1][3][14]. Group 1: One-Shot CFT Methodology - One-Shot CFT is a new method that allows models to learn reasoning by analyzing the quality of answers rather than merely imitating them, thus providing a deeper learning signal [3][12]. - The process involves selecting a representative task, generating multiple answers using various models, and then having a more powerful model critique these answers, which serves as the supervision signal for training [4][5]. - The entire training process requires only one question, multiple answers, and critiques, taking approximately 5 GPU hours, significantly less than RL methods [5][14]. Group 2: Performance and Results - In experiments, Qwen2.5-Math-7B achieved a 15% accuracy increase after One-Shot CFT fine-tuning on a single question, surpassing both RL and full supervised fine-tuning models that used tens of thousands of training samples [9][10]. - The method demonstrated strong performance across various mathematical and logical reasoning tasks, with accuracy improvements ranging from 10% to 16% in specific sub-tasks [10][11]. - One-Shot CFT showed stability and reproducibility across different tasks and model configurations, indicating its robustness [11][13]. Group 3: Advantages of One-Shot CFT - The method emphasizes critical learning, allowing models to understand why answers are correct or incorrect, which enhances the depth of learning compared to traditional SFT [12]. - It introduces multi-perspective inputs by generating multiple answers and critiques for a single task, closely mimicking human learning processes [12]. - The training signals from critiques are highly generalizable, reducing the risk of overfitting and allowing for easier transfer to new tasks [12]. Group 4: Accessibility and Practical Implications - One-Shot CFT's low computational cost makes it accessible for individual researchers, resource-limited labs, and startups, providing a cost-effective solution for enhancing reasoning capabilities [14][15]. - The entire process is open-source, including training scripts, model parameters, and datasets, which significantly lowers the barrier for replication and experimentation [17].
苹果Meta狂抓AI,抢人并购
Hu Xiu· 2025-06-23 23:27
Core Insights - Apple and Meta are intensifying their efforts in AI, realizing its potential to disrupt device experiences and advertising models [1][2] - Both companies face challenges in talent acquisition and strategic direction, risking marginalization in the AI landscape [3][12] Group 1: AI Competition and Acquisitions - Apple and Meta are competing against AI giants like Microsoft, Amazon, Google, and OpenAI, with significant valuations for potential acquisition targets such as Perplexity at $14 billion and Thinking Machines Lab at $10 billion [2][23] - Meta has acquired nearly half of Scale AI for $14.3 billion and is considering other acquisitions like SSI, valued at $32 billion, and several other AI companies with valuations ranging from $4.5 billion to $62 billion [2][21] Group 2: Strategic Challenges - Both companies are struggling with a lack of direction and talent, leading to confusion in strategic execution [3][12] - Apple has not delivered substantial AI innovations at its recent developer conference, raising concerns about its future in the AI ecosystem [6][13] Group 3: Market Position and Threats - Apple is losing its dominance in the smartphone market, with competitors like Huawei and Xiaomi advancing rapidly in AI capabilities [8][22] - Google is solidifying its position in AI search and video, posing a direct threat to Meta's advertising market, particularly in short videos [7][10] Group 4: Talent Acquisition Efforts - Zuckerberg is actively recruiting top talent in AI, emphasizing the importance of building a strong team to drive Meta's AI initiatives [15][18] - Apple is also seeking to enhance its AI capabilities by potentially acquiring or collaborating with companies like Mistral and Thinking Machines Lab [19][21] Group 5: Future Outlook - The competition for AI talent and technology is intensifying, with both Apple and Meta needing to adapt quickly to avoid being left behind [12][23] - The ongoing mergers and acquisitions in Silicon Valley signal a new wave of consolidation in the AI sector, with both companies needing to act decisively [23]