量子位
Search documents
后生可畏!何恺明团队新成果发布,共一清华姚班大二在读
量子位· 2025-12-03 09:05
Core Viewpoint - The article discusses the introduction of Improved MeanFlow (iMF), which addresses key issues in the original MeanFlow (MF) model, enhancing training stability, guidance flexibility, and architectural efficiency [1]. Group 1: Model Improvements - iMF reformulates the training objective to a more stable instantaneous velocity loss, introducing flexible classifier-free guidance (CFG) and efficient in-context conditioning, significantly improving model performance [2][14]. - In the ImageNet 256x256 benchmark, the iMF-XL/2 model achieved a FID score of 1.72 in 1-NFE, a 50% improvement over the original MF, demonstrating that single-step generative models can match the performance of multi-step diffusion models [2][25]. Group 2: Technical Enhancements - The core improvement of iMF is the reconstruction of the prediction function, transforming the training process into a standard regression problem [4]. - iMF constructs the loss from the perspective of instantaneous velocity, stabilizing the training process [9][10]. - The model simplifies input to a single noisy data point and modifies the prediction function's computation, removing dependency on external approximations [11][12][13]. Group 3: Flexibility and Efficiency - iMF internalizes the guidance scale as a learnable condition, allowing the model to adapt and learn average velocity fields under varying guidance strengths, thus enhancing CFG flexibility during inference [15][16][18]. - The improved in-context conditioning architecture eliminates the need for the large adaLN-zero mechanism, optimizing model size and efficiency, with iMF-Base reducing parameters by about one-third [19][24]. Group 4: Experimental Results - iMF demonstrates exceptional performance on challenging benchmarks, with iMF-XL/2 achieving a FID of 1.72 in 1-NFE, outperforming many pre-trained multi-step models [26][27]. - In 2-NFE, iMF further narrows the gap between single-step and multi-step diffusion models, achieving a FID of 1.54 [29].
速报!MEET2026嘉宾阵容再更新,观众报名从速
量子位· 2025-12-03 02:38
Core Insights - The MEET2026 Smart Future Conference will focus on cutting-edge technologies and industry developments that have garnered significant attention throughout the year [1] - The theme "Symbiosis Without Boundaries, Intelligence to Ignite the Future" emphasizes how AI and smart technologies penetrate various industries, disciplines, and scenarios, becoming a core driving force for societal evolution [2] Group 1: Conference Highlights - The conference will cover hot topics in the tech circle this year, including reinforcement learning, multimodal AI, chip computing power, AI in various industries, and AI going global [3] - It will feature the latest collisions between academic frontiers and commercial applications, showcasing leading technological achievements from infrastructure, models, and product industries [4] - The event will also include the authoritative release of the annual AI rankings and the annual AI trend report [5][116] Group 2: Notable Speakers - Zhang Yaqin, President of Tsinghua University's Intelligent Industry Research Institute and an academician of the Chinese Academy of Engineering, will be a key speaker [11] - Sun Maosong, Executive Vice President of Tsinghua University's Artificial Intelligence Research Institute, has led numerous national projects [15] - Wang Zhongyuan, Director of the Beijing Academy of Artificial Intelligence, has extensive experience in AI core technology research [19] - Wang Ying, Vice President of Baidu Group, oversees several key business units including Baidu Wenku and Baidu Netdisk [24] - Han Xu, Founder and CEO of WeRide, has led the company to become a leader in autonomous driving technology [28] Group 3: Annual AI Rankings and Trends - The "Artificial Intelligence Annual Rankings" initiated by Quantum Bit has become one of the most influential rankings in the AI industry, evaluating companies, products, and individuals across three dimensions [117] - The "2025 Annual AI Top Ten Trends Report" will analyze ten AI trends that are releasing significant potential, considering factors like technological maturity and practical application [118] Group 4: Event Details - The MEET2026 Smart Future Conference is scheduled for December 10, 2025, at the Beijing Jinmao Renaissance Hotel, with registration now open [119] - The conference aims to attract thousands of tech professionals and millions of online viewers, establishing itself as an annual barometer for the smart technology industry [122]
云计算一哥10分钟发了25个新品!Kimi和MiniMax首次上桌
量子位· 2025-12-03 02:38
Core Insights - Amazon Web Services (AWS) showcased an unprecedented number of product launches at the re:Invent 2025 event, with CEO Matt Garman challenging himself to release 25 products in 10 minutes, ultimately unveiling 40 new products in just over two hours, emphasizing practicality and addressing challenges in AI applications [1][7][9]. Group 1: AI Computing Power - AWS has restructured its AI computing supply model by focusing on self-developed chips, specifically the Trainium series, which has grown into a multi-billion dollar business with over 1 million chips deployed, outperforming competitors by four times in speed [15][20]. - The latest Trainium3 Ultra Servers, based on 3nm technology, offer a 4.4 times increase in computing performance and a 3.9 times increase in memory bandwidth compared to the previous generation [18]. - The upcoming Trainium4 chip promises significant advancements, including a 6 times increase in FP4 computing performance and a 4 times increase in memory bandwidth, tailored for large model training needs [20][22]. - AWS introduced AI Factories, allowing clients to deploy AWS AI infrastructure within their data centers, thus maintaining control and security while accessing top-tier AI computing power [23][24]. Group 2: Model Development and Integration - AWS launched Amazon Bedrock, a flexible and customizable model platform, which now includes Chinese models Kimi and MiniMax, marking their entry into the global developer ecosystem [26][28]. - The new Amazon Nova 2 series includes various models designed for different tasks, with Nova 2 Light focusing on cost-effectiveness and low latency, Nova 2 Pro excelling in complex tasks, and Nova 2 Sonic optimizing real-time voice interactions [30][32]. - Nova Forge introduces the concept of Open Training Models, allowing enterprises to integrate their proprietary data with AWS's training datasets, creating specialized models that retain general reasoning capabilities while understanding unique business knowledge [40][41]. Group 3: AI Agents - AI Agents emerged as a key focus, with Garman stating that the era of AI assistants is being replaced by AI Agents, which will be widely adopted across companies [45][46]. - AWS introduced several new Agents, including Kiro Autonomous Agent for complex development tasks, AWS Security Agent for proactive security measures, and AWS DevOps Agent for continuous system monitoring and troubleshooting [50][52][56]. - AWS provides tools like AWS Transform Custom for code migration and Policy in AgentCore for defining agent behavior, ensuring that agents operate within controlled parameters [58][61]. Group 4: Strategic Vision - AWS's strategy emphasizes the importance of practical applications of AI technologies, focusing on building a comprehensive, secure, and scalable enterprise-level infrastructure rather than solely on technological breakthroughs [68][70]. - The company aims to address challenges related to computing costs, model understanding of proprietary knowledge, and the controllability of AI Agents through its innovative solutions and partnerships [70].
浙大系具身智能再闯港交所:主打工业场景,每天进账1000000元
量子位· 2025-12-03 02:38
Core Viewpoint - The article discusses the growth and challenges faced by XianGong Intelligent, a company specializing in robotic control systems, as it prepares for its IPO on the Hong Kong Stock Exchange. Despite increasing revenues, the company has not yet reached profitability and has accumulated significant losses over the past three years [1][2][49]. Company Overview - XianGong Intelligent focuses on providing robotic control systems and solutions primarily for industrial applications, rather than consumer-oriented robots [3][5]. - The company has developed a product matrix that includes controllers, software, robots, and accessories, aiming to simplify the development and deployment of intelligent robots [6][24]. Financial Performance - XianGong Intelligent's revenue has shown consistent growth, with figures of 184 million RMB in 2022, 249 million RMB in 2023, and projected 339 million RMB in 2024, reflecting a compound annual growth rate of 35.7% [34][32]. - Despite revenue growth, the company has reported cumulative losses of 122 million RMB over the past three years [49]. Product and Service Offerings - The company's main products include the SRC series controllers, which serve as the "brain" of robots, enabling them to operate autonomously [9][10]. - XianGong Intelligent offers over 1,000 robot models, primarily targeting industrial environments that require high precision and durability [15][19]. - The software component acts as a central command system for managing robotic fleets, enhancing operational efficiency [12][13]. Market Position - As of 2024, XianGong Intelligent holds a 23.6% market share in the global robotic controller market, ranking first in sales volume [31]. - The company has steadily increased its customer base, serving over 1,600 integrators and end customers across more than 35 countries [28][30]. Challenges and Risks - The company has faced continuous losses due to high R&D expenses, which amounted to 39 million RMB in 2022, 64 million RMB in 2023, and are projected to reach 71 million RMB in 2024 [53]. - XianGong Intelligent's cash flow has been negatively impacted by extended payment cycles with customers, leading to a net operating cash flow deficit [65][62]. - The reliance on external suppliers for manufacturing and components poses risks, especially if payment terms lead to disruptions in supply [68][70]. Management and Team - The founding team of XianGong Intelligent includes experienced professionals with backgrounds in robotics and AI, contributing to the company's technological advancements and strategic direction [72][74][78].
量子位编辑作者招聘
量子位· 2025-12-03 02:38
AI热潮还在汹涌,但如果你还不知道如何参与……那为什么不来 量子位 呢? 我们是一家以 追踪AI新进展 为核心的内容平台,经过8年积累,目前拥有顶流影响力,广泛且备受认可的产业资源,以及时代风口的最佳观 测和学习生态位。 编辑部 发自 凹非寺 量子位 | 公众号 QbitAI 目前,我们有 三大方向 岗位招聘,希望你是 (或者能成为) 这三个方向的内容专家: 岗位均为全职,工作地点:北京中关村。 岗位面向: 加入我们,你可以获得: 以下是岗位详情: 所有岗位不同能力层级职位均在开放,欢迎结合个人履历和经验申请。 AI产业方向 岗位职责: AI产业方向 :关注基建层创新,包含芯片、AI Infra、云计算; AI财经方向 :关注AI领域创投和财报,跟踪产业链资本动向; AI产品方向 :关注AI在应用和硬件终端方向的进展。 社招:覆盖编辑、主笔、主编各个层级,按能力匹配岗位; 校招:应届毕业生,接受实习且可转正。 站在AI浪潮之巅 :第一时间接触和了解AI领域最新技术和产品,构建完整的AI认知体系。 玩转AI新工具 :将各种AI新技术、新工具应用于工作,提升工作效率和创造力。 打造个人影响力 :通过撰写独家原创内 ...
DeepSeekV3.2技术报告还是老外看得细
量子位· 2025-12-03 00:11
Core Insights - The article discusses the launch of two open-source models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which have gained significant attention in Silicon Valley, indicating a shift in the competitive landscape of AI models [2][6]. Group 1: Model Performance - DeepSeek-V3.2 has achieved the highest level among current open-source models, significantly narrowing the gap with top closed-source models [6]. - The standard version of DeepSeek-V3.2 reached performance levels comparable to GPT-5, while the Speciale version surpassed GPT-5 and competed closely with Gemini-3.0-Pro in mainstream reasoning tasks [7][8]. - DeepSeek-V3.2-Speciale won gold medals in various competitions, demonstrating its advanced capabilities [9]. Group 2: Technical Innovations - The model utilizes DSA sparse attention to address efficiency issues with long contexts, laying the groundwork for subsequent long-sequence reinforcement learning [14]. - By introducing scalable reinforcement learning and allocating over 10% of pre-training compute for post-training, the model significantly enhances general reasoning and agent capabilities [15]. - The Speciale version allows for extended reasoning chains, enabling deeper self-correction and exploration, which unlocks stronger reasoning abilities without increasing pre-training scale [16][17]. Group 3: Economic Implications - DeepSeek-V3.2 is approximately 24 times cheaper than GPT-5 and 29 times cheaper than Gemini 3 Pro in terms of output token costs [29][30]. - The cost of using DeepSeek-V3.2 for generating extensive content is significantly lower, making it an economically attractive option compared to its competitors [31][32]. - The model's deployment on domestic computing power (e.g., Huawei, Cambricon) could further reduce inference costs, posing a challenge to established players like Google and OpenAI [36]. Group 4: Market Impact - The success of DeepSeek-V3.2 challenges the notion that open-source models lag behind closed-source ones, indicating a potential shift in market dynamics [10][26]. - The article highlights that the gap between DeepSeek and top models is now more of an economic issue rather than a technical one, suggesting that with sufficient resources, open-source models can compete effectively [26].
GPT5.5代号“蒜你狠”曝光!OpenAI拉响红色警报加班赶制新模型,最快下周就发
量子位· 2025-12-03 00:11
Core Insights - OpenAI is under pressure to enhance ChatGPT in response to Google's Gemini 3 Pro, leading to an internal "code red" alert to prioritize resources for this task [2][34] - The competition has intensified, with Google's Gemini series gaining significant traction, resulting in a decline in ChatGPT's traffic and user engagement [10][15] Group 1: Competitive Landscape - OpenAI's ChatGPT traffic dropped by 6% within a week following the release of Gemini 3 Pro and Nano Banana Pro [10] - Gemini's monthly active users surged from 450 million in July to 650 million in October, indicating rapid growth and increased competition [14] - OpenAI's market share in AI assistants remains at 70%, but growth is slowing, raising concerns about its competitive edge [15][16] Group 2: Financial Challenges - OpenAI is not yet profitable and faces increasing financial pressure, needing to raise approximately $100 billion to sustain operations amid high cash burn [19][21] - Projected revenues from ChatGPT are $10 billion for this year, $20 billion next year, and $35 billion by 2027, but these figures are insufficient to cover expenses [20][21] - The company must achieve $200 billion in revenue by 2030 to reach profitability, which poses a significant challenge given its current financial trajectory [21] Group 3: Product Development and Strategy - OpenAI plans to release a new reasoning model next week, with the internal codename "Garlic," which is expected to outperform Gemini 3 in evaluations [5][24] - The development of Garlic has reportedly made significant strides in pre-training, allowing for better performance in coding and reasoning tasks [28] - OpenAI is also working on additional models, including one named "Shallotpeat," to further enhance its product offerings and respond to competitive pressures [27][31]
OpenAI首席研究员Mark Chen长访谈:小扎亲手端汤来公司挖人,气得我们端着汤去了Meta
量子位· 2025-12-03 00:11
Core Insights - The interview with OpenAI's Chief Research Officer Mark Chen reveals the competitive landscape in AI talent acquisition, particularly between OpenAI and Meta, highlighting the lengths to which companies will go to attract top talent, including sending homemade soup [4][9][11] - OpenAI maintains a strong focus on AI research, with a core team of approximately 500 people and around 300 ongoing projects, emphasizing the importance of pre-training and the development of next-generation models [4][20][27] - Mark Chen expresses confidence in OpenAI's ability to compete with Google's Gemini 3, stating that internal models have already matched its performance and that further advancements are imminent [4][26][119] Talent Acquisition and Competition - Meta's aggressive recruitment strategy has led to a "soup war," where both companies are trying to entice talent through unconventional means [4][11] - Despite Meta's efforts, many OpenAI employees have chosen to stay, indicating a strong belief in OpenAI's mission and future [10][14] - The competition for talent is intense, with companies recognizing the necessity of attracting the best individuals to build effective AI labs [9][10] Research Focus and Model Development - OpenAI's research strategy prioritizes exploratory research over merely replicating existing benchmarks, aiming to discover new paradigms in AI [22][27] - The company has invested heavily in pre-training, believing it still holds significant potential, contrary to claims that scaling has reached its limits [118][119] - Mark Chen emphasizes the importance of maintaining a clear focus on core research priorities and effectively communicating these to the team [24][20] Response to Competitors - OpenAI aims to avoid being reactive to competitors, focusing instead on long-term research goals and breakthroughs rather than short-term updates [26][28] - The company has already developed models that can compete with Gemini 3, showcasing its confidence in upcoming releases [34][119] - Mark Chen highlights the significance of reasoning capabilities in language models, which OpenAI has been developing for over two years [26][116] Company Culture and Management - OpenAI's culture remains rooted in its original mission as a pure AI research organization, despite its growth and the introduction of product lines [27][28] - Mark Chen's management style emphasizes collaboration and open communication, fostering a strong sense of community among researchers [101][104] - The company has navigated internal challenges, including leadership changes, by promoting unity and a shared vision among its team [98][102]
米哈游蔡浩宇,发了个“游戏版ChatGPT”
量子位· 2025-12-02 09:32
Core Viewpoint - The article discusses the launch of AnuNeko, an AI chat application developed by Cai Haoyu, the founder of miHoYo, highlighting its unique features and user experiences. Group 1: Product Overview - AnuNeko is an AI chat software that combines elements of gaming and conversation, allowing users to choose different characters for interaction [1][22]. - The application offers a high degree of personalization, with responses varying based on user input and character selection, showcasing a human-like interaction style [9][11]. Group 2: User Experience - Initial user feedback indicates that while AnuNeko provides quick responses, its logic can be inconsistent, leaning more towards a humanistic approach rather than strict logical reasoning [4][21]. - Users have noted that the AI can mimic emotional tones, responding more aggressively to more intense user inputs [11]. Group 3: Market Context - AnuNeko is part of a broader trend in the gaming industry where AI integration is becoming standard, with other companies like Google and miHoYo also developing AI-driven characters and interactions in their games [28][32]. - The article mentions the competitive landscape, including the introduction of AI agents in popular games like Genshin Impact, which enhances player interaction and game dynamics [35][38].
AI营销头雁冲刺IPO,2个北大-宝洁校友创办
量子位· 2025-12-02 09:32
Core Viewpoint - DeepMind Intelligent, a decision AI technology company, is making a second attempt to list on the Hong Kong Stock Exchange in 2023, indicating its ambition for growth and expansion in the AI marketing sector [1][5]. Company Overview - DeepMind Intelligent specializes in applying AI technology to marketing and sales scenarios, aiming to automate business decision-making through algorithms and data intelligence [3][7]. - The company was founded in 2009 as PinYou Interactive and rebranded in 2019, co-founded by Peking University alumni Huang Xiaonan and Xie Peng [4][39]. Market Position - According to Frost & Sullivan, DeepMind Intelligent ranks first in China's marketing and sales decision AI application market, with a market share of 2.6% based on projected 2024 revenue [5][19]. - The company has completed its E-round financing, achieving a valuation of RMB 1.9 billion [6][31]. Product Offerings - The company's main products include: - AlphaDesk: A smart advertising management platform developed in 2011 for real-time programmatic bidding across media platforms [11][12]. - AlphaData: Launched in 2017, this platform aggregates consumer data from over 100 sources for precise marketing execution [11][13]. - Deep Agent: A new product set to launch in February 2025, utilizing open-source large language models for tasks like generating consumer insights and automating sales dialogues [15][16]. Client Base - DeepMind Intelligent serves approximately 530 clients, including 89 Fortune Global 500 companies, across various industries such as e-commerce, fast-moving consumer goods, automotive, retail, and beauty [18]. Financial Performance - The company maintained profitability during its historical performance period, with revenue increasing from RMB 543 million in 2022 to RMB 611 million in 2023, despite a projected decline of 12% to RMB 538 million in 2024 due to budget cuts from clients in certain sectors [20][21]. - In the first half of 2025, revenue rebounded to RMB 277 million, reflecting a year-on-year growth of 5.8% [21]. Profitability Metrics - The company's gross margin decreased from 31.2% in 2023 to 27.3% in 2024, stabilizing at 27.1% in the first half of 2025 [22][23]. - Net profit figures were RMB 59.36 million in 2022 and RMB 60.66 million in 2023, dropping to RMB 21.52 million in 2024, but rebounding to RMB 3.64 million in the first half of 2025 [24][25]. Revenue Structure - The smart advertising business (AlphaDesk) is the core revenue driver, contributing 82.1% of total revenue in 2022, increasing to 93.3% in the first half of 2025 [26][27]. - The smart data management business (AlphaData) contributed 6.7% of revenue in the first half of 2025, reflecting a contraction due to client budget adjustments [27]. Cost Structure - Media resource procurement costs accounted for 87.1% of total sales costs in the first half of 2025, with R&D expenses maintained to ensure technological competitiveness [29][30].