Workflow
量子位
icon
Search documents
华为刚投的物理AI:首家国产世界模型公司
量子位· 2025-11-12 04:08
鱼羊 发自 凹非寺 量子位 | 公众号 QbitAI 华为在 世界模型 上又有新动作:投了一家物理AI公司。 极佳视界 ,最新完成一轮亿元级A1轮融资,由华为哈勃、华控基金联合投资。这也是该公司两个月内连续完成的第三轮融资。 这家公司成立于2023年,说得上是国内第一家"纯血"物理AI公司——创业就是为世界模型而来。 2年时间,极佳视界的产品已经覆盖从自动驾驶世界模型、具身基础模型到世界模型平台的全栈软硬件, 应用落地领域,正是华为正在持续押 注的方向:自动驾驶和具身智能 。 "世界模型是未来重要高质量数据来源" 极佳视界核心团队成员来自清华大学、中科院、中科大等知名院校。 最早"出圈",是在2024年奇绩创坛的路演上:联合清华大学自动化系,发布了国内首个支持原生16秒超长时长的视频生成模型"视界一粟 YiSu"。 当时,极佳视界就强调了视频生成模型对物理世界的理解,并展示了其在自动驾驶和具身智能领域的应用潜力。 极佳视界的创始人兼CEO 黄冠 ,是清华大学自动化系AI方向博士。创立极佳视界之前,拥有微软、三星、地平线等公司的算法经历,以及 AI、自动驾驶等方向的连续创业经验。 在2024年的公开演讲中,黄冠 ...
忍无可忍,LeCun离职!Meta市值应声蒸发1400亿
量子位· 2025-11-11 16:01
Core Viewpoint - Yann LeCun's departure from Meta signifies a critical shift in the company's AI strategy, moving away from long-term foundational research towards a more aggressive, product-driven approach [1][5][36]. Group 1: Departure and Immediate Impact - LeCun announced his departure plans to colleagues, intending to pursue entrepreneurship [2]. - Following the news of his departure, Meta's market value dropped by 1.5%, equating to over $20 billion [4]. - The decision to leave was influenced by ongoing dissatisfaction with Meta's AI strategy and organizational changes, including significant layoffs within his team [6][10][22]. Group 2: Strategic Shifts at Meta - Meta's AI strategy has undergone multiple reorganizations, with four internal restructurings in just six months, hindering research progress [10][11]. - The appointment of a new chief scientist for the MSL lab effectively sidelined LeCun, altering his influence within the organization [13][14]. - Meta's shift towards a more aggressive AI strategy under CEO Mark Zuckerberg contrasts sharply with LeCun's long-term vision for foundational AI research [27][36]. Group 3: Philosophical and Ideological Differences - LeCun advocates for a "world model" approach, which he believes is essential for true AI understanding, while Meta is focusing on large language models (LLMs) for immediate product development [24][25]. - The ideological clash is further emphasized by the internal discussions at Meta regarding the potential closure of Llama's future versions, which LeCun opposes [26]. - LeCun's commitment to open-source principles stands in stark contrast to the direction taken by Meta's new leadership [26]. Group 4: Historical Context and Legacy - LeCun joined Meta in 2013 and established the FAIR lab, which was known for its academic freedom and focus on foundational research [31][32]. - His contributions to AI were recognized with the Turing Award in 2018, marking a peak in Meta's reputation in AI research [33]. - The end of LeCun's tenure at Meta represents the conclusion of a decade-long era of academic-style research within the company [37].
6666!NuerIPS满分论文来了
量子位· 2025-11-11 11:11
Core Insights - The article discusses a groundbreaking paper that challenges the prevailing belief that reinforcement learning (RL) is essential for enhancing reasoning capabilities in large language models (LLMs), suggesting instead that model distillation may be more effective [1][5][12]. Group 1: Research Findings - The paper titled "Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?" received a perfect score at NeurIPS, indicating its significant impact [5][6]. - The research team from Tsinghua University and Shanghai Jiao Tong University found that RL primarily reinforces existing reasoning paths rather than discovering new ones, which contradicts the common assumption that RL can expand a model's reasoning capabilities [10][12]. - The study utilized the pass@k metric to evaluate model performance, revealing that RL models perform better at lower sampling rates but are outperformed by base models at higher sampling rates, indicating that the base model's reasoning abilities may be underestimated [14][20]. Group 2: Methodology - The research involved testing various models across three key application areas: mathematical reasoning, code generation, and visual reasoning, using authoritative benchmark datasets [17][19]. - The models compared included mainstream LLMs like Qwen2.5 and LLaMA-3.1, with RL models trained using algorithms such as PPO, GRPO, and Reinforce++ [18][19]. - The analysis focused on the differences in pass@k performance between RL and base models, as well as the trends in performance as sampling increased [21][22]. Group 3: Implications for the Industry - The findings suggest that the substantial investments and explorations surrounding RLVR may need to be reevaluated, as the actual benefits of RL in enhancing reasoning capabilities could be overestimated [4][12]. - The research highlights the potential of model distillation as a more promising approach for expanding reasoning capabilities in LLMs, which could shift industry focus and funding [10][12].
谷歌192亿买他回来,现在只想让他闭嘴
量子位· 2025-11-11 11:11
Core Viewpoint - The controversy surrounding Noam Shazzer's statements at Google highlights the ongoing tension between talent retention and adherence to company values, particularly regarding inclusivity and free speech within the organization [4][9][19]. Group 1: Incident Overview - Noam Shazzer, a key figure in the development of the Transformer model, sparked significant internal debate at Google with his controversial remarks on gender issues [6][5]. - The internal forum discussions quickly polarized employees into two opposing camps, with many arguing that Shazzer's comments were provocative and challenged Google's established norms on inclusivity [7][9]. - Google's management intervened by deleting some of Shazzer's comments, which escalated the controversy rather than resolving it, leading to accusations of suppressing free speech [8][9]. Group 2: Noam Shazzer's Contributions - Shazzer is recognized as one of the eight authors of the Transformer model and is credited with making the most significant contributions, including rewriting the project code to enhance its capabilities [20]. - His return to Google was seen as a strategic move, with estimates suggesting that his work on the Gemini project alone is valued at $2.5 billion [14]. - The company invested $2.7 billion to bring Shazzer back, which many consider a worthwhile investment given his pivotal role in AI advancements [28]. Group 3: Historical Context - The current situation draws parallels to the 2017 James Damore incident, where another Google employee was fired for similar issues related to gender discussions [12][19]. - Historical patterns at Google show a recurring theme of conflicts between high-profile employees and management over issues of academic freedom and corporate values, as seen in the cases of Timnit Gebru and Jeff Dean [29][31].
杨植麟回复:Kimi K2训练用的H800!但“只花了460万美元”嘛…
量子位· 2025-11-11 11:11
Core Insights - The Kimi K2 Thinking model reportedly cost only $4.6 million to train, which is lower than the $5.6 million for DeepSeek V3, raising questions about the valuation of closed-source giants in Silicon Valley [13][14]. - The Kimi K2 model is causing a migration trend in Silicon Valley as it offers superior performance at a lower cost compared to existing models [5][6]. - The Kimi K2 model utilizes innovative engineering techniques, including a self-developed MuonClip optimizer, which allows for stable gradient training without human intervention [18]. Training Cost and Performance - The training cost of Kimi K2 is claimed to be $4.6 million, significantly lower than other models, prompting reflection within the industry [13][14]. - Investors and companies are migrating to Kimi K2 due to its strong performance and cost-effectiveness, with reports of it being five times faster and 50% more accurate than closed-source models [8][6]. Technical Innovations - Kimi K2 has optimized its architecture by increasing the number of experts in the MoE layer from 256 to 384 while reducing the number of active parameters during inference from approximately 37 billion to 32 billion [16]. - The model employs Quantization-Aware Training (QAT) to achieve native INT4 precision inference, which enhances speed and reduces resource consumption by about 2 times [21]. Community Engagement and Future Developments - The team behind Kimi K2 engaged with the developer community through a three-hour AMA session, discussing future architectures and the potential for a next-generation K3 model [22][24]. - The team revealed that the unique writing style of Kimi K2 results from a combination of pre-training and post-training processes, and they are exploring longer context windows for future models [26][27].
看图写代码,3毛钱开发一个网页!字节AI Coding新模型真卷麻了
量子位· 2025-11-11 06:59
Core Viewpoint - Volcano Engine has launched a new code model, Doubao-Seed-Code, optimized for Agentic programming tasks, showcasing significant advancements in performance, pricing, and migration costs [2][4][7]. Group 1: Performance - Doubao-Seed-Code achieves state-of-the-art (SOTA) performance, integrating deeply with the TRAE development environment, and ranks at the top of the SWE-Bench Verified leaderboard with a resolution rate of 78.80% [4][63]. - The model is capable of handling multimodal software issues, including those described with images, indicating its versatility in problem-solving [5][64]. - It demonstrates strong capabilities in coding tasks, efficiently completing basic functions and complex interactions, as evidenced by its performance in various coding tests [13][20][28]. Group 2: Pricing - Volcano Engine offers the lowest calling prices in the domestic market, with a subscription plan starting at just 9.9 yuan, making it accessible for developers [6][58]. - The overall usage cost has been reduced by 62.7% compared to industry averages, with Doubao-Seed-Code costing approximately 0.34 yuan for the same token volume that costs 4.05 yuan with Claude Sonnet 4.5 [55][56]. Group 3: Migration Costs - Doubao-Seed-Code is natively compatible with the Anthropic API, allowing for seamless migration with virtually zero configuration costs, making it easy for developers to switch from other models [7][56]. Group 4: Technical Advancements - The model supports visual understanding capabilities, allowing it to generate code from UI design drafts or screenshots, a feature that sets it apart in the domestic market [43][56]. - Doubao-Seed-Code is built on a robust training library with over 100,000 container images and utilizes end-to-end reinforcement learning for efficient optimization [66][67]. Group 5: Market Position - Volcano Engine's Doubao-Seed-Code is positioned as a competitive player in the AI coding landscape, emphasizing performance, affordability, and user-friendly migration, which are critical in the current market [52][74].
iPhone Air卖不动,库克挥泪砍产线…这也就刚卖了一个月
量子位· 2025-11-11 04:24
Core Viewpoint - The iPhone Air has faced significant challenges since its launch, leading to production cuts and a lack of consumer interest, ultimately resulting in its withdrawal from the market [5][25][31] Market Response - The iPhone Air's initial sales were disappointing, with only over 50,000 activations in its first week, which is less than one-tenth of the iPhone 17 Pro Max's performance [5] - In major online sales channels like JD and Tmall, the iPhone Air's sales lagged behind both the iPhone 17 series and the older iPhone 16 models, failing to make it into the top ten of small-screen bestsellers [6] - On Amazon, the iPhone Air received a rating of only 4.4, with many users citing serious battery and performance issues compared to the Pro Max [8] Product Positioning - Apple aimed to create a "non-Pro flagship" by simplifying features, removing high-refresh screens and advanced camera capabilities, while retaining the A-series chip and main camera [12] - However, the iPhone Air's stripped-down features did not meet consumer expectations, leading to a perception of it being underwhelming and overpriced compared to the iPhone 17 Pro, which is only $100 more but offers significantly better specifications [15][16] User Experience Issues - The removal of the physical SIM card slot in favor of eSIM led to complications for users in regions where eSIM is not widely supported, causing frustration and connectivity issues [19][20] - The iPhone Air lacked unique features or configurations that could stimulate accessory development or software adaptation, making it less appealing to consumers [22][23] Competitive Landscape - The withdrawal of the iPhone Air has created a gap in the "light flagship" market, which domestic brands like Xiaomi, OPPO, and Honor have already capitalized on, offering competitive products with strong ecosystems [26][27] - Huawei has quickly responded by launching the Mate 70 Air, which is thinner and lighter at a lower price point, indicating a swift shift in market dynamics following Apple's retreat [28][29] Future Prospects - Although the iPhone Air 2 project has been removed from the main production schedule, internal development continues, with plans for improvements in weight, battery capacity, and camera performance [31][32] - The potential for a future return of the iPhone Air remains, though it may not happen soon [33]
卧底硅谷AI独角兽60天:没有KPI,自觉996,不接受远程办公
量子位· 2025-11-11 04:24
鹭羽 发自 凹非寺 量子位 | 公众号 QbitAI 没有logo、没有招聘JD、没有万恶之源KPI。 反之,全员主动996,写代码互相来找茬,甚至公司最有压力的不是程序员,而是厨子…… 我重生了,重生在 Cursor 的"草根"时期。 成立不到两年,估值就超百亿美元,一经推出全网引爆"氛围编程",不只是让写代码更快,更是重新定义了写代码这事。 | 构建软件的新方式。 | | | | --- | --- | --- | | 前后两批次的效果判若云泥,采用率从个位数飙升至 | 迄今为止我付费使用、毫无疑问最有用的 AI 工具就是 | 最出色的 LLM 应用都有一个"自主性滑杆":你可以决 | | 80%以上。它像野火般迅速蔓延,最顶尖的开发者都 | Cursor。它速度快、在你需要的时机和位置智能补 | 定给 Al 多大的自主权。在 Cursor 中,你可以用 Tab 自 | | 在使用 Cursor。 | 全,括号处理得当,键盘快捷键设计合理,支持自带 | 动补全、用 Cmd+K 做定向编辑,或者直接放手交给全 | | 模型 各方面都打磨得非常到位。 | | 自主代理模式来处理。 | | Diana Hu 胡 ...
打破数据质量鸿沟!清华腾讯Bee项目发布1500万高质量数据集,刷新MLLM全栈开源SOTA
量子位· 2025-11-11 04:24
Core Insights - The article discusses the launch of the Bee project by Tsinghua University and Tencent's Mixuan team, aimed at bridging the performance gap between fully open-source multimodal large language models (MLLMs) and their closed or semi-open counterparts [2][5][26]. Group 1: Background and Motivation - The current MLLM landscape exhibits a three-tier structure: (1) top-tier closed-source models (e.g., Gemini 2.5, GPT-5), (2) semi-open models with private data (e.g., Qwen2.5-VL), and (3) significantly underperforming fully open-source models [5]. - The core bottleneck is identified as the "data quality gap" rather than model architecture [2][10]. Group 2: Key Contributions of the Bee Project - **Honey-Data-15M**: A high-quality SFT dataset comprising 15 million samples, enhanced through a dual-layer Chain of Thought (CoT) approach [6][16]. - **HoneyPipe & DataStudio**: An open-source, end-to-end data enhancement pipeline that provides a transparent and reproducible methodology for data cleaning and CoT augmentation [6][12]. - **Bee-8B**: A new 8 billion parameter model trained on Honey-Data-15M, achieving state-of-the-art (SOTA) results in various benchmarks, rivaling or surpassing mainstream semi-open models [6][21][26]. Group 3: Data Quality Issues - Existing open-source datasets suffer from two main issues: pervasive noise (e.g., factual inaccuracies, mismatched images) and a lack of complex reasoning data [11][14]. - The Bee project emphasizes that the most viable path for the open-source community is to focus on "data quality" rather than merely increasing "data quantity" [11][26]. Group 4: HoneyPipe Process - The HoneyPipe process involves a meticulous "filter-enhance-validate" workflow that produces high-quality datasets [15][18]. - The process includes three stages: noise and irrelevance filtering, short CoT enhancement and validation, and long CoT enhancement for complex queries [18]. Group 5: Performance of Bee-8B - Bee-8B demonstrates superior performance across various benchmarks, including MathVerse and LogicVista, where it achieved scores of 67.0 and 61.3, respectively, outperforming semi-open models [28]. - In general VQA tasks, Bee-8B achieved excellent SOTA scores in multiple benchmarks, including MMStar and CountBench [28]. Group 6: Conclusion - The Bee project effectively addresses the core data quality issues hindering the development of fully open-source MLLMs, advocating for a methodology that prioritizes data quality over sheer volume [26].
从“给答案”到“教动脑”:这届小学生被AI教会主动思考
量子位· 2025-11-11 04:24
Core Viewpoint - The article discusses the evolution of AI in education, highlighting the transition from traditional tutoring methods to advanced AI-driven personalized learning experiences, exemplified by the "Xueersi Learning Machine T4" and its "Xiao Si AI 1-on-1" feature, which aims to enhance student engagement and understanding through interactive and adaptive teaching methods [2][38]. Group 1: Current AI Education Landscape - Various AI education products are emerging, including ChatGPT's learning mode and Google's "Learn Your Way" tool, indicating a growing trend in AI integration within education [2][4]. - Many existing AI education tools focus on efficiency, providing quick answers without addressing deeper understanding, leading to a cycle of rote learning and superficial engagement [2][10]. Group 2: Features of Xiao Si AI 1-on-1 - The "Xiao Si AI 1-on-1" feature represents a significant advancement, functioning as an interactive AI tutor that guides students through problem-solving rather than simply providing answers [4][10]. - It utilizes multimodal perception capabilities to understand both written and verbal inputs, creating a more immersive learning experience [5][10]. - The AI encourages students to write out problem-solving steps, providing real-time feedback and corrections, which fosters critical thinking and deeper comprehension [11][14]. Group 3: Personalized Learning Approach - Xiao Si adapts its teaching strategies based on individual student performance, adjusting the pace and methods to ensure effective learning [21][22]. - It generates dynamic learning profiles for each student, allowing for tailored educational experiences that move away from a one-size-fits-all approach [22][27]. Group 4: Technological Infrastructure - The integration of hardware and software is crucial for achieving low-latency, multimodal interactions, which are essential for creating a native AI teaching experience [30][31]. - The "Nine Chapters Model" (MathGPT) is employed for comprehensive subject tutoring, having received high-level certifications for its capabilities [34][36]. Group 5: Future of AI in Education - The industry is moving towards a model where AI can serve as a complete educational companion, potentially replacing traditional tutoring roles [39][42]. - The article outlines a framework for evaluating AI teachers, suggesting that current AI capabilities are approaching the L3 stage, indicating significant progress in personalized and interactive learning [41][44].