Artificial Intelligence
Search documents
RL 将如何提高具身大模型 VLA 泛化性?清华大学团队NeurIPS 2025文章分析 RL 与 SFT 泛化性差异
机器之心· 2025-10-12 02:41
Core Insights - The article discusses the potential of Vision-Language-Action (VLA) large models in embodied intelligence, highlighting the limitations of current supervised fine-tuning (SFT) methods in generalization to new environments and tasks. It emphasizes the advantages of Reinforcement Learning (RL) in enhancing the generalization capabilities of VLA models [2][4]. Group 1: Research Findings - A new evaluation benchmark was created to address the limited generalization of VLA models, comparing the performance of RL and SFT in enhancing model robustness across various visual, semantic, and execution challenges [4]. - Experiments revealed that using RL algorithms like Proximal Policy Optimization (PPO) significantly improved the model's robustness in semantic understanding and task execution, maintaining performance comparable to SFT in visually varied scenarios [4][11]. Group 2: RL Methodology - The research team tested three RL algorithms: PPO, Direct Preference Optimization (DPO), and Group Relative Policy Optimization (GRPO). The results showed that PPO outperformed DPO and GRPO in multi-step decision tasks due to the partially observable Markov decision process (POMDP) characteristics of robotic tasks [9][11]. - To enhance the efficiency of PPO training on VLA models, three key innovations were introduced: a shared Actor-Critic architecture reducing memory usage by 45% and increasing training speed by 35%, a preheating strategy using 140 high-quality trajectories to improve convergence speed by 50%, and minimizing PPO training epochs to just one, which reduced training time significantly [13][15]. Group 3: Comparison of SFT and RL - The research explored the data scale limits of SFT, finding that performance saturation occurred at around 16,000 demonstration trajectories. In contrast, RL achieved a 42.6% performance improvement on out-of-distribution tasks, indicating superior generalization capabilities [18][19]. - A comprehensive evaluation benchmark was constructed to dissect the generalization differences between SFT and RL across visual, semantic, and execution dimensions, with RL showing clear advantages in semantic understanding and execution robustness [21][23]. Group 4: Practical Implications - The study underscores the core value of RL in developing truly generalizable embodied agents, which is increasingly important as robotic applications become more complex and variable. The team has open-sourced a large-scale RL framework for embodied intelligence, RLinf, to facilitate further research [25]. - Visual analysis of specific cases revealed deeper differences, such as RL's ability to maintain task stability under noise and effectively handle unseen objects, contrasting with SFT's tendency to get stuck in repetitive actions [26].
曾拒15亿美金,超级天才Andrew Tulloch重返Meta,Thinking Machines Lab痛失联创
机器之心· 2025-10-12 02:41
Core Viewpoint - Meta's aggressive recruitment strategy, particularly the high-profile attempt to lure Andrew Tulloch back, highlights the company's ongoing efforts to strengthen its AI capabilities despite previous rejections of lucrative offers [1][11]. Group 1: Recruitment and Offers - Mark Zuckerberg's recruitment efforts have included a dramatic offer exceeding $1 billion to Andrew Tulloch, which was initially declined [2][11]. - Tulloch, a prominent figure in AI, has a strong academic background and extensive experience at Meta and OpenAI, making him a valuable asset for any tech company [7][8]. - Despite rejecting the initial offer, Tulloch ultimately decided to join Meta, indicating a shift in his career path [5][12]. Group 2: Background of Andrew Tulloch - Andrew Tulloch graduated with top honors in mathematics from the University of Sydney and later earned a master's degree from Cambridge University [7]. - He has over 11 years of experience at Meta, contributing significantly to the development of machine learning systems and advertising platforms [7]. - After leaving Meta, Tulloch played a key role at OpenAI, working on the development of advanced AI models like GPT-4 and GPT-4.5 [9][11]. Group 3: Implications for Meta - Tulloch's return to Meta comes at a time of internal management changes, raising questions about the potential impact of his expertise on the company's AI initiatives [12].
实测“清华特奖版Sora”:一图一prompt直接生成视频,堪称嘴强王者
量子位· 2025-10-12 02:05
Core Insights - The article discusses the launch of GAGA-1, a video generation model developed by Sand.ai, which focuses on audio-visual synchronization and performance [1][24][30] - GAGA-1 allows users to create videos by simply uploading an image and providing a prompt, making the process user-friendly and accessible [4][7][8] Group 1: Model Features - GAGA-1 excels in generating videos where characters can "speak" and perform, showcasing a strong capability in lip-syncing and expression [23][30] - The platform does not require an invitation code, allowing users to access it freely [4] - Users can generate images within the platform, streamlining the process from image to video [7][8] Group 2: Performance Evaluation - Initial tests show that GAGA-1 can produce high-quality video outputs with natural expressions and synchronized lip movements [11][12] - However, some minor bugs were noted, such as stiffness in character expressions and slight misalignment in audio [13][23] - The model performs well in simple scenarios but struggles with complex scenes involving multiple characters and actions [23][30] Group 3: Team Background - Sand.ai, the team behind GAGA-1, previously developed the Magi-1 model, known for its high-quality video generation [25][29] - The founder, Cao Yue, has a strong academic background, including a PhD from Tsinghua University and recognition for his contributions to AI research [26][29] Group 4: Market Position - GAGA-1 differentiates itself by focusing on audio-visual synchronization rather than attempting to be an all-encompassing model [29][30] - The model's strength in dialogue and performance positions it as a leading player in the AI-generated video market [30][31]
谷歌新一代文生视频模型样片流出:8秒720P视频自带配乐;Figure AI发布Figure 03,可端茶送水等丨AIGC日报
创业邦· 2025-10-12 01:08
Group 1 - OpenAI and Sur Energy have signed a memorandum of understanding to initiate the "Gateway to the Stars" project in Argentina, which includes the construction of a large data center with a capacity of up to 500 megawatts, representing an investment of up to $25 billion, making it one of the largest tech and energy infrastructure projects in Argentina's history [2] - Figure AI has launched its third-generation humanoid robot, Figure 03, which showcases multi-tasking capabilities across home and commercial settings, with improvements in speed and torque density, designed for high-volume manufacturing to reduce costs [2] - Google's new video generation model, Veo 3.1, has entered testing, allowing users to create 8-second 720p videos with synchronized audio, featuring enhanced emotional expressiveness in background music compared to previous versions [2] - Solidigm has established an AI central laboratory equipped with high-performance, high-density storage testing clusters, located in Rancho Cordova, California, to study real AI workloads and enhance cluster efficiency [3]
Thinking Machines Lab Co-Founder Departs for Meta
WSJ· 2025-10-11 18:12
Core Insights - Andrew Tulloch, a prominent AI researcher, has joined a major social media company, indicating a strategic move to enhance its AI capabilities [1] Company Developments - The recruitment of Andrew Tulloch highlights the company's commitment to advancing its artificial intelligence initiatives [1] Industry Trends - The trend of attracting top AI talent is becoming increasingly common among leading tech firms, reflecting the growing importance of AI in the social media landscape [1]
'We're kind of in a fragile state now': Why the AI bubble might be about to burst — how to protect yourself
Yahoo Finance· 2025-10-11 15:57
Core Insights - Concerns are rising about a potential AI bubble, drawing parallels to the dot-com bubble of 2000, with significant market reactions observed recently [1][4][5] - Despite fears, investment in AI continues, highlighted by OpenAI's multibillion-dollar deal with AMD, which significantly boosted AMD's stock [1][3] - Reports indicate a troubling trend in the AI sector, including high failure rates of generative AI projects and restructuring efforts at major companies like Meta [2][3] Investment Trends - OpenAI's recent release of ChatGPT-5 received negative feedback, prompting the reinstatement of the previous model for paying customers, indicating challenges in maintaining user satisfaction [3] - The Nasdaq index fell over 3%, and the S&P 500 lost $1 trillion in value amid market fears, reflecting the volatility in tech stocks [1][3] - AI stocks, such as C3.ai, experienced significant declines, with a drop of 28.2% in August, showcasing the market's sensitivity to negative news [1][2] Market Sentiment - Industry leaders, including OpenAI's CEO Sam Altman, acknowledge the existence of a bubble in AI investments, suggesting that investor enthusiasm may be excessive [2][8] - Analysts express that while the AI theme remains strong, there is a saturation point, and the market is vulnerable to rapid shifts in sentiment [7][8] - The current state of the market is described as fragile, with volatility expected as investors navigate the uncertain landscape of AI [10][9]
What could burst the AI bubble?
TechXplore· 2025-10-11 15:10
Core Insights - Major tech firms have significantly increased in value over the past year, driven by advancements in AI, with expectations of transformative impacts across various sectors [1][2] - OpenAI's valuation has surged to US$500 billion from US$157 billion in the previous year, while Anthropic's valuation has nearly tripled, raising concerns from the Bank of England about a potential market correction [2][3] - The sustainability of these valuations is questioned, with discussions around whether they are based on realistic future profitability or merely speculative optimism [3][7] Valuation and Profitability - OpenAI, despite its high valuation, has not yet achieved profitability and may need to increase revenue tenfold to do so [11][12] - The US$500 billion valuation is particularly striking given OpenAI's reported loss of US$7.8 billion in the first half of the year, with some value attributed to a deal with Nvidia involving mutual investments [12][13] - AI firms, in general, are currently not profitable, with investments being made based on anticipated future gains rather than current financial performance [13][15] Market Dynamics and Risks - The rapid rise in valuations is seen as an early sign of a potential bubble, with the possibility of a correction if investor confidence wanes [6][10] - Historical parallels are drawn to the dotcom bubble, suggesting that minor events can trigger a reevaluation of investments, leading to a sell-off [10] - The major tech companies are investing heavily in AI infrastructure, which could lead to a bubble burst if the anticipated future does not materialize [15]
北京国资出手,面壁智能完成数亿元新一轮融资、加码端侧大模型研发
Sou Hu Cai Jing· 2025-10-11 14:16
Core Insights - The article highlights that Mianbi Intelligent has completed a new round of financing amounting to several hundred million yuan, aimed at enhancing the development of large models and accelerating commercialization efforts [1][4]. Company Overview - Mianbi Intelligent was established in 2022, focusing on innovation and application transformation in large model technology [4]. - The founding team is from Tsinghua University, specifically the Natural Language Processing Laboratory [4]. - Li Dahai, previously with Zhihu, became the CEO of Mianbi Intelligent in 2023, while Liu Zhiyuan, a co-founder and chief scientist, is an associate professor in the Department of Computer Science and Technology at Tsinghua University [4]. Financing Details - The recent financing round was led by Beijing's state-owned investment platform "Jingguorui" and market-oriented venture capital fund "Mijuhua" [1]. - The funds raised will primarily be used to increase research and development efforts on the end-side large models and to promote the commercialization process [1].
2025人工智能全景报告:AI的物理边界,算力、能源与地缘政治重塑全球智能竞赛
欧米伽未来研究所2025· 2025-10-11 13:47
Core Insights - The narrative of artificial intelligence (AI) development is undergoing a fundamental shift, moving from algorithm breakthroughs to being constrained by physical world limitations, including energy supply and geopolitical factors [2][10][12] - The competition in AI is increasingly focused on reasoning capabilities, with a shift from simple language generation to complex problem-solving through multi-step logic [3][4] - The AI landscape is expanding with three main camps: closed-source models led by OpenAI, Google, and Anthropic, and emerging open-source models from China, particularly DeepSeek [4][9] Group 1: Reasoning Competition and Economic Dynamics - The core of the AI research battlefield has shifted to reasoning, with models like OpenAI's o1 demonstrating advanced problem-solving abilities through a "Chain of Thought" approach [3] - Leading AI labs are competing not only for higher intelligence levels but also for lower costs, with the Intelligence to Price Ratio doubling every 3 to 6 months for flagship models from Google and OpenAI [5] - Despite high training costs for "super intelligence," inference costs are rapidly decreasing, leading to a "Cambrian explosion" of AI applications across various industries [5] Group 2: Geopolitical Context and Open Source Movement - The geopolitical landscape, particularly the competition between the US and China, shapes the AI race, with the US adopting an "America First" strategy to maintain its leadership in global AI [7][8] - China's AI community is rapidly developing an open-source ecosystem, with models like Qwen gaining significant traction, surpassing US models in download rates [8][9] - By September 2025, Chinese models are projected to account for 63% of global regional model adoption, while US models will only represent 31% [8] Group 3: Physical World Constraints and Energy Challenges - The pursuit of "super intelligence" is leading to unprecedented infrastructure investments, with AI leaders planning trillions of dollars in capital for energy and computational needs [10][11] - Energy supply is becoming a critical bottleneck for AI development, with predictions of a significant increase in power outages in the US due to rising AI demands [10] - AI companies are increasingly collaborating with the energy sector to address these challenges, although short-term needs may lead to a delay in transitioning away from fossil fuels [11] Group 4: Future Outlook and Challenges - The report highlights that AI's exponential growth is constrained by linear limitations from the physical world, including capital, energy, and geopolitical tensions [12] - The future AI competition will not only focus on algorithms but will also encompass power, energy, capital, and global influence [12] - Balancing speed with safety, openness with control, and virtual intelligence with physical reality will be critical challenges for all participants in the AI landscape [12]
AI全产业链“组团”出击,17家光谷龙头企业率先“动”起来
Chang Jiang Ri Bao· 2025-10-11 13:41
Core Insights - The "Optics Valley in Action" initiative has been launched, accelerating the development of the AI industry in Optics Valley [1] - A total of 17 core enterprises from the industry chain, including Fenghuo Communication and Dameng Data, have come together to respond to national strategies [1] Group 1: Government Initiatives - Hubei Province and Wuhan City have introduced a series of action plans and policies this year to seize new industrial heights [4] - The East Lake High-tech Zone has established a dedicated AI+ task force to implement strong measures for industrial advancement [4] Group 2: Industry Collaboration - The "Optics Valley in Action" initiative includes a joint proposal that emphasizes six action directions: enhancing innovation, promoting deep integration, building an open ecosystem, focusing on public welfare, improving governance capabilities, and deepening open cooperation [9] - Companies such as Changjiang Computing, Zhidong Taichu, and Dameng Data have signed the "AI+ Innovation Cooperation Plan" aimed at creating a domestically controlled AI industry ecosystem [13] Group 3: Industry Landscape - Optics Valley has gathered over 700 AI enterprises, forming a comprehensive industrial layout in hardware fields such as storage chips, high-speed optical modules, and computing servers [13] - Leading companies in the development of large models, such as Zhidong Taichu and Kingsoft Office, are at the forefront of their respective industries nationwide [13] - Six humanoid robot manufacturers are innovating in cutting-edge fields like intelligent manufacturing and smart healthcare, also ranking among the top in the country [13] Group 4: Infrastructure Development - Fenghuo Communication showcased its robust AI infrastructure capabilities, offering solutions for efficient computing power interconnectivity and presenting the "Tongzhi Chao" integrated computing power base developed by Changjiang Computing [13]