Workflow
强化学习
icon
Search documents
一天之内,Meta痛失两员大将,小扎钞能力失效?
机器之心· 2025-08-26 08:53
Core Viewpoint - Meta is experiencing significant talent attrition, particularly among top AI researchers, due to internal management issues and a lack of alignment with the company's vision and culture [1][9][39]. Group 1: Talent Departure - Two senior researchers, Rishabh Agarwal and Bert Maher, recently announced their departure from Meta, with Agarwal moving to an unspecified location and Maher joining Anthropic [3][24]. - Agarwal's exit highlights the argument that even high salaries cannot retain top talent, as he follows Zuckerberg's advice on taking risks in a rapidly changing world [14][39]. - Maher, who worked at Meta for 12 years, contributed to significant projects like PyTorch and HHVM, indicating the loss of valuable expertise [25][27]. Group 2: Internal Management Issues - Meta's internal management culture is cited as a reason for its low employee retention rate of 64%, compared to Anthropic's 80% [30][33]. - Previous complaints from former employees, including John Carmack and Tijmen Blankevoort, point to issues such as poor resource utilization, performance evaluation pressures, and internal competition [33][34]. - The lack of a strong CTO to balance the power of the CEO is seen as a potential risk for the company's future stability [11]. Group 3: Cultural Misalignment - Many top researchers are leaving Meta due to a misalignment with the company's focus on speed and profitability, which contrasts with their values of safety, independence, and long-term research [39][40]. - The absence of a compelling mission at Meta makes it difficult for some employees to justify staying, as exemplified by Tesla engineer Yun-Ta Tsai's decision to remain with his current employer for its meaningful goals [40][42]. - The perception that Meta's culture prioritizes financial gain over meaningful work is leading to a reluctance among potential recruits to join the company [39][42].
Meta万引强化学习大佬跑路!用小扎原话作为离别寄语,扎心了
量子位· 2025-08-26 04:36
Core Viewpoint - The departure of Rishabh Agarwal from Meta highlights a potential trend of employee attrition within the company, raising concerns about internal conflicts and employee satisfaction amidst a hiring spree [1][22][24]. Group 1: Rishabh Agarwal's Departure - Rishabh Agarwal, a prominent figure in reinforcement learning at Meta, is leaving the company after 7.5 years, expressing a desire to explore a completely different path [1][17]. - His contributions include significant work on models like Gemini 1.5 and Gemma 2, and he received the Outstanding Paper Award at NeurIPS in 2021 for his research on statistical instability in deep reinforcement learning [4][14][13]. - Agarwal's next steps remain uncertain, but speculation suggests he may venture into entrepreneurship [17]. Group 2: Employee Turnover at Meta - Agarwal's exit is part of a broader trend, as another long-term employee with 12 years at Meta also announced their departure, joining a competing firm, Anthropic [18][19]. - Reports indicate that tensions between new and old employees regarding salary disparities have led to dissatisfaction, prompting some researchers to threaten resignation [23][24]. - The current hiring surge at Meta may be exacerbating internal conflicts, contributing to the trend of experienced employees leaving the company [22][24].
最新智能体自动操作手机电脑,10个榜单开源SOTA全拿下|通义实验室
量子位· 2025-08-25 23:05
Core Viewpoint - The article discusses the launch of the Mobile-Agent-v3 framework by Tongyi Lab, which achieves state-of-the-art (SOTA) performance in automating tasks on mobile and desktop platforms, showcasing its ability to perform complex tasks through a multi-agent system [2][9]. Group 1: Framework and Capabilities - The Mobile-Agent-v3 framework can independently execute complex tasks with a single command and seamlessly switch roles within a multi-agent framework [3][9]. - It has achieved SOTA performance across ten major GUI benchmarks, demonstrating both foundational capabilities and reasoning generalization [9][11]. Group 2: Data Production and Model Training - The framework relies on a robust cloud infrastructure built on Alibaba Cloud, enabling large-scale parallel task execution and data collection [11][13]. - A self-evolving data production chain automates data collection and model optimization, creating a feedback loop for continuous improvement [13][15]. - The model is trained using high-quality trajectory data, which is generated through a combination of historical task data and large-scale pre-trained language models [22][23]. Group 3: Task Execution and Understanding - The framework emphasizes precise interface element localization, allowing the AI to understand the graphical interface effectively [18][19]. - It incorporates complex task planning, enabling the AI to strategize before executing tasks, enhancing its ability to handle long-term and cross-application tasks [21][22]. - The model understands the causal relationship between actions and interface changes, which is crucial for effective task execution [24][25]. Group 4: Reinforcement Learning and Performance - The Mobile-Agent team employs reinforcement learning (RL) to enhance the model's decision-making capabilities through real-time interactions [28][29]. - An innovative TRPO algorithm addresses the challenges of sparse and delayed reward signals in GUI tasks, significantly improving learning efficiency [31][36]. - The framework has shown a performance increase of nearly 8 percentage points in dynamic environments, indicating its self-evolution potential [36][40]. Group 5: Multi-Agent Collaboration - The Mobile-Agent-v3 framework supports multi-agent collaboration, allowing different agents to handle various aspects of task execution, planning, reflection, and memory [33][34]. - This collaborative approach creates a closed-loop enhancement pipeline, improving the overall efficiency and effectiveness of task execution [34][35]. - The framework's design enables AI to act with purpose, adjust based on feedback, and retain critical information for future tasks [35][36].
VLA/强化学习/VLN方向1v1论文辅导~
具身智能之心· 2025-08-25 06:00
Group 1 - The article announces the availability of 1v1 paper guidance in the field of embodied intelligence, specifically focusing on three areas: vla, reinforcement learning, and sim2real [1] - The guidance is primarily aimed at participants of major conferences such as CVPR, ICCV, ECCV, ICLR, CoRL, ICML, and ICRA [1] - The instructors are actively engaged in the academic field of embodiment and have innovative ideas [1] Group 2 - Interested individuals are encouraged to add a specific WeChat contact for inquiries or to scan a QR code for consultation regarding the paper guidance [2]
自动驾驶转具身智能有哪些切入点?
自动驾驶之心· 2025-08-24 23:32
Core Viewpoint - The article discusses the transition from autonomous driving to embodied intelligence, highlighting the similarities and differences in algorithms and tasks between the two fields [1]. Group 1: Algorithm and Task Comparison - Embodied intelligence largely continues the algorithms used in robotics and autonomous driving, such as training and fine-tuning methods, as well as large models [1]. - There are notable differences in specific tasks, including data collection methods and the emphasis on execution hardware and structure [1]. Group 2: Community and Learning Resources - A full-stack learning community named "Embodied Intelligence Heart" has been established to share knowledge related to algorithms, data collection, and hardware solutions in the field of embodied intelligence [1]. - Key areas of focus within the community include VLA, VLN, Diffusion Policy, reinforcement learning, robotic arm grasping, pose estimation, robot simulation, multimodal large models, chip deployment, sim2real, and robot hardware structure [1].
重磅!浙大最新综述,解码40+年足式机器人技术演进与未来挑战
机器人大讲堂· 2025-08-24 13:15
近日, 浙江大学流体动力与机电系统国家重点实验室 的研究团队在国际期刊《 Cyborg and Bionic Systems 》上发表一篇系统性综述论文,全面梳理单腿机器人在结构设计、建模方法与控制策略等核心领域 的发展演进与未来挑战。 论文名为《 Bridging the Gap to Bionic Motion: Challenges in Legged Robot Limb Units Design, Modeling, and Control 》, 由中国工程院院士领衔的研究团队撰写,系统探讨了实现 "仿生运动"的关键 路径 ,为理解 "让机器人像生物一样灵活行走"这一根本性难题提供了新的思路。 该研究的独特价值在于:它 不仅追溯了四十多年来从简单伸缩结构到复杂关节系统的演化历程,更重要的是 揭示单腿机器人作为多腿机器人 "基本单元"的科学意义 ——通过在简化系统复杂度的前提下聚焦腿足运动本 质,为波士顿动力 Spot 、云深处绝影等商业化四足机器人的成功奠定了理论基础。 文章链接: https://spj.science.org/doi/10.34133/cbsystems.0365 ▍ 为什么要从 ...
在OpenAI炼Agent一年半,回国做出首个开源Agent训练框架!这个30岁清华天才却说:创业不是技术命
AI前线· 2025-08-23 05:32
Core Viewpoint - The article highlights the journey and achievements of Wu Yi, a prominent figure in AI and reinforcement learning, emphasizing his contributions to the field and the unique positioning of his startup, BianSai Technology, which focuses on the AReaL framework for training large models [2][4][8]. Group 1: Career and Achievements - Wu Yi has a distinguished background, being an ACM World Medalist and a coach for the IOI team, with significant experiences at Facebook, ByteDance, and OpenAI [2][4]. - His startup, BianSai Technology, was acquired by Ant Group in 2024, and the team has developed a unique asynchronous reinforcement learning framework called AReaL, which has gained traction on GitHub with 2.4k stars [2][4][8]. Group 2: Insights from OpenAI Experience - Wu Yi's decision to join OpenAI was somewhat serendipitous, as he initially aimed for Google Brain but found OpenAI more accommodating due to its non-profit structure [4][5]. - He emphasizes the importance of evidence-driven decision-making in AI development, advocating for a flexible approach that allows for rapid adjustments based on new findings [5][13]. Group 3: Reinforcement Learning and Competitions - Wu Yi discusses the differences in performance of AI models in competitions like IOI and CCPC, attributing failures to the readiness of the models rather than inherent limitations of AI [6][7]. - He believes that AI's role in competitive programming is akin to sports, where psychological factors and skills play a significant role [6][7]. Group 4: AReaL Framework and Market Position - AReaL is positioned as a unique framework for training agent models, with Wu Yi asserting that there are currently no direct competitors in this space [2][33][36]. - The framework aims to facilitate faster and more effective training of agent models, focusing on user-friendliness and performance [36][37]. Group 5: Future Directions and Challenges - Wu Yi anticipates that multi-agent systems will become increasingly important as the complexity of agent workflows grows, presenting new opportunities for algorithm development [41][42]. - He expresses confidence that agent technology will evolve to become a mainstream interaction form in AI, moving towards more autonomous and proactive roles [42].
又帮到了一位同学拿到了VLA算法岗......
具身智能之心· 2025-08-22 16:03
昨天下午有个小朋友,底子还不错,C9即将研三。正在秋招,来找峰哥诉苦,同门找到了VLA算法岗位 (一个特别有钱的具身公司),我想转来不及了......刚开始都是一起做的传统机器人,SLAM相关。后面不 知道他做了什么项目,进度这么快,面试几家都过了。 这两天同门才刚给我推荐你们社区,体系很完整, 就怕有点晚了。 8月份,陆续有同学找到峰哥,不是拿到口头offer,就是想转具身担心来不及。虽然秋招将近, 但还是那 句话,"什么时候都不算太晚。" 尽快把完整的具身路线补齐才是重中之重,特别是数采和算法、仿真等。 如果你没有较强独立学习和搜索问题的能力,可以来我们的具身社区,也是目前国内最大最全的具身学习 平台【具身智能之心】知识星球。 "具身智能之心知识星球"目前集视频 + 图文 + 学习路线 + 问答 + 求职交流为一体,是一个综合类的具身社 区,近2000人了。我们期望未来2年内做到近万人的规模。给大家打造一个交流+技术分享的聚集地,是许 多初学者和进阶的同学经常逛的地方。 社区内部还经常为大家解答各类实用问题:如何使用设备?如何有效采集数据?如何部署VA、VLA模型 等。是采集背景太复杂还是数据比较dirt ...
用三组关键词囊括所有看好理想人士近期对理想的观点
理想TOP2· 2025-08-22 13:29
Core Viewpoint - The article discusses the differing perspectives of VC (Venture Capital) and PE (Private Equity) mindsets towards the company 理想 (Li Auto), highlighting how these mindsets influence the evaluation of the company's potential and performance. VC Mindset - The VC mindset focuses on long-term potential, often looking at a 3-5 year horizon and analyzing the core value or potential of 理想 in the context of being a leading physical AI company [2] - VCs are more tolerant of mistakes and failures during the long-term goal achievement process, believing in the transformative potential of AI technology [2][5] - The VC perspective emphasizes the low marginal cost of software and the significant future value creation potential, regardless of immediate financial metrics [9] PE Mindset - The PE mindset is more short-term oriented, typically evaluating the company on a timeline of less than a year, focusing on concrete financial metrics such as sales volume, revenue, and profit margins [3] - PEs require solid evidence of value and are less forgiving of short-term misjudgments, leading to a more critical view of 理想's recent performance [4][19] - The PE perspective is influenced by recent financial data, which has been disappointing, leading to a low evaluation based on specific performance metrics [15][16] Physical AI - 理想's approach to physical AI combines AI software with hardware, representing a significant advancement over traditional software-hardware integration [6][7] - The article emphasizes the unique capabilities of 理想 in achieving a high level of integration between AI software and hardware, which may be underestimated by those focused solely on hardware or traditional software [7] Recent Performance and Criticism - Recent performance metrics have led to criticism from the PE perspective, particularly regarding delivery targets and product expectations [15][16] - Specific issues highlighted include unmet delivery guidance, product delays, and customer dissatisfaction, which have contributed to a negative perception among PE investors [16] - The article notes that while the VC mindset may overlook these issues due to a focus on long-term potential, the PE mindset is less tolerant of such discrepancies [18][19]
VLA方向的论文还不知怎么下手?有的同学已经CCF-A了......
自动驾驶之心· 2025-08-22 12:00
Core Insights - The article discusses the advancements of the Li Auto VLA driver model, highlighting its improved capabilities in understanding semantics, reasoning, and trajectory planning, which are crucial for autonomous driving [1][3][5] Group 1: VLA Model Capabilities - The VLA model demonstrates enhanced semantic understanding through multimodal input, improved reasoning via thinking chains, and a closer approximation to human driving intuition through trajectory planning [1] - Four core abilities of the VLA model are showcased: spatial understanding, reasoning ability, communication and memory capability, and behavioral ability [1][3] Group 2: Research and Development Trends - The VLA model has evolved from VLM+E2E, integrating various cutting-edge technologies such as end-to-end learning, trajectory prediction, visual language models, and reinforcement learning [5] - While traditional perception and planning tasks are still being optimized in the industry, the academic community is increasingly shifting focus towards large models and VLA, indicating a wealth of subfields still open for exploration [5] Group 3: VLA Research Guidance Program - A VLA research paper guidance program has been initiated, receiving positive feedback, aimed at helping participants systematically grasp key theoretical knowledge and develop their own research ideas [6] - The program includes a structured curriculum over 14 weeks, covering topics from traditional end-to-end autonomous driving to writing methodologies for research papers [9][11][30] Group 4: Course Structure and Requirements - The course is designed for a maximum of 8 participants per session, targeting individuals with a background in VLA and autonomous driving at various academic levels [12][15] - Participants are expected to have a foundational understanding of deep learning, Python programming, and familiarity with PyTorch, with specific hardware requirements suggested for optimal performance [21][22] Group 5: Expected Outcomes - Participants will gain insights into classic and cutting-edge research papers, coding skills, and methodologies for writing and submitting research papers, culminating in the production of a draft paper [20][34] - The program aims to enhance participants' understanding of algorithms, their advantages and disadvantages, and to stimulate their research ideas through structured guidance [20][34]