强化学习 - filings, earnings calls, financial reports, news - Reportify

强化学习

Search documents

“百分之百的中国车”，别克首款增程式轿车至境L7亮相

Guan Cha Zhe Wang· 2025-09-17 10:38

Core Viewpoint - The Buick Zhijing L7, the first extended-range sedan from SAIC-GM Buick, was unveiled on September 15, 2023, and is touted as the "strongest extended-range luxury sedan" in the industry, developed entirely in China [1][3]. Group 1: Product Features - The Buick Zhijing L7 is built on the "Xiaoyao" super fusion architecture and features the "Zhenlong" extended-range technology, which includes a maximum power output of 252 kW and accelerates from 0 to 100 km/h in just 5.9 seconds [5]. - The vehicle boasts a comprehensive fuel consumption of 0.5L per 100 km, with a pure electric range of up to 302 km and a total range of 1420 km [5]. - It supports the fastest charging in its class at 130 kW, allowing for a 30% to 80% charge in just 18 minutes [5]. Group 2: Technological Advancements - The Zhijing L7 is equipped with the latest Qualcomm SA8775P chip, providing a neural network computing power of 72 TOPS, and features a 50-inch panoramic AR-HUD and a 15.6-inch smart central control screen [9]. - It incorporates the "Xiaoyao Zhixing" advanced driver-assistance system, which includes full-scene driving assistance capabilities and the industry's first "no-stop one-button parking" feature [7]. Group 3: Design and Comfort - The vehicle's dimensions are 5032 mm in length, 1952 mm in width, and 1500 mm in height, with a wheelbase of 3000 mm, positioning it as a C-class sedan with a sleek fastback design [11]. - The interior features a premium design with high-quality materials, including a 27-speaker Buick Sound theater-level audio system and multi-mode headrest speakers [11]. Group 4: Market Positioning - The Zhijing L7 will compete with domestic electric vehicles such as the Xiangjie S9 and Avita 12, and its brand strength in the new energy era remains to be validated [13].

SAIC MOTOR(SH:600104)

Momenta R6飞轮大模型

高通SA8775P芯片

奥特能2.0增混专用高性能电池

Momenta R6飞轮大模型

高通SA8775P芯片

奥特能2.0增混专用高性能电池

腾讯AI Lab首创RL框架Parallel-R1，教大模型学会「并行思维」

机器之心· 2025-09-17 09:37

自从 Google Gemini 将数学奥赛的成功部分归功于「并行思维」后，如何让大模型掌握这种并行探索多种推理路径的能力，成为了学界关注的焦点。然而，现有方法多依赖于监督微调（SFT），模型一来只能模仿预先构造的 parallel thinking 数据，难以泛化到真实的复杂任务中，其次这种方式对数据要求很高，往往需要复杂的 data pipeline 来构造。为解决这些难题，来自腾讯 AI Lab 西雅图、马里兰大学、卡内基梅隆大学、北卡教堂山分校、香港城市大学、圣路易斯华盛顿大学等机构的研究者们（第一作者郑童是马里兰大学博士生，本工作于其在腾讯 AI Lab 西雅图实习期间完成）首创了 Parallel-R1 框架 —— 这是第一个通过强化学习（RL）在通用数学推理任务上教会大模型进行并行思维的框架。该框架通过创新的「渐进式课程」与「交替式奖励」设计，成功解决了 RL 训练中的冷启动和奖励设计难题。实验表明，Parallel-R1 不仅在多个数学基准上带来高达 8.4% 的平均准确率提升，更通过一种 "中程训练脚手架" 的策略，在 AIME25 测试中实现了 42.9% 的性能飞跃 ...

Parallel - R1框架

Parallel - R1框架

AI革命下一站：Anthropic与OpenAI斥巨资打造“虚拟员工”

3 6 Ke· 2025-09-17 05:11

这样的训练成本不菲。据知情人士透露，Anthropic计划在未来一年内投入10亿美元，专门建设被称为"强化学习环境"或"健身房"的模拟办公平台。OpenAI同样不惜重金，预计今年在数据相关领域的支出就将达到10亿美元，到2030年更将增至80亿美元。这些资金既用于搭建虚拟办公环境，也用于支付专家薪酬。 9月17日消息，AI领域的两大巨头Anthropic和OpenAI正致力于开发能够替代人类执行复杂工作的"AI同事"。其核心方法是使用模拟企业软件来训练AI模型，使其能像人类员工那样理解和操作真实的工作流程。为加速这一进程，Anthropic计划在明年投入10亿美元建设大规模的AI训练"健身房"。OpenAI则认为，整个经济未来都可能变成巨大的"强化学习机器"，AI将通过与人类协作和反馈不断进化，从根本上重塑生产力与工作模式。时薪最高250美元，"AI家教"正在教大模型如何办公 Anthropic与OpenAI正在做一件前所未有的事：让大语言模型真正走进"办公室"，学习当一名合格的"数字员工"。这些AI模型正在接受高强度职业培训，学习操作各类专业办公软件，从Salesforce的客户管理系统、Ze ...

强化学习环境

强化学习环境

速递｜OpenAI和Anthropic的新战场：训练AI操作企业软件，成本年飙80亿美元

Z Potentials· 2025-09-17 03:34

Anthropic 、 OpenAI 等人工智能开发公司正在让大型语言模型 " 上班办公 " 。这些 AI 模型正在学习使用从 Salesforce 的客户关系管理软件到 Zendesk 的客户支持系统，再到 Cerner 的医疗记录应用等各种工具。其目的是教会 AI 如何处理白领工作者所面临的一些复杂任务。这种训练模式与 AI 模型以往的任何训练都不同。研究人员为 AI 提供模拟应用程序进行交互练习，同时聘请各领域专家向模型示范如何操作这些应用。这些技术的成本并不低廉。据一位知情人士透露， Anthropic 高管内部讨论过未来一年将斥资 10 亿美元打造这些 " 企业应用克隆体 " ——也被称为强化学习环境或训练场。雇佣生物学、软件编程和医学等领域的人类专家来教导模型学习新知识及办公软件操作，其成本也日益攀升。 OpenAI 今年早些时候预测，计划今年在数据相关成本上支出约 10 亿美元（包括支付人类专家费用和强化学习训练场），到 2030 年这一数字将攀升至 80 亿美元。若取得成功，这些 AI 训练方法或能帮助 OpenAI 和 Anthropic 突破传统训练技术近期遭遇的部分局限 ...

Artificial Intelligence

强化学习环境

Salesforce客户关系管理软件

Zendesk客户支持系统

Cerner医疗记录应用

Artificial Intelligence

强化学习环境

Salesforce客户关系管理软件

Zendesk客户支持系统

Cerner医疗记录应用

星动纪元招聘！具身多模态、强化学习等多个方向

具身智能之心· 2025-09-17 00:02

Core Viewpoint - The article outlines various job descriptions and requirements for positions related to multi-modal reinforcement learning, data processing, and embodied intelligence, emphasizing the need for advanced skills in AI and machine learning technologies [6][14][15]. Group 1: Job Descriptions - Responsibilities include research, design, and implementation of cutting-edge multi-modal reinforcement learning algorithms to address complex real-world problems [6]. - Involvement in the collection, processing, cleaning, and analysis of multi-modal data to create high-quality training datasets [14]. - Development and optimization of multi-modal models, including training, fine-tuning, and enhancing performance across different tasks [6][15]. Group 2: Job Requirements - Candidates should possess a master's degree or higher in computer science, artificial intelligence, or robotics, with at least one year of research experience in computer vision or embodied intelligence [13]. - Proficiency in programming languages such as Python and deep learning frameworks like PyTorch is essential, along with strong engineering implementation skills [13]. - Experience in publishing papers at top academic conferences (e.g., CVPR, NeurIPS) and contributions to open-source projects are preferred [13][19]. Group 3: Additional Qualifications - Familiarity with multi-modal data cleaning, labeling, and loading, as well as understanding data optimization techniques is required [14]. - Candidates should have experience with large language models and multi-modal models, including knowledge of their capabilities and applicable scenarios [14]. - High standards for data quality and attention to detail are necessary, along with proficiency in data processing tools like Pandas and NumPy [14].

多模态大模型

具身智能系统

多模态大模型

具身智能系统

直击增程消费痛点，别克新能源豪华轿车至境L7全国首秀

Nan Fang Du Shi Bao· 2025-09-16 11:07

Core Insights - SAIC-GM's new luxury electric sedan, the Zhijing L7, was officially unveiled on September 15, featuring the "Zhenlong" range extender system and advanced AI technology [1][3] - The vehicle is positioned in the competitive 200,000-300,000 RMB market segment, aiming to provide consumers with a balanced choice between traditional fuel vehicles and electric cars [1][3] Product Features - The Zhijing L7's range extender system boasts a maximum power output of 252 kW, equivalent to a 3.0T V6 engine, with a 0-100 km/h acceleration time of just 5.9 seconds and a combined fuel consumption of only 0.5L per 100 km [4][6] - The vehicle offers a pure electric range of 302 km and a total range of 1420 km, addressing common consumer concerns regarding range anxiety [4][6] Market Positioning - The luxury and joint venture brands have faced significant challenges in the electric vehicle market, with the Zhijing L7 aiming to fill the gap in the sedan segment for range-extended vehicles [3][4] - The current market for range-extended vehicles is seen as a growing segment, particularly as consumer preferences evolve towards intelligent and electric solutions [6][8] Technological Advancements - The Zhijing L7 is equipped with the Momenta R6 flying wheel model, which enhances its intelligent driving capabilities, including features like "no-stop" city navigation and automated parking [6][8] - The vehicle utilizes Qualcomm's latest SA8775P chip, providing high computational power for its intelligent cabin and driving systems [8][10] Strategic Vision - The company emphasizes a long-term commitment to luxury, comfort, and quietness, aiming to balance various performance aspects rather than focusing solely on standout features [10]

新能源汽车

真龙增程系统

Momenta R6飞轮大模型

新能源汽车

真龙增程系统

Momenta R6飞轮大模型

别克至境L7增程轿车全国首秀

Huan Qiu Wang· 2025-09-16 11:03

2025年9月15日，新能源智能豪华轿车——至境L7首次公开亮相。作为别克高端新能源子品牌"至境"的首款旗舰轿车，至境L7采用顶级"真龙"增程技术，率先搭载"逍遥智行"辅助驾驶系统，全球首发上车基于端到端"强化学习"的Momenta R6飞轮大模型，以及高通最新一代SA8775P芯片。此外，至境L7还拥有豪华底盘和豪华舒享座舱，以及对标高端市场的配置。目前，至境L7已到达全国别克经销商展厅，并开启早鸟计划。设计与舒适：豪华配置与底盘技术至境L7拥有5032mmx1952mmx1500m车身尺寸和3000mm较长轴距。设计师从大自然汲取灵感，塑造了富有流动美感与张力的星空展翼外观，蓄势待发的豪华溜背造型，具备超静谧NVH全车无框车门、隐藏门把手和20吋星光涡扇轮毂。银河星空展翼大灯、星轨浮光展翼尾灯，加上车顶激光雷达，以及标志"逍遥智行"的小蓝灯，将科技融入优雅。座舱采用全新纯净浮岛设计美学，塑造了简洁优雅、势能流淌的错层空间。内饰选材提供270°皮质环绕包覆。湖心岛式顶控、水中石晶雅顶灯，还有门板及仪表台星河金砂饰条，呈现典雅、内敛的东方意蕴，营造高端、雅致的空间氛围。至境L7拥有宽裕的座舱 ...

新能源汽车

新能源汽车

一文读懂GPT-5的绝招，这是决定AI未来的隐形武器

3 6 Ke· 2025-09-16 10:43

Core Insights - The article discusses the significance of the "Universal Verifier" in the evolution of AI models, particularly in the context of GPT-5 and its performance enhancements [2][3] - It highlights the limitations of previous reinforcement learning methods, particularly "Reinforcement Learning with Verifiable Rewards" (RLVR), in complex real-world scenarios where answers are not binary [2][4] - The article outlines two main approaches to developing the Universal Verifier: enhancing the evaluation criteria and allowing models to self-assess their outputs [36][44] Group 1: Universal Verifier and Its Importance - The Universal Verifier is seen as a potential breakthrough in AI, addressing the shortcomings of RLVR by enabling models to evaluate answers in a more nuanced manner [2][10] - The need for a more sophisticated evaluation system arises from the complexity of real-world problems, especially in fields like healthcare and education, where answers are not simply right or wrong [2][11] - The article emphasizes that understanding the Universal Verifier is crucial for grasping the future of AI technology and competition [3] Group 2: Approaches to Developing the Universal Verifier - The first approach involves using large language models (LLMs) as judges to create a more complex evaluation standard, which has been explored in various research papers [4][5][6] - The second approach focuses on self-assessment, where models evaluate their own outputs based on internal confidence levels, reducing reliance on external validation [44][45] - The RaR (Rubrics as Rewards) framework is introduced as a method to create detailed scoring criteria for evaluating model outputs, leading to significant performance improvements in specific domains [19][21][22] Group 3: Performance Improvements and Results - The article presents data showing that models trained using the RaR framework achieved substantial performance gains, with scores in medical evaluations increasing nearly fourfold [21][22] - Comparisons with other evaluation methods indicate that RaR outperformed traditional approaches, demonstrating its effectiveness in complex reasoning tasks [22][24] - The Rubicon framework further enhances the scoring system by incorporating over 10,000 evaluation criteria, leading to improved performance in subjective areas like creative writing [27][28] Group 4: Future Directions and Challenges - The article discusses the limitations of current approaches, noting that while RaR and Rubicon show promise, they still rely on expert-defined criteria, which may hinder scalability [69][70] - The INTUITOR method represents a shift towards internal feedback mechanisms, allowing models to learn without predefined answers, but it also faces challenges in generalizability [59][60] - The OaK architecture is proposed as a long-term vision for AI, aiming for a system that learns and evolves through interaction with the environment, though it remains a distant goal [70][77]

通用验证器

通用验证器

通用验证器

通用验证器

上汽通用汽车“至境L7”公开亮相

Zhong Zheng Wang· 2025-09-16 06:13

Core Viewpoint - SAIC-GM's Buick brand has launched its flagship electric sedan, the Buick Zhijing L7, which aims to compete in the high-end electric vehicle market with advanced technology and features [1] Group 1: Product Launch - The Buick Zhijing L7 made its national debut on September 15 in Shanghai [1] - The vehicle is now available in Buick dealerships across the country and has initiated an early bird program offering lifetime free maintenance for orders placed before September 28 [1] Group 2: Technology and Features - The Zhijing L7 utilizes "True Dragon" range extension technology and is equipped with the "Xiaoyao Zhixing" driver assistance system [1] - It features the Momenta R6 flywheel model based on end-to-end "reinforcement learning" and Qualcomm's latest SA8775P chip, providing a top-tier intelligent electric experience [1] - The vehicle boasts a pure electric range of 302 km and a comprehensive range of 1420 km [1] Group 3: Market Positioning - The Zhijing L7 combines global automotive expertise with local innovation, aiming to enter the first tier of the electric vehicle market [1] - The vehicle is expected to create new opportunities for the Buick brand's development in the new era, leveraging industry-leading range extension technology and luxury experience [1]

SAIC MOTOR(SH:600104)

新能源汽车

真龙增程技术

逍遥智行辅助驾驶系统

Momenta R6飞轮大模型

新能源汽车

真龙增程技术

逍遥智行辅助驾驶系统

Momenta R6飞轮大模型

蚂蚁集团大模型数据智能算法工程师招聘（可内推）

自动驾驶之心· 2025-09-15 23:33

Core Viewpoint - The article discusses the responsibilities and requirements for a position focused on developing advanced algorithms for large model data production, emphasizing the importance of data knowledge systems, automatic classification, authoritative evaluation sets, quality assessment, and innovative solutions in the field of artificial intelligence and deep learning [1][2][3]. Group 1: Responsibilities - The role involves designing and developing algorithms to address key issues in large model data production, including data knowledge system generation, automatic corpus classification, authoritative evaluation set construction, and quality assessment of training data [1][5]. - Specific tasks include researching automatic knowledge graph generation based on LLM, developing classification algorithms, and creating standardized evaluation sets to assess model performance [1][5]. - The position also requires establishing a data-driven system for quality assessment, identifying low-quality data, and synthesizing training data to improve model performance [1][5]. Group 2: Requirements - Candidates should possess a master's degree or higher in computer science, artificial intelligence, deep learning, or related fields, and be proficient in deep learning frameworks such as PyTorch and TensorFlow [2][6]. - Strong problem-solving skills, self-motivation, and the ability to analyze and address issues are essential, along with effective communication and coordination abilities [2][6]. - Preference is given to candidates with practical experience in large model data system design, corpus classification, evaluation set construction, and data annotation algorithms [3][4][6].

蚂蚁大模型

蚂蚁大模型