Workflow
强化学习
icon
Search documents
外滩大会再证蚂蚁的底色:金融科技公司
Mei Ri Shang Bao· 2025-09-11 23:04
商报讯(记者 张玲丽 吕文鹃)昨天上午,以"重塑创新增长"为主题的2025 Inclusion·外滩大会在上海黄 浦世博园区开幕,来自16个国家和地区的550位嘉宾参会分享,包括新晋图灵奖得主理查德·萨顿 (Richard Sutton),阿里云创始人、之江实验室主任王坚,"人类简史系列"作者尤瓦尔·赫拉利(Yuval Noah Harari),宇树科技创始人兼首席执行官王兴兴等全球顶尖学者、产业界人士、青年创业者和科学 家,共同探索智能时代的创新路径与商业未来。 今年大会内容呈现国际化、多元化特色,聚焦"金融科技""人工智能与产业""创新创投生态""全球对话 与合作""负责任创新与普惠未来"五大内容主线,设置了1场开幕主论坛、44场见解论坛、科技展览和系 列科创活动。作为年度备受瞩目的金融科技盛会,外滩大会以其开放、多元、前瞻性吸引全球目光,被 誉为"亚洲三大金融科技盛会"之一。 图灵奖得主理查德·萨顿外滩大会分享 人工智能进入"经验时代" ,潜力远超以往 在昨天上午的开幕主论坛上,2024年图灵奖得主、"强化学习之父"理查德·萨顿(Richard Sutto n)发表 主旨演讲,他认为,人类数据红利正逼近 ...
腾讯研究院AI速递 20250912
腾讯研究院· 2025-09-11 16:01
Group 1 - Thinking Machines has released its first research blog addressing non-determinism in LLM inference, focusing on batch invariance [1] - The research team improved RMSNorm, matrix multiplication, and attention mechanisms to achieve fully reproducible inference results with acceptable performance loss [1] - The company's valuation has reached $12 billion, with a founding team primarily from OpenAI, and its first product is named Connection Machine [1] Group 2 - OpenAI announced that ChatGPT now officially supports MCP (Model Context Protocol), allowing Plus and Pro users to automate operations with a single prompt [2] - MCP standardizes interactions between AI models, tools, and data sources, enabling different models to share context and support plug-and-play functionality [2] - Users can connect third-party services (like Stripe) in developer mode to complete complex tasks, although this cannot be used simultaneously with other ChatGPT features [2] Group 3 - WeChat official account has launched an "Intelligent Reply" feature supported by Tencent's Hunyuan large model, addressing the issue of operators not being able to respond to reader inquiries in a timely manner [3] - This feature automatically learns from the account's historical articles and reply styles, marking replies as "intelligent replies" and referencing relevant historical articles [3] - Tencent Hunyuan will also introduce Roleplay models and AI avatar applications to provide immersive dialogue experiences, which individual creators can enable in the PC backend of the official account [3] Group 4 - Kimi has open-sourced a new middleware called checkpoint-engine, capable of updating trillion-parameter models across thousands of GPUs in 20 seconds, significantly enhancing reinforcement learning efficiency [4] - This technology employs a hybrid co-location architecture to manage parameter states through a distributed checkpoint engine, enabling parallel processing of parameter broadcasting and reloading [4] - The system design supports complete decoupling of training and inference engines, using a pipeline approach for parameter updates to enhance stability against single-point failures [4] Group 5 - NVIDIA has released a new AI Blueprint that allows 3D artists to quickly create scene prototypes using generative AI technology, generating up to 20 3D models from text prompts [5] - It integrates Microsoft TRELLIS and NVIDIA NIM microservices, achieving speeds 20% faster than native applications, and supports RTX 50 and 40 series GPUs with over 16GB of memory [5] - The workflow automates the conversion from concept to 3D model, with generated models exportable to platforms like Blender for further optimization, significantly reducing prototype design time for artists [5] Group 6 - Baidu Academic has completed an AI reconstruction, launching features like AI academic search, AI literature summarization, AI reading, and paper mapping, creating the first one-stop AI academic platform in the industry [7] - The platform covers the entire academic chain of "search, read, create, and edit," providing literature summarization, full-text translation, topic recommendations, and professional formatting, greatly enhancing research efficiency [7] - It has indexed 690 million literature resources, covering 1.04 million academic sites, and established 4.2 million scholar profiles, with plans to build an academic identity system supported by Baidu's full traffic [7] Group 7 - Tencent Meeting has launched an AI hosting feature in collaboration with Yuanbao, allowing users to have the AI listen to meetings in advance and record in real-time, addressing issues like tardiness and overlapping meetings [8] - Users can activate "AI hosting" on the meeting page or list, enabling Yuanbao to automatically join the meeting and generate intelligent AI minutes, ensuring no content is missed [8] - After the meeting, users can directly ask Yuanbao about the meeting content to assist in decision-making, ensuring that key meetings are always "present" [8] Group 8 - Wang Xingxing, founder of Yushu Technology, expressed regret for not focusing on AI since 2011, believing that the current fields for AI application remain "desolate" [9] - Yushu Technology has announced its IPO plan, expecting to submit an application by the end of 2025, with projected revenue exceeding 1 billion yuan in 2024 and four consecutive years of profitability, aiming to become the largest "quadruped and humanoid robot" stock globally [9] - Wang revised his previous views on data, acknowledging that both robot data and models are core issues, advising young entrepreneurs to embrace current AI technological innovations [9] Group 9 - Sutton, known as the "father of reinforcement learning," stated in a speech that AI is entering an "experience era," where intelligence will be gained from continuous learning rather than static knowledge accumulation [10] - He emphasized that fears surrounding AI are exaggerated, suggesting that AI and human prosperity stem from decentralized collaboration, allowing intelligent agents to coexist peacefully under different objectives [10] - Sutton proposed four predictive principles, asserting that human intelligence will be surpassed, power will shift to the smartest agents, and AI is an inevitable next step in the evolution of the universe [10]
预见AI:人类进入新“经验时代” 唯有人造太阳能喂饱AI
Nan Fang Du Shi Bao· 2025-09-11 15:58
Group 1: AI and Innovation - The 2025 Inclusion·Bund Conference in Shanghai focused on "Reshaping Innovation Growth," featuring discussions on AI as a key theme, with over 40 forums and a significant technology exhibition [1] - Richard Sutton, the 2024 Turing Award winner, emphasized that humanity is entering a new "Era of Experience," where AI's replacement is inevitable, and the data era is nearing its end [3][4] - Sutton highlighted that the core of intelligence lies in experience, which involves observation, action, and reward, and pointed out the need for continual learning and meta-learning technologies to unlock AI's full potential [3] Group 2: Industry Perspectives - Wang Jian, founder of Alibaba Cloud, stated that open data and computing resources are essential for advancing AI, marking a shift from code open-sourcing to resource sharing [5][6] - Wang also introduced the concept of "computing satellites," which will leverage AI in space exploration, indicating a new frontier for AI applications beyond traditional devices [6] - Wang Xingxing, CEO of Yushu Technology, expressed optimism about the AI era, noting that small organizations will increasingly have explosive growth potential, despite existing challenges in data quality and model algorithms [7][8] Group 3: Organizational Challenges - McKinsey's China Chairman, Li Yili, identified organizational culture as the biggest bottleneck in AI development, advocating for CEO-led transformations focused on profitability rather than just application scenarios [8][9] - Li outlined three stages of globalization for Chinese enterprises, emphasizing the need for a global perspective and diverse collaboration models to enhance growth opportunities [10] Group 4: Energy and AI - Professor Sun Xuan from the University of Science and Technology of China proposed that nuclear fusion is the key to meeting the energy demands of AI, with 1 gram of fusion fuel equating to the energy of 8 tons of oil [11][12] - Sun highlighted the significant energy gap that AI could create, predicting that AI's energy consumption could exceed 20% of the Earth's total energy supply in the future [11] - The fusion industry is seeing increased investment, with a total of $7.1 billion raised globally, indicating a growing interest in commercializing fusion technology [12]
金融大模型步入“价值”攻坚战,如何跨越三道门槛?
Di Yi Cai Jing· 2025-09-11 10:11
Core Insights - The year 2025 is identified as a pivotal year for the large-scale implementation of AI in China's financial industry, transitioning from mere usage to creating real value [1][2] - Financial institutions are increasingly focusing on the collaboration between technology and business departments to achieve actual benefits and cost control, with "value" becoming a common consensus in the industry [2][3] AI Application in Finance - AI applications in finance have evolved from simple human assistance to intelligent agents capable of perception, learning, action, and decision-making, applicable in areas like market analysis, risk assessment, and wealth management [2][3] - The participation of business departments in AI development has significantly increased from 18% to 74%, indicating a shift towards practical applications of AI [3] Accelerated Implementation - Major banks are rapidly expanding AI applications, with examples such as ICBC's "Navi AI+" initiative introducing over 100 new AI application scenarios in key business areas [3] - Postal Savings Bank has developed over 230 AI model scenarios, showcasing the industry's commitment to integrating AI into their operations [3] Strategic Considerations - Financial institutions are beginning to systematically consider their AI strategies, aiming to become more agile and better manage light capital businesses [3] - There is a consensus that while AI can reshape business processes, it will take time to fully realize its potential, emphasizing the importance of building a robust AI framework in the next 1-2 years [3] Data Utilization Challenges - Companies face challenges in converting data resources into assets, with a need to bridge the gap between data, technology, and algorithms to support decision-making [4][5] - The concept of insight platforms is proposed to activate approximately 70% of "sleeping" data, transforming it into valuable resources for AI model training [4] Security and Trust Issues - The application of domestic AI models in finance is transitioning from isolated breakthroughs to ecosystem reconstruction, but issues like algorithm bias and privacy breaches remain unresolved [6] - The financial sector requires high precision in decision-making, making the introduction of reinforcement learning technology crucial for enhancing decision accuracy [6][7] Uncertainty in AI Deployment - The introduction of AI brings new challenges, particularly regarding uncertainty in investment returns and business outcomes, necessitating innovation in strategic planning and organizational design [7]
对AI的恐惧被夸大了,“强化学习之父”萨顿外滩演讲:四条原则预言AI未来
3 6 Ke· 2025-09-11 08:34
Group 1 - The core idea presented is that the human data dividend is nearing its limit, and artificial intelligence (AI) is entering an "experience era" centered on continuous learning, which has the potential to exceed previous capabilities [1][9][44] - AI's current training methods are primarily focused on transferring existing human knowledge to static models without autonomous learning capabilities, leading to a recognition of the limitations of this approach [10][14] - The future of AI relies on the development of two currently immature technologies: continual learning and meta-learning, which are essential for unlocking the full potential of experience-based learning [16][14] Group 2 - AI has become a highly politicized issue, with public fears about bias, unemployment, and even human extinction being exaggerated and fueled by certain organizations and individuals [16][18][25] - The call for regulation and control of AI reflects a broader societal tendency to fear the unknown, which can hinder collaborative efforts necessary for progress [24][28] - The concept of decentralized collaboration is emphasized as a superior alternative to centralized control, allowing for coexistence among diverse intelligent agents with different goals [20][26][21] Group 3 - Four principles are proposed to predict the future of AI: the absence of a unified global opinion on how the world should operate, the eventual understanding and creation of intelligence by humans, the inevitable surpassing of current human intelligence by superintelligent entities, and the flow of power and resources towards the most intelligent agents [35][36][37] - The inevitability of AI's replacement of human roles is acknowledged, framing it as a natural progression in the evolution of intelligence [38][44] - The role of humans as catalysts and pioneers in the "design era" is highlighted, emphasizing the unique ability to push design to its limits through AI [42][43]
VLA:有人喊“最强解法”,有人说“跑不动”
3 6 Ke· 2025-09-11 08:17
Core Viewpoint - The intelligent driving industry is at a critical juncture with the emergence of VLA (Vision-Language-Action) technology, leading to a division among key players regarding its potential and implementation [1][2][3]. Group 1: VLA Technology and Its Implications - VLA is seen as a potential solution to the limitations of end-to-end systems in intelligent driving, which can only address about 90% of the challenges [6][10]. - The introduction of language as a bridge in the VLA model aims to enhance the system's understanding and decision-making capabilities, allowing for more complex and nuanced driving actions [12][14][18]. - VLA is believed to improve three key areas: understanding dynamic traffic signals, enabling natural voice interactions, and enhancing risk prediction capabilities [19][20][21]. Group 2: Challenges and Criticisms of VLA - Despite the potential advantages, VLA faces significant challenges, including the need for substantial financial investment and the technical difficulties of aligning multimodal data [31][32]. - Critics argue that VLA may not be necessary for achieving higher levels of autonomous driving, with some suggesting it is more of a supplementary enhancement rather than a fundamental solution [35][36]. - The current limitations of existing intelligent driving chips hinder the effective deployment of VLA models, raising concerns about their practical application in real-world scenarios [31][32]. Group 3: Industry Perspectives and Strategies - Companies like Li Auto, Yuanrong, and Xiaopeng are betting on VLA, emphasizing high investment and computational intensity to pursue its development [41][42]. - In contrast, players like Huawei and Horizon are focusing on structural solutions and world models, arguing that these approaches may offer more reliable paths to achieving advanced autonomous driving [43][46]. - The ongoing debate over VLA reflects broader strategic choices within the industry, with companies prioritizing different technological pathways based on their resources and market positioning [47].
图灵奖得主理查德·萨顿:人类将开启“宇宙第四大时代”
Core Insights - Richard Sutton, the 2024 Turing Award winner, emphasizes the inevitability of AI replacing human roles in the development process of humanity [1][2] - Sutton introduces four realistic "predictive principles" regarding the future of AI, highlighting the need for decentralized collaboration and the importance of experience in learning [2][3] Group 1: AI and Learning - Sutton argues that current machine learning primarily focuses on transferring existing human knowledge to static AI, which lacks autonomous learning capabilities [1][2] - He identifies the need for a new data source generated through direct interaction between intelligent agents and the world, marking the transition into an "experience era" [1][2] - The core of intelligence lies in the ability to predict and control input signals based on experience, which is essential for the development of AI [2] Group 2: Future of AI - Sutton's four predictive principles include the lack of consensus on how the world operates, the potential for humans to understand and create intelligence through technology, the likelihood of superintelligent AI surpassing human intelligence, and the concentration of power and resources among the most intelligent agents [2][3] - He posits that humanity is currently in the "replicator era" and is on the verge of entering the "design era," where AI will play a crucial role [3][4] - Sutton encourages embracing AI as a necessary step in the evolution of the universe, advocating for courage and a spirit of adventure in facing its challenges [4]
Kimi开源又放大招!20秒更新万亿参数的中间件来了
量子位· 2025-09-11 05:19
Core Viewpoint - The article discusses the introduction of a middleware called "checkpoint-engine" that enables the Kimi K2 model, which has one trillion parameters, to update its model weights in approximately 20 seconds across thousands of GPUs, marking a significant advancement in the efficiency of large language model training and inference [6][7]. Group 1: Middleware Functionality - The checkpoint-engine is designed to facilitate the updating of model weights during the inference process of large language models [6]. - It allows for both simultaneous broadcasting of updated weights to all nodes and point-to-point dynamic updates [2][24]. - The middleware supports a pipeline approach for parameter updates, minimizing memory usage by updating parameters one at a time [19][20]. Group 2: System Architecture - Kimi K2 employs a hybrid co-location architecture where the training and inference engines are deployed on the same set of nodes [8]. - During each reinforcement learning iteration, a centralized controller generates new training data using the inference engine and then instructs the training engine to update parameters based on this data [9]. - The system is optimized for high throughput, with each engine deeply optimized for performance [10]. Group 3: Parameter Update Process - The training engine's parameters are unloaded to DRAM, allowing for quick activation of the training engine with minimal data transfer [12]. - The checkpoint engine manages parameter states by first obtaining local parameter copies from the training engine and then broadcasting the complete parameter set to all checkpoint nodes [16][17]. - The inference engine retrieves only the necessary parameter slices from the checkpoint engine, streamlining the update process [18]. Group 4: Performance Optimization - The design sacrifices some data transfer efficiency for a simpler system architecture, which reduces the complexity of maintenance and testing [25][26]. - During the startup of the training engine, nodes selectively read parameters from disk to minimize expensive disk I/O operations [28]. - The checkpoint engine can independently restart in case of failures, enhancing system resilience [33].
交互扩展时代来临:创智复旦字节重磅发布AgentGym-RL,昇腾加持,开创智能体训练新范式
机器之心· 2025-09-11 04:53
Core Insights - The article emphasizes the transition of artificial intelligence from a "data-intensive" to an "experience-intensive" era, where true intelligence is derived from active exploration and experience accumulation in real environments [10][11][50]. - The introduction of the AgentGym-RL framework represents a significant advancement in training autonomous LLM agents for multi-turn decision-making, addressing the limitations of existing models that rely on single-turn tasks and lack diverse interaction mechanisms [12][50]. Group 1: Framework and Methodology - AgentGym-RL is the first end-to-end framework for LLM agents that does not require supervised fine-tuning, supports interactive multi-turn training, and has been validated in various real-world scenarios [3][15]. - The framework integrates multiple environments and rich trajectory data, simplifying complex environment configurations into modular operations, thus facilitating effective experience-driven learning [13][19]. - The ScalingInter-RL method introduces a progressive interaction round expansion strategy, allowing agents to gradually adapt to environments and optimize their interaction patterns, balancing exploration and exploitation [4][23][25]. Group 2: Performance and Results - The research team achieved remarkable results with a 7B parameter model, which demonstrated complex task handling skills such as understanding task objectives and planning multi-step operations after extensive interaction training [5][29]. - In various testing environments, the model not only surpassed large open-source models over 100B in size but also matched the performance of top commercial models like OpenAI o3 and Google Gemini 2.5 Pro [5][29]. - The ScalingInter-RL model achieved an overall accuracy of 26.00% in web navigation tasks, significantly outperforming GPT-4o's 16.00% and matching the performance of DeepSeek-R1-0528 and Gemini-2.5-Pro [29][30]. Group 3: Future Directions - Future research will focus on upgrading general capabilities to enable agents to make efficient decisions in new environments and with unknown tools [51]. - The team aims to expand into more complex scenarios that closely resemble the physical world, such as robotic operations and real-world planning [52]. - There is an intention to explore multi-agent collaboration training models to unlock more complex group decision-making capabilities [52].
图灵奖得主理查德·萨顿:人工智能进入“经验时代”,潜力超以往
Bei Ke Cai Jing· 2025-09-11 04:47
Core Insights - Richard Sutton, the 2024 Turing Award winner, emphasized that the human data dividend is nearing its limit, and artificial intelligence is entering an "experience era" centered on continuous learning, which has the potential to exceed previous capabilities [1][2] Group 1: AI and Learning - Sutton stated that most current machine learning aims to transfer existing human knowledge to static AI, which lacks autonomous learning capabilities. He believes we are reaching the limits of human data, and existing methods cannot generate new knowledge, making continuous learning essential for intelligence [2] - He defined "experience" as the interaction of observation, action, and reward, which is crucial for an intelligent agent's ability to predict and control its input signals. Experience is the core of all intelligence [2] Group 2: Collaboration and Future Predictions - Addressing fears about AI causing bias, unemployment, or even human extinction, Sutton argued that such fears are exaggerated and often fueled by those who profit from them. He highlighted that economic systems function best when individuals have different goals and abilities, similar to how decentralized collaboration among intelligent agents can lead to win-win outcomes [3] - Sutton proposed four predictive principles for the future of AI: 1. There is no consensus on how the world should operate, and no single view can dominate [3] 2. Humanity will truly understand intelligence and create it through technology [3] 3. Current human intelligence will soon be surpassed by superintelligent AI or enhanced humans [3] 4. Power and resources will flow to the most intelligent agents [3] Group 3: Historical Context and Future Outlook - Sutton categorized the history of the universe into four eras: the particle era, the star era, the replicator era, and the design era. He believes humanity's uniqueness lies in pushing design to its limits, which is the goal pursued through AI today [4] - He described AI as the inevitable next step in the evolution of the universe, urging society to embrace it with courage, pride, and a spirit of adventure [4] Group 4: Event Overview - The 2025 Inclusion Bund Conference, themed "Reshaping Innovative Growth," took place in Shanghai from September 10 to 13, featuring a main forum, over 40 open insight forums, global theme days, innovation stages, a technology exhibition, and various networking opportunities [4]