大语言模型
Search documents
两个LLM互相对线,推理能力起飞:康奈尔团队发布大模型版类GAN训练法
机器之心· 2025-12-07 02:52
Core Insights - The article discusses the development of a new GAN-like training framework called PasoDoble, aimed at enhancing the reasoning capabilities of large language models (LLMs) through adversarial training without external supervision [3][41]. Group 1: PasoDoble Framework - PasoDoble consists of two models: Proposer, which generates challenging questions with standard answers, and Solver, which attempts to solve these questions [3][9]. - The training process involves Proposer generating question-answer pairs based on knowledge sampled from a knowledge base, while Solver generates multiple answers for each question [9][10]. - The framework does not rely on any supervisory signals throughout the training process, making it a fully unsupervised method [3][7]. Group 2: Performance Improvements - The implementation of PasoDoble has led to significant performance improvements in mathematical tasks, with Qwen3-1.7B-Base showing an average performance increase of approximately 13 percentage points and Qwen3-4B-Base showing an increase of about 16 percentage points [7][28]. - The results from various models indicate that the performance enhancement is more pronounced with larger model sizes, demonstrating the scalability of the PasoDoble approach [28][41]. Group 3: Reward Mechanism - The Proposer's reward mechanism is designed to encourage the generation of difficult and diverse questions, with rewards based on the difficulty and novelty of the questions generated [12][13]. - The Solver's training relies solely on correctness rewards, where each answer generated is compared to the standard answer provided by the Proposer [22][23]. - The effectiveness of the reward mechanisms is highlighted by the significant performance differences observed when using random rewards compared to the structured rewards from the PasoDoble framework [35][37]. Group 4: Experimental Results - The article presents detailed experimental results across various mathematical benchmarks, showing that PasoDoble significantly enhances model performance, particularly in competitive math tasks [28][29]. - The results indicate that models trained with PasoDoble consistently outperform baseline models, with notable improvements in accuracy across different benchmarks [28][34]. Group 5: Future Directions - Future research will explore extending the PasoDoble framework to other domains beyond mathematics, such as code generation and factual question answering, and investigate broader multi-model training paradigms [41].
以理想汽车为例,探寻自动驾驶的「大脑」进化史 - VLA 架构解析
自动驾驶之心· 2025-12-07 02:05
作者 | 我要吃鸡腿 编辑 | 自动驾驶之心 原文链接: https://zhuanlan.zhihu.com/p/1965839552158623077 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 本文只做学术分享,如有侵权,联系删文 在自动驾驶这个飞速迭代的领域,技术范式的更迭快得令人目不暇接。前年,行业言必称BEV(鸟瞰图视 角);去年,"端到端"(End-to-End)又成了新的技术高地。然而,每一种范式在解决旧问题的同时,似乎都 在催生新的挑战。 传统的"端到端"自动驾驶,即VA(Vision-Action,视觉-行动)模型,就暴露出一个深刻的矛盾:它就像一个 车技高超但沉默寡言的"老司机"。它能凭借海量数据训练出的"直觉",在复杂的路况中做出令人惊叹的丝滑操 作。但当您坐在副驾,心脏漏跳一拍后问它:"刚才为什么突然减速?"——它答不上来。 这就是"黑箱"问题:系统能"做对",但我们不知道它"为何做对"。这种无法解释、无法沟通的特性,带来了巨 大的信任危机。 自动驾驶的三大范式演进。(a) ...
中国第一,阿里146篇论文入选AI顶会NeurIPS 2025
Cai Jing Wang· 2025-12-05 09:02
NeurIPS是人工智能领域影响力最大的顶会之一,该会议诞生了Transformer、AlexNet等里程碑式研究成 果。今年,谷歌、微软、OpenAI、阿里巴巴及麻省理工学院等全球顶尖科技公司和机构共有2万多篇论 文投稿,仅有约25%的论文被接收。统计数据显示,谷歌、微软、Meta和阿里巴巴是论文数量前四的科 技公司。 据悉,目前阿里千问已开源300多款模型,涵盖全模态、全尺寸,全球下载量突破7亿次,衍生模型超过 18万个,位居全球第一。在Gartner发布的GenAI云基础设施、GenAI工程、GenAI模型以及AI知识管理 应用四大维度的新兴市场象限报告中,阿里云均位于新兴领导者象限,是入选全部四项新兴领导者象限 的唯一亚太厂商。 据介绍,此次阿里入选的146篇论文全面覆盖了模型训练框架、数据集和模型基础研究和模型推理优化 等领域,展现了阿里在全栈AI体系的创新成果。 12月5日消息,人工智能领域顶级国际会议NeurIPS 2025在美国圣迭戈召开,本届会议,阿里巴巴共146 篇论文入选,是论文收录数量最多的中国公司。其中,阿里千问在门控注意力机制上的成果被评为最佳 论文,为唯一获奖的中国公司。 在训练 ...
豆包发布语音识别模型2.0 支持多模态视觉识别和13种海外语种识别
Mei Ri Jing Ji Xin Wen· 2025-12-05 08:10
Core Viewpoint - The article reports the official launch of Doubao-Seed-ASR-2.0, a voice recognition model by Huoshan Engine, which enhances contextual understanding and recognition accuracy through advanced technology [1] Group 1: Model Features - The 2.0 version of the model has improved inference capabilities, achieving a 20% increase in overall keyword recall rate [1] - It supports multimodal visual recognition, allowing the model to understand both audio and visual inputs, thereby enhancing text recognition accuracy [1] - The model can recognize 13 foreign languages, including Japanese, Korean, German, and French [1] Group 2: Targeted Upgrades - The model has been specifically upgraded to handle complex scenarios involving proper nouns, personal names, geographical names, brand names, and easily confused homophones [1]
知行科技宋阳:依托庞大工业基础和众多场景,中国能率先在AI领域取得更多突破
Xin Lang Cai Jing· 2025-12-05 08:07
他指出,大语言模型作为基座模型,无论是用于自动驾驶还是机器人,都存在中间跳跃的问题。以多模 态VLA为例,以机器人为例,我在一个房间做一个动作,到另一个场景很难泛化,需要高成本采集数 据。同样,给世界模型增加维度,如重力,模型所需算力和成本会急剧增加,背后还有电力、散热等问 题,这些都是行业存在的问题。 但他对此表示乐观。他认为,依托庞大工业基础和众多场景,利用这些场景和数据推动人工智能发展, 以产业带动AI的方式,中国能率先在人工智能取得更多突破。 新浪声明:所有会议实录均为现场速记整理,未经演讲者审阅,新浪网登载此文出于传递更多信息之目 的,并不意味着赞同其观点或证实其描述。 责任编辑:王翔 专题:2025新汽车合作生态交流会 2025新汽车合作生态交流会于12月5日-6日在苏州举行。知行科技创始人兼CEO宋阳出席兵演讲。 宋阳表示,短期看,中国汽车年产量三千万辆,具身智能机器人去年只有50万辆,是汽车的六十分之 一。长期看,自动驾驶汽车本质是轮式机器人,是机器人分支,长远看数量庞大。这两个行业现阶段和 长期如何融合发展,是长期和短期的问题。 宋阳表示,短期看,中国汽车年产量三千万辆,具身智能机器人去年只 ...
AI不是随机鹦鹉,如何应对“有主见”的AI?
Guan Cha Zhe Wang· 2025-12-05 02:12
2025年,堪称中国大语言模型的元年,这一年DeepSeek横空出世,很快在全球掀起风暴,甚至抢走了 OpenAI的风头。 面对一个文科背景的访问者,特伦斯用通俗的语言讲述了,为什么AI技术在经历了60年的研究后最近 几年才突飞猛进,变得如此强大并广泛应用。他用科学家的激情和热忱,逐一打消了我们对于AI大语 言模型"编造事实"、带有偏见、抢人类工作、甚至未来可能威胁人类的各种担忧。他还讲述了自己在 1980年代和其他科学家一起挑战语言学权威和数学权威,将语言之间的联系用方程写出来,从而奠定了 大语言模型基础的故事,并以此来鼓励年轻人,科学探索要不畏惧权威,"当专家告诉你某件事不可能 时,不要听他们的。" 此外,针对各国纷纷出台的AI监管法案,他一再强调AI大语言模型还处在初级阶段,监管过早过细会 影响科学发展。新技术只有被大规模使用,科学家们才能通过试错,发现解决问题的办法。以下是我们 的对话实录。 观察者网【思想者茶座】连线特伦斯·谢诺夫斯基 【对话/观察者网 高艳平】 OpenAI一夜成名背后,经历了60年的积累 观察者网:今年以来,随着DeepSeek等应用的普及,我们开始更频繁的使用AI工具。因此有关 ...
世界太小,不够世界模型们用了
3 6 Ke· 2025-12-04 09:29
Core Insights - The AI industry is experiencing a chaotic evolution of "world models," with various interpretations and definitions emerging from leading figures in the field, all agreeing that world models are essential for achieving AGI [2][20][22] - The concept of world models has expanded significantly, encompassing a wide range of technologies and applications, from embodied intelligence to video generation and 3D modeling [18][20] Group 1: Definition and Evolution of World Models - The term "world model" refers to the ability of AI to understand external world rules and predict changes, rather than a specific technical path [3][6] - The idea of world models dates back to 1943 with Kenneth Craik's "mental models," which posited that the brain constructs miniature models of the external world for prediction [4] - The modern framework for neural network world models was established by Jürgen Schmidhuber in 2018, defining a structure that includes visual and memory components [4] Group 2: Different Approaches to World Models - Current world models can be categorized into two main schools: the Representation school, which focuses on abstract state predictions, and the Generation school, which aims to reconstruct and simulate visual worlds [6][13] - Yann LeCun represents the Representation school, emphasizing a minimalist approach that predicts abstract states rather than visual details [7][9] - The Generation school, exemplified by OpenAI's Sora, focuses on creating visual simulations and understanding physical laws through video generation [13][14] Group 3: Emerging Technologies and Concepts - Interactive Generative Video (IGV) represents an advanced form of the Generation school, allowing real-time user interaction with generated environments, as seen in Google DeepMind's Genie 3 [14] - Li Fei-Fei's concept of "Spatial Intelligence" aims to create a persistent, downloadable 3D environment, represented by the Marble project, which focuses on high-precision physical accuracy [16] - The rise of world models is driven by a collective anxiety in the AI industry regarding the limitations of large language models (LLMs) and a shift towards understanding and simulating the physical world [23][20]
南网能源院 | 业务动态(总第53期)
Xin Lang Cai Jing· 2025-12-03 13:25
Group 1 - The strategic department director Zhang Xuan and postdoctoral researcher He Binghao attended the 13th meeting of the China-Germany Energy Working Group at the Global Energy Transition Forum, reviewing the 2024 work results and 2025 work plan, focusing on carbon capture, utilization, and storage, as well as power system flexibility [1] - Senior researcher Yuan Kanglong presented on "Research on the Enhancement of Southern Power Grid's Backbone Network Planning" at the 2025 National Power Grid Technology Exchange Conference, discussing the construction history and effectiveness of the backbone network [3] - The main network and system departments participated in a survey on "Key Equipment Technology and Engineering Applications of Flexible DC Grids," engaging with institutions like Zhejiang University and Tsinghua University to discuss foundational stability theories and key equipment development [5] Group 2 - Researcher Wang Haijin presented a report on "Key Technologies for Electricity-Carbon Accounting Based on Large Language Models" at the 6th International Forum on New Power Systems, highlighting the potential of advanced AI tools in improving the accuracy and efficiency of carbon accounting [7] - Researcher Li Yan discussed the planning layout and demonstration effectiveness of the new power system demonstration area in the southern region at the 2025 Autumn Academic Annual Meeting of "China Electric Power" [9] - Researcher Wang Haijin elaborated on the methodology of electricity-carbon accounting driven by large language models at the IEEE International Conference on Energy Engineering and Power Systems [12] Group 3 - The Guangzhou Electric Power Design Institute won three awards at the National Excellent Engineering Survey and Design Award, marking its first participation in this authoritative industry evaluation [10] - The main network department participated in the 13th International Conference on Power System Control, Operation, and Management, sharing innovative results and practical experiences in power grid planning [8] - The distribution network department conducted research on the "Electric Hong Internet of Things Operating System," focusing on digital architecture and smart terminal technologies to support the distribution network's planning [12] Group 4 - The 2025 Standard Design and Typical Cost System Document Review Meeting was held in Guangzhou, aiming to provide a scientific and unified technical basis for the planning, construction, and operation management of the Southern Power Grid [13] - The innovation management team from the enterprise management department visited Jiangsu Industrial Technology Research Institute to discuss typical experiences in traditional industry transformation [14] - The Yulin Power Supply Bureau engaged in discussions with the Southern Power Grid Energy Institute on the transformation requirements of new distribution systems [16] Group 5 - The investment department director Wu Hongliang and senior researcher Yang Yin held discussions with the deputy dean of Peking University's School of Urban Planning and Design on topics including the impact of ultra-fast charging technology on grid risks [19] - Researcher Wang Fengyun spoke at the 32nd China International Power Equipment and Technology Exhibition, discussing the role of hydrogen energy in new power systems [18] - Researcher Liu Ziyi participated in a preparatory meeting for the Global Sustainable Transportation Innovation Alliance, discussing green transformation and international carbon tax [21]
腾讯公司副总裁蒋杰:AI让广告每个环节都在提效,腾讯会更多启用AI人才
36氪未来消费· 2025-12-03 12:50
Core Insights - Tencent's advertising revenue growth reached 21% in Q3, marking the highest increase in six quarters, driven by improved ad loading rates and AI-driven ad targeting [2] - The AIM+ smart advertising product suite significantly reduces operational tasks for advertisers, with an 80% decrease in required actions for ad spending and a 47% reduction in creative operations [2] - Tencent's capital expenditure on AI is projected to grow by 221% in 2024, indicating a strong commitment to AI investments [2] Group 1: AI and Advertising Efficiency - AI is enhancing every aspect of advertising efficiency, including recommendation, creativity, and placement [7] - The current ad loading rate for Tencent's video ads is approximately 4%, significantly lower than the industry average of 10%-15%, reflecting Tencent's cautious approach to commercialization [6] - AI optimization has reportedly increased the click-through rate of some ad inventories to around 3.0%, a significant improvement from historical averages [10] Group 2: Talent Acquisition and Competition - The demand for AI talent is surging, with new AI job postings increasing over tenfold in the first half of 2025, highlighting a competitive landscape for skilled professionals [3][4] - Tencent ranked fifth in new AI job postings among companies, with ByteDance, Xiaohongshu, Alibaba, and Ant Group leading the list [4] - The "Tencent Advertising Algorithm Competition" attracted over 8,400 participants from nearly 30 countries, showcasing the company's efforts to recruit top talent [4] Group 3: Future of Advertising Roles - The role of advertising optimization specialists is evolving; they will focus more on creative aspects rather than traditional bidding and pricing tasks, as AI systems take over these functions [8] - Future advertising systems will incorporate generative AI to address cold start problems, moving away from traditional discriminative models [7] - The integration of AI in advertising will blur the lines between ads and native content, emphasizing the importance of original creativity [8] Group 4: Technological Advancements - Tencent is exploring advanced technologies, including large language models and multi-modal capabilities, to enhance advertising effectiveness [12][13] - The company is investing in refining AI models to improve efficiency and reduce costs in generating advertising content [10] - The future of advertising will involve real-time generation of interactive ad materials based on user interests, enhancing user engagement [11]
中山大学最新论文登上Cell头条
生物世界· 2025-12-03 10:00
Core Insights - The study demonstrates that large language models (LLMs) can significantly assist physicians in overcoming technical barriers in medical AI research, with project completion rates increasing from 25% to 87% when using LLMs [11][12] - Despite the benefits, the research highlights potential risks associated with LLMs, including the possibility of dependency and the phenomenon of "hallucination" where AI may generate incorrect information [8][12] Study Overview - The research titled "The effectiveness of large language models in medical AI research for physicians: A randomized controlled trial" was published on November 26, 2025, in Cell Reports Medicine [4] - Conducted by a team from Sun Yat-sen University, the study involved a randomized controlled trial with 64 primary ophthalmologists, assessing the effectiveness of LLMs in an "automated cataract identification" project [6][7] Results - The intervention group using ChatGPT-3.5 had a total project completion rate of 87.5%, compared to 25.0% in the control group, and a non-assisted completion rate of 68.7% versus 3.1% [7] - After a washout period, 41.2% of successful intervention participants were able to complete new projects independently without LLM support [7] - Concerns were raised among participants, with 42.6% worried about mindlessly repeating AI-generated information and 40.4% fearing that AI could foster lazy thinking [7] Conclusion - The study concludes that while LLMs can democratize medical AI research and help physicians navigate technical challenges, the long-term risks associated with their use, such as dependency, require further investigation [8][12]