大语言模型
Search documents
复旦大学邓建国:未来是人机共生的世界,大学的使命是让人成为更好的人
Xin Lang Cai Jing· 2025-12-08 12:31
Core Insights - The future of human-machine coexistence is an inevitable trend, and universities should move beyond traditional knowledge transmission to focus on cultivating meta-knowledge, tacit knowledge, and practical knowledge while enhancing human empathy and collaborative abilities to address the challenges posed by changes in communication forms [3][7]. Group 1: AI Development and Challenges - The foundation of artificial intelligence is based on three elements: chips, data, and algorithms, driven by Moore's Law, which has led to the emergence of large language models through massive data generated by mobile sensors and powerful chip analysis capabilities [3][7]. - Large language models have a critical shortcoming due to their lack of physical embodiment, which prevents them from providing essential variables such as gender, age, and region necessary for human communication, making it difficult to establish stable trust relationships [3][7]. Group 2: Human Interaction and Education - Despite the presence of AI with physical forms, humans still desire real-life interactions and connections, emphasizing that the essence of human learning and communication is a multi-faceted, multi-channel, and social process that cannot be satisfied by mere artificial voice or online interactions [3][7]. - AI may replace certain types of knowledge production and some cognitive tasks, but human empathy and collaborative abilities, rooted in carbon-based life, remain irreplaceable core competencies [4][8]. Group 3: The Role of Universities - The mission of universities is to help individuals become better people in a future characterized by human-machine coexistence, maintaining their social and practical characteristics while cultivating core knowledge and unique abilities [4][8]. - In the face of communication changes and knowledge iteration brought about by AI, universities must assist humanity in maintaining core values amid competition and collaboration with technology, achieving harmonious coexistence between humans and machines [4][8].
DeepSeek双模型发布:一位是“话少助手” 一位是“偏科天才”
Ke Ji Ri Bao· 2025-12-08 10:03
Core Insights - DeepSeek has released two new models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which have garnered attention for their performance in comparison to leading models like OpenAI's GPT-5 and Google's Gemini3 Pro [1][2] Model Features - DeepSeek-V3.2 is designed as a high-efficiency assistant with strong reasoning and agent capabilities, aimed at automating complex tasks such as report generation and coding [2] - DeepSeek-V3.2-Speciale focuses on solving high-difficulty mathematical problems and academic research, pushing the limits of open-source model reasoning [2] Technological Innovations - The new models incorporate two significant breakthroughs: Domain-Specific Architecture (DSA) and Thinking Tool Invocation technology [2] - DSA enhances efficiency by allowing the model to retrieve only the most relevant information, reducing resource consumption [2] - Thinking Tool Invocation enables multi-round reasoning and tool usage, allowing the model to think, execute, and iterate on tasks autonomously [2] Market Positioning - The release of these models aims to bridge the performance gap between open-source and closed-source models, providing a competitive edge for open-source development [3][4] - DeepSeek's focus on practicality and generalization is intended to create pressure on closed-source vendors, transforming aspirations into competitive realities [4] Community Engagement - DeepSeek has updated its official web platform, app, and API to the new version, while the Speciale version is currently available only as a temporary API for community evaluation [4]
模型可以“卷”、算力必须“烧”!瑞银:AI巨头密集推新模型,算力投入将继续加码
智通财经网· 2025-12-08 09:54
Core Insights - UBS highlights the recent advancements in AI with the launch of new large language models (LLMs) by companies like Google, Anthropic, and DeepSeek, intensifying competition in the industry [1] - The report emphasizes the continued relevance of the "scaling laws" in model performance, indicating that computational power will remain a critical factor in determining the AI competitive landscape [1] Model Performance - The latest generation of models has shown significant breakthroughs, with Gemini 3 Deep Think and Claude Opus 4.5 achieving multi-step reasoning task scores of 45% and 38%, respectively, surpassing previous models that scored between 10%-20% [2] - This performance aligns with the effectiveness of the AI model pre-training scaling laws, where increased computational investment leads to non-linear improvements in model capabilities [2] Chip Technology Competition - Google’s Gemini 3 Pro is trained entirely on self-developed TPU chips, sparking discussions about the competition between GPUs and AI-specific ASIC chips [2] - ASIC chips are noted for their higher efficiency in specific AI tasks, while GPUs maintain a 90% market share in data center chips due to their flexible architecture and extensive software ecosystem [2] - The collaboration between OpenAI and Broadcom, as well as Anthropic and Google, is expected to enhance the focus on ASIC chips, with both chip types anticipated to coexist in the future [2] Market Trends - The introduction of next-generation chips like NVIDIA's Blackwell and Rubin is expected to sustain the competition for computational expansion, leading to an upward revision of AI capital expenditure forecasts by UBS [3] - The advancements from Google, Anthropic, and DeepSeek are increasing competitive pressure on companies like OpenAI, driving the AI industry towards a multi-model and multi-vendor landscape, a trend expected to persist at least until 2026 [3]
LLM强化学习不稳定之谜,被Qwen团队从「一阶近似」视角解开
机器之心· 2025-12-07 04:33
Core Insights - Reinforcement Learning (RL) has become a key technology paradigm for enhancing the complex reasoning and problem-solving capabilities of Large Language Models (LLMs) [2] - The main challenge in RL for LLMs is the mismatch between sequence-level rewards and token-level optimization objectives, raising concerns about theoretical soundness and training stability [2][5] - A new RL formulation method proposed by Alibaba's Qianwen team focuses on optimizing the expected value of sequence-level rewards using a surrogate token-level objective as a first-order approximation [2][11] Methodology - The team defines an autoregressive LLM represented by a policy π_θ, focusing on sequence-level rewards where a scalar reward R(x, y) is assigned to the entire response y [6] - The decision to avoid value function methods stems from the difficulty in constructing a general, scalable, and reliable value model [7] - Directly optimizing the expected sequence-level reward is challenging due to numerical differences between training and inference [9] Key Findings - The team conducted extensive experiments using a 30 billion parameter MoE model, consuming hundreds of thousands of GPU hours [4] - The introduction of on-policy training with importance sampling correction achieved the highest training stability [10] - In off-policy updates, both clipping and Routing Replay are essential for maintaining training stability, as their absence leads to performance degradation [23] Experimental Results - The MiniRL algorithm, which incorporates importance sampling, demonstrated the best performance and stability during training [22] - The removal of importance sampling correction during training led to rapid collapse and a sharp decrease in entropy, confirming its critical role in the first-order approximation [22] - Different cold-start initialization methods yielded similar final performance, indicating that the focus should be on the RL methods themselves rather than initialization details [27]
OpenAI会是第一个倒闭的AI独角兽吗?
Xin Lang Cai Jing· 2025-12-07 03:39
AI之争就是生态之争 作者 | 吕敬之 来源 | #融中财经 11月20日,Gemini3推出两天后,在被称为"硅谷投资人春晚"的Cerebral Valley AI峰会上,OpenAI就被 选成了"第二大可能倒闭的AI独角兽"。 同一天,Sam Altman推送了一条内部备忘录,承认了OpenAI在预训练上已经落后于谷歌的表现。 十天后,在12月1日,Altman再次推送全员内部信,这一次,口气更加严厉,发起内部"红色预警"叫停 广告商业化、AI agent的所有尝试,把所有人的所有注意力重新调回到ChatGPT性能提升上。硅谷知名 投资人Deedy Das在X上评价,Gemini3上线十五天,ChatGPT的日均访问量已经掉了约1200万,这也是 OpenAI拉响红色警报的真正原因。 随着谷歌的穷追猛打,用户和投资人也开始意识到,AI之争,争的不只是用户数据、快速商业化,而 是长远的生态之争。 被谷歌抢走1000万流量的ChatGPT 在Gemini3上线的第十六天,OpenAI传出发布新的大模型开启反击战的消息。 据The information最新报道,OpenAI在近几周的人工智能开发竞赛中似乎已落 ...
两个LLM互相对线,推理能力起飞:康奈尔团队发布大模型版类GAN训练法
机器之心· 2025-12-07 02:52
Core Insights - The article discusses the development of a new GAN-like training framework called PasoDoble, aimed at enhancing the reasoning capabilities of large language models (LLMs) through adversarial training without external supervision [3][41]. Group 1: PasoDoble Framework - PasoDoble consists of two models: Proposer, which generates challenging questions with standard answers, and Solver, which attempts to solve these questions [3][9]. - The training process involves Proposer generating question-answer pairs based on knowledge sampled from a knowledge base, while Solver generates multiple answers for each question [9][10]. - The framework does not rely on any supervisory signals throughout the training process, making it a fully unsupervised method [3][7]. Group 2: Performance Improvements - The implementation of PasoDoble has led to significant performance improvements in mathematical tasks, with Qwen3-1.7B-Base showing an average performance increase of approximately 13 percentage points and Qwen3-4B-Base showing an increase of about 16 percentage points [7][28]. - The results from various models indicate that the performance enhancement is more pronounced with larger model sizes, demonstrating the scalability of the PasoDoble approach [28][41]. Group 3: Reward Mechanism - The Proposer's reward mechanism is designed to encourage the generation of difficult and diverse questions, with rewards based on the difficulty and novelty of the questions generated [12][13]. - The Solver's training relies solely on correctness rewards, where each answer generated is compared to the standard answer provided by the Proposer [22][23]. - The effectiveness of the reward mechanisms is highlighted by the significant performance differences observed when using random rewards compared to the structured rewards from the PasoDoble framework [35][37]. Group 4: Experimental Results - The article presents detailed experimental results across various mathematical benchmarks, showing that PasoDoble significantly enhances model performance, particularly in competitive math tasks [28][29]. - The results indicate that models trained with PasoDoble consistently outperform baseline models, with notable improvements in accuracy across different benchmarks [28][34]. Group 5: Future Directions - Future research will explore extending the PasoDoble framework to other domains beyond mathematics, such as code generation and factual question answering, and investigate broader multi-model training paradigms [41].
以理想汽车为例,探寻自动驾驶的「大脑」进化史 - VLA 架构解析
自动驾驶之心· 2025-12-07 02:05
作者 | 我要吃鸡腿 编辑 | 自动驾驶之心 原文链接: https://zhuanlan.zhihu.com/p/1965839552158623077 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近30个 方向 学习 路线 >>自动驾驶前沿信息获取 → 自动驾驶之心知识星球 本文只做学术分享,如有侵权,联系删文 在自动驾驶这个飞速迭代的领域,技术范式的更迭快得令人目不暇接。前年,行业言必称BEV(鸟瞰图视 角);去年,"端到端"(End-to-End)又成了新的技术高地。然而,每一种范式在解决旧问题的同时,似乎都 在催生新的挑战。 传统的"端到端"自动驾驶,即VA(Vision-Action,视觉-行动)模型,就暴露出一个深刻的矛盾:它就像一个 车技高超但沉默寡言的"老司机"。它能凭借海量数据训练出的"直觉",在复杂的路况中做出令人惊叹的丝滑操 作。但当您坐在副驾,心脏漏跳一拍后问它:"刚才为什么突然减速?"——它答不上来。 这就是"黑箱"问题:系统能"做对",但我们不知道它"为何做对"。这种无法解释、无法沟通的特性,带来了巨 大的信任危机。 自动驾驶的三大范式演进。(a) ...
中国第一,阿里146篇论文入选AI顶会NeurIPS 2025
Cai Jing Wang· 2025-12-05 09:02
NeurIPS是人工智能领域影响力最大的顶会之一,该会议诞生了Transformer、AlexNet等里程碑式研究成 果。今年,谷歌、微软、OpenAI、阿里巴巴及麻省理工学院等全球顶尖科技公司和机构共有2万多篇论 文投稿,仅有约25%的论文被接收。统计数据显示,谷歌、微软、Meta和阿里巴巴是论文数量前四的科 技公司。 据悉,目前阿里千问已开源300多款模型,涵盖全模态、全尺寸,全球下载量突破7亿次,衍生模型超过 18万个,位居全球第一。在Gartner发布的GenAI云基础设施、GenAI工程、GenAI模型以及AI知识管理 应用四大维度的新兴市场象限报告中,阿里云均位于新兴领导者象限,是入选全部四项新兴领导者象限 的唯一亚太厂商。 据介绍,此次阿里入选的146篇论文全面覆盖了模型训练框架、数据集和模型基础研究和模型推理优化 等领域,展现了阿里在全栈AI体系的创新成果。 12月5日消息,人工智能领域顶级国际会议NeurIPS 2025在美国圣迭戈召开,本届会议,阿里巴巴共146 篇论文入选,是论文收录数量最多的中国公司。其中,阿里千问在门控注意力机制上的成果被评为最佳 论文,为唯一获奖的中国公司。 在训练 ...
豆包发布语音识别模型2.0 支持多模态视觉识别和13种海外语种识别
Mei Ri Jing Ji Xin Wen· 2025-12-05 08:10
Core Viewpoint - The article reports the official launch of Doubao-Seed-ASR-2.0, a voice recognition model by Huoshan Engine, which enhances contextual understanding and recognition accuracy through advanced technology [1] Group 1: Model Features - The 2.0 version of the model has improved inference capabilities, achieving a 20% increase in overall keyword recall rate [1] - It supports multimodal visual recognition, allowing the model to understand both audio and visual inputs, thereby enhancing text recognition accuracy [1] - The model can recognize 13 foreign languages, including Japanese, Korean, German, and French [1] Group 2: Targeted Upgrades - The model has been specifically upgraded to handle complex scenarios involving proper nouns, personal names, geographical names, brand names, and easily confused homophones [1]
知行科技宋阳:依托庞大工业基础和众多场景,中国能率先在AI领域取得更多突破
Xin Lang Cai Jing· 2025-12-05 08:07
他指出,大语言模型作为基座模型,无论是用于自动驾驶还是机器人,都存在中间跳跃的问题。以多模 态VLA为例,以机器人为例,我在一个房间做一个动作,到另一个场景很难泛化,需要高成本采集数 据。同样,给世界模型增加维度,如重力,模型所需算力和成本会急剧增加,背后还有电力、散热等问 题,这些都是行业存在的问题。 但他对此表示乐观。他认为,依托庞大工业基础和众多场景,利用这些场景和数据推动人工智能发展, 以产业带动AI的方式,中国能率先在人工智能取得更多突破。 新浪声明:所有会议实录均为现场速记整理,未经演讲者审阅,新浪网登载此文出于传递更多信息之目 的,并不意味着赞同其观点或证实其描述。 责任编辑:王翔 专题:2025新汽车合作生态交流会 2025新汽车合作生态交流会于12月5日-6日在苏州举行。知行科技创始人兼CEO宋阳出席兵演讲。 宋阳表示,短期看,中国汽车年产量三千万辆,具身智能机器人去年只有50万辆,是汽车的六十分之 一。长期看,自动驾驶汽车本质是轮式机器人,是机器人分支,长远看数量庞大。这两个行业现阶段和 长期如何融合发展,是长期和短期的问题。 宋阳表示,短期看,中国汽车年产量三千万辆,具身智能机器人去年只 ...