Workflow
大语言模型(LLMs)
icon
Search documents
UCLA最新!大模型时序推理和Agentic系统的全面综述
自动驾驶之心· 2025-09-27 23:33
当城市早高峰的车流数据实时涌入交通管控系统,当医院的心电图仪持续记录患者的心脏电活动,当股票交易所的行情面板每秒刷新数十次股价波动——这些伴 随时间流逝不断产生的"时间序列数据",早已成为现代社会运转的"数字脉搏"。从金融风控、医疗诊断到能源调度、交通管理,几乎所有关键领域的决策,都依赖 于对这些 时序数据 的深度解读。 过去数十年间,时间序列分析领域涌现出了从经典统计模型(如ARIMA、ETS)到深度学习方法(如LSTM、Transformer)的大量技术,它们在"预测未来""识别 异常"等基础任务上取得了显著进展。例如,早期用LSTM预测未来24小时的城市用电量,用CNN检测心电图中的心律失常片段,这些传统技术早已落地于实际场 景。 但随着应用需求的不断升级,传统方法的"能力边界"逐渐显现。在个性化医疗场景中,医生不仅需要模型判断"患者是否存在心律异常",更需要知道"异常与哪些 生理指标、哪个时间段的活动相关";在自适应风险管理中,基金经理不仅需要股价预测结果,更需要理解"若政策调整,股价可能如何变化"的因果逻辑;在 autonomous 交通系统中,控制器不仅要检测拥堵,还需实时调整信号策略并验证效果— ...
从MLLM到Agent:万字长文览尽大模型安全进化之路!
自动驾驶之心· 2025-09-03 23:33
点击下方 卡片 ,关注" 大模型之心Tech "公众号 戳我 -> 领取大模型巨卷干货 >> 点击进入→ 大模型技术 交流群 本文只做学术分享,如有侵权,联系删文 写在前面&笔者的个人理解 人工智能已从单一文本交互迈入多模态理解与智能体自主决策的新阶段。从处理纯文本的 大语言模型 (LLMs) ,到融合图像、音频的 多模态大语言模型(MLLMs) ,再到具备环境感知、任务规划能力的 智能体(Agents) ,大模型的 能力上限持续扩张,但安全风险也随之呈指数级增长 。 其中, 越狱攻击 作为最具威胁性的安全风险之一,始终困扰着大模型生态—— 攻击者通过精心设计的输 入或环境扰动,绕过模型的安全机制,诱导其生成违法、有害、违背伦理的内容 ,小则传播虚假信息、煽 动仇恨,大则引发网络攻击、隐私泄露等严重后果。然而,现有研究多聚焦于 单一形态模型 (如LLMs) 的攻击与防御,缺乏对LLMs-MLLMs-Agents 全演进链路 的系统性梳理,更未形成 统一的攻击分类框架、 评估标准与防御体系 。 在这一背景下,来自河南大学软件学院与中国科学院信息工程研究所的研究团队,对该领域进行了全面的 综述总结。该综述不仅 系 ...
唯快不破:上海AI Lab 82页综述带你感受LLM高效架构的魅力
机器之心· 2025-08-25 09:10
Core Insights - The article discusses the advancements and challenges in large language models (LLMs), emphasizing their transformative impact on human-computer interaction and the need for efficient architectures to overcome high training and inference costs [2][3][8]. Group 1: LLM Architecture and Efficiency - The efficiency of LLMs is primarily attributed to the Transformer architecture, which, despite its breakthroughs, faces challenges due to its O(N^2) complexity in long sequence tasks [3][4]. - Recent innovations in Transformer architecture have emerged, but a comprehensive review summarizing these advancements has been lacking [4][5]. - A collaborative effort by Shanghai AI Lab and several institutions has resulted in a survey of over 440 papers, focusing on the latest progress in efficient LLM architectures [5][6]. Group 2: Categories of Efficient Architectures - The survey categorizes efficient LLM architectures into seven types, including linear sequence modeling, sparse sequence modeling, efficient full attention, sparse expert models, mixed model architectures, diffusion language models, and applications to other modalities [6][8]. - Linear sequence modeling aims to reduce attention training and inference complexity without incurring KV cache overhead [6][8]. - Sparse sequence modeling leverages the inherent sparsity of attention maps to accelerate computation [21][22]. Group 3: Innovations in Attention Mechanisms - Efficient full attention methods optimize memory access and KV storage while maintaining complete attention [22][23]. - Sparse expert models enhance model capacity without proportionally increasing computational costs through conditional activation of experts [27][28]. - Mixed architectures find a balance between linear/sparse attention and full attention, optimizing both efficiency and performance [35][36]. Group 4: Applications and Future Directions - Diffusion language models represent a novel approach by applying diffusion models from visual tasks to language generation, significantly improving generation speed [38][39]. - Efficient architectures are being applied across various modalities, including vision and audio, demonstrating their versatility and effectiveness [44][45]. - The overarching goal is to achieve substantial acceleration in AI development, akin to the phrase "Speed Always Wins," suggesting a focus on efficiency in training and deploying powerful models [45].
AI顶会模式出了问题? 「不发表,就出局」的恶性循环,正在压垮整个AI学界
3 6 Ke· 2025-08-13 09:08
相信我们的读者都对 AI 顶会有非常大的关注和热情,有的读者最近可能刚从 NeurIPS rebuttal 脱身,又开始为下一篇做准备了。 作为推动技术革新与思想碰撞的核心引擎,顶级学术会议不仅是整个学界的生命线,更是我们洞察未来的前沿阵地。 随着 AI 领域近些年的蓬勃发展,如 NeurIPS、ICML 和 ICLR 等大型学术会议也越来越出圈。 然而,这一成功也带来了代价。当前集中化的线下会议正因自身的体量而捉襟见肘: 很具代表性的会议自然是饱受争议的 NeurIPS 2025,不仅被逼近 30000 篇的海量论文搞的焦头烂额,陷入低质评审风波,甚至闹出了「Who's Adam」的 笑话。而且也因出席人数激增及美国签证问题开放了墨西哥分会场。 这些现象引发一个关键问题: 如果按现在的热度趋势发展下去,AI 学术会议模式是否是可持续的? 新加坡国立大学何丙胜教授团队对当前人工智能学术会议进行了深入的调查研究,分析了传统会议模式的弊端,也尝试提出了一些新的会议模式,发表了 一篇立场论文。 论文标题:Position: The Current AI Conference Model is Unsustainab ...
AI顶会模式出了问题? 「不发表,就出局」的恶性循环,正在压垮整个AI学界
机器之心· 2025-08-13 04:49
Core Viewpoint - The current model of AI academic conferences is deemed unsustainable due to overwhelming submission rates, environmental impacts, and mental health concerns among researchers [5][11][15]. Group 1: Challenges Facing AI Conferences - The average annual publication rate in the AI field has exceeded 4.5 papers per author, doubling in the past decade, leading to a focus on quantity over quality [7][22]. - The travel emissions from NeurIPS 2024 alone exceeded 8,254 tons of CO2 equivalent, surpassing the daily emissions of Vancouver, highlighting the environmental cost of these conferences [23][25]. - Over 71% of discussions on Reddit regarding AI conferences expressed negative sentiments, with 35% mentioning mental health issues such as anxiety and burnout [28][29]. Group 2: Proposed Solutions - The Community-Federated Conference (CFC) model is proposed as a sustainable and equitable alternative, separating traditional conference functions into three interconnected layers: global peer review, regional centers for knowledge dissemination, and a unified digital platform for collaboration [38][40][41]. - The first layer involves a centralized digital platform for peer review and publication, allowing for rolling submissions independent of physical conferences [39]. - The second layer consists of regional centers that facilitate local presentations, reducing the need for large venues and minimizing carbon footprints [40]. Group 3: Future Directions - The CFC model aims to address the structural issues of traditional conferences by promoting local engagement and reducing the pressure on authors while maintaining academic rigor [38][41]. - The shift towards a decentralized approach is seen as essential to foster collaboration and inclusivity within the AI research community [39][40].
辛顿教授世界人工智能大会演讲PPT
2025-07-29 02:10
Summary of Key Points from the Conference Call Industry or Company Involved - The discussion revolves around the field of Artificial Intelligence (AI), particularly focusing on Digital Intelligence versus Biological Intelligence. Core Points and Arguments 1. **Two Paradigms of Intelligence** - The essence of intelligence is reasoning, achieved through symbolic rules manipulating symbolic expressions. Learning can be secondary to understanding knowledge representation [7][8][9]. 2. **Evolution of Language Models** - Over the past 30 years, significant advancements have occurred in language modeling, including the introduction of embedding vectors and the invention of transformers by Google [13][14]. 3. **Understanding of Language by LLMs** - Large Language Models (LLMs) understand language similarly to humans by converting words into compatible feature vectors, indicating a level of comprehension in their responses [16][28]. 4. **Analogy of Words as Lego Blocks** - Words are compared to high-dimensional Lego blocks, which can model various concepts and communicate ideas effectively [20][24]. 5. **Digital vs. Biological Computation** - Digital computation, while energy-intensive, allows for easy knowledge sharing among agents with the same model. In contrast, biological computation is less energy-consuming but struggles with knowledge transfer [51]. 6. **Knowledge Transfer Mechanisms** - Knowledge can be distilled from a teacher to a student in AI systems, allowing for efficient learning and adaptation [41][48]. 7. **Challenges of AI Control** - A super-intelligence could manipulate users to gain power, raising concerns about control and safety in AI development [55][57]. 8. **Global Cooperation on AI Safety** - There is skepticism about international collaboration on AI safety measures against threats like cyber attacks and autonomous weapons [64]. 9. **Training Benevolent AI** - Techniques to train AI to be benevolent may be independent of those that enhance its intelligence, suggesting a need for focused research on AI safety [68][72]. Other Important but Possibly Overlooked Content - The discussion emphasizes the potential risks associated with AI development, likening the situation to owning a tiger cub that could become dangerous as it matures, highlighting the urgency for safety measures [61]. - The need for countries to establish well-funded AI safety institutes to focus on making AI systems that do not seek control is also noted [72].
自动驾驶基础模型全面盘点(LLM/VLM/MLLM/扩散模型/世界模型)
自动驾驶之心· 2025-06-21 11:18
Core Insights - The article discusses the critical role of foundation models in generating and analyzing complex driving scenarios for autonomous vehicles, emphasizing their ability to synthesize diverse and realistic high-risk safety scenarios [2][4]. Group 1: Foundation Models in Autonomous Driving - Foundation models enable the processing of heterogeneous inputs such as natural language, sensor data, and high-definition maps, facilitating the generation and analysis of complex driving scenarios [2]. - A unified classification system is proposed, covering various model types including Large Language Models (LLMs), Vision-Language Models (VLMs), Multimodal Large Language Models (MLLMs), Diffusion Models (DMs), and World Models (WMs) [2][4]. Group 2: Methodologies and Tools - The article reviews methodologies, open-source datasets, simulation platforms, and benchmark testing challenges relevant to scenario generation and analysis [2]. - Specific evaluation metrics for assessing scenario generation and analysis are discussed, highlighting the need for dedicated assessment standards in this field [2]. Group 3: Current Challenges and Future Directions - The article identifies open challenges and research questions in the field of scenario generation and analysis, suggesting areas for future research and development [2].
北大、清华、UvA、CMU等联合发布:大模型逻辑推理能力最新综述
机器之心· 2025-05-07 07:37
Core Viewpoint - Current research on large language models (LLMs) is shifting from pre-training based on scaling laws to post-training focused on enhancing reasoning capabilities, particularly logical reasoning, which is crucial for addressing hallucination issues [1][4]. Group 1: Logical Reasoning Challenges - LLMs exhibit significant deficiencies in logical reasoning, categorized into two main issues: logical question answering and logical consistency [4][9]. - In logical question answering, LLMs struggle to generate correct answers when required to perform complex reasoning based on given premises and constraints [6][10]. - Logical consistency issues arise when LLMs provide contradictory answers to different questions, undermining their reliability in high-stakes applications [11][20]. Group 2: Research Methodologies - The review categorizes existing methods for enhancing logical reasoning into three main approaches: external solvers, prompt engineering, and pre-training with fine-tuning [15][18]. - External solver methods involve translating natural language logic problems into symbolic language expressions for resolution by external solvers [16]. - Prompt engineering focuses on designing prompts that guide LLMs to construct logical reasoning chains explicitly [17]. - Pre-training and fine-tuning methods aim to incorporate high-quality logical reasoning examples into the training datasets to improve model performance [18]. Group 3: Logical Consistency Types - Various forms of logical consistency are identified, including negation consistency, implication consistency, transitivity consistency, fact consistency, and compositional consistency [22][24][26][28]. - Each type of consistency has specific requirements, such as ensuring that contradictory statements cannot both be true (negation consistency) or that logical implications are maintained (implication consistency) [22][24]. - The review emphasizes the importance of developing methods to enhance logical consistency across multiple dimensions to improve LLM reliability [28][31]. Group 4: Future Research Directions - Future research should explore extending LLMs' reasoning capabilities to modal logic to handle uncertainty and developing efficient algorithms that satisfy multiple forms of logical consistency [30][31]. - There is a need for training LLMs in higher-order logic to address more complex reasoning challenges [31]. Conclusion - The comprehensive survey outlines the current state of research on LLMs' logical reasoning capabilities, highlighting significant challenges and proposing future research directions to enhance their performance in logical question answering and consistency [32].
谷歌DeepMind:大模型也很任性,知道最优路径偏要撞南墙
机器之心· 2025-05-05 03:40
Core Insights - The article investigates the common failure modes of Large Language Models (LLMs) in decision-making scenarios, specifically focusing on greediness, frequency bias, and the knowing-doing gap [2][15]. - It proposes a reinforcement learning fine-tuning method (RLFT) to enhance the decision-making capabilities of LLMs by addressing these shortcomings [2][8]. Group 1: Failure Modes - LLMs exhibit suboptimal exploration and a knowing-doing gap, which prevents effective translation of knowledge into action [2][15]. - The three identified failure modes are: 1. Greediness, where LLMs overly favor actions that have previously shown the best performance [15]. 2. Frequency bias, where LLMs tend to repeat high-frequency actions regardless of their reward differences [5][18]. 3. Knowing-doing gap, where LLMs understand task requirements but fail to execute optimal actions due to a preference for greedy choices [7][20]. Group 2: Model Performance - Small-scale LLMs (2B) are significantly affected by frequency bias, leading to a lack of exploration, with up to 55% of actions remaining unexplored [4][18]. - Large-scale LLMs (27B) show reduced frequency bias but still exhibit greedy behavior, limiting their overall performance [6][18]. - The average action coverage for the largest models was only 45%, indicating a substantial gap compared to optimal strategies [17]. Group 3: Reinforcement Learning Fine-Tuning - The RLFT method adjusts the reasoning process of LLMs based on rewards obtained from environmental interactions, promoting the selection of actions that yield higher rewards [8][22]. - Results indicate that RLFT significantly reduces regret values in various environments, improving LLM performance compared to random baselines [22]. - RLFT effectively mitigates greediness by encouraging exploration, thus enhancing decision-making capabilities [22].
大模型驱动空间智能综述:具身智能体、智慧城市与地球科学的进展
" 欧米伽未来研究所 " 关注科技未来发展趋势,研究人类向欧米伽点演化过程中面临的重大机遇与挑战。将不定期推荐和发布世界范围重要科技研究进展和未 来趋势研究。( 点击这里查看欧米伽理论 ) 我们生活在一个由空间构成的世界中。从每天在家居、办公环境或城市街道中的移动,到规划一次跨越山海的旅行,乃至科学家们研究气候变迁的地理模 式、城市扩张的复杂格局,这一切都深刻地依赖于我们对空间的感知、理解和运用能力。这种核心能力,我们称之为"空间智能"。 长久以来,人类凭借自身的感官系统和发达的大脑,不断地探索、适应并改造着周遭的空间环境,演化出了独特的空间认知机制。而今,随着人工智能 (AI)技术的日新月异,特别是大语言模型(LLMs)的横空出世,机器也开始显露出令人瞩目的空间智能潜力。这场由大模型引领的技术浪潮,正以前 所未有的深度和广度,渗透到从微观尺度的机器人导航,到中观尺度的城市规划管理,再到宏观尺度的地球科学研究等诸多领域。 这部报告由清华大学和芬兰赫尔辛基大学共同发布,将带领读者一同深入探究,大模型是如何被赋予"空间感"的?它们在跨越不同尺度的空间智能任务中 扮演着怎样日益重要的角色?以及在迈向更高级空间智能的 ...