Workflow
大语言模型(LLMs)
icon
Search documents
唯快不破:上海AI Lab 82页综述带你感受LLM高效架构的魅力
机器之心· 2025-08-25 09:10
作者:孙伟高 上海人工智能实验室 近年来,大语言模型(LLMs)展现出强大的语言理解与生成能力,推动了文本生成、代码生成、问答、翻 译等任务的突破。代表性模型如 GPT、Claude、Gemini、DeepSeek、Qwen 等,已经深刻改变了人机交互方 式。LLMs 的边界也不止于语言和简单问答。随着多模态(VLMs)与推理能力(LRMs)的兴起,LLMs 正 不断扩展到多模态理解、生成与复杂推理场景。 但模型性能持续提升的背后,是模型尺寸、数据规模、RL 推理长度的快速 Scaling,是算力和存储资源的急 剧消耗。大模型的训练与推理的成本居高不下,成为制约其广泛落地和应用的现实瓶颈。 本文从 LLM 架构角度出发,带你剖析大模型的效率秘诀。这一切的核心在于 Transformer 架构。 Transformer 的自注意力机制虽带来了远距离建模的突破,却因 O ( N 2 ) 的复杂度在长序列任务中成本高 昂。 而在 RAG、智能体、长链推理、多模态等新兴场景下,长序列需求愈发突出,进一步放大了效率与性 能之间的矛盾。 同时 Transformer 的 FFN 部分采用密集的 MLP 层,同样面临模型规 ...
AI顶会模式出了问题? 「不发表,就出局」的恶性循环,正在压垮整个AI学界
机器之心· 2025-08-13 04:49
机器之心报道 编辑:+0,冷猫 相信我们的读者都对 AI 顶会有非常大的关注和热情,有的读者最近可能刚从 NeurIPS rebuttal 脱身,又开始 为下一篇做准备了。 作为推动技术革新与思想碰撞的核心引擎,顶级学术会议不仅是整个学界的生命线,更是我们洞察未来的前 沿阵地。 随着 AI 领域近些年的蓬勃发展,如 NeurIPS、ICML 和 ICLR 等大型学术会议也越来越出圈。 然而,这一成功也带来了代价。当前集中化的线下会议正因自身的体量而捉襟见肘: 很具代表性的会议自然是饱受争议的 NeurIPS 2025,不仅被逼近 30000 篇的海量论文搞的焦头烂额,陷入低 质评审风波,甚至闹出了 「Who's Adam」 的笑话。而且也因出席人数激增及美国签证问题开放了墨西哥分 会场。 这些现象引发一个关键问题: 如果按现在的热度趋势发展下去,AI 学术会议模式是否 是 可持 续的 ? 新加坡国立大学何丙胜教授团队 对当前人工智能学术会议进行了深入的调查研究,分析了传统会议模式的弊 端,也尝试提出了一些新的会议模式,发表了一篇立场论文。 发表激增 :过去十年间,每位作者的年均发表率翻了一番以上,达到每年超过 ...
辛顿教授世界人工智能大会演讲PPT
2025-07-29 02:10
Summary of Key Points from the Conference Call Industry or Company Involved - The discussion revolves around the field of Artificial Intelligence (AI), particularly focusing on Digital Intelligence versus Biological Intelligence. Core Points and Arguments 1. **Two Paradigms of Intelligence** - The essence of intelligence is reasoning, achieved through symbolic rules manipulating symbolic expressions. Learning can be secondary to understanding knowledge representation [7][8][9]. 2. **Evolution of Language Models** - Over the past 30 years, significant advancements have occurred in language modeling, including the introduction of embedding vectors and the invention of transformers by Google [13][14]. 3. **Understanding of Language by LLMs** - Large Language Models (LLMs) understand language similarly to humans by converting words into compatible feature vectors, indicating a level of comprehension in their responses [16][28]. 4. **Analogy of Words as Lego Blocks** - Words are compared to high-dimensional Lego blocks, which can model various concepts and communicate ideas effectively [20][24]. 5. **Digital vs. Biological Computation** - Digital computation, while energy-intensive, allows for easy knowledge sharing among agents with the same model. In contrast, biological computation is less energy-consuming but struggles with knowledge transfer [51]. 6. **Knowledge Transfer Mechanisms** - Knowledge can be distilled from a teacher to a student in AI systems, allowing for efficient learning and adaptation [41][48]. 7. **Challenges of AI Control** - A super-intelligence could manipulate users to gain power, raising concerns about control and safety in AI development [55][57]. 8. **Global Cooperation on AI Safety** - There is skepticism about international collaboration on AI safety measures against threats like cyber attacks and autonomous weapons [64]. 9. **Training Benevolent AI** - Techniques to train AI to be benevolent may be independent of those that enhance its intelligence, suggesting a need for focused research on AI safety [68][72]. Other Important but Possibly Overlooked Content - The discussion emphasizes the potential risks associated with AI development, likening the situation to owning a tiger cub that could become dangerous as it matures, highlighting the urgency for safety measures [61]. - The need for countries to establish well-funded AI safety institutes to focus on making AI systems that do not seek control is also noted [72].
自动驾驶基础模型全面盘点(LLM/VLM/MLLM/扩散模型/世界模型)
自动驾驶之心· 2025-06-21 11:18
Core Insights - The article discusses the critical role of foundation models in generating and analyzing complex driving scenarios for autonomous vehicles, emphasizing their ability to synthesize diverse and realistic high-risk safety scenarios [2][4]. Group 1: Foundation Models in Autonomous Driving - Foundation models enable the processing of heterogeneous inputs such as natural language, sensor data, and high-definition maps, facilitating the generation and analysis of complex driving scenarios [2]. - A unified classification system is proposed, covering various model types including Large Language Models (LLMs), Vision-Language Models (VLMs), Multimodal Large Language Models (MLLMs), Diffusion Models (DMs), and World Models (WMs) [2][4]. Group 2: Methodologies and Tools - The article reviews methodologies, open-source datasets, simulation platforms, and benchmark testing challenges relevant to scenario generation and analysis [2]. - Specific evaluation metrics for assessing scenario generation and analysis are discussed, highlighting the need for dedicated assessment standards in this field [2]. Group 3: Current Challenges and Future Directions - The article identifies open challenges and research questions in the field of scenario generation and analysis, suggesting areas for future research and development [2].
北大、清华、UvA、CMU等联合发布:大模型逻辑推理能力最新综述
机器之心· 2025-05-07 07:37
Core Viewpoint - Current research on large language models (LLMs) is shifting from pre-training based on scaling laws to post-training focused on enhancing reasoning capabilities, particularly logical reasoning, which is crucial for addressing hallucination issues [1][4]. Group 1: Logical Reasoning Challenges - LLMs exhibit significant deficiencies in logical reasoning, categorized into two main issues: logical question answering and logical consistency [4][9]. - In logical question answering, LLMs struggle to generate correct answers when required to perform complex reasoning based on given premises and constraints [6][10]. - Logical consistency issues arise when LLMs provide contradictory answers to different questions, undermining their reliability in high-stakes applications [11][20]. Group 2: Research Methodologies - The review categorizes existing methods for enhancing logical reasoning into three main approaches: external solvers, prompt engineering, and pre-training with fine-tuning [15][18]. - External solver methods involve translating natural language logic problems into symbolic language expressions for resolution by external solvers [16]. - Prompt engineering focuses on designing prompts that guide LLMs to construct logical reasoning chains explicitly [17]. - Pre-training and fine-tuning methods aim to incorporate high-quality logical reasoning examples into the training datasets to improve model performance [18]. Group 3: Logical Consistency Types - Various forms of logical consistency are identified, including negation consistency, implication consistency, transitivity consistency, fact consistency, and compositional consistency [22][24][26][28]. - Each type of consistency has specific requirements, such as ensuring that contradictory statements cannot both be true (negation consistency) or that logical implications are maintained (implication consistency) [22][24]. - The review emphasizes the importance of developing methods to enhance logical consistency across multiple dimensions to improve LLM reliability [28][31]. Group 4: Future Research Directions - Future research should explore extending LLMs' reasoning capabilities to modal logic to handle uncertainty and developing efficient algorithms that satisfy multiple forms of logical consistency [30][31]. - There is a need for training LLMs in higher-order logic to address more complex reasoning challenges [31]. Conclusion - The comprehensive survey outlines the current state of research on LLMs' logical reasoning capabilities, highlighting significant challenges and proposing future research directions to enhance their performance in logical question answering and consistency [32].
谷歌DeepMind:大模型也很任性,知道最优路径偏要撞南墙
机器之心· 2025-05-05 03:40
Core Insights - The article investigates the common failure modes of Large Language Models (LLMs) in decision-making scenarios, specifically focusing on greediness, frequency bias, and the knowing-doing gap [2][15]. - It proposes a reinforcement learning fine-tuning method (RLFT) to enhance the decision-making capabilities of LLMs by addressing these shortcomings [2][8]. Group 1: Failure Modes - LLMs exhibit suboptimal exploration and a knowing-doing gap, which prevents effective translation of knowledge into action [2][15]. - The three identified failure modes are: 1. Greediness, where LLMs overly favor actions that have previously shown the best performance [15]. 2. Frequency bias, where LLMs tend to repeat high-frequency actions regardless of their reward differences [5][18]. 3. Knowing-doing gap, where LLMs understand task requirements but fail to execute optimal actions due to a preference for greedy choices [7][20]. Group 2: Model Performance - Small-scale LLMs (2B) are significantly affected by frequency bias, leading to a lack of exploration, with up to 55% of actions remaining unexplored [4][18]. - Large-scale LLMs (27B) show reduced frequency bias but still exhibit greedy behavior, limiting their overall performance [6][18]. - The average action coverage for the largest models was only 45%, indicating a substantial gap compared to optimal strategies [17]. Group 3: Reinforcement Learning Fine-Tuning - The RLFT method adjusts the reasoning process of LLMs based on rewards obtained from environmental interactions, promoting the selection of actions that yield higher rewards [8][22]. - Results indicate that RLFT significantly reduces regret values in various environments, improving LLM performance compared to random baselines [22]. - RLFT effectively mitigates greediness by encouraging exploration, thus enhancing decision-making capabilities [22].
大模型驱动空间智能综述:具身智能体、智慧城市与地球科学的进展
" 欧米伽未来研究所 " 关注科技未来发展趋势,研究人类向欧米伽点演化过程中面临的重大机遇与挑战。将不定期推荐和发布世界范围重要科技研究进展和未 来趋势研究。( 点击这里查看欧米伽理论 ) 我们生活在一个由空间构成的世界中。从每天在家居、办公环境或城市街道中的移动,到规划一次跨越山海的旅行,乃至科学家们研究气候变迁的地理模 式、城市扩张的复杂格局,这一切都深刻地依赖于我们对空间的感知、理解和运用能力。这种核心能力,我们称之为"空间智能"。 长久以来,人类凭借自身的感官系统和发达的大脑,不断地探索、适应并改造着周遭的空间环境,演化出了独特的空间认知机制。而今,随着人工智能 (AI)技术的日新月异,特别是大语言模型(LLMs)的横空出世,机器也开始显露出令人瞩目的空间智能潜力。这场由大模型引领的技术浪潮,正以前 所未有的深度和广度,渗透到从微观尺度的机器人导航,到中观尺度的城市规划管理,再到宏观尺度的地球科学研究等诸多领域。 这部报告由清华大学和芬兰赫尔辛基大学共同发布,将带领读者一同深入探究,大模型是如何被赋予"空间感"的?它们在跨越不同尺度的空间智能任务中 扮演着怎样日益重要的角色?以及在迈向更高级空间智能的 ...
大模型驱动空间智能综述:具身智能体、智慧城市与地球科学的进展
欧米伽未来研究所2025· 2025-04-20 14:32
Core Viewpoint - The article discusses the evolution of spatial intelligence through the development of large language models (LLMs) and their applications across various scales, from micro-level robotics to macro-level earth sciences, highlighting both opportunities and challenges in this field [1][2][35]. Section Summaries Section 1: The Foundation of Spatial Intelligence - How Large Models "Understand" Space - To enable machines to possess spatial intelligence, they must develop effective spatial memory and flexible abstract spatial reasoning capabilities [2][3]. Section 2: Spatial Memory and Knowledge - The "Cognitive Map" in Large Models - Large models acquire spatial information through "internal absorption" during pre-training and "external invocation" when needing real-time data [4][5]. Section 3: Abstract Spatial Reasoning - Beyond Memorization - Current large models primarily mimic spatial tasks using language modeling rather than possessing deep spatial reasoning akin to human cognition [9]. Section 4: Multi-Scale Spatial Intelligence Applications Driven by Large Models - Large models are increasingly important in various spatial intelligence tasks across different scales, from individual robots to urban environments and global systems [10][11]. Section 5: Embodied Intelligence - Enhancing Robot Spatial Understanding and Action - The development of embodied intelligence focuses on enabling robots to perceive, understand, and navigate physical environments effectively [11][12]. Section 6: Urban Spatial Intelligence - Empowering Smarter, More Livable Cities - Large models are applied in urban settings to enhance spatial understanding, reasoning, and decision-making for better city management [15][16]. Section 7: Earth Spatial Intelligence (ESI) - Insights into Our Planet - ESI leverages AI and large models to analyze vast amounts of earth observation data, addressing global challenges like climate change and resource management [20][21]. Section 8: Challenges and Prospects - The Future of Spatial Intelligence - Despite significant advancements, challenges remain in spatial reasoning, data integration, and model interpretability, necessitating ongoing research and development [29][30].