大语言模型(LLMs)

Search documents
ChatGPT越用人越傻?
虎嗅APP· 2025-06-25 15:06
Core Viewpoint - The article discusses a study conducted by MIT that explores the cognitive effects of relying on AI, specifically ChatGPT, for writing tasks, suggesting that over-dependence on AI may lead to a decline in critical thinking and creativity [3][25]. Group 1: Experiment Overview - The experiment involved 54 university students from prestigious institutions who were divided into three groups: one using ChatGPT (AI group), one using Google search (search engine group), and one relying solely on memory (brain group) [6][11]. - Each group completed writing tasks based on SAT prompts, with their brain activity monitored using EEG technology to assess cognitive engagement [4][10]. Group 2: Cognitive Findings - The brain activity of the brain group was the most active, indicating strong engagement in thinking, organizing, and executing tasks, while the AI group showed lower overall brain activity and declining attention over time [11][22]. - The study highlighted the concept of "cognitive debt," where reliance on AI for writing may enhance short-term efficiency but could degrade long-term cognitive abilities, such as critical thinking and creativity [8][12]. Group 3: Writing Quality and Perception - Essays produced by the AI group were criticized for being grammatically correct but lacking depth and originality, while the search engine group produced more coherent and personalized content [8][12]. - Students using AI expressed mixed feelings about their work, often feeling a lack of ownership and clarity regarding the sources of their information [14][22]. Group 4: Long-term Effects of AI Usage - In a follow-up round, students who switched from using AI to writing independently exhibited slower cognitive responses and difficulty recalling their writing processes, indicating a potential decline in cognitive skills due to prior reliance on AI [21][22]. - Conversely, students who transitioned from traditional writing to using AI showed increased brain activity and improved writing quality, suggesting that AI can enhance cognitive engagement when used appropriately [24][25]. Group 5: Conclusion and Implications - The research culminated in a paper titled "Your Brain on ChatGPT," which sparked discussions about the implications of AI on cognitive development and writing skills [24][25]. - The study emphasizes the importance of maintaining active engagement in writing processes to foster critical thinking and creativity, warning against the risks of becoming overly reliant on AI tools [26][27].
Andrej Karpathy 爆火演讲刷屏技术圈:AI 开启软件 3.0,重写一切的时代来了!
AI前线· 2025-06-19 08:10
Core Viewpoint - The article discusses a paradigm shift in software development driven by AI, marking the transition to "Software 3.0," where natural language replaces traditional coding as the primary interface for programming [1][2]. Group 1: Evolution of Software - Software is undergoing a profound transformation, with the last 70 years seeing little change until recent years, which have witnessed two major shifts [5]. - The emergence of "Software 2.0" involves using neural network weights instead of traditional code, indicating a new software paradigm [8][16]. - The current "Software 3.0" allows developers to use natural language prompts to interact with large language models (LLMs), simplifying the programming process [17][19]. Group 2: Impact on Developers and Users - The evolution of programming lowers barriers for developers and enhances user interaction, making software more intuitive and collaborative [2][4]. - The relationship between humans and machines is at a historical turning point, with future software acting as intelligent partners rather than mere tools [2][4]. Group 3: Characteristics of LLMs - LLMs are likened to public utilities, requiring significant capital investment for training and offering services through APIs, similar to electricity distribution [29][31]. - LLMs exhibit properties of both a "wafer fab" and an "operating system," indicating their complex nature and the need for substantial infrastructure [38][39]. - The current state of LLMs is compared to the computing landscape of the 1960s, suggesting that they are still in their infancy [51][67]. Group 4: Opportunities and Challenges - LLMs present opportunities for creating partially autonomous applications, allowing for more efficient workflows and collaboration between humans and AI [95][102]. - The need for effective context management and user interfaces is emphasized to enhance the interaction between users and LLMs [97][110]. - The article highlights the importance of refining documentation and tools to make them more accessible for LLMs, which can unlock new applications [152][161]. Group 5: Future Directions - The future of software development will involve a gradual increase in the autonomy of AI systems, with a focus on maintaining human oversight [135][172]. - The concept of "vibe coding" is introduced as a new way for individuals to engage with programming, making it more accessible to a broader audience [140][144]. - The article concludes with a call to action for developers to embrace the new paradigm and build systems that leverage the capabilities of LLMs effectively [170][172].
陈岱孙经济学纪念讲座报名丨熊伟:结构化信念与基金投资
Sou Hu Cai Jing· 2025-06-17 08:25
Group 1 - The event is a lecture titled "Structured Beliefs and Fund Investment," scheduled for June 20, 2025, at Tsinghua University [2] - The lecture will be presented by Xiong Wei, a professor at Princeton University, with a focus on the intersection of finance and economics [4][6] - The event is organized by the Department of Finance at Tsinghua University's School of Economics and Management and the Global Institute for Common Development [2] Group 2 - Xiong Wei's research interests include capital market imperfections, behavioral finance, digital economy, and the Chinese economy [4][6] - He has received several prestigious awards, including the 2018 China Economics Prize and the 2014 Sun Yefang Financial Innovation Award [4][6] - The lecture will utilize insights from a study analyzing fund managers' perceptions of government policies and their impact on investment decisions and market outcomes [7][9] Group 3 - The study constructs a countercyclical policy beliefs measure (CCP) to capture fund expectations about policies mitigating economic shocks [7][9] - Findings indicate that fund managers' market beliefs positively predict market returns, and CCP beliefs enhance this predictive power, improving fund performance [8][9] - The research emphasizes the significance of structured beliefs in shaping investment decisions and market results [8][9] Group 4 - The event is open to Tsinghua University students, with specific registration instructions for students from different departments [10] - The lecture will be conducted in English with Chinese explanations [11]
「Next-Token」范式改变!刚刚,强化学习预训练来了
机器之心· 2025-06-11 03:54
Core Viewpoint - The article discusses the emerging importance of Reinforcement Learning (RL) in enhancing AI model capabilities, particularly through a new paradigm called Reinforcement Pre-Training (RPT) which redefines next-token prediction as a reasoning task [3][10][24]. Summary by Sections Introduction - Yann LeCun previously viewed reinforcement learning as a minor component in AI, but its significance is growing in model enhancement [3]. RPT Overview - RPT transforms the next-token prediction task into a reasoning process, allowing models to receive verifiable rewards for correct predictions [6][25]. - This method leverages vast amounts of unannotated text data for general reinforcement learning without requiring domain-specific labeled answers [9][26]. Advantages of RPT - RPT offers inherent scalability and generality by utilizing large unannotated datasets for training [28]. - It minimizes the risk of reward hacking by using direct, rule-based reward signals [29]. - The internal reasoning process during pre-training allows for deeper understanding and generalization beyond mere token memorization [30]. - RPT enhances prediction accuracy by allocating more computational resources to each prediction step [31]. Experimental Results - RPT outperforms baseline methods in next-token prediction accuracy across various difficulty levels [40][41]. - The performance of RPT-14B is comparable to that of larger models, indicating its effectiveness in capturing complex reasoning signals [43]. - RPT's accuracy improves reliably with increased training computation, demonstrating its scaling characteristics [45]. - Models pre-trained with RPT achieve higher performance ceilings when further trained with RLVR, showcasing its ability to transfer learned reasoning patterns to downstream tasks [47]. Zero-Shot Performance - RPT-14B consistently surpasses R1-Distill-Qwen-14B across all benchmark tests, even outperforming larger models in next-token prediction [49]. Reasoning Mode Analysis - The reasoning process of RPT-14B differs qualitatively from that of R1-Distill-Qwen-14B, indicating a more thoughtful approach rather than simple pattern matching [51].
Redis 之父亲证:人类程序员仍力压 LLM!网友锐评:那是你没见过平庸码农被 AI 吊打的样子
程序员的那些事· 2025-05-30 07:10
Core Viewpoint - The article emphasizes that human programmers possess superior capabilities compared to large language models (LLMs), despite the usefulness of AI tools in assisting with programming tasks [3][10]. Group 1: Human vs. AI Capabilities - The article discusses a scenario where a complex bug in Redis was addressed, highlighting the limitations of LLMs in generating innovative solutions compared to human creativity [5][10]. - It is noted that while LLMs can assist in problem-solving, they often lack the ability to think outside conventional frameworks, which is a significant advantage of human programmers [10]. Group 2: Practical Applications of LLMs - The author shares experiences of using LLMs for code review and idea validation, indicating that these tools can enhance productivity but cannot fully replace the nuanced understanding required in software engineering [3][10]. - The article mentions that LLMs can serve as a sounding board for ideas, providing feedback that can help refine thought processes [13]. Group 3: Software Engineering Complexity - The article points out that software engineering encompasses much more than just coding, including understanding client needs and requirements, which LLMs are currently ill-equipped to handle [14]. - It emphasizes the social attributes of software engineering, where human interaction and comprehension of client demands play a crucial role [14].
《科学智能白皮书2025》发布,中国引领AI应用型创新领域
Di Yi Cai Jing· 2025-05-26 13:27
Core Insights - By 2024, China's AI-related paper citation volume is expected to account for 40.2% of the global total, rapidly catching up to the United States at 42.9% [1][8] - The report titled "Scientific Intelligence White Paper 2025" analyzes the integration of AI and scientific research across seven major research fields, covering 28 directions and nearly 90 key issues [1] - The report highlights the dual promotion and deep integration of AI innovation and scientific research, termed "AI for Science" [1] Research Trends - The number of global AI journal papers has surged nearly threefold over the past decade, from 308,900 to 954,500, with an average annual growth rate of 14% [7] - The share of core AI fields, such as algorithms and machine learning, has decreased from 44% to 38%, while the share of scientific intelligence has increased by 6 percentage points, with an annual growth rate rising from 10% before 2020 to 19% after [7] - China’s AI publication volume increased from 60,100 in 2015 to 300,400 in 2024, representing 29% of the global total [7][8] Citation Impact - The citation volume of AI-related papers in the U.S. reached 302,200 in 2020, while China's citations rose from 10,300 in 2015 to 144,800 in 2020, surpassing the EU for the first time in 2021 [8] - By 2024, China is projected to account for 41.6% of global AI citations in patents, policy documents, and clinical trials, significantly leading the field [8] Country-Specific Trends - China has a leading position in the intersection of AI with earth and environmental sciences, and has surpassed in AI with mathematics, material sciences, and humanities since 2019 [9] - The U.S. and EU maintain advantages in AI and life sciences, with China ranking third in this area [9] - India shows significant progress across all fields, currently ranking third in earth and environmental sciences, engineering, and humanities [9]
谷歌DeepMind:大模型也很任性,知道最优路径偏要撞南墙
机器之心· 2025-05-05 03:40
Core Insights - The article investigates the common failure modes of Large Language Models (LLMs) in decision-making scenarios, specifically focusing on greediness, frequency bias, and the knowing-doing gap [2][15]. - It proposes a reinforcement learning fine-tuning method (RLFT) to enhance the decision-making capabilities of LLMs by addressing these shortcomings [2][8]. Group 1: Failure Modes - LLMs exhibit suboptimal exploration and a knowing-doing gap, which prevents effective translation of knowledge into action [2][15]. - The three identified failure modes are: 1. Greediness, where LLMs overly favor actions that have previously shown the best performance [15]. 2. Frequency bias, where LLMs tend to repeat high-frequency actions regardless of their reward differences [5][18]. 3. Knowing-doing gap, where LLMs understand task requirements but fail to execute optimal actions due to a preference for greedy choices [7][20]. Group 2: Model Performance - Small-scale LLMs (2B) are significantly affected by frequency bias, leading to a lack of exploration, with up to 55% of actions remaining unexplored [4][18]. - Large-scale LLMs (27B) show reduced frequency bias but still exhibit greedy behavior, limiting their overall performance [6][18]. - The average action coverage for the largest models was only 45%, indicating a substantial gap compared to optimal strategies [17]. Group 3: Reinforcement Learning Fine-Tuning - The RLFT method adjusts the reasoning process of LLMs based on rewards obtained from environmental interactions, promoting the selection of actions that yield higher rewards [8][22]. - Results indicate that RLFT significantly reduces regret values in various environments, improving LLM performance compared to random baselines [22]. - RLFT effectively mitigates greediness by encouraging exploration, thus enhancing decision-making capabilities [22].
基于奖励驱动和自组织演化机制,全新框架ReSo重塑复杂推理任务中的智能协作
机器之心· 2025-04-27 10:40
本文由上海人工智能实验室,悉尼大学,牛津大学联合完成。第一作者周恒为上海 ailab 实习生和 Independent Researcher 耿鹤嘉。通讯作者为上海人工智能实验 室青年科学家白磊和牛津大学访问学者,悉尼大学博士生尹榛菲,团队其他成员还有 ailab 实习生薛翔元。 ReSo 框架( Re ward-driven & S elf- o rganizing)为复杂推理任务中的多智能体系统(MAS)提供了全新解法,在处理复杂任务时,先分解生成任务图,再为每个 子任务匹配最佳 agent。将任务图生成与奖励驱动的两阶段智能体选择过程相结合,该方法不仅提升了多智能体协作的效率,还为增强多智能体的推理能力开辟了 新路径。 研究背景:LLM 推理能力的掣肘与突破口 近年来, 增加推理时间(Inference Time Scaling) 被广泛认为是提升大语言模型(Large Language Models, LLMs)推理能力的重要途径之一。一方面,通过在训 练后阶段引入强化学习与奖励模型,可优化单一模型的推理路径,使其在回答前生成中间步骤,表现出更强的逻辑链构建能力;另一方面,也有研究尝试构建 多 智能体 ...