Workflow
涌现
icon
Search documents
斯坦福最新论文,揭秘大语言模型心智理论的基础
3 6 Ke· 2025-09-24 11:04
Core Insights - The article discusses how AI, specifically large language models (LLMs), are beginning to exhibit "Theory of Mind" (ToM) capabilities, traditionally considered unique to humans [2][5] - A recent study from Stanford University reveals that the ability for complex social reasoning in these models is concentrated in a mere 0.001% of their total parameters, challenging previous assumptions about the distribution of cognitive abilities in neural networks [8][21] - The research highlights the importance of structured order and understanding of sequence in language processing as foundational to the emergence of advanced cognitive abilities in AI [15][20] Group 1: Theory of Mind in AI - The concept of "Theory of Mind" refers to the ability to understand others' thoughts, intentions, and beliefs, which is crucial for social interaction [2][3] - Recent benchmarks indicate that LLMs like Llama and Qwen can accurately respond to tests designed to evaluate ToM, suggesting they can simulate perspectives and understand information gaps [5][6] Group 2: Key Findings from the Stanford Study - The study identifies that the parameters driving ToM capabilities are highly concentrated, contradicting the belief that such abilities are widely distributed across the model [8][9] - The research utilized a sensitivity analysis method based on the Hessian matrix to pinpoint the parameters responsible for ToM, revealing a "mind core" that is critical for social reasoning [7][8] Group 3: Mechanisms Behind Cognitive Abilities - The findings suggest that the attention mechanism in models, particularly those using RoPE (Rotary Positional Encoding), is directly linked to their social reasoning capabilities [9][14] - Disrupting the identified "mind core" parameters in models using RoPE leads to a collapse of their ToM abilities, while models not using RoPE show resilience [8][14] Group 4: Emergence of Intelligence - The study posits that advanced cognitive abilities in AI emerge from a foundational understanding of sequence and structure in language, which is essential for higher-level reasoning [15][20] - The emergence of ToM is seen as a byproduct of mastering basic language structures and statistical patterns in human language, rather than a standalone cognitive module [20][23]
诺贝尔物理学成果48年后终获数学证明!中科大少年班尹骏又出现了
量子位· 2025-08-24 04:38
Core Viewpoint - Two Chinese scholars have made a significant breakthrough in proving the Anderson model, a long-standing problem in condensed matter physics that explains the transition of electrons in semiconductor materials from a conductive to a non-conductive state [1][2][19]. Group 1: Anderson Model Overview - The Anderson model, proposed by Philip W. Anderson in 1958, describes how electrons transition from being able to move freely (delocalized) to being trapped (localized) in a material as the disorder increases [10][11][16]. - This phenomenon is crucial for understanding semiconductor materials, which can switch between conductive and non-conductive states, making them essential for chip technology [7][8][12]. Group 2: Breakthrough Achievements - After 16 years of collaboration, scholars Yao Hongze and Jun Yin successfully provided a mathematical proof for the Anderson model, marking the most significant progress since its inception [2][32]. - Their research initially focused on one-dimensional cases and later expanded to two-dimensional and three-dimensional scenarios, achieving notable advancements in understanding electron behavior in complex matrices [33][35]. Group 3: Methodology and Challenges - The scholars utilized random matrix theory to simplify the complex band matrix involved in the Anderson model, allowing them to prove that when the bandwidth exceeds a certain threshold, electrons remain delocalized [27][31]. - They faced significant challenges in their calculations, requiring extensive graphical analysis to simplify their equations and ultimately leading to a breakthrough in understanding the conditions for electron localization [30][31]. Group 4: Background of Scholars - Yao Hongze, a prominent mathematician, has made substantial contributions to probability, random processes, and quantum mechanics, and has been a professor at Harvard University since 2005 [36][38]. - Jun Yin, a professor at UCLA, has received several prestigious awards for his early career achievements in physics and mathematics, including the von Neumann Research Prize [47][50].
AI“黑箱”与老子的“道”:跨越2500年的惊人共鸣
Hu Xiu· 2025-08-08 03:57
Group 1 - The article discusses the concept of "Dao" as something that transcends language and rational understanding, suggesting that true knowledge cannot be fully articulated [1][2][12] - It draws parallels between the philosophical notion of "Dao" and modern physics, particularly quantum mechanics, highlighting the challenges in comprehending phenomena that defy intuitive understanding [3][10][11] - The article introduces the "black box" problem in AI, emphasizing that the complexity of AI models makes their decision-making processes difficult to explain, similar to the elusive nature of "Dao" [14][16][19] Group 2 - The article suggests that both "Dao" and AI's "black box" represent emergent properties that exceed human cognitive boundaries, indicating a need for trust rather than complete understanding [20][23][24] - It emphasizes the importance of collaboration between humans and AI, proposing that while AI can discover patterns, human experience and ethics remain essential in decision-making [26][29] - The article warns about potential biases in AI, advocating for data governance and ethical scrutiny to ensure fairness in AI outcomes [30][31]
对话问小白创始人李岩:AI是一种暴力美学,小不可能美
暗涌Waves· 2025-07-07 07:16
Core Viewpoint - The article discusses the innovative approach of the company "Yuan Shi Technology" and its product "Wen Xiaobai," which aims to redefine information retrieval and content generation in the AI era, positioning itself as a unique AIGC content platform rather than a traditional chatbot or search engine [3][4][5]. Group 1: Company Background and Development - Li Yan, the founder of Yuan Shi Technology, has a strong background in AI, having previously built the AI system at Kuaishou [2]. - Yuan Shi Technology has secured approximately $50 million in funding from notable investors, including Kuaishou's co-founder and venture capital firms [2]. - The product "Wen Xiaobai" combines active Q&A with passive content consumption, resembling a modern version of today's news aggregation platforms [3]. Group 2: Product Positioning and Differentiation - "Wen Xiaobai" is defined as an AIGC content platform that allows users to actively ask questions and passively consume information, contrasting with traditional UGC platforms [8][9]. - The platform emphasizes a user-friendly approach, aiming to lower the psychological barrier for users, which is reflected in its name "Wen Xiaobai" [12]. - The product's content generation relies heavily on AI, with a multi-agent system that automates the creation and quality control of content [16][17]. Group 3: Market Perspective and Opportunities - Li Yan believes that the market for information retrieval is vast and that large companies cannot monopolize it entirely, leaving significant opportunities for startups [5][24]. - The article highlights the shift from traditional information retrieval methods to AI-driven content generation, suggesting that this transformation creates new market dynamics [24][25]. - The company aims to leverage AI's capabilities to address long-tail demands and underrepresented voices in the content landscape [26]. Group 4: Future Outlook and Strategy - Yuan Shi Technology plans to expand its product offerings to international markets, focusing on creating a closed-loop system of generation, distribution, and consumption [53]. - The company is committed to developing its own models for user interest mapping, which is seen as a core differentiator in its strategy [53]. - Li Yan emphasizes the importance of understanding user needs and adapting to market changes, indicating a flexible approach to product development and commercialization [52][53].
一文了解DeepSeek和OpenAI:企业家为什么需要认知型创新?
Sou Hu Cai Jing· 2025-06-10 12:49
Core Insights - The article emphasizes the transformative impact of AI on business innovation and the necessity for companies to adapt their strategies to remain competitive in the AI era [1][4][40] Group 1: OpenAI's Journey - OpenAI was founded in 2015 by Elon Musk and Sam Altman with the mission to counteract the monopolistic tendencies of tech giants and promote open, safe, and accessible AI [4][7] - The development of large language models (LLMs) by OpenAI is attributed to the effective use of the Transformer architecture and the Scaling Law, which predicts a linear relationship between model size, training data, and computational resources [8][11] - The emergence of capabilities in models like GPT is described as a phenomenon of "emergence," where models exhibit unexpected abilities when certain thresholds of parameters and data are reached [12][13] Group 2: DeepSeek's Strategy - DeepSeek adopts a "Limited Scaling Law" approach, focusing on maximizing efficiency and performance with limited resources, contrasting with the resource-heavy strategies of larger AI firms [18][22] - The company employs innovative model architectures such as Multi-Head Latent Attention (MLA) and Mixture of Experts (MoE) to optimize performance while minimizing costs [20][21] - DeepSeek's R1 model, released in January 2025, showcases its ability to perform complex reasoning tasks without human feedback, marking a significant advancement in AI capabilities [23][25] Group 3: Organizational Innovation - DeepSeek promotes an AI Lab paradigm that encourages open collaboration, resource sharing, and dynamic team structures to foster innovation in AI development [27][28] - The organization emphasizes self-organization and autonomy among team members, allowing for a more flexible and responsive approach to research and development [29][30] - The company's success is attributed to breaking away from traditional corporate constraints, enabling a culture of creativity and exploration in foundational research [34][38]
一文了解DeepSeek和OpenAI:企业家为什么需要认知型创新?
混沌学园· 2025-06-10 11:07
Core Viewpoint - The article emphasizes the transformative impact of AI technology on business innovation and the necessity for companies to adapt their strategies to remain competitive in the evolving landscape of AI [1][2]. Group 1: OpenAI's Emergence - OpenAI was founded in 2015 by Elon Musk and Sam Altman with the mission to counteract the monopolistic power of major tech companies in AI, aiming for an open and safe AI for all [9][10][12]. - The introduction of the Transformer architecture by Google in 2017 revolutionized language processing, enabling models to understand context better and significantly improving training speed [13][15]. - OpenAI's belief in the Scaling Law led to unprecedented investments in AI, resulting in the development of groundbreaking language models that exhibit emergent capabilities [17][19]. Group 2: ChatGPT and Human-Machine Interaction - The launch of ChatGPT marked a significant shift in human-machine interaction, allowing users to communicate in natural language rather than through complex commands, thus lowering the barrier to AI usage [22][24]. - ChatGPT's success not only established a user base for future AI applications but also reshaped perceptions of human-AI collaboration, showcasing vast potential for future developments [25]. Group 3: DeepSeek's Strategic Approach - DeepSeek adopted a "Limited Scaling Law" strategy, focusing on maximizing efficiency and performance with limited resources, contrasting with the resource-heavy approaches of larger AI firms [32][34]. - The company achieved high performance at low costs through innovative model architecture and training methods, emphasizing quality data selection and algorithm efficiency [36][38]. - DeepSeek's R1 model, released in January 2025, demonstrated advanced reasoning capabilities without human feedback, marking a significant advancement in AI technology [45][48]. Group 4: Organizational Innovation in AI - DeepSeek's organizational model promotes an AI Lab paradigm that fosters emergent innovation, allowing for open collaboration and resource sharing among researchers [54][56]. - The dynamic team structure and self-organizing management style encourage creativity and rapid iteration, essential for success in the unpredictable field of AI [58][62]. - The company's approach challenges traditional hierarchical models, advocating for a culture that empowers individuals to explore and innovate freely [64][70]. Group 5: Breaking the "Thought Stamp" - DeepSeek's achievements highlight a shift in mindset among Chinese entrepreneurs, demonstrating that original foundational research in AI is possible within China [75][78]. - The article calls for a departure from the belief that Chinese companies should only focus on application and commercialization, urging a commitment to long-term foundational research and innovation [80][82].
从OpenAI到DeepSeek:你必须知道认知型创新对企业家多重要
混沌学园· 2025-06-05 09:28
Core Viewpoint - The article discusses the emergence of AI and its transformative impact on industries, highlighting the importance of cognitive innovation and the role of organizations that can adapt and thrive in this new landscape [2][3][23]. Group 1: AI Development Milestones - The introduction of the Transformer model by Google's Brain Team in June 2017 laid the foundation for subsequent language model advancements [1]. - The explosive growth of ChatGPT in 2023 marked the beginning of AI commercialization, while DeepSeek's emergence in 2025 demonstrated a significant shift in industry perception by achieving technological parity at a fraction of the cost [3][12]. Group 2: Cognitive Innovation - The article emphasizes that the evolution of AI is not merely a technical race but a revolution in the underlying logic of cognitive innovation [4]. - The course led by Professor Li Shanyou aims to dissect the methods of innovation in the AI era, revealing the cognitive leap from technological breakthroughs to commercial applications [4][20]. Group 3: Case Studies and Competitive Dynamics - The course will analyze the rise of OpenAI, detailing its journey from Musk's vision to the rapid user adoption of ChatGPT, which reached over one million users in just five days [10][12]. - It will also explore DeepSeek's strategy of achieving a 90% reduction in training costs through its unique architecture, showcasing how a small team can outperform larger organizations [11][13]. Group 4: Practical Tools and Frameworks - The course will introduce a practical framework for innovation, focusing on model building, single-point breakthroughs, and team organization, which are essential for navigating the AI landscape [11][25]. - Participants will learn how to identify their business's cognitive axes and value dimensions, as well as the management principles of emergent organizations [11][25]. Group 5: Target Audience - The course is designed for various innovators, including entrepreneurs, executives, product managers, investors, and technology enthusiasts, who seek to leverage cognitive advantages in the AI era [17][18].
人工智能至今仍不是现代科学,人们却热衷用四种做法来粉饰它
Guan Cha Zhe Wang· 2025-05-21 00:09
Group 1 - The term "artificial intelligence" was formally introduced at a conference in 1956 at Dartmouth College, marking the beginning of efforts to replicate human intelligence through modern science and technology [1] - Alan Turing is recognized as the father of artificial intelligence due to his introduction of the "Turing Test" in 1950, which provides a method to determine if a machine can exhibit intelligent behavior equivalent to a human [1][3] - The Turing Test involves a human evaluator interacting with an isolated "intelligent agent" through a keyboard and display, where if the evaluator cannot distinguish between the machine and a human, the machine is considered intelligent [3][5] Group 2 - The Turing Test is characterized as a subjective evaluation method rather than an objective scientific test, as it relies on human judgment rather than consistent measurable criteria [6][9] - Despite claims of machines passing the Turing Test, such as Eugene Goostman in 2014, there is no consensus that these machines possess human-like thinking capabilities, highlighting the limitations of the Turing Test as a scientific standard [6][8] - Turing's original paper contains subjective reasoning and speculative assertions, which, while valuable for exploration, do not meet the rigorous standards of scientific argumentation [8][9] Group 3 - The field of artificial intelligence has been criticized for lacking a solid scientific foundation, often relying on conjecture and analogy rather than empirical evidence [10][19] - The emergence of terms like "scaling law" in AI research reflects a trend of using non-scientific concepts to justify claims about machine learning performance, which may not hold true under scrutiny [16][17] - Historical critiques, such as those from Hubert L. Dreyfus in 1965, emphasize the need for a deeper scientific understanding of AI rather than superficial advancements based on speculative ideas [18][19] Group 4 - The ongoing development of AI as a practical technology has achieved significant progress, yet it remains categorized as a modern craft rather than a fully-fledged scientific discipline [20][21] - Future advancements in AI should adhere to the rational norms of modern science and technology, avoiding the influence of non-scientific factors on its development [21]
李善友:DeepSeek,是国运的AI支点
混沌学园· 2025-04-27 10:16
2025年4月25日,2025年李善友开年大课暨混沌·AI创新院开学典礼正式开讲。 Day1的主题是"AI的进击",在上午的大课中,教授动情表示:DeepSeek,将是国运的AI支点。 以下是李善友教授大课的笔记内容。 讲者 |李善友 我相信未来的20 年 , 必然是 AI 在中国的黄金 20 年 。 其实在大课开始 前,我们 同事 问我 :教授 你 为这堂课 , 做了多长时间的准备? 我想 : 这个准备 , 如果从长来说可能是十年, 往 短 里 说可能是 18 个月。 所以: 18 个月以来 , 我一直在思考,今天这个时代命题是什么?混沌要呼应什么样的命题? 我要 把最大公约数的那个命题 , 像旗帜一样举出来,跟 所有 同学们去呼应。 这个命题 究竟 是什么? 我一直 在 思考。 因为马斯克看见了一件事情,谷歌把之前最领先的 AI 实验室 DeepMind 给收购了。 马斯克心中有一个巨大的隐忧—— AI 比核武器更具威胁,任由 AI 发展下去,最终 AI 一定反过来控制人类,甚至会毁灭人类。 其实我认为, OpenAI 是这一轮 AI 革命的先驱。 我觉得 全世界的人,都应该向 AI革命的先驱OpenAI ...