LLaMA系列模型
Search documents
华人2亿美元年薪破界,AI竞赛冰火两重天
Sou Hu Cai Jing· 2025-07-11 06:03
Group 1 - Meta has offered over $200 million annual salary to Ruoming Pang, a prominent AI/ML expert from Apple, to strengthen its newly established "Superintelligence Labs" [4][8] - The compensation package for Pang exceeds Apple's CEO Tim Cook's salary of $74.6 million and approaches the earnings of sports stars like Cristiano Ronaldo and Stephen Curry [4] - The majority of Pang's compensation is structured as stock options, signing bonuses, and performance-based incentives, requiring years of service and achievement of Meta's market value growth targets to unlock [4] Group 2 - Microsoft has laid off 15,000 employees, including 9,000 in its third round of layoffs, as part of a cost-cutting strategy amid a significant increase in AI infrastructure investment [5][7] - The layoffs reflect a broader trend in the tech industry, where companies are restructuring to focus resources on AI, with Amazon cutting 27,000 jobs and other firms like Google and IBM also reducing staff [7] - The shift towards AI is leading to the replacement of traditional IT roles, as seen in Microsoft's layoffs where 40% of the affected positions were software engineers, indicating a significant transformation in the workforce [5][7] Group 3 - Meta's recruitment of Pang is part of a larger strategy to enhance its capabilities in large language models and intelligent assistants, addressing concerns about its AI progress compared to competitors [9] - Apple is reportedly considering abandoning its in-house large language model development in favor of technologies from Anthropic or OpenAI due to slow internal progress, leading to the exit of several key AI engineers [9] - The competition for AI talent is intensifying, with Meta actively recruiting from leading tech firms to fill gaps in its AI research and development [9]
精准调控大模型生成与推理!浙大&腾讯新方法尝试为其注入“行为定向剂”
量子位· 2025-06-05 10:28
Core Viewpoint - The article discusses the dilemma in controlling large AI models, emphasizing the need for a balance between intelligence and compliance, proposing the Steering Target Atoms (STA) method as a solution to create AI that is both smart and obedient [1][6]. Method & Experimental Results - The STA method allows for "atomic-level" behavior editing of large models, enhancing robustness and safety in output control [2]. - Traditional methods often couple safety defenses with general intelligence, leading to potential performance trade-offs. The STA method addresses this by intervening at the internal neuron level, identifying and adjusting specific neurons associated with harmful behaviors while preserving those linked to correct responses [4][5]. - The STA method has been tested on models like Gemma and LLaMA, showing superior detoxification performance without significant negative impact on general performance [10]. Experimental Setup - The research involved manipulating target atom directions and amplitudes to regulate model behavior, with extensive testing on various model configurations [9]. Key Experimental Results - The STA method outperformed other techniques in detoxification while maintaining general performance, as shown in the comparative results table [10]. Steering Vectors vs. Prompt Engineering - The article compares Steering Vectors with traditional prompt engineering, highlighting that Steering is more robust against jailbreak attacks and allows for finer control [12][13]. Cognitive Intervention in Large Models - The research also explored cognitive interventions in larger models like DeepSeek-R1, enhancing reasoning capabilities by amplifying weights of neurons associated with "thinking" [16][18]. - The findings indicate that while Steering techniques may lack the convenience of prompts, they offer more robust and precise intervention effects [18]. Open Source Contribution - The research team has made some intervention methods open source to encourage further exploration in the field of safe and controllable large models [19].
中国AI模型全面爆发,AI大模型技术体系综合开源影响力榜单重磅发布!
AI科技大本营· 2025-04-18 05:53
一提到"大模型",很多人的第一反应往往是那个既能聊天,又会写代码、画画的"模型本身"。但其 实,大模型远不止是一个"能输出结果的程序"这么简单,其背后有一整套复杂而庞大的技术体系作为 支撑:从大规模、高质量、多样化的数据,到先进的模型架构与训练策略,再到推理部署、资源调度 等支撑落地的系统能力,以及不可或缺的科学评测机制。大模型更像是一个由模型、数据、系统、评 测平台 等多要素构成的"技术共同体",而非单一模块的堆叠。 如今在闭源技术壁垒与高昂商用门槛的对比下,开源大模型正迅速崛起,成为推动 AI 技术普惠化的 重要力量。但面对层出不穷的开源 AI 模型技术,我们该如何选型?不同的模型技术体系又各有怎样 的优势与短板? 在这一背景下,为系统呈现全球大模型生态的开源发展现状,CSDN 联合多家机构于 4 月 18 日在 2025 全球机器学习技术大会(ML-Summit 2025)现场重磅发布《AI 大模型技术体系综合开源影响 力榜单》,全面评估全球范围内开源大模型技术体系的贡献与影响力,旨在为行业提供参考坐标,推 动开源创新持续前行。 注:这里大模型是指 主要包括 decoder-only 以来的模型结构,包 ...
图灵奖得主LeCun:人类智能不是通用智能,下一代AI可能基于非生成式
量子位· 2025-04-14 09:09
Core Viewpoint - Human intelligence is not general intelligence; it is specialized and evolved to solve survival-related problems, which makes the term AGI (Artificial General Intelligence) misleading [2][18]. Group 1: Next Generation AI - The next breakthrough in AI may come from non-generative models, contrary to the current focus on generative AI [3][14]. - Current AI technologies, such as large language models (LLMs), exhibit limitations in generalization and reasoning capabilities, which are essential for achieving human-like intelligence [20][21]. - To reach human-level intelligence, new technologies must be invented, as the current state of AI is far from this goal [8][10]. Group 2: AI Capabilities - Future AI must possess several key abilities, including world modeling, reasoning, planning, and long-term memory, which are not solely reliant on language [17][22]. - The ability to understand the physical world and adapt to it is crucial for AI to function similarly to biological entities [21][23]. Group 3: Open Source Strategy - Meta's decision to open-source the LLaMA series models is driven by ethical considerations and aims to foster innovation and participation from academia and startups [25][27]. - Open-source strategies are seen as essential for accelerating breakthroughs in AI, as no single company can monopolize all innovations [28][33]. Group 4: Future Directions - Smart glasses are identified as an important direction for the practical application of AI technology [29]. - The future of AI assistants should focus on multi-sensory interaction, specialized virtual assistant teams, and the ability to adapt to user environments [34].