Workflow
大语言模型
icon
Search documents
四月游戏收入同比增长超20%,游戏ETF(516010)涨超3%
Mei Ri Jing Ji Xin Wen· 2025-06-03 03:01
Group 1 - The Chinese gaming market is projected to reach 27.351 billion yuan by April 2025, representing a year-on-year growth of 21.93%, with mobile gaming growing by 28.41% and overseas revenue increasing by 9.62% [1] - Deepseek R1 has demonstrated global leadership in deep thinking capabilities, surpassing o3 and Gemini 2.5 Pro in digital testing AIME2024 and code testing LiveCodeBench, with a 15% improvement over the previous version [1] - The advancement of artificial intelligence is expected to boost the gaming sector, as the industry is a mature application area for AI, with potential new gameplay emerging from the integration of large language models [1] Group 2 - The ability of R1 in text understanding and creative writing has improved, with a reduction in hallucination rates for rewriting, summarizing, and reading comprehension by 45%-50%, and significant growth in long-form writing and role-playing capabilities [1] - Future developments may allow large language models to endow game characters with independent personalities, enabling them to perform actions and behaviors within the virtual world, potentially creating new gameplay experiences [1]
揭开大模型“伪遗忘”,港理工等团队:结构不变就是没忘
量子位· 2025-06-01 03:40
训练中暴露的敏感信息往往被模型"记住",引发广泛关注。 Machine Unlearning团队 投稿 量子位 | 公众号 QbitAI 近年来,大语言模型(LLMs)的能力突飞猛进,但随之而来的隐私风险也逐渐浮出水面。 在此背景下, 机器遗忘(Machine Unlearning) 技术应运而生,目标是在不影响整体能 力的前提下,有选择性地抹除特定知识。 来自香港理工大学、卡内基梅隆大学和加州大学圣克鲁兹分校的研究团队通过构建一套表示 空间的诊断工具,系统性地区分了 "可逆性遗忘"与"灾难性不可逆遗忘" ,并首次揭示了遗 忘现象背后的表示结构变化规律—— 真正的遗忘只有在多个网络层发生协同且大幅度扰动时才会出现;而相比之下,在高敏感区 域(如输出logits)中进行轻微更新虽然会显著降低准确率或提高困惑度,但模型内部表示 结构仍可保持完整。 研究人员整理成了一个统一的表示层分析工具箱,支持诊断LLM在 Unlearning/Relearning/Finetuning等过程中的内在变化。 真正的遗忘,是结构性的抹除,而非行为的抑制 研究者提出:"一个模型若仅仅在token输出上'忘记',而其内部结构几乎未变, ...
大模型智能体如何突破规模化应用瓶颈,核心在于Agentic ROI
机器之心· 2025-05-30 04:16
Core Viewpoint - The main barrier to the usability of large language model agents (LLM Agents) is not the capability of the models but rather the "Agentic ROI" which has not reached a practical threshold for widespread application [1][3][4]. Group 1: Agentic ROI Concept - Agentic ROI (Agentic Return on Investment) is a key metric that measures the ratio of "information yield" to "usage cost" for LLM Agents in real-world scenarios [4]. - Usability is achieved only when the quality of information exceeds a certain threshold and the ratio of time and cost saved by the agent is sufficiently high [4][5]. Group 2: Current Application Landscape - Most LLM Agents are currently applied in high human task time cost scenarios, such as research and programming, where human labor is intensive, thus allowing for significant efficiency improvements [7]. - In everyday applications with high user demand, such as e-commerce and personal assistants, the tasks are simpler, leading to lower marginal value from LLM Agents, which may introduce additional interaction costs and delays, resulting in low Agentic ROI [7]. Group 3: Development Trajectory - The development path of LLM Agents is characterized by a "zigzag" model of first scaling up to enhance information quality, followed by scaling down to reduce time and cost while maintaining quality [9]. - The evolution of foundational models, such as the OpenAI series, illustrates this zigzag trend, with significant performance improvements in larger models and the introduction of smaller models that maintain performance while reducing inference costs and delays [9]. Group 4: Scaling Up Information Quality - Pre-training scaling involves expanding model size, data volume, and computational resources to enhance foundational capabilities in language understanding and reasoning [11]. - Post-training scaling, including supervised fine-tuning and reinforcement learning, aligns the agent's performance with human needs and values, relying on extensive interaction data for continuous learning [12]. - Test-time scaling focuses on building a world model that supports multimodal interactions and can handle complex tasks while reflecting real-world uncertainties [13]. Group 5: Ensuring Robustness and Security - Ensuring the robustness and security of LLM Agents is crucial for enhancing information quality, preventing exploitation of reward mechanisms, and safeguarding against data contamination and feedback manipulation [16]. Group 6: Scaling Down to Reduce Time and Cost - Introducing memory mechanisms allows agents to skip redundant calculations, leveraging past knowledge to enhance processing speed [18]. - Model compression techniques can significantly reduce computational resources and inference delays without compromising performance [18]. - Optimizing reasoning strategies and infrastructure can further enhance the efficiency and responsiveness of LLM Agents [18]. Group 7: Cost Management - Reducing interaction time by enabling agents to proactively understand user intent can lower cognitive burdens and improve user experience [19]. - Managing operational costs effectively is essential, especially in large-scale deployments, by optimizing context management and controlling inference complexity [19]. - Agentic ROI serves as a framework for evaluating the real usability of LLM Agents, shifting focus from mere model performance to practical benefits and comprehensive efficiency [19].
谷歌之后,英伟达入局扩散大语言模型,Fast-dLLM推理速度狂飙27.6倍
机器之心· 2025-05-30 03:28
Core Viewpoint - The article discusses the breakthrough in inference speed for diffusion models through the introduction of Fast-dLLM, which utilizes a training-free acceleration approach, enhancing the practical application of large language models (LLMs) [2][20]. Group 1: Core Technology - Fast-dLLM employs a Block-Wise KV Cache mechanism, achieving over 90% activation reuse, significantly improving computational efficiency for long sequence inference [6][12]. - The Confidence-Aware Parallel Decoding method allows for parallel decoding while maintaining token dependency, ensuring coherent generation by filtering tokens based on confidence levels [9][13]. - The dual cache strategy enables simultaneous caching of prefix and suffix attention activations, reducing redundant calculations and enhancing performance [12]. Group 2: Performance Breakthrough - Fast-dLLM achieves a 27.6 times end-to-end acceleration for long text generation tasks, reducing single-step latency from 0.26 seconds to 0.09 seconds, and overall time from 266 seconds to 12 seconds [18]. - The accuracy loss in mainstream benchmark tests is kept under 2%, demonstrating the model's effectiveness in maintaining quality while improving speed [19]. Group 3: Application Value - Fast-dLLM's zero-training cost feature makes it an ideal tool for inference optimization, allowing for quick integration into existing systems without altering model architecture or training processes [20]. - The model shows compatibility with various existing models like LLaDA and Dream, achieving significant throughput improvements while maintaining competitive accuracy [21]. Group 4: Summary and Outlook - Fast-dLLM represents a significant advancement in inference efficiency for diffusion models while ensuring stable generation quality, paving the way for broader applications in real-time interaction and long text generation [23].
2025国际人形机器人技能大赛召开 业内呼吁理性包容机器人行业“成长的烦恼”
以"具身智能,未来已来"为主题的2025张江具身智能开发者大会暨2025国际人形机器人技能大赛5月29 日在上海浦东张江举行。2025国际人形机器人技能大赛设置5大赛道,覆盖28个高难度场景,60余支顶 尖参赛队伍和国内外参赛选手共同参赛。本次大赛旨在于集中展示人形机器人解决实际问题的能力和场 景应用落地能力,助力机器人产业向"能看、会说、有智商"进阶。 国家地方共建人形机器人创新中心首席科学家江磊在接受证券时报记者采访时表示,上海聚焦人形机器 人在生产制造、服务场景中的实际应用,强调 "解决实际问题"。本次大赛设定商超、药店、工业制造 等真实场景任务,以场景驱动行业回归实用。江磊说,公众需以包容的心态看待机器人行业发展现状。 行业的快速发展仅有三年左右时间,如同"三岁儿童",需给予它更长的时间发育。 参赛者:理性看待机器人行业发展进度 当天,上海开普勒K2"大黄蜂"团队、北京理工大学急行智学团队、清华大学紫荆战队等60余支顶尖参 赛队伍和国内外参赛选手,在9个比赛场地同步展开激烈角逐。 记者了解到,此次赛事的所有项目均源自企业实际需求,每个赛道都还原了真实应用场景,总体难度较 大。因此,机器人未能完成任务并 ...
Linear-MoE:线性注意力遇上混合专家的开源实践
机器之心· 2025-05-29 11:38
Core Insights - The article highlights the rise of Linear-MoE architecture, which effectively combines linear sequence modeling and Mixture-of-Experts (MoE) for enhanced performance in large language models [1][10]. Group 1: Linear Sequence Modeling - Significant advancements in linear sequence modeling have been achieved over the past two years, characterized by linear time complexity in training and constant memory usage during inference [5]. - The main categories of linear sequence modeling include Linear Attention, State Space Models (SSM), and Linear RNN, with notable works such as Lightning Attention, GLA, Mamba2, and RWKV [5]. Group 2: Mixture-of-Experts (MoE) - MoE has become a standard in the industry, with various models like GPT-4, Gemini, and domestic models such as DeepSeek and Qwen all adopting MoE architectures [8]. - The importance of MoE in enhancing model capabilities is emphasized, although the article does not delve deeply into this aspect [8]. Group 3: Linear-MoE Architecture - Linear-MoE offers a complete system from modeling to training, allowing flexible combinations of linear sequence modeling layers and MoE layers, while also being compatible with traditional Softmax Attention Transformer layers [10]. - Key features include a modular architecture with support for various linear modeling methods and multiple MoE implementations, ensuring stability and scalability through the Megatron-Core framework [10]. Group 4: Performance and Future Prospects - Large-scale experiments validate the superiority of Linear-MoE, demonstrating faster inference speeds (2-5 times quicker than traditional architectures) and over 50% reduction in memory usage [12][13]. - The open-source nature of Linear-MoE fills a technical gap and provides reproducible training solutions, with future exploration planned for applications in long-context understanding and Vision-Language model architectures [13].
重新理解Agent的边界与潜力:AI转型访谈录
3 6 Ke· 2025-05-29 10:53
Core Insights - The year 2025 is referred to as the "Year of Agents," with various AI agents emerging in both enterprise and personal planning tools, yet a unified definition remains elusive [1] - AI Native companies are redefining the boundaries of agents, moving beyond efficiency tools to explore deeper values in business insights, creative generation, and organizational transformation [1][3] - The founder of Tezan, Dr. Fan Ling, emphasizes the potential of large language models to simulate real user behavior, enabling AI to not only answer questions but also proactively build user profiles and drive decision-making processes [1][3] Product Innovation - Atypica.ai distinguishes itself by simulating real individuals using large language models to study typical users, facilitating large-scale user interviews at low costs [5][6] - The product employs a divergence-first model for reasoning, suitable for addressing non-consensus and artistic aspects of business problems, contrasting with traditional convergence-first research methods [5] - Atypica.ai allows AI to generate non-consensus viewpoints, broadening the scope of thinking, particularly useful for issues like public opinion surveys [5] Organizational Transformation - AI is shifting work dynamics from specialized roles to more versatile positions, leading to fewer job titles but more composite skills within organizations [5][41] - The company envisions a future where every employee can unleash their potential, akin to a "unicorn," rather than being confined to narrow job descriptions [41] - The integration of AI is expected to reshape traditional industrial thinking, encouraging a return to multi-talented roles reminiscent of the Renaissance [41] Market Research Applications - Atypica.ai can address four main business problems: market insights, product co-creation, product testing, and content planning [19][20] - The system can analyze user feedback on products, such as identifying the needs of young families for multi-purpose vehicles in the electric vehicle market [19] - The platform can generate detailed consumer personas and conduct interviews with simulated users to gather insights efficiently [18][19] Data Integration and Accuracy - The company is collaborating with authoritative media to integrate unique data sources, enhancing the accuracy of analyses beyond social media narratives [22] - The dual nature of hallucination and accuracy in commercial research is acknowledged, where diverse perspectives are essential for understanding complex business problems [24] Future of AI Agents - The relationship between humans and virtual agents is expected to evolve, with agents serving as both tools and mirrors reflecting human society [5][6] - The potential for AI to simulate real personalities raises questions about the future coexistence of humans and virtual agents, challenging traditional views of AI as mere tools [59][60]
重新理解Agent的边界与潜力|AI转型访谈录
腾讯研究院· 2025-05-29 09:28
Core Insights - The year 2025 is referred to as the "Agent Year," with various AI agents emerging in both enterprise and personal planning tools, yet a unified definition remains elusive [1] - AI Native companies are redefining the boundaries of agents, moving beyond efficiency tools to explore deeper values in business insights, creative generation, and organizational transformation [1] - Atypica.ai, developed by the company, simulates real user behavior using large language models, allowing AI to not only answer questions but also proactively build user profiles and drive decision-making processes [3][4] Product Innovation - Atypica.ai innovates by simulating real users and conducting large-scale user interviews at low costs through multiple AI assistants [3] - The model prioritizes divergent thinking, suitable for addressing non-consensus and artistic aspects of business problems, contrasting with traditional convergent research methods [3] - The concept of "hallucination" is leveraged to allow AI to generate non-consensus viewpoints, broadening the scope of thinking [3] Organizational Transformation - AI is shifting work dynamics from specialized roles to more versatile positions, leading to organizational structures with fewer roles but more composite skills [3] - The potential of each employee is emphasized, suggesting that AI will not replace humans but enable them to unleash their full potential [3] - The relationship between virtual agents and humans is evolving, with AI serving as a mirror to human society, potentially reshaping work and life [3] Workflow and Use Cases - The workflow of Atypica.ai involves identifying business problems, clarifying user needs through targeted questions, and simulating user personas for analysis [18][19] - The system can address four main business issues: market insights, product co-creation, product testing, and content planning [20] - Atypica.ai has been used to analyze user feedback for products, co-create with target user groups, and assist in content direction for social media influencers [21] Future Perspectives - The article discusses the potential for AI to redefine personal planning and decision-making processes, emphasizing the dual nature of commercial research as both science and art [25][26] - The integration of authoritative data sources is seen as crucial for ensuring the authenticity of analyses, especially for high-stakes inquiries [25] - The future of work is envisioned as a shift towards more holistic roles, where employees take on broader responsibilities rather than being confined to narrow job descriptions [45][46]
小鹏汽车-W(09868):同级领先智能辅助驾驶,定价超预期
Changjiang Securities· 2025-05-28 23:30
Investment Rating - The investment rating for the company is "Buy" and is maintained [6]. Core Views - On May 28, 2025, the company launched the MONA M03 MAX version, which includes two models: the 502 Long Range Max priced at 129,800 yuan and the 600 Ultra Long Range Max priced at 139,800 yuan. These models feature the full-version AI Tianji system and Turing driving assistance, achieving the strongest urban intelligent driving assistance capabilities in their class. The company is expected to accelerate sales due to a strong new vehicle cycle, channel transformation, and enhanced marketing systems. Financial performance is anticipated to improve continuously due to scale enhancement, cost reduction from platforms and technologies, and the expansion of software profitability models alongside ongoing international growth [2][4][9]. Summary by Sections Event Description - The MONA M03 MAX version was officially launched on May 28, 2025, featuring two models with prices of 129,800 yuan and 139,800 yuan, equipped with advanced AI systems and driving assistance technologies [4]. Sales and Financial Projections - The expected delivery volume for Q2 2025 is between 102,000 and 108,000 units, representing a year-on-year growth of 237.7% to 257.5%. Projected revenue for this period is between 17.5 billion and 18.7 billion yuan, reflecting a year-on-year increase of 115.7% to 130.5%. The company anticipates a strong new vehicle cycle with multiple new models set to launch, which is expected to enhance sales further [6][9]. Competitive Advantage - The MONA M03 Max is the first in its class to feature dual Orin-X chips, providing a computing power of 508 TOPS, significantly surpassing competitors. The intelligent driving capabilities are designed to adapt to driver styles, allowing for seamless control transfer between the driver and the vehicle [9]. Future Outlook - The company expects to achieve a single-quarter profit turnaround by Q4 2025, with an overall positive cash flow for the year. The anticipated revenue for 2025 is projected to reach 99.1 billion yuan, corresponding to a price-to-sales ratio of 1.3X, indicating a significant improvement in financial performance as the company enters a new vehicle cycle [9].
Jeff Dean:一年内 AI 将取代初级工程师,网友:“Altman 只会画饼,Jeff 说的话才致命”
AI前线· 2025-05-28 05:17
Core Insights - Jeff Dean, a prominent figure in AI, predicts that within a year, AI systems capable of functioning like junior engineers will be available [1][15][16] - The conversation highlights the transformative potential of AI in software development and the broader implications for the job market [4][10] Group 1: AI Development and Trends - AI has been evolving for over a decade, with significant advancements in neural networks and machine learning since 2012 [5][6] - The mantra "larger models, more data, better results" has held true over the past 12 to 15 years, indicating a trend towards increasingly capable AI systems [6][8] - The emergence of multi-modal AI, capable of processing various input formats, is seen as a crucial trend in the industry [6][8] Group 2: AI Capabilities and Applications - AI agents are expected to perform tasks traditionally requiring human intervention, with a clear path for enhancing their capabilities through reinforcement learning [7][8] - The development of large models necessitates significant investment, leading to a market where only a few advanced models will survive [9][10] - The potential for AI to revolutionize education and other fields is highlighted, with examples of AI generating educational content from video inputs [11][12] Group 3: Hardware and Infrastructure - Specialized hardware for machine learning is critical, with Google’s TPU project being a significant development in this area [17][20] - The future of computing infrastructure is expected to adapt to the demands of running large-scale neural networks efficiently [22][23] - The distinction between training and inference workloads is emphasized, suggesting that different solutions may be required for each [23][24] Group 4: Future of AI Models - Sparse models, which utilize different parts of the model for specialized tasks, are viewed as a promising direction for future AI development [26][27] - The concept of dynamic scaling in models, allowing for the addition of new parameters and efficient resource allocation, is proposed as a more organic approach to AI learning [27][28]