大语言模型
Search documents
亚马逊研究奖获奖名单出炉:王晋东等26位华人入选
机器之心· 2025-11-28 04:11
Core Insights - The Amazon Research Awards (ARA) announced 63 recipients, including 26 Chinese scholars from 41 universities across 8 countries, aimed at funding multidisciplinary research topics [1][2]. AI Information Security - Eight researchers in AI information security received awards, with three being Chinese scholars [3]. - Zhou Li from the University of California, Irvine, focuses on using LLM for precise and analyst-friendly attack tracing in audit logs [4]. - Yu Meng from the University of Virginia studies weakly supervised RLHF, modeling ambiguity and uncertainty in human preferences [5]. - Ziming Zhao from Northeastern University specializes in system and software security, network security, and human-centered security research [6]. Amazon Ads - Two awardees in the Amazon Ads research area are both Chinese [8]. - Xiaojing Liao from the University of Illinois Urbana-Champaign investigates attack methods on large language models, focusing on interpretable vulnerability detection and remediation [10][11]. - Tianhao Wang from the University of Virginia works on differential privacy and machine learning privacy, designing practical algorithms [14]. AWS Agentic AI - Thirty researchers were awarded in the Agentic AI category, including several Chinese scholars [16]. - Cong Chen from Dartmouth College aims to drive global energy transition through engineering methods based on optimization, economics, and modern machine learning [19]. - Chunyang Chen from the Technical University of Munich focuses on the intersection of software engineering, human-computer interaction, and AI [21]. Trainium Development - Twenty awardees are involved in research related to Amazon's Trainium AI chips, with several being Chinese researchers [49]. - Kuan Fang from the University of Minnesota works on NetGenius for autonomous configuration and intelligent operation of next-generation wireless networks [50]. - Shizhong Han from the Lieber Institute focuses on revealing the genetic basis of brain diseases and translating genetic discoveries into new treatments [55]. Think Big Initiative - Three researchers were awarded under the Think Big initiative, which supports transformative ideas in scientific research, including one Chinese scholar [85]. - Tianlong Chen from the University of North Carolina at Chapel Hill utilizes molecular dynamics to empower protein AI models [88].
NeurIPS 2025奖项出炉,Qwen获最佳论文
具身智能之心· 2025-11-28 00:04
Core Insights - The NeurIPS 2025 conference awarded four Best Paper awards and three Best Paper Runner-up awards, highlighting significant advancements in various AI research areas [1][2][4]. Group 1: Best Papers - Paper 1: "Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)" introduces Infinity-Chat, a dataset with 26,000 diverse user queries, addressing the issue of homogeneity in language model outputs [6][8][10]. - Paper 2: "Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free" reveals the impact of gated attention mechanisms on model performance, enhancing training stability and robustness [12][18]. - Paper 3: "1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities" demonstrates that increasing network depth to 1024 layers significantly improves performance in self-supervised reinforcement learning tasks [19][20]. - Paper 4: "Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training" explores the dynamics of training diffusion models, identifying mechanisms that prevent memorization and enhance generalization [21][23]. Group 2: Awards and Recognition - The Time-Tested Award was given to the paper "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," recognized for its foundational impact on computer vision since its publication in 2015 [38][42]. - The Sejnowski-Hinton Prize was awarded to researchers for their work on "Random synaptic feedback weights support error backpropagation for deep learning," contributing to the understanding of biologically plausible learning rules [45][49].
AI赋能资产配置(二十六):AI“添翼”:大模型增强投资组合回报
Guoxin Securities· 2025-11-27 09:19
Core Insights - The report analyzes three representative AI asset management products: AIEQ, ProPicks, and QRFT, assessing whether AI can deliver excess returns for investors [2] - Overall, while overseas AI asset management products have improved quality and efficiency, they should not be overly "mythologized." AIEQ, a sentiment-driven active ETF, has underperformed SPY due to high market sentiment volatility and cost erosion from high fees and turnover [2] - ProPicks, a subscription-based product, has shown strong returns during tech uptrends but is highly sensitive to execution discipline and slippage, making actual replication challenging [2] - QRFT, an AI-enhanced ETF, has shown performance close to the S&P 500, with significant variations in performance over different periods, focusing more on narrow enhancements rather than stable high alpha [2] - The report concludes that AI's more reliable value lies in enhancing information processing efficiency and standardizing research processes rather than guaranteeing consistent outperformance against indices [2] Group 1: AI-Driven Asset Management: Progress and Cases - The evolution of global financial markets reflects a historical contest between computational power and data processing capabilities, marking a paradigm shift in investment decision-making mechanisms [3] - Traditional quantitative investment relies on linear regression and statistical arbitrage, while the new generation of AI-driven strategies utilizes deep learning, reinforcement learning, and natural language processing to identify nonlinear market patterns [4] Group 2: Case Study 1: AIEQ ETF Introduction - AIEQ ETF, launched on October 17, 2017, is the world's first actively managed ETF entirely by AI, utilizing IBM Watson's cognitive computing platform for its investment strategy [5] - AIEQ's investment approach involves high-frequency scanning and sentiment interpretation of the entire market information environment, processing millions of unstructured texts daily [5] Group 3: AIEQ Performance Analysis - As of November 2025, AIEQ's performance shows a cumulative return of 107.34% since inception, but it has underperformed the S&P 500 significantly over various time frames [8][13] - AIEQ's annual turnover rate is an astonishing 1159%, reflecting its sensitivity to short-term market sentiment, which has led to significant cost erosion [18] - The fund's asset management scale has stagnated between $114 million and $117 million, indicating disappointment among investors due to its long-term underperformance [20] Group 4: Case Study 2: Investing ProPicks - ProPicks represents a different AI investment path through a subscription model, providing users with monthly stock picks based on a vast historical database and AI algorithms [21] - The "Tech Titans" strategy under ProPicks has achieved a cumulative return of 98.7% since its launch, significantly outperforming the S&P 500 by 55% [25] Group 5: Case Study 3: QRFT - QRFT, launched in May 2019, employs AI to optimize a traditional factor investment framework, focusing on quality, size, value, momentum, and low volatility [39] - As of November 2025, QRFT's performance has been slightly better than the S&P 500, with a five-year annualized return of approximately 14.9% [44] - QRFT's turnover rate is 267%, indicating a high-frequency rebalancing strategy, which poses challenges in terms of cost and performance relative to low-cost index funds [48]
「AI界淘宝」MuleRun:上线10天涌入21万用户,要做全球最大劳动力外包公司
3 6 Ke· 2025-11-27 09:15
"让AI骡子去干那些重复、琐碎、耗人的活,人类就做点更'人'的事。" 9 月 16 日,全球首个 AI Agent 交易市场 MuleRun (骡子快跑)正式上线,面向所有用户开放使用。 MuleRun 的 Logo 是一个像素风骡子,平台上集合了不同类型的多个 Agent 。 Agent的创作者多为各领域中懂得某个具体流程、有经验的人,他们将自己的技能变为工作流后做成 Agent。而用户可以根据自身需求在平台上找到相应的 Agent 随租随用、按需付费,比如 3D 桌面人物 创作,其实就是基于 Nano Banana 做的一个 Agent ,每用一次 50 积分(约0.5美元)。而 MuleRun 作为 三方平台,负责流量、交易、收美元和持续分红。所以 MuleRun 也被很多人称为 AI 界的淘宝或闲鱼。 很关键的一点,创作者可以在这里赚钱。一个被说过多次的典型案例——刚才提到的3D桌面人物创 作,就在Nano Banana发酵之时,创作者将这个Agent部署到MuleRun,用户只需要上传一张照片,然后 点击"run",两步就能生成一张手办图片,这让创作者3天赚了1200美元。 一个小而具体的时髦需求, ...
研报掘金丨国泰海通:维持网宿科技“增持”评级,目标价14.02元
Ge Long Hui A P P· 2025-11-27 08:39
Core Viewpoint - The report from Guotai Junan Securities indicates that Wangsu Technology achieved significant year-on-year and quarter-on-quarter growth in net profit attributable to the parent company in Q3 2025, driven by active expansion into overseas markets and a comprehensive product offering [1] Market Expansion - The company is actively exploring overseas markets, focusing on Southeast Asia and the Middle East for its product offerings [1] - Wangsu Technology is advancing its entire product line internationally, providing security services to local clients and companies venturing abroad [1] Product and Service Development - The company has launched a deep security assessment service for large models, offering a one-stop security solution for large language models and AI applications [1] - The service encompasses model output security, data security, algorithm security, and application security, aiding enterprises in building a secure and trustworthy AI application ecosystem [1] Valuation and Rating - Based on comparable company valuations, Wangsu Technology is assigned a 2025 price-to-earnings ratio of 42 times, leading to a target price of 14.02 yuan, with a maintained "Buy" rating [1]
NeurIPS 2025最佳论文开奖,何恺明、孙剑等十年经典之作夺奖
3 6 Ke· 2025-11-27 07:27
Core Insights - NeurIPS 2025 announced its best paper awards, with four papers recognized, including a significant contribution from Chinese researchers [1][2] - The "Test of Time Award" was given to Faster R-CNN, highlighting its lasting impact on the field of computer vision [1][50] Best Papers - The first best paper titled "Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)" was authored by a team from multiple prestigious institutions, including Washington University and Carnegie Mellon University [5][6] - The second best paper, "Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free," involved collaboration between researchers from Alibaba, Edinburgh University, Stanford University, MIT, and Tsinghua University [14][15] - The third best paper, "1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities," was authored by researchers from Princeton University and Warsaw University of Technology [21][24] - The fourth best paper, "Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training," was a collaborative effort from PSL University and Bocconi University [28][29] Runners Up - Three runner-up papers were also recognized, including "Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?" from Tsinghua University and Shanghai Jiao Tong University [33][34] - Another runner-up paper titled "Optimal Mistake Bounds for Transductive Online Learning" was authored by researchers from Kent State University, Purdue University, Google Research, and MIT [38][39] - The third runner-up paper, "Superposition Yields Robust Neural Scaling," was from MIT [42][46] Test of Time Award - The "Test of Time Award" was awarded to the paper "Faster R-CNN," which has been cited over 56,700 times and has significantly influenced the computer vision field [50][52] - The paper introduced a fully learnable two-stage process that replaced traditional methods, achieving high detection accuracy and near real-time speeds [50][52]
月之暗面公开强化学习训练加速方法:训练速度暴涨97%,长尾延迟狂降93%
量子位· 2025-11-27 04:34
Core Viewpoint - The article discusses the introduction of a new acceleration engine called Seer, developed by Moonlight and Tsinghua University, which significantly enhances the reinforcement learning (RL) training speed of large language models (LLMs) without altering the core training algorithms [1][8]. Summary by Sections Performance Improvement - Seer can improve the rollout efficiency of synchronous RL by 74% to 97% and reduce long-tail delays by 75% to 93% [3][23]. Technical Architecture - Seer consists of three main modules: 1. **Inference Engine Pool**: Built on DRAM/SSD, it includes multiple inference instances and a global KVCache pool for load balancing and data reuse [9]. 2. **Request Buffer**: Acts as a unified entry for all rollout requests, managing metadata and request states for precise resource scheduling [10]. 3. **Context Manager**: Maintains context views for all requests and generates scheduling decisions based on context signals [11]. Key Technologies - **Divided Rollout**: This technique breaks down responses into independent requests and segments, reducing memory fluctuations and load imbalance [12][13]. - **Context-Aware Scheduling**: Implements a "speculative request" strategy to prioritize obtaining length features for requests, thus alleviating long request delays [17]. - **Adaptive Grouped Speculative Decoding**: Utilizes similar response patterns within groups to create a dynamic reference library for generating drafts, enhancing decoding efficiency [19]. Experimental Validation - In experiments with models like Moonlight, Qwen2-VL-72B, and Kimi-K2, Seer demonstrated a throughput increase of 74% to 97% compared to the baseline system veRL, with significantly reduced long-tail delays [21][23]. - For instance, in the Moonlight task, the last 10% of requests took 3984 seconds with veRL, while Seer reduced this to 364 seconds, achieving an 85% reduction in long-tail delays [23]. Financing and Future Plans - Moonlight is reportedly nearing completion of a new funding round, potentially raising several hundred million dollars, which could elevate its valuation to $4 billion [32][33]. - The company is in discussions with investment firms, including IDG Capital and existing shareholder Tencent, with plans to complete the funding by the end of the year and initiate an IPO process in the following year [36][37].
中山大学最新Cell子刊:AI能够帮助医生克服技术障碍,但存在依赖风险
生物世界· 2025-11-27 04:11
Core Insights - The article discusses the integration of interdisciplinary research in fields such as biology, chemistry, and computer science, highlighting its role in advancing digital medicine and healthcare services [2] - Despite the potential of technologies like artificial intelligence (AI) in biomedicine, their widespread application is hindered by technical barriers and limited expertise among physicians [2][5] - A recent study demonstrated that large language models (LLMs) can assist physicians in overcoming these technical challenges, although concerns about dependency and misinformation remain [3][6] Study Findings - A randomized controlled trial involving 64 primary ophthalmologists showed that the use of LLMs like ChatGPT-3.5 significantly improved project completion rates from 25% to 87.5% [5][7] - After a two-week washout period, 41.2% of successful intervention participants were able to complete new projects independently without LLM support [5][7] - The study identified potential risks associated with LLMs, including the tendency for physicians to rely on AI-generated information without full understanding [5][9] Implications - The findings suggest that LLMs can democratize medical AI research by helping physicians navigate design, execution, and reporting challenges [9] - However, the long-term risks of dependency on LLMs warrant further investigation to ensure safe and effective use in clinical settings [6][9]
NeurIPS 2025奖项出炉,Qwen获最佳论文,Faster R-CNN获时间检验奖
机器之心· 2025-11-27 03:00
Core Insights - The NeurIPS 2025 conference awarded four Best Paper awards and three Best Paper Runner-up awards, highlighting significant advancements in various AI research areas [1][4]. Group 1: Best Papers - Paper 1: "Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)" discusses the limitations of large language models in generating diverse content and introduces Infinity-Chat, a dataset with 26,000 diverse user queries for studying model diversity [5][6][9]. - Paper 2: "Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free" reveals the impact of gated attention mechanisms on model performance and stability, demonstrating significant improvements in the Qwen3-Next model [11][16]. - Paper 3: "1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities" shows that increasing network depth to 1024 layers can enhance performance in self-supervised reinforcement learning tasks, achieving performance improvements of 2x to 50x [17][18]. - Paper 4: "Why Diffusion Models Don't Memorize: The Role of Implicit Dynamical Regularization in Training" identifies mechanisms that prevent diffusion models from memorizing training data, establishing a link between training dynamics and generalization capabilities [19][21][22]. Group 2: Best Paper Runner-Up - Paper 1: "Optimal Mistake Bounds for Transductive Online Learning" solves a 30-year-old problem in learning theory, establishing optimal mistake bounds for transductive online learning [28][30][31]. - Paper 2: "Superposition Yields Robust Neural Scaling" argues that representation superposition is the primary mechanism governing neural scaling laws, supported by multiple experiments [32][34]. Group 3: Special Awards - The Time-Tested Award was given to the paper "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," recognized for its foundational impact on modern object detection frameworks since its publication in 2015 [36][40]. - The Sejnowski-Hinton Prize was awarded for the paper "Random synaptic feedback weights support error backpropagation for deep learning," which contributed significantly to understanding biologically plausible learning rules in neural networks [43][46][50].
a16z前合伙人重磅科技报告:AI如何吞噬世界
Hua Er Jie Jian Wen· 2025-11-26 12:08
Core Insights - Generative AI is initiating a significant platform shift in the tech industry, comparable to past transitions every 10 to 15 years, with the launch of ChatGPT in 2022 marking a potential starting point for this change [1][4][5] Investment Trends - Major tech companies, including Microsoft, AWS, Google, and Meta, are projected to invest $400 billion in AI infrastructure by 2025, surpassing the global telecom industry's annual investment of approximately $300 billion [4][11] - This projected investment for 2025 has nearly doubled within a year, indicating a rapid escalation in capital allocation towards AI [14] Historical Context of Platform Shifts - The tech industry has historically undergone platform shifts, such as from mainframes to PCs and from the web to smartphones, often leading to the decline of early leaders like Microsoft and Apple [5][11] - The report highlights that early leaders often disappear during these transitions, as evidenced by Microsoft's operating system market share dropping from nearly 100% to below 20% by 2025 [5] Current State of AI Development - Despite significant investment, the exact form of the current platform shift towards generative AI remains unclear, with various potential user interface paradigms being explored [10] - The construction of data centers in the U.S. is outpacing that of office buildings, driven by the new investment cycle in AI [17] Market Dynamics and Competition - The performance gap among leading large language models is narrowing, suggesting that these models may be becoming commoditized, which could lead to a reshuffling of value capture in the market [23] - Companies must seek new competitive advantages in areas such as computational scale, vertical data, product experience, or distribution channels [26] User Engagement Challenges - Despite claims of 800 million weekly active users for ChatGPT, actual user engagement is low, with only about 10% of U.S. users utilizing AI chatbots daily [27][30] - The report identifies a significant gap between technological capability and practical application, with many enterprises still slow to deploy AI solutions [33][36] Transformative Potential in Advertising - AI is expected to revolutionize advertising and recommendation systems by understanding user intent rather than relying solely on relevance, potentially rewriting the mechanisms of the trillion-dollar advertising market [37] Future Outlook - The future of AI is characterized by both clarity and ambiguity; while it is expected to reshape industries, the final product forms and value chain leaders remain uncertain [44] - The shift towards capital-intensive competition is evident, as companies like Microsoft increase their capital expenditure relative to sales, reflecting a fundamental change in competitive dynamics [45]