Workflow
机器之心
icon
Search documents
刚刚,神秘模型登顶视频生成榜,又是个中国模型?
机器之心· 2025-11-28 08:05
Core Viewpoint - The article discusses the emergence of a new AI video model named Whisper Thunder (aka David), which has surpassed existing models in the Artificial Analysis video leaderboard, indicating a significant advancement in AI video generation technology [1]. Group 1: Model Performance - Whisper Thunder ranks first on the Artificial Analysis leaderboard with an ELO score of 1,247, outperforming Veo 3 (1,226) and Kling 2.5 Turbo (1,225) [2]. - The model's performance is characterized by a fixed duration of 8 seconds for generated videos, with noticeable motion dynamics [3]. - Users have reported a decrease in the model's appearance frequency, suggesting that it may require multiple refreshes to encounter [3]. Group 2: Model Origin and Characteristics - There is speculation among users that Whisper Thunder may originate from China, based on its generation effects and aesthetic tendencies [4]. - The model has demonstrated impressive capabilities, although some users noted minor generation flaws, particularly during high-motion scenes [11][13]. Group 3: Example Prompts - Several prompts illustrate the model's versatility, including scenes of construction, emotional anime performances, and serene landscapes, showcasing its ability to create diverse and engaging visual narratives [5][6][7][8][9][10][12].
亚马逊研究奖获奖名单出炉:王晋东等26位华人入选
机器之心· 2025-11-28 04:11
Core Insights - The Amazon Research Awards (ARA) announced 63 recipients, including 26 Chinese scholars from 41 universities across 8 countries, aimed at funding multidisciplinary research topics [1][2]. AI Information Security - Eight researchers in AI information security received awards, with three being Chinese scholars [3]. - Zhou Li from the University of California, Irvine, focuses on using LLM for precise and analyst-friendly attack tracing in audit logs [4]. - Yu Meng from the University of Virginia studies weakly supervised RLHF, modeling ambiguity and uncertainty in human preferences [5]. - Ziming Zhao from Northeastern University specializes in system and software security, network security, and human-centered security research [6]. Amazon Ads - Two awardees in the Amazon Ads research area are both Chinese [8]. - Xiaojing Liao from the University of Illinois Urbana-Champaign investigates attack methods on large language models, focusing on interpretable vulnerability detection and remediation [10][11]. - Tianhao Wang from the University of Virginia works on differential privacy and machine learning privacy, designing practical algorithms [14]. AWS Agentic AI - Thirty researchers were awarded in the Agentic AI category, including several Chinese scholars [16]. - Cong Chen from Dartmouth College aims to drive global energy transition through engineering methods based on optimization, economics, and modern machine learning [19]. - Chunyang Chen from the Technical University of Munich focuses on the intersection of software engineering, human-computer interaction, and AI [21]. Trainium Development - Twenty awardees are involved in research related to Amazon's Trainium AI chips, with several being Chinese researchers [49]. - Kuan Fang from the University of Minnesota works on NetGenius for autonomous configuration and intelligent operation of next-generation wireless networks [50]. - Shizhong Han from the Lieber Institute focuses on revealing the genetic basis of brain diseases and translating genetic discoveries into new treatments [55]. Think Big Initiative - Three researchers were awarded under the Think Big initiative, which supports transformative ideas in scientific research, including one Chinese scholar [85]. - Tianlong Chen from the University of North Carolina at Chapel Hill utilizes molecular dynamics to empower protein AI models [88].
Nature | ApdativeNN:建模类人自适应感知机制,突破机器视觉「不可能三角」
机器之心· 2025-11-28 04:11
Core Insights - The article discusses the significant advancements in computer vision and the challenges faced in deploying high-precision models in resource-constrained environments, such as robotics and autonomous driving, due to increased computational demands and energy consumption [2][3]. - It highlights the limitations of existing global representation learning paradigms, which process all pixels of an image or video simultaneously, leading to inefficiencies in energy and computational resources [3]. - The article introduces the AdaptiveNN architecture, which emulates human-like adaptive vision by modeling visual perception as a sequential decision-making process, allowing for efficient and flexible machine visual perception [7][11]. Group 1: Challenges in Current Computer Vision Models - High-precision models require activation of millions of parameters, resulting in increased power consumption, storage needs, and response delays, making them difficult to deploy in real-world applications [2]. - The global parallel computation paradigm leads to a significant energy efficiency bottleneck, as the computational complexity grows with the input size, making it challenging to balance high-resolution input, performance, and efficient inference [3]. Group 2: Insights from Human Visual System - Human vision operates through selective sampling of key areas rather than processing all visual information at once, which significantly reduces computational overhead and allows for efficient functioning even in resource-limited scenarios [5]. - The concept of "active observation" proposed by researchers emphasizes the need for AI systems to adopt a human-like approach to visual perception, focusing on task-driven observation [5]. Group 3: Introduction of AdaptiveNN - AdaptiveNN architecture models visual perception as a multi-step sequential decision process, allowing the model to focus on specific areas of interest and accumulate information progressively [11]. - The architecture combines representation learning with self-rewarding reinforcement learning, enabling the model to optimize its attention and decision-making without additional supervision [15][16]. Group 4: Performance and Efficiency of AdaptiveNN - In extensive experiments, AdaptiveNN achieved up to 28 times reduction in inference costs while maintaining accuracy comparable to traditional static models, demonstrating its potential for efficient visual perception [7][22]. - The model's attention mechanism automatically focuses on discriminative regions, enhancing interpretability and aligning closely with human visual behavior [22][26]. Group 5: Broader Implications and Future Research - The findings from AdaptiveNN provide insights into cognitive science, particularly in understanding human visual behavior and the mechanisms behind visual decision-making [25]. - The architecture's application in embodied intelligence models shows significant improvements in reasoning and perception efficiency, suggesting a promising direction for future research in AI and cognitive science [29].
华为放出「准万亿级MoE推理」大招,两大杀手级优化技术直接开源
机器之心· 2025-11-28 04:11
机器之心报道 编辑:杜伟 2025 年已接近尾声,这一年里,大模型加速从单点提效工具升级为支撑业务系统的底层基础设施。过程中,推理效率决定了大模型能否真正 落地。对于超大规模 MoE 模型,复杂推理链路带来了计算、通信、访存等方面的挑战,亟需行业给出高效可控的推理路径。 华为亮出了面向准万亿参数 MoE 推理的完整技术栈:openPangu-Ultra-MoE-718B-V1.1 展现 MoE 架构的模型潜力、 包括 Omni Proxy 调度特 性、将昇腾硬件算力利用率推至 86% 的 AMLA 技术在内的昇腾亲和加速技术, 使得超大规模 MoE 模型具备了走向生产级部署的现实可行 性。开源实现: https://gitcode.com/ascend-tribe/ascend-inference-cluster# 如果说过去数年大模型竞争的焦点在训练规模与能力突破上,那么如今,推理效率正迅速成为影响模型能否落地的关键变量。 模型 GitCode 地址:https://ai.gitcode.com/ascend-tribe/openPangu-Ultra-MoE-718B-V1.1-Int8 从任务属性来看, ...
学术圈炸了!ICLR评审大开盒,原来低分是好友打的
机器之心· 2025-11-28 00:51
机器之心报道 昨晚不知有多少人彻夜未眠。 北京时间 11 月 27 日晚,国内 AI 社区全数炸锅。在学术论文审稿最常用的 OpenReview 平台上,一个前端 bug 导致数据库泄露,让原本的双盲评审变 成了明牌。 这次的信息泄露方法简单到了极致: 只要在浏览器上输入某个网址,自行替换你要看的 paper ID 和审稿人编号,你就可以找到对应的任何审稿人的身份 。你可以知道是谁给你审的论文,知道他 / 她给你打了多少分。 因为没有操作门槛,在传播开来之后,所有人都瞬间切换到了调查模式,毕竟这年头谁还和审稿人没点摩擦,终于可以「有冤报冤,有仇报仇」了。 这一下子,就造就了无数惊喜、惊吓,愤怒与哀嚎。微信群里,小红书上,到处都是受害者在讲故事,有开人的也有被开的。你永远猜不到给你的论文打低 分的是谁。 机器之心编辑部 审稿人打低分的理由各不相同,有的是没能理解作者原意,有的是个人恩怨(比如组里兄弟互相打低分),更加可恶的是给低分从而给自己正在写的同赛道 论文「让路」。有人就利用这次泄露事件实锤了自己曾经被打 1 分的论文,审稿人竟然在五个月后提交了另一篇论文,又不愿意 cite 作者的投稿。 真正的 open ...
大模型作为评估者的「偏好」困境:UDA实现无监督去偏对齐
机器之心· 2025-11-28 00:51
Core Insights - The article discusses the issue of preference bias in large language models (LLMs) acting as judges, highlighting that even advanced models like GPT-4o and DeepSeek-V3 exhibit systematic favoritism towards their own outputs, leading to significant discrepancies in scoring and ranking [2][4][5] - The introduction of Unsupervised Debiasing Alignment (UDA) offers a new approach to address this bias by allowing models to autonomously adjust scoring rules through unsupervised learning, thus achieving debiasing alignment [2][7] Summary by Sections Problem Statement - Current LLM judging systems, such as Chatbot Arena, face three main challenges: self-preference solidification, heterogeneity bias, and static scoring defects [4][5] - Self-preference solidification leads to models overestimating their own answers, creating a scenario where "who judges wins" [4] - Heterogeneity bias results in varying directions and intensities of bias among different models, ranging from aggressive self-promotion to excessive humility [4] UDA Contribution - UDA transforms the debiasing problem into a sequence learning issue that can be optimized through dynamic calibration, allowing judges to explore optimal scoring strategies autonomously [7][25] - The method utilizes a consensus-driven training approach, treating the collective agreement of judges as a practical optimization target, which helps reduce overall bias [13][18] Methodology - UDA models pairwise evaluations as an instance-level adaptive process, dynamically generating adjustment parameters for each judge model during comparisons [10][11] - The system extracts multiple features from each comparison, including semantic feature vectors and self-perception features, which are crucial for detecting bias tendencies [11][20] Experimental Results - UDA significantly reduces inter-judge variance, lowering the average standard deviation from 158.5 to 64.8, demonstrating its effectiveness in suppressing extreme biases [23] - The average Pearson correlation with human evaluations improved from 0.651 to 0.812, indicating enhanced alignment with human judgment [23] - UDA shows robust zero-shot transfer capabilities, achieving a 63.4% variance reduction on unseen datasets, showcasing its domain-agnostic debiasing ability [23] Conclusion - UDA represents a shift in how judgment calibration is approached, moving away from prompt engineering to a learnable problem, enhancing the robustness and reproducibility of evaluations while aligning more closely with human judgment [25]
DeepSeek强势回归,开源IMO金牌级数学模型
机器之心· 2025-11-27 12:13
Core Insights - DeepSeek has released a new mathematical reasoning model, DeepSeek-Math-V2, which surpasses its predecessor, DeepSeek-Math-7b, in performance, achieving gold medal levels in mathematical competitions [5][21]. - The model addresses limitations in current AI mathematical reasoning by focusing on self-verification and rigorous proof processes rather than merely achieving correct final answers [7][25]. Model Development - DeepSeek-Math-V2 is based on the DeepSeek-V3.2-Exp-Base architecture and has shown improved performance compared to Gemini DeepThink [5]. - The previous version, DeepSeek-Math-7b, utilized 7 billion parameters and achieved performance comparable to GPT-4 and Gemini-Ultra [3]. Research Limitations - Current AI models often prioritize the accuracy of final answers, which does not ensure the correctness of the reasoning process [7]. - Many mathematical tasks require detailed step-by-step deductions, making the focus on final answers inadequate [7]. Self-Verification Mechanism - DeepSeek emphasizes the need for comprehensive and rigorous verification of mathematical reasoning [8]. - The model introduces a proof verification system that allows it to self-check and acknowledge its mistakes, enhancing its reliability [11][17]. System Design - The system consists of three roles: a proof verifier (teacher), a meta-verifier (supervisor), and a proof generator (student) [12][14][17]. - The proof verifier evaluates the reasoning process, while the meta-verifier checks the validity of the verifier's feedback, improving overall assessment accuracy [14]. Innovative Training Approach - The proof generator is trained to self-evaluate its solutions, promoting deeper reflection and correction of errors before finalizing answers [18]. - An honest reward mechanism encourages the model to admit mistakes, fostering a culture of self-improvement [18][23]. Automation and Evolution - DeepSeek has developed an automated process that allows the system to evolve independently, enhancing both the proof generator and verifier over time [20]. - The model's approach shifts from a results-oriented to a process-oriented methodology, focusing on rigorous proof examination [20]. Performance Metrics - DeepSeek-Math-V2 achieved impressive results in competitions, scoring 83.3% in IMO 2025 and 98.3% in Putnam 2024 [21][22]. - The model demonstrated near-perfect performance in the Basic benchmark of the IMO-ProofBench, achieving close to 99% accuracy [22]. Future Directions - DeepSeek acknowledges that while significant progress has been made, further work is needed to enhance the self-verification framework for mathematical reasoning [25].
生成式AI赋能需求工程:一场正在发生的变革
机器之心· 2025-11-27 12:13
Core Insights - The article presents a systematic literature review on the application of Generative AI (GenAI) in Requirements Engineering (RE), highlighting its transformative potential and the challenges that need to be addressed for effective industrial adoption [4][51]. Research Growth - Research on GenAI in the RE field has shown exponential growth, with the number of relevant papers increasing from 4 in 2022 to 23 in 2023, and projected to reach 113 in 2024 [10][8]. - A total of 238 papers were reviewed, indicating a strong academic interest following the release of ChatGPT [8][10]. Research Focus Imbalance - The focus of research is heavily skewed towards certain phases of RE, with 30% dedicated to requirements analysis, while only 6.8% is focused on requirements management, indicating a lack of attention to complex socio-technical factors [11][9]. - GenAI is currently in a "rapid expansion but immature" phase, with a significant increase in quantity but insufficient depth in research [14]. Technical Landscape - A significant reliance on the GPT model family is observed, with 67.3% of studies using it, which limits exploration of diverse technological paths [16]. - GPT-4 is primarily used for complex requirement analysis, while open-source alternatives like CodeLlama are underutilized despite their lower hallucination rates [17][16]. Challenges Identified - The research identifies three core challenges: reproducibility (66.8%), hallucination (63.4%), and interpretability (57.1%), which are interrelated and must be addressed collectively [30][31]. - The lack of reproducibility is particularly problematic due to the random nature of large language models (LLMs) and their opaque APIs [30]. Evaluation Practices - There is a notable lack of standardized evaluation metrics in the RE field, with only 23.9% of studies releasing tools and 45.8% using non-public datasets [35][37]. - Traditional NLP metrics dominate the evaluation methods, failing to capture the complexity of RE tasks [33]. Industrial Adoption - The industrial adoption of GenAI in RE is lagging, with 90.3% of studies remaining at the conceptual or prototype stage, and only 1.3% achieving production-level integration [39][41]. - The value of GenAI in industry is seen in accelerating requirement documentation and reducing communication costs, but companies are hesitant due to compliance and risk control concerns [43]. Future Roadmap - A four-phase strategy is proposed for advancing GenAI in RE: strengthening evaluation infrastructure, governance-aware development, scalable context-aware deployment, and industrial-level standardization [46]. - Key areas for improvement include generalization capabilities, data quality, and evaluation methods [45]. Recommendations for Researchers and Practitioners - Researchers are encouraged to explore diverse models beyond GPT, develop hybrid architectures specific to RE, and focus on reproducibility [53]. - Practitioners should use GenAI as an auxiliary tool rather than a decision-maker, especially in low-risk tasks [53].
聚焦AI青年成长|2025浦东国际人才港论坛·人工智能产业人才论坛报名启动
机器之心· 2025-11-27 10:23
Core Viewpoint - Artificial intelligence (AI) is recognized as a core driver of technological revolution and industrial transformation, becoming a key component of national competitiveness. The emphasis is on the role of young talents in AI, who are seen as crucial for solving technical challenges and promoting the integration of AI across various industries [2][4]. Event Overview - The "Artificial Intelligence Industry Talent Forum" will be held on December 6, focusing on the theme "Youth Empowerment, Intelligence Gathering in Pudong." The forum will bring together university professors, young scientists, entrepreneurial pioneers, and industry leaders to discuss topics such as AI talent cultivation, the evolution of embodied intelligence, and youth entrepreneurship [2][4]. - Pudong is positioned as the first national pilot zone for AI innovation applications, with the Zhangjiang AI Innovation Town at its core, aiming to build a global AI talent hub through policy and ecological construction [4]. Agenda Highlights - The forum will feature a series of events including: - Opening remarks by the host [5] - Introduction of the Zhangjiang AI Innovation Town [5] - Signing ceremony for youth talent entrepreneurial enterprises [5] - Keynote speech on cultivating top talents in the AI era by Wang Yanfeng, Executive Dean of the AI Institute at Shanghai Jiao Tong University [5] - Panel discussion on the evolution of embodied intelligence and ecological construction [5][6] - Dialogue among young talents on bridging theoretical advancements with practical applications [6] - Release of the AI industry talent development trend report by the Shanghai AI Industry Association and Shanghai Pudong Talent Development Co., Ltd. [6] Guest Profiles - Notable speakers include: - Wang Yanfeng, a prominent figure in AI research and education, focusing on the intersection of AI with media and healthcare [11]. - Tan Yinliang, a professor with extensive experience in AI and digital economy [12]. - Su Yang, co-founder and chief AI architect at Lingxin Qiaoshou, specializing in cutting-edge AI technology [13]. - Wang Hongtao, founder and CEO of Jingzhi Technology, with expertise in robotics and AI [14]. - Li Guanghui, founder and CEO of BraneMatrix AI, with a strong background in internet and security industries [17]. - Chen Yuanpei, a young talent in AI, known for his work in reinforcement learning [18]. - Liu Bang, an associate professor with a focus on natural language processing and embodied learning [19].
无问芯穹完成近5亿元A+轮融资,加码Agentic Infra基础设施建设,引领智能体产业变革
机器之心· 2025-11-27 10:23
Core Viewpoint - The article highlights the recent completion of a nearly 500 million RMB A+ round financing by Wunwen Chinqu, emphasizing the strong market confidence in its AI infrastructure capabilities and its alignment with national strategic initiatives [1][12]. Group 1: Financing and Investment - Wunwen Chinqu has successfully raised nearly 500 million RMB in its A+ round financing, led by Zhuhai Technology Group and Futen Capital, with participation from various other investors [1]. - The financing round reflects a combination of state-owned and market-driven investment, showcasing recognition of Wunwen Chinqu's commitment to technological innovation and its role in the AI industry [1][12]. Group 2: Company Overview and Offerings - Founded two and a half years ago, Wunwen Chinqu focuses on developing high-performance AI infrastructure, optimizing both software and hardware to address the computational bottlenecks in AI applications [3][6]. - The company has created the "Wunqiong AI Cloud" and "Wuyin Terminal Intelligent Solutions," serving numerous leading AI enterprises and research institutions [3][9]. Group 3: Future Directions and Strategic Focus - The new funding will be allocated towards enhancing Wunwen Chinqu's technological advantages, expanding AI cloud products and terminal solutions, and increasing investment in intelligent infrastructure development [5][6]. - Wunwen Chinqu aims to build a first-class intelligent service platform and supporting infrastructure to facilitate the widespread application of intelligent agents in both digital and physical worlds [5][6]. Group 4: Technological Innovations - The company has developed a comprehensive "Intelligent Agent Infrastructure" that integrates AI cloud and terminal intelligence, enabling significant advancements in the application of intelligent agents [7][9]. - Recent product launches include the Infra Agents for cloud infrastructure and Kernel Mind for terminal reasoning optimization, aimed at making intelligent agents a fundamental resource across various industries [11][12]. Group 5: Market Position and Vision - Wunwen Chinqu is positioned as a leader in AI infrastructure, with a strong emphasis on the integration of digital and physical worlds through intelligent agents [7][12]. - The company’s strategic focus on creating a robust ecosystem for intelligent agents aligns with national goals for developing a more autonomous AI industry [13].