Workflow
机器之心
icon
Search documents
离了大谱,21%的ICLR 2026审稿意见竟是AI生成的?官方回应来了
机器之心· 2025-11-17 03:19
Core Insights - The article discusses the significant presence of AI-generated content in the review process for ICLR 2026, highlighting a trend where a substantial portion of review comments are created by AI [2][11]. Group 1: AI Usage in Paper Reviews - A systematic analysis of 75,800 review comments revealed that 21% were fully generated by AI, while 4% were heavily edited by AI, 9% moderately edited, and 22% lightly edited, with only 43% being fully human-written [2][11]. - AI-generated reviews tend to be longer by 26% and score higher on average, with fully AI-generated reviews averaging a score of 4.43 compared to 4.13 for fully human-written reviews [11]. - The average confidence level for fully AI-generated reviews is slightly higher, indicating a tendency to provide more confident evaluations [12]. Group 2: Implications and Responses - The ICLR 2026 organizing committee acknowledged the issue of low-quality reviews generated by AI and is considering appropriate measures, including marking and reporting such reviews [18]. - Suggestions for handling AI-generated reviews include removing poor evaluations and designating the reviewers as having failed their responsibilities, which could lead to automatic rejection of their submissions [18]. - Pangram Labs' analysis indicates that 39% of submitted papers utilized AI in some capacity, with a correlation between higher AI usage and lower average scores [8].
arXiv开始拒收综述论文了?「论文DDoS」这事,这篇NeurIPS论文早有讨论
机器之心· 2025-11-17 03:19
Core Viewpoint - The article discusses a significant update from arXiv, requiring all review and position papers in the computer science category to undergo peer review before submission, primarily due to the overwhelming influx of AI-generated content [2][8]. Group 1: The Crisis of AI-Generated Papers - The term "Survey Paper DDoS attack" is introduced to describe the overwhelming number of low-quality AI-generated survey papers flooding the academic community [5][20]. - The increase in AI-generated content has led to a situation where valuable insights are obscured, akin to a denial-of-service attack, making it difficult for researchers to access meaningful academic contributions [7][21]. Group 2: Quantitative Evidence of the Surge - A study analyzed 10,063 survey papers from arXiv between 2020 and 2024, revealing a significant spike in submissions post-2022, coinciding with the rise of generative AI tools like ChatGPT [10][12]. - The average AI-generated score has more than doubled, indicating that AI is a primary driver of this growth [13]. - There has been a notable increase in suspicious publishing behavior, with authors publishing multiple papers in a short time frame, suggesting AI-assisted bulk production [14]. Group 3: Detrimental Effects on Academic Integrity - AI-generated reviews are not merely noise; they pose a serious threat to the academic ecosystem by introducing low-quality, redundant content [16][19]. - Traditional expert-written reviews provide critical insights, whereas AI-generated reviews often lack structure, innovative classification, and can contain inaccuracies [17][18]. - The phenomenon of "literature poisoning" occurs when new researchers rely on flawed AI-generated reviews, potentially embedding incorrect academic foundations [19]. Group 4: Proposed Solutions - The article suggests that arXiv's new regulations are a necessary but reactive measure against the crisis [23][25]. - The authors propose a shift towards "Dynamic Live Surveys" (DLS), which would create a community-maintained online knowledge base, allowing for real-time updates and reducing redundancy [24]. - Recommendations include stricter review processes, transparency in AI usage, and incentivizing high-quality reviews to combat the influx of low-quality submissions [26].
这届NeurIPS 2025太有看头了!11月22日北京见
机器之心· 2025-11-16 07:30
Core Insights - The evolution of AI is transitioning from "capability breakthroughs" to "system construction" by 2025, focusing on reliability, interpretability, and sustainability [2] - NeurIPS 2025 will be held from December 2 to 7 in San Diego, USA, with a record of 21,575 submissions and an acceptance rate of 24.52%, indicating a growing global AI academic ecosystem [2] - The event aims to serve the Chinese AI community through various activities, including keynote speeches, paper sharing, roundtable discussions, and poster sessions [3] Event Details - The "NeurIPS 2025 Paper Sharing Conference" will take place on November 22, 2025, from 09:00 to 17:30 at the Crowne Plaza Hotel in Zhongguancun, Beijing [5][6] - The agenda includes keynote speeches, paper presentations, and poster exchanges, providing a platform for academic and industry collaboration [3][10] Keynote Speakers - The morning keynote will be delivered by Professor Qiu Xipeng from Fudan University, focusing on "Contextual Intelligence: Completing the Key Puzzle of AGI" [14][16] - The afternoon keynote speaker is Fan Qi from Nanjing University, with the topic yet to be determined [17] Paper Presentations - A variety of papers will be presented, covering topics such as data mixing, multimodal adaptation, and reinforcement learning in large language models [9][11][23] - Notable presentations include "Data Mixing Can Induce Phase Transitions in Knowledge Acquisition" and "Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model" [9][11]
Lumina-DiMOO:多模态扩散语言模型重塑图像生成与理解
机器之心· 2025-11-16 04:01
Core Viewpoint - Lumina-DiMOO is an innovative multimodal generative language model that utilizes discrete diffusion modeling to bridge the gap between various multimodal tasks, enabling a seamless integration of text-to-image, image-to-image, and image-to-text capabilities [2][11]. Group 1: Historical Context - Traditional autoregressive models, such as Chameleon and Janus-Pro, face significant limitations including slow generation speed, constrained quality in high-resolution image generation, and a lack of seamless task integration [7]. Group 2: Current Innovations - Lumina-DiMOO employs a pure discrete diffusion framework, addressing the limitations of previous models by enhancing generation speed and quality through parallelized bidirectional attention mechanisms and flexible sampling strategies [9][11]. Group 3: Key Features - **Discrete Diffusion Architecture**: This architecture allows for efficient operation of image generation and understanding tasks within a single framework, breaking down traditional boundaries between generation and understanding [12]. - **Efficient Generation**: By processing multiple tokens simultaneously, Lumina-DiMOO accelerates inference and improves quality, ensuring effective collaboration between tasks [15]. - **Bidirectional Attention Mechanism**: This feature enhances the model's ability to understand contextual relationships in text and capture structural details in images, ensuring high consistency across multimodal tasks [17]. - **Joint Optimization**: The model utilizes a global optimization strategy during training, enhancing performance across various tasks and ensuring seamless transitions between them [18]. - **Max-Logit Caching Technology**: This innovation significantly boosts generation efficiency by caching stable tokens, reducing unnecessary computations and maintaining high-quality outputs, especially in high-resolution tasks [20]. Group 4: Advanced Learning Framework - **Self-GRPO Framework**: This new self-reinforcement framework integrates image generation and multimodal understanding into a single reinforcement learning trajectory, allowing the model to learn from its outputs and improve iteratively [22][23]. Group 5: Performance and Recognition - Lumina-DiMOO has achieved top rankings in several authoritative evaluations, demonstrating its superiority in semantic consistency, layout understanding, and reasoning capabilities compared to leading models like GPT-4o and Janus-Pro [29].
首发 | 陈天桥盛大团队,推出最强开源记忆系统EverMemOS
机器之心· 2025-11-16 04:01
Core Viewpoint - EverMind has launched EverMemOS, a world-class long-term memory operating system for AI agents, aiming to provide AI with a persistent, coherent, and evolving "soul" [1][4]. Group 1: Memory Capability - The limitation of fixed context windows in LLMs leads to frequent "amnesia" in AI during long-term tasks, resulting in memory breaks and factual inconsistencies, which diminishes the application value of AI [4]. - Long-term memory is becoming a core competitive advantage in AI applications, marking a shift from AI as a "tool" to an "agent" capable of proactive evolution [5]. Group 2: Industry Trends - Major industry players like Claude and ChatGPT have introduced long-term memory as a strategic feature, indicating a clear industry trend towards memory as a critical capability for future AI applications [5]. - Current attempts to address memory issues, such as RAG and emerging memory systems, are often fragmented, highlighting the need for a comprehensive, usable memory system that can support various scenarios [5]. Group 3: Inspiration and Design - EverMind's design of EverMemOS is inspired by human brain memory mechanisms, aiming to allow AI to think, remember, and grow like humans [7][14]. - The system's architecture is based on a four-layer design that parallels key functions of the human brain, enhancing its memory capabilities [19][22]. Group 4: Technical Performance - EverMemOS has achieved significant breakthroughs in both scene coverage and technical performance, being the first memory system to support both one-on-one conversations and complex multi-user collaboration [15]. - The system scored 92.3% and 82% on the LoCoMo and LongMemEval-S benchmarks, respectively, surpassing previous state-of-the-art levels [17]. Group 5: System Features - EverMemOS is not just a memory "database" but a "memory processor," addressing the core pain point of existing methods that only retrieve information without utilizing it effectively [23]. - The system features innovative "layered memory extraction" and dynamic organization, allowing for structured memory that captures implicit context [23][26]. - It introduces the first scalable modular memory framework, adaptable to various memory needs across different scenarios, ensuring optimal memory organization and application strategies [26]. Group 6: Open Source and Future Plans - EverMind has released an open-source version of EverMemOS on GitHub for developers and AI teams to deploy and test [28]. - A cloud service version is expected to be launched later this year, providing enhanced technical support and data persistence for enterprise users [28].
WithAnyone重磅开源:这可能是你见过最自然的AI合照模型
机器之心· 2025-11-16 04:01
Core Viewpoint - Fudan University collaborates with Jieyue Xingchen to launch a new AI photo generation model called WithAnyone, which allows users to generate natural and seamless AI photos by simply uploading a picture [2][4]. Group 1: WithAnyone Overview - WithAnyone is a personalized AI photo generation method that can create various angles and expressions of a person from a single photo, or generate a group photo with multiple individuals without any sense of incongruity [4]. - Previous models like InstantID and PuLID faced limitations in generating varied expressions and angles, often resulting in a "copy-paste" effect [5]. Group 2: Breakthrough Features - WithAnyone breaks the "copy-paste" curse by achieving both ID consistency and controllability [13]. - The model's effectiveness is demonstrated through impressive group photos, showcasing the ability to harmoniously combine multiple individuals in a single image [22]. Group 3: Problem Identification and Solution - The research team identified that existing AI portrait generation methods often resulted in images that were too similar, leading to a lack of diversity in generated outputs [26]. - To quantify this issue, the team introduced MultiID-Bench and a "copy-paste" metric to measure the distance between generated results and reference inputs [27][29]. Group 4: Data and Training Innovations - The team collected a dataset of 500,000 group photos, each with hundreds of different angles and expressions, along with an additional million unpaired photos for training [31]. - The training process involved traditional reconstruction training followed by paired data training and fine-tuning with high-quality data to develop the WithAnyone model [34]. Group 5: Open Source and Community Engagement - WithAnyone has been fully open-sourced, providing access to code, model weights, sample datasets, and evaluation benchmarks to facilitate community replication and expansion [36]. - The project aims to enhance the emotional and narrative quality of AI-generated photos, encouraging users to create personalized images with the technology [36].
离谱:打造超10亿美元的独角兽,从真人假扮成AI开始
机器之心· 2025-11-16 04:01
Core Insights - The article discusses the unconventional startup journey of Fireflies.ai, which began with two entrepreneurs manually pretending to be an AI assistant to validate their business idea [7][10]. Group 1: Company Background - Fireflies.ai was founded by two entrepreneurs who had previously experienced six failed startups before pivoting to create an AI meeting assistant [2][3]. - Initially, the founders had no actual AI technology and instead participated in meetings themselves, taking notes and later presenting them as AI-generated [4][6]. Group 2: Business Model and Growth - The founders' approach of pretending to be AI allowed them to validate their business model and generate enough revenue to sustain their operations, eventually leading to the automation of their services [6][9]. - Fireflies.ai has achieved a valuation exceeding $1 billion, with a user base of over 20 million across 500,000 organizations, and has been profitable since 2023 [9]. Group 3: Product Features - The AI assistant now boasts a transcription accuracy of up to 95%, supports 69 languages, and offers features like intelligent summarization and seamless integration with other tools [9]. Group 4: Ethical Concerns - The initial method of using humans to impersonate AI raised significant ethical concerns, including issues of user privacy, potential data security risks, and the implications of misleading clients [13][14][15]. - Critics have pointed out that this approach could lead to legal repercussions and a culture of deception within the industry [17][18].
LLM 语境下,「持续学习」是否是 「记忆」 问题的最优解?
机器之心· 2025-11-16 01:30
Group 1 - The article discusses the concept of "Nested Learning" proposed by Google, which aims to address the memory management issues in LLMs (Large Language Models) and the challenges of catastrophic forgetting [5][6][8] - Nested Learning is presented as a multi-layered optimization problem, where models are seen as a series of interconnected sub-problems, allowing for the simultaneous learning of new skills while avoiding the loss of previously acquired knowledge [6][7] - The research introduces the "Continuous Memory System" (CMS), which treats memory as a system of multiple modules that update at different frequencies, enhancing the model's ability to manage memory effectively [6][7] Group 2 - The article highlights the importance of improving LLMs' memory capabilities to enable continual learning, allowing AI to retain contextual experiences, semantic knowledge, and procedural skills [8] - A proposed three-layer memory architecture includes Model Weights for general knowledge, KV Cache for intermediate results, and Context for relevant background information, facilitating appropriate responses from the model [8]
NeurIPS 2025 Spotlight | NYU提出QSVD,仅数学压缩让模型更轻、更快、更稳
机器之心· 2025-11-15 09:23
Core Insights - The article discusses the development of QSVD, a novel framework for efficient compression of Vision-Language Models (VLM) that combines singular value decomposition (SVD) and quantization, aiming to reduce computational costs while maintaining model performance [3][29]. Group 1: Background and Motivation - Vision-Language Models (VLM) serve as a crucial engine connecting visual understanding and language generation, enabling applications like image description and visual question answering [2]. - The large parameter size of these models, often exceeding billions, leads to significant memory and computational demands, making practical deployment challenging [2][6]. Group 2: QSVD Framework - QSVD employs a unique approach of Joint SVD over Query-Key-Value (QKV) matrices, allowing for a unified low-rank approximation that reduces storage and computation requirements [10][24]. - The framework introduces Cross-layer Rank Allocation, which intelligently allocates ranks based on the importance of different layers, optimizing the compression process [13][14]. Group 3: Technical Innovations - QSVD integrates low-bit quantization and outlier smoothing techniques to enhance hardware efficiency and maintain high accuracy during the quantization process [15][18]. - The method significantly reduces memory usage by only caching a shared representation of K/V values, halving the memory footprint during inference [12][19]. Group 4: Experimental Results - The research team conducted evaluations on various models, including LLaVA-v1.5 and SmolVLM, demonstrating that QSVD achieves over 10% higher accuracy compared to existing methods like ASVD and SVD-LLM [20][22]. - The results indicate that QSVD not only compresses models but also enhances their intelligence, with inference speed improvements of up to 13 times [23][19]. Group 5: Conclusion and Future Directions - QSVD represents a significant advancement in the efficient compression of VLMs, focusing on self-attention layers to improve inference efficiency while minimizing accuracy loss [29]. - Future research aims to extend optimizations to cross-module joint compression and adaptive optimization, enhancing the deployability and accessibility of powerful models [29].
通向算力自由:openEuler发布全球首个超节点操作系统,专为AI打造
机器之心· 2025-11-15 09:23
Core Viewpoint - The conference on operating systems, themed "Intelligent Leap Without Boundaries, Open Source for a Better Future," successfully gathered industry leaders to promote the development of the openEuler operating system and accelerate the global open-source software ecosystem [2]. Group 1: Development and Growth of openEuler - The openEuler community has grown significantly over the past six years, with over 2,100 member organizations and more than 23,000 global contributors, serving over 5.5 million users [2]. - The cumulative installation of openEuler is expected to exceed 16 million sets by the end of 2025, establishing it as the preferred operating system for digital transformation in various industries [2]. - The community is set to embark on a new five-year development path, launching an operating system tailored for supernodes by the end of 2025, aiming to lead in the AI era and enhance globalization efforts [2][12]. Group 2: Strategic Importance of Basic Software - Academician Ni Guangnan emphasized the strategic nature of basic software, advocating for independent innovation, collaborative ecosystem building, and sustained long-term investment [3]. - The transition to supernodes is recognized as a mainstream trend in computing infrastructure, with operating systems playing a crucial role in connecting hardware and applications in the intelligent era [3]. Group 3: Collaboration and Ecosystem Building - The core of open-source is collaboration, and the future of the ecosystem relies on co-creation and sharing among hardware partners, software vendors, and global developers [5]. - Huawei's CEO highlighted the rapid transformation brought by AI technologies and the need for operating systems that can support supernodes, contributing to the openEuler community with key capabilities [6][10]. Group 4: Technological Innovations and Solutions - The openEuler community has introduced the Intelligence BooM full-stack open-source AI solution, enhancing inference efficiency by 10% to 30% through heterogeneous collaboration [16]. - In the new industrial automation sector, openEuler has evolved its embedded capabilities, achieving microsecond response times and successfully implemented in various well-known enterprises [16]. Group 5: Globalization Efforts - New donors to the openEuler community include major chip manufacturers like AMD, further strengthening the community's resources [18]. - The community has established deep technical cooperation with 15 global open-source organizations in areas such as AI, cloud computing, and embedded systems, enhancing its global presence [20].