量子位 - filings, earnings calls, financial reports, news

量子位

Search documents

无需重训练+即插即用+性能零损耗，蚂蚁集团×南洋理工首发微调安全框架，让模型既安全又高效

量子位· 2025-11-19 06:20

最近研究表明，模型的微调过程会严重削弱安全对齐能力，也就是说，模型能力越强反而越危险。 EnchTable团队投稿量子位 | 公众号 QbitAI 无需重新训练，也能一键恢复模型的安全意识了。于是蚂蚁集团联合南洋理工大学针对性推出了模型安全对齐框架—— EnchTable ，可以让模型在微调后依旧保持安全意识。通过安全蒸馏+干扰感知融合两大核心技术，在多个模型架构与任务中实现了安全与效用的最佳平衡，甚至在抗攻击能力上超越了官方 Instruct安全模型。而且即插即用，完全不影响模型性能。详细内容如下：安全对齐具有"可迁移性" 目前陆续出现了多起有关微调模型安全能力下降的事件，其根本问题在于当前的安全对齐机制无法随模型微调而持续生效。对此，研究团队认为：安全对齐 (Safety Alignment) 本身是一种具有高度可迁移性 (transferability) 的知识。这意味着不需要在每个微调模型上都"重新学习"一遍安全，而是可以将"安全"作为一种独立的知识模块，从一个已对齐的模型中"提取"出来，再"注入"到另一个模型中。而这一发现则将问题从"昂贵的重新训练" 转变为"高效 ...

融资数亿、营收过亿！黄仁勋频频关注的具身赛道隐形冠军浮出水面

量子位· 2025-11-19 06:20

Core Viewpoint - The recent financing of the AI company, Guanglun Intelligent, has sparked significant interest due to its focus on simulation synthetic data, which is crucial for embodied intelligence and physical AI world models [1][2][5]. Group 1: Company Overview - Guanglun Intelligent has completed a financing round of several hundred million yuan, with investors including Oriental Fortune and Jiupai Capital, as well as industry players like 37 Interactive Entertainment and Huobo Capital [2][3]. - The company has established partnerships with major clients such as NVIDIA, Google, Alibaba, and BYD, indicating its integral role in the AI ecosystem [3][39]. - Guanglun Intelligent is recognized as the only global company specializing in simulation synthetic data, with revenue exceeding 100 million yuan [3][47]. Group 2: Market Trends - The AI wave is expanding from information fields to physical reality, with a focus on world models and embodied intelligence as key paths to bridge AI and the physical world [6][9]. - There is a growing demand for high-quality synthetic data to train world models and embodied intelligence models, as traditional data sources face limitations [10][20]. Group 3: Data Importance - Simulation synthetic data is identified as the most suitable solution for the data needs of both embodied intelligence and world models [22]. - The industry is witnessing a paradigm shift where simulation synthetic data is becoming a foundational element rather than a supplementary resource [26]. Group 4: Competitive Advantage - Guanglun Intelligent has positioned itself as a hidden champion in the data sector, having completed the first round of technical validation and product standardization [31][32]. - The company is deeply involved in the development of simulation systems and data standards, making its synthetic data capabilities integral to the training processes of world models [34][39]. Group 5: Financial Performance - Guanglun Intelligent's annual revenue has reportedly surpassed 100 million yuan, reflecting its growth speed and delivery capabilities [47][48]. - The company has experienced a tenfold increase in revenue compared to the previous year, indicating strong market demand for simulation synthetic data [50]. Group 6: Future Outlook - The recent financing aims to expand supply and enhance scalable delivery capabilities, signaling a shift in the industry towards a data-driven performance phase [52][54]. - Guanglun Intelligent is set to build a data infrastructure for physical AI, recognizing the long-term and dynamic nature of training needs in the industry [57][61].

文献、报告、合同翻译的老大难被国产工具治了？三大翻译神器横评后，这家稳得离谱

量子位· 2025-11-19 06:20

Core Viewpoint - The article discusses the advantages of Baidu's "Document Translation" tool, particularly in academic settings, highlighting its superior translation accuracy, formatting preservation, and integrated AI assistance compared to competitors like Google Translate and DeepL [1][3][59]. Translation Capability Comparison - Baidu's "Document Translation" offers specialized translation models for over 10 professional fields, including academic papers, legal documents, and news, making it more user-friendly for specific needs compared to Google and DeepL, which lack such differentiation [8][17]. - The tool boasts a professional translation accuracy rate of 90%, effectively capturing the nuances of academic language, which is crucial for users dealing with complex terminologies [17][22]. AI Assistance Features - The integrated AI assistant in Baidu's tool can summarize content, answer specific questions about the text, and provide explanations for technical terms, enhancing the user experience significantly [26][30][36]. - Users can interact with the AI to clarify difficult sections of the text, making the translation process more intuitive and less daunting [28][32]. Formatting and Editing Capabilities - Baidu's "Document Translation" excels in maintaining the original document's formatting, achieving a near 1:1 restoration of the layout, which is critical for academic papers that often include complex structures like tables and figures [43][46]. - The tool allows for extensive post-translation editing, enabling users to modify text directly within the translated document, which is not supported by DeepL and is limited in Google Translate [52][55]. Overall User Experience - The comprehensive features of Baidu's translation tool cater to the needs of students and professionals, making it a preferred choice for those who require efficient and accurate translations without the hassle of manual corrections [57][58]. - The article concludes that Baidu's "Document Translation" is the closest to an ideal translation tool, effectively integrating into the workflow of users in academic and professional environments [59][60].

Gemini 3打服奥特曼马斯克，谷歌CEO却在担心AI泡沫

量子位· 2025-11-19 05:02

Core Viewpoint - Google CEO Sundar Pichai has expressed concerns about the potential "AI bubble," indicating that the current investment frenzy in AI may contain "irrational factors" and that no company will be immune if the bubble bursts [3][29]. Group 1: AI Investment Trends - Major tech companies are significantly increasing their investments in AI, with Meta projecting capital expenditures between $70 billion and $72 billion for 2025, up from previous estimates [10]. - Microsoft reported a total capital expenditure of $34.9 billion as of September 30, exceeding analyst expectations and previous quarter figures [15]. - Alphabet, Google's parent company, has raised its revenue forecast for the year from $85 billion to between $91 billion and $93 billion, nearly double its projected capital expenditures for 2024 [18]. Group 2: AI Valuations and Financial Performance - Nvidia has become the first company to surpass a market capitalization of $5 trillion, highlighting the financial impact of the AI boom [20]. - OpenAI's valuation has surged to $500 billion following a secondary share sale, reflecting a 67% increase from its previous valuation of $300 billion [23]. - Despite its high valuation, OpenAI reported a quarterly loss of $11.5 billion, which has negatively impacted Microsoft's financial performance, reducing its net profit and EPS by $3.1 billion and $0.41 per share, respectively [24][26]. Group 3: Cautionary Perspectives - Pichai has drawn parallels between the current AI investment climate and the internet bubble of 2000, suggesting that while there is excitement around AI, it is essential to recognize the potential for over-investment [29]. - He emphasized the importance of not blindly trusting AI outputs and advocated for the use of additional tools, such as Google Search, to verify information [33][35]. - Pichai acknowledged the rapid pace of technological development and the need for responsible management of its potential harmful effects, stating that companies must be both bold and responsible [37].

周靖人署名，通义实验室开源智能体自进化系统：让模型学会“自我反思”，14B也能越级打怪

量子位· 2025-11-19 05:02

Core Insights - The article discusses the launch of AgentEvolver, a self-evolving intelligent agent system developed by Alibaba, which significantly enhances the performance of AI models in complex tasks [2][4]. Performance Improvement - AgentEvolver has improved the average completion rate of a 14B model from 29.8% to 57.6%, nearly doubling its performance [4]. - In a smaller 7B model, the average completion rate increased from 15.8% to 45.2%, demonstrating the framework's versatility across different model sizes [5]. - The system has shown the ability to outperform larger models (e.g., 32B models) in specific tasks after optimization [5]. Learning Efficiency - AgentEvolver exhibits rapid convergence in learning efficiency, requiring significantly fewer training steps to reach 90% of baseline model performance—55.6% fewer steps in AppWorld tasks and 66.7% fewer in BFCL tasks [7][8]. - This efficiency leads to reduced training time and computational costs [8]. Cross-Domain Generalization - Models trained on synthetic data maintain high performance when applied to new, unseen domains, indicating strong cross-domain generalization capabilities [9][11]. - For instance, a model trained on AppWorld tasks performed well on BFCL tasks with minimal performance degradation [10]. Self-Evolution Mechanism - AgentEvolver utilizes a data-exploration-feedback automated process to achieve self-evolution, driven by three core mechanisms: self-questioning, self-navigating, and self-attributing [13][20]. - The self-questioning mechanism allows the system to generate challenging tasks autonomously, breaking reliance on external data [21][23]. - The self-navigating mechanism enhances exploration efficiency by leveraging past experiences to guide current decision-making [24][28]. - The self-attributing mechanism provides fine-grained feedback on each action taken, improving sample efficiency in strategy optimization [30][33].

智能体自进化

Artificial Intelligence

AgentEvolver

智能体自进化

Artificial Intelligence

AgentEvolver

谷歌抢跑L3级AI，Gemini连续工作40分钟，Agent自动生成评审百条创意

量子位· 2025-11-19 01:37

Core Insights - Google is advancing towards L3 AI with its Gemini system, which can autonomously execute tasks for extended periods, marking a significant step in AI development [27][30][32]. Group 1: Gemini's Capabilities - Gemini can continuously operate for 40 minutes on a single task, showcasing its ability to handle complex processes [2][19]. - The system generates over 100 creative ideas based on user input, which are then evaluated and ranked by multiple agents, providing structured feedback [3][15]. - Users only need to make final decisions, as the exploration and iteration processes are managed by the agents, significantly reducing the time spent on refining outputs [4][11]. Group 2: Multi-Agent System - The multi-agent competition system integrates long-term thinking and adversarial generation, enhancing the quality of outputs by utilizing time effectively [10][12]. - This system allows for a comprehensive generation, competition, and selection process, resulting in a well-rounded set of ideas presented to users [15][20]. - Gemini for Enterprise includes applications for creative generation and collaborative research, demonstrating its versatility in different contexts [18][26]. Group 3: Future of AI - The development of L3 AI is characterized by the ability to autonomously run tasks over extended periods, with Gemini's capabilities aligning closely with this definition [30][32]. - Speculations suggest that future agents may be able to operate for even longer durations, potentially up to 3 hours by next year [33]. - As collaborative research features evolve, Gemini may reach L4 AI status, further enhancing its capabilities [37].

谷歌Gemini 3把GPT-5.1打成计量单位！马斯克奥特曼都服了

量子位· 2025-11-19 01:37

Core Insights - Google Gemini 3 Pro shows significant advancements over its predecessor, Gemini 2.5 Pro, outperforming GPT-5.1 and Claude 4.5 in nearly all benchmark tests, including academic reasoning and visual reasoning puzzles [1][2]. Benchmark Performance - In "Humanity's Last Exam," Gemini 3 Pro scored 37.5% without tools and 45.8% with search and code execution, compared to 21.6% for Gemini 2.5 Pro [2]. - For the ARC-AGI-2 visual reasoning puzzles, Gemini 3 Pro achieved 31.1%, a substantial increase from 4.9% in Gemini 2.5 Pro [2]. - In mathematics, Gemini 3 Pro scored 95.0% in AIME 2025 without tools and achieved a perfect score of 100% with code execution [2]. - The LiveCodeBench Pro benchmark saw Gemini 3 Pro with an Elo Rating of 2,439, significantly higher than Gemini 2.5 Pro's 1,775 [2]. Model Evolution - The Gemini series has evolved significantly, with each generation addressing the shortcomings of the previous one. The first generation established multimodal capabilities, while the second focused on decision-making and planning [15][18]. - Gemini 2.5 introduced a reasoning engine for deeper reasoning and problem-solving, leading to the current generation, which integrates multimodal, reasoning, and agent capabilities [19][20]. User Interaction and Usability - Gemini 3 Pro is designed to understand user intent better, allowing for more straightforward interactions without the need for complex prompts [21]. - The model can seamlessly process text, images, videos, audio, and code, enhancing its usability across various applications [23]. Development Platform - Google introduced the Antigravity platform alongside Gemini 3 Pro, aimed at simplifying the development process for AI agents, allowing developers to focus on higher-level tasks [29][33]. - Antigravity supports multiple models, including third-party options, and has attracted significant developer interest due to its generous rate limits [33]. Future Developments - A more advanced version, Gemini 3 Deep Think, is in development, promising further enhancements in capabilities [13][14].

Artificial Intelligence

智能体

Artificial Intelligence

Gemini 3

Google Antigravity

GPT-5.1

Artificial Intelligence

智能体

Artificial Intelligence

Gemini 3

Google Antigravity

GPT-5.1

30秒，我用蚂蚁灵光复刻了个支付宝（doge）

量子位· 2025-11-18 09:00

Core Viewpoint - Ant Group has launched a new all-modal general AI assistant called Lingguang, which aims to provide a comprehensive solution for generating interactive applications and content across various formats, including 3D, audio, video, and more [1][3]. Group 1: Features of Lingguang - Lingguang allows users to create a personalized app in as little as 30 seconds, offering editable, interactive, and shareable content [3]. - The app includes three main functionalities: Lingguang Dialogue, Lingguang Flash Apps, and Lingguang Open Eye [5]. - The Lingguang Dialogue feature simplifies complex questions into clear answers, showcasing its ability to generate structured and visually appealing content [7][11]. Group 2: User Experience - Users can interact with the app to generate content, such as a comprehensive guide on world models, with well-organized text and visuals [10][11]. - The app supports various creative modifications, allowing users to animate images and customize styles easily [13][15]. - The Lingguang Flash Apps feature enables users to create simple applications, such as a virtual pet simulator, demonstrating the app's playful and engaging nature [19][20]. Group 3: Technical Aspects - Lingguang utilizes a multi-agent architecture for content generation, where each modality has a dedicated agent to collaborate dynamically [35]. - The app's coding capabilities are designed to be user-friendly, connecting static front-end interfaces with backend model calls [37]. - The Lingguang Open Eye feature employs AGI camera technology for real-time object recognition and understanding, supporting various creative modes [39]. Group 4: Comparison with Qianwen - Lingguang and Qianwen, both from Ant Group, differ significantly in their underlying models and focus areas; Lingguang emphasizes all-modal generation and lightweight applications, while Qianwen focuses on traditional dialogue scenarios and deep thinking capabilities [40][42]. - Lingguang is more suited for diverse interactions, while Qianwen is better for text processing and office workflow assistance [42][43]. Group 5: Ant Group's Strategic Direction - Ant Group is expanding its presence in the AGI space, with initiatives like the AI Medical Assistant AQ and the establishment of Lingbo Technology for robotics and AI interaction [44][47]. - The company aims to transform into an AI-driven tech firm, leveraging its financial services expertise while focusing on low-threshold, multi-modal applications for consumer-facing scenarios [50].

量子位· 2025-11-18 09:00

Core Insights - The article emphasizes the transformative potential of AI technology in enhancing social innovation, production efficiency, and quality of life [1] - It highlights the need for precise identification of application areas and timely insights to leverage the benefits of AI advancements [2] Group 1: AI Trends and Reports - The "Top Ten Trends Series Report" has been published annually for five years, summarizing and forecasting technology trends, and is recognized as a key reference in the tech industry [3] - Starting in 2024, the report will focus on identifying ten AI trends that are showing significant potential, including developments in new architectures, reasoning capabilities, world models, spatial intelligence, and multi-modal applications [3] - The report aims to help stakeholders recognize technological changes and engage in innovation, thereby riding the wave of transformation [3] Group 2: Collaboration and Participation - The report seeks to involve more technology partners from various sectors, including research, investment, and entrepreneurship, to share insights and predictions about the AI field [7] - Participating partners will be recognized as official collaborators in the "2025 Annual AI Trends Report," gaining media exposure and acknowledgment for their products and cases [8] - The report is set to be released at the "2026 MEET Intelligent Future Conference" in December [9] Group 3: Call for Contributions - The article invites contributions from various entities, including research institutes, venture capital firms, and tech startups, to share their insights on AI trends and noteworthy institutions, products, and cases [10] - The deadline for submissions is November 20, 2025 [12]

Artificial Intelligence

AI视频进入“加速度”时代：30%加速＋细节随手P，等等党和抽卡党都有救了！

量子位· 2025-11-18 06:00

Core Insights - The article discusses the launch of the upgraded version of "拍我AI" (PixVerse) called V5 Fast, which significantly enhances video generation speed and introduces a new editing feature called "Modify" that allows for real-time video editing and customization [7][49][51]. Group 1: Video Generation and Editing Features - The V5 Fast version improves video generation speed by over 30%, allowing users to produce a 5-second high-definition video in under a minute [49][50]. - The "Modify" feature enables users to make precise edits to videos without needing to regenerate the entire content, addressing a major pain point in the current AI video market [9][10][13]. - Users can now replace elements within videos, such as characters and backgrounds, while maintaining the overall consistency and quality of the original footage [18][20][23]. Group 2: Market Demand and User Experience - The demand for editable AI video content has become a pressing need in the market, as traditional AI video tools often require complete regeneration for minor changes [8][9]. - The new capabilities allow both professional content creators and everyday users to have greater control over their video projects, making the technology more accessible [45][51]. - The article emphasizes that AI video creation is evolving from a one-time generation process to a more iterative and user-friendly experience, enabling users to refine and enhance their videos easily [45][52]. Group 3: Company Growth and Industry Impact - The company behind "拍我AI" has seen significant growth, with over 100 million users and a monthly active user count exceeding 16 million, reflecting its rapid commercialization and adoption [51]. - The recent funding round of 100 million RMB and multiple model iterations highlight the company's commitment to innovation and maintaining a competitive edge in the AI video generation space [50][51]. - The advancements in video generation speed and editing capabilities position the company as a leader in the AI video market, catering to both individual and commercial needs [51][52].

AI视频

AI Video

拍我AI（PixVerse）V5 Fast

「Modify」精修功能

AI视频

AI Video

拍我AI（PixVerse）V5 Fast

「Modify」精修功能

Previous Next