Workflow
历史首次!o3找到Linux内核零日漏洞,12000行代码看100遍揪出,无需调用任何工具
量子位·2025-05-25 03:40

Core Viewpoint - The article discusses the successful identification of a Linux kernel zero-day vulnerability using the o3 model, highlighting the potential of large models in security research and vulnerability detection [1][2][5]. Group 1: Vulnerability Discovery - The vulnerability, identified as CVE-2025-37899, is a use-after-free vulnerability in the SMB "logoff" command handler [4]. - This marks the first publicly discussed instance of a vulnerability discovered by a large model [5]. - The discovery process involved minimal tools, relying solely on the o3 API without complex setups [3][6]. Group 2: Research Methodology - Sean Heelan, an independent researcher, initially tested the o3 model on a manually discovered vulnerability (CVE-2025-37778) to evaluate its capabilities [12]. - He provided the model with a session handler's code and specified the search for use-after-free vulnerabilities, running each experiment 100 times to gather success rates [13]. - The o3 model demonstrated a notable performance, identifying vulnerabilities in a complex codebase of approximately 3,300 lines [15]. Group 3: Comparative Analysis - Heelan also tested other models, Claude 3.7 and Claude 3.5, with o3 outperforming them significantly: Claude 3.7 found vulnerabilities 3 times out of 100 runs, while Claude 3.5 found none [18]. - The o3 model's output was structured and clear, resembling human-written vulnerability reports, while Claude's output was more verbose and less organized [17]. Group 4: New Vulnerability Discovery - When testing o3 on a larger codebase of about 12,000 lines, the success rate for the original vulnerability dropped to 1%, but it reported a new vulnerability that Heelan was previously unaware of [21]. - This new vulnerability was also a use-after-free issue, highlighting the model's ability to discover previously unknown vulnerabilities [22]. Group 5: Repair Suggestions - The o3 model provided more comprehensive repair suggestions than Heelan's initial proposals, indicating its potential to enhance vulnerability remediation processes [25]. - Heelan acknowledged that using o3 for vulnerability detection and repair could theoretically yield better results than manual efforts, despite current challenges with false positives [27][28]. Group 6: Future Implications - Heelan concluded that large models are approaching human-like capabilities in program analysis, suggesting a shift in how code auditing may be conducted in the future [30]. - There are concerns regarding the potential misuse of AI capabilities for malicious purposes, emphasizing the need for vigilance in the security landscape [31].