量子位
Search documents
GPT-5为量子计算提供关键思路!大牛盛赞:不到半小时给出“灵魂一击”
量子位· 2025-09-29 03:46
Core Viewpoint - GPT-5 is potentially underestimated in its capabilities, particularly in assisting with complex quantum computing problems, as demonstrated by its role in providing critical insights during a recent research collaboration [1][20][26]. Group 1: GPT-5's Role in Quantum Research - Scott Aaronson, a prominent figure in quantum computing, expressed that the insights provided by GPT-5 were impressive enough to be considered as coming from a highly intelligent student [2][3]. - In a recent collaboration, GPT-5 contributed significantly to a paper titled "Limits to black-box amplification in QMA," which explores the limitations of amplification techniques in quantum complexity classes [5][4]. - The research involved analyzing how the maximum eigenvalue of a Hermitian matrix changes with parameters, a task that GPT-5 helped expedite, leading to a breakthrough in the research [22][25]. Group 2: Quantum Complexity Class (QMA) - QMA (Quantum Merlin Arthur) is a complexity class that describes a verification process where a verifier (Arthur) checks the validity of a quantum state provided by a prover (Merlin) [9][10]. - A long-standing question in QMA is whether the completeness error can be improved from 2/3 to 1, meaning whether a verifier can always accept a correct answer with certainty [10][12]. - Recent findings by researchers indicate that any QMA protocol can be amplified to achieve an exponentially small completeness error, showcasing the potential for significant advancements in quantum computing [15][19]. Group 3: Industry Reactions and Developments - The collaboration between researchers and GPT-5 has sparked discussions about the changing dynamics of research and the role of AI in scientific discovery [27][28]. - There are concerns regarding OpenAI's recent model downgrades, which have led to user dissatisfaction and calls for transparency in model usage [30][31]. - OpenAI has responded to these concerns by stating that the model switching is part of a "safety routing test" aimed at handling sensitive topics more rigorously [31].
十位离职华为的「天才少年」
量子位· 2025-09-28 11:54
Core Viewpoint - The article discusses the transition of Huawei's "Genius Youth" program participants, highlighting their shift from Huawei to various entrepreneurial and academic paths, particularly in the AI sector, showcasing their contributions and achievements in the industry [1][2][82]. Group 1: Entrepreneurial Paths - The "Genius Youth" program has produced notable entrepreneurs, with six out of ten participants choosing to start their own companies [82]. - 彭志辉, a prominent figure, left Huawei to co-found 智元机器人, which has secured significant funding and contracts, indicating strong market potential [10][15]. - 季宇 founded 行云集成电路, focusing on AI chip development, and has successfully launched a new product with competitive pricing [34][36]. - 王乃行 established 博思芯宇, targeting AI chip lifecycle management, and has also secured substantial funding [41][43]. - 丁文超 transitioned from Huawei to academia and then co-founded 它石智航, which focuses on embodied intelligence and has achieved significant funding milestones [48][50]. - 黄青虬, known for his work in laser radar algorithms, is also venturing into entrepreneurship in the field of embodied intelligence [56]. Group 2: Academic Paths - Four participants returned to academia, contributing to research and education in their respective fields [82]. - 周满 joined 华中科技大学, focusing on cybersecurity and wireless systems [62][63]. - 任宇翔 became an assistant professor at 南京大学, specializing in graph computing and AI models [70][72]. - 徐科 returned to 南京大学, where he is involved in data intelligence and visualization research [75][76]. - 邵典 took a position at 西北工业大学, focusing on AI and computer vision [81]. Group 3: Background of the "Genius Youth" Program - The "Genius Youth" program was initiated by 任正非 in 2019, aiming to cultivate top talent in key technological fields [85][88]. - Participants were offered competitive salaries, with the highest tier reaching up to 201 million yuan, attracting elite graduates [88][90]. - The program has been influential in shaping the careers of its participants, many of whom have made significant contributions to the tech industry [91][92].
Transformer作者初创公司最新成果:开源新框架突破进化计算瓶颈,样本效率暴涨数十倍
量子位· 2025-09-28 11:54
Core Insights - The article discusses the launch of an open-source framework called ShinkaEvolve, developed by Sakana AI, which significantly enhances sample efficiency in various computational tasks, achieving results that previously required thousands of evaluations with only 150 samples [1][3][22]. Group 1: Framework Overview - ShinkaEvolve allows large language models (LLMs) to optimize their own code while maintaining efficiency, likened to equipping evolutionary computation with an "acceleration engine" [3][6]. - The framework demonstrates performance comparable to Google's AlphaEvolve but with higher sample efficiency and open-source accessibility [6][22]. Group 2: Key Innovations - The framework incorporates three major architectural innovations that enhance its performance across tasks such as mathematical optimization, agent design, and competitive programming [5][11]. - The first innovation is a parent sampling technique that balances exploration and exploitation through a layered strategy and multi-method integration [11][13]. - The second innovation involves a novelty rejection sampling method that reduces ineffective computations by filtering out low-novelty variants using a two-tiered mechanism [14][16]. - The third innovation is a multi-armed bandit LLM selection strategy based on the UCB1 algorithm, which dynamically schedules LLMs based on their performance during different task phases [17][18]. Group 3: Performance Validation - In mathematical optimization, ShinkaEvolve achieved a significant breakthrough by requiring only 150 evaluations to optimize the placement of 26 circles within a unit square, compared to thousands needed by AlphaEvolve [20][22]. - For agent design, experiments showed that ShinkaEvolve outperformed baseline models in solving mathematical reasoning problems, achieving maximum performance with just seven LLM queries [23][25]. - In competitive programming benchmarks, ShinkaEvolve improved average scores by 2.3% across ten AtCoder problems, demonstrating its effectiveness without extensive code restructuring [28]. - The framework also excelled in evaluating load balancing loss functions in mixed expert models, showing higher accuracy and lower perplexity across multiple downstream tasks [30][32].
机器人感知大升级!轻量化注入几何先验,成功率提升31%
量子位· 2025-09-28 11:54
当前基于显式深度输入的增强方案虽有效,但依赖额外传感器或深度估计网络,存在部署难度、精度噪声等问题。 Evo-0团队 投稿 量子位 | 公众号 QbitAI 在机器人学习领域,如何让AI真正"看懂"三维世界一直是个难题。 VLA模型通常建立在预训练视觉语言模型(VLM)之上,仅基于2D图像-文本数据训练,缺乏真实世界操作所需的3D空间理解能力。 为此, 上海交通大学和剑桥大学提出一种增强视觉语言动作(VLA)模型空间理解能力的轻量化方法Evo-0, 通过隐式注入3D几何先验 , 无需显式深度输入或额外传感器。 该方法利用视觉几何基础模型VGGT, 从多视角RGB图像中提取3D结构信息 ,并融合到原有视觉语言模型中,实现空间感知能力的显著提 升。 在rlbench仿真实验中,Evo-0在5个需要精细操作的任务上,平均成功率超过基线pi0 15%,超过openvla-oft 31%。 Evo-0:实现2D–3D表征的融合 Evo-0提出将VGGT作为空间编码器,引入VGGT训练过程中针对3D结构任务提取的t3^D token。这些token包含深度上下文、跨视图空间对 应关系等几何信息。 模型引入一个cross- ...
HLE“人类最后考试”首次突破60分!Eigen-1基于DeepSeek V3.1显著领先Grok4、GPT-5
量子位· 2025-09-28 11:54
Core Insights - The article highlights a significant breakthrough in AI capabilities with the Eigen-1 multi-agent system achieving a Pass@1 accuracy of 48.3% and Pass@5 accuracy of 61.74% on the HLE Bio/Chem Gold test set, surpassing major competitors like Google Gemini 2.5 Pro and OpenAI GPT-5 [1][5][39]. Technical Innovations - The success of Eigen-1 is attributed to three innovative mechanisms: Monitor-based RAG, Hierarchical Solution Refinement (HSR), and Quality-Aware Iterative Reasoning (QAIR) [3][15][20]. - Monitor-based RAG reduces the "tool tax" associated with traditional retrieval-augmented generation systems, leading to a 53.5% reduction in token consumption and a 43.7% decrease in workflow iterations while maintaining higher accuracy [11][12][37]. - HSR introduces a hierarchical collaboration model that allows stronger solutions to absorb valuable insights from weaker ones, enhancing the overall problem-solving process [15][18]. - QAIR optimizes the iterative reasoning process by adjusting the depth of exploration based on the quality of answers, ensuring efficient resource utilization [20][21]. Performance Metrics - Eigen-1's performance metrics indicate a significant lead over competitors, with Pass@1 and Pass@5 scores of 48.3% and 61.74% respectively in HLE Bio/Chem Gold, and also strong performances in SuperGPQA Hard and TRQA tasks [27][22]. - The article provides a comparative table showcasing the performance of various models, highlighting Eigen-1's superior results [22]. Insights on Error Patterns - Analysis reveals that 92.78% of errors stem from reasoning process issues, indicating that the core challenge lies in seamlessly integrating knowledge with reasoning rather than mere knowledge retrieval [24][25]. - The article notes that execution and understanding errors are relatively low, suggesting that models have matured in instruction comprehension [26]. Component Contribution Analysis - The team conducted ablation studies to quantify the contributions of each component, demonstrating that the baseline system achieved only 25.3% accuracy without external knowledge, while the full system reached 48.3% accuracy with efficient token usage [29][31]. Implications for AI in Science - The breakthrough signifies a new paradigm for AI-assisted scientific research, suggesting that AI can become a powerful ally for scientists in tackling complex problems [39][40]. - The research team plans to continue optimizing the architecture and exploring applications in other scientific fields, indicating a commitment to advancing AI capabilities in research workflows [42].
黄仁勋:OpenAI融资时英伟达太穷,当时应该把所有钱都给他们
量子位· 2025-09-28 06:19
闻乐 发自 凹非寺 量子位 | 公众号 QbitAI 在最新近两小时的深度对谈,老黄不仅曝光了和OpenAI千亿合作的底层逻辑,还抛出了对AI行业的一系列判断: 总的来说,这场对话不仅拆解了AI算力爆发的底层逻辑,还覆盖了当前AI领域企业战略、技术路线、市场趋势等关键方向。 推理算力需求暴涨近10亿倍,AI正在从 记忆回答进化为思考解题 ; 英伟达1000亿美元投资瞄准的是"AI时代基础设施红利",OpenAI将成继Meta、谷歌后的下一个万亿级超大规模公司, 通过投资变相让 OpenAI采购自家芯片 是误解; 通用计算时代已经终结,全球超万亿计算基础设施将全面转向加速计算与AI; "AI产能过剩"是伪命题,在通用计算向加速计算完全转型前,算力缺口只会持续扩大; …… 下面来看看这场让网友直呼"最好的英伟达的访谈"具体聊了些什么。 英伟达与OpenAI合作,共建AI时代"算力铁路网" 对话开篇就解析了英伟达与OpenAI在三大核心领域推进的深度协作,其中 OpenAI自建AI基础设施 是最大亮点。 OpenAI一早就寻求英伟达投资,但老黄表示:当时太穷了,应该把所有的钱都给他们。 这是黄仁勋在BG2最新访谈中 ...
陈丹琦新作:大模型强化学习的第三条路,8B小模型超越GPT-4o
量子位· 2025-09-28 04:56
Core Viewpoint - The article discusses a new method called RLMT (Reinforcement Learning with Model-rewarded Thinking) that combines the advantages of RLHF (Reinforcement Learning from Human Feedback) and RLVR (Reinforcement Learning with Verifiable Rewards), enabling an 8 billion parameter model to outperform GPT-4o and rival Claude-3.7-Sonnet [1][4][11]. Group 1: Methodology and Performance - RLMT requires the model to generate a Chain of Thought (CoT) before producing an answer, which is then evaluated by a reward model trained on human preferences [5][17]. - The method can be directly applied to base models without the need for supervised fine-tuning (SFT), significantly reducing post-training costs [6][22]. - In benchmark tests, the L3.1-8B-RLMT model achieved an average score of 84.3, surpassing larger models like GPT-40 and Claude3.7-Sonnet [7]. Group 2: Training Process - The training process involves generating a reasoning trajectory based on user prompts, followed by scoring the final answer using a reward model [14]. - Two training approaches are highlighted: Warm-start (using SFT data) and Zero (direct training without SFT), both leading to improved performance [21][19]. - The RLMT method enhances the model's reasoning style to resemble human thought processes, resulting in higher quality dialogue and writing [19]. Group 3: Implications and Future Directions - The introduction of RLMT sets a new baseline for general reinforcement learning, emphasizing the importance of defining preferences in the post-training era [8]. - The results indicate that smaller models can achieve superior performance compared to larger models, suggesting a shift in focus towards efficiency in model training [22]. - The research team, led by Chen Danqi, aims to further explore natural language understanding and reasoning capabilities in future studies [24][25].
奥特曼和量子计算奠基人讨论GPT-8
量子位· 2025-09-28 03:39
Core Viewpoint - The dialogue between Sam Altman and David Deutsch highlights the ongoing debate about whether AI can evolve into a conscious superintelligence, with differing opinions on the definitions and standards of AGI (Artificial General Intelligence) and ASI (Artificial Superintelligence) [3][8]. Group 1: Discussion on AI and Consciousness - Altman believes that future iterations of AI, such as GPT-8, could potentially understand complex concepts like quantum gravity and explain their reasoning process, challenging Deutsch's skepticism about AI achieving consciousness [22]. - Deutsch argues that while AI can perform impressive tasks, it lacks the intrinsic qualities of human intelligence, such as intuition and the ability to create original ideas, which are essential for true AGI [11][12][18]. Group 2: Perspectives on Human Intelligence - The conversation emphasizes that human intelligence is characterized by the ability to narrate one's own story and actively choose motivations, contrasting with the mechanical processing of information seen in current AI systems [19][21]. - The notion that there is no definitive test for AGI is discussed, suggesting that existing methods cannot adequately measure the capabilities of a truly general intelligence [15][16]. Group 3: Contributions of David Deutsch - David Deutsch is recognized as a foundational figure in quantum computing and information theory, having proposed significant theoretical frameworks that underpin the field [23][24]. - His work includes the development of the Deutsch-Jozsa algorithm, which demonstrated the exponential speedup of quantum algorithms compared to classical ones, laying the groundwork for future advancements in quantum computing [26].
DeepMind率先提出CoF:视频模型有自己的思维链
量子位· 2025-09-28 03:39
Core Viewpoint - DeepMind introduces the concept of Chain-of-Frames (CoF) for video models, paralleling the Chain-of-Thought (CoT) in language models, suggesting a shift towards general-purpose visual understanding capabilities in machine vision [1][3][28]. Group 1: Introduction of CoF - The CoF concept arises from the curiosity of whether video generation models can achieve general-purpose capabilities similar to large language models (LLMs) without specialized training [6][7]. - The goal is to validate the hypothesis that video models can perform various visual tasks using a single underlying logic based on vast data [7][8]. Group 2: Capabilities of Veo 3 - Veo 3 demonstrates four progressive capabilities: 1. It can handle many classic visual tasks without specialized training, showcasing perceptual abilities [10][11]. 2. It can establish rules of the visual world, indicating modeling capabilities [13][14]. 3. It can perform creative modifications and simulations, reflecting operational abilities [16]. 4. It can achieve cross-temporal visual reasoning, embodying the CoF concept [18][21]. Group 3: Performance Analysis - Analysis of 62 qualitative tasks and 7 quantitative tasks revealed that Veo 3 can solve many tasks it has not been specifically trained for, indicating its general potential [23]. - The performance of Veo 3 shows significant improvement over its predecessor, Veo 2, suggesting rapid development in video model capabilities [24][25]. Group 4: Future Outlook - DeepMind predicts that general-purpose models like Veo 3 will eventually replace specialized models in the video domain, similar to the evolution seen in LLMs [25][26]. - The cost of video generation is currently higher than specialized models, but it is expected to decrease over time, paralleling trends observed in LLMs [25][26].
AI原生产品不等于全部功能AI化,保留传统功能让用户体验更完整 | 对话小卡健康
量子位· 2025-09-27 09:58
以下文章来源于量子位智库 ,作者AI 100访谈 量子位智库 . 连接AI创新,提供产业研究 分析师 刘萌媛 奕然 量子位智库 | 公众号 AI123All AI健康管理赛道竞争如火如荼,面向生活场景中的健康管理产品层出不穷,产品定位、界面设计、功能排布以及商业模式和盈利模式各有区 别。 整体来看,传统健康管理产品依据自身定位推出AI新功能,如AI教练、AI体重管理助手;AI原生产品注重AI功能应用以提升效率和体验,如AI 识图测热量、个性化交互助手等。 但是,消费级AI健康管理赛道是一个 大众化程度高且分散的市场 ,产品差异化程度相对不强,用户区分度不高,功能基本围绕 "记录+个性 化方案定制" 两个方面进行。 在这样的市场格局下,AI健康管理赛道面临的困惑与问题也愈发明显: 针对这些问题,量子位智库邀请了 小卡健康 及其产品负责人 李雅 ,进行了一场深入交流。 在这次访谈中,李雅表示小卡健康的核心定位是 每个人的专属AI营养师 ,可以用AI拍照测热量、语义识别用户饮食和运动行为并自动记录, AI营养师能给出个性化营养方案等。结合小卡健康的实例,我们也看到 AI原生产品在健康管理赛道竞争布局的差异化策略 , ...