Workflow
Aristotle
icon
Search documents
大模型开始“批量破解”数学难题
Hua Er Jie Jian Wen· 2026-01-15 07:08
Core Insights - The breakthrough in artificial intelligence (AI) within the field of mathematics is accelerating, with 15 out of over 1000 unsolved problems left by mathematician Paul Erdős being solved since Christmas, 11 of which involved AI models [1] - OpenAI's latest GPT 5.2 model has shown significant improvements in mathematical reasoning, capable of providing complete proofs in 15 minutes, surpassing previous versions [1] - AI models have made substantial autonomous progress on 8 different Erdős problems, indicating a shift in the role of AI from an assistant to an independent problem solver [1][3] Impact on Mathematical Research and AI Application Market - The advancements in AI are transforming the academic research workflow, with formal tools like Lean and Aristotle being widely adopted by top mathematicians and computer science professors [2] - The increase in the number of solved Erdős problems is attributed to the serious engagement of top mathematicians with these AI tools, marking a shift from experimental to mainstream academic application [6] Systematic Breakthroughs and Discoveries - The discovery by Neel Somani began with a routine test of ChatGPT, which provided a complete answer to a mathematical problem, demonstrating the model's ability to reference established mathematical principles [3] - The Erdős problem set, containing over 1000 conjectures, has become an attractive target for AI-driven mathematical research, with GPT 5.2 outperforming previous models in advanced mathematics [3] Cautious Evaluation by Leading Mathematicians - Mathematician Terence Tao suggests that AI systems are better suited for systematically addressing lesser-known Erdős problems, which may now be more likely solved through pure AI methods rather than human or hybrid approaches [4] - This evaluation indicates a potential reallocation of resources in mathematical research, with AI efficiently handling medium-difficulty problems that have been overlooked due to human limitations [4] Formalization Tools Driving Application - The recent shift towards formalization in mathematics is a key driver, making mathematical reasoning easier to verify and extend, with new automation tools significantly reducing the workload [5] - Tools like Lean and Aristotle promise to automate much of the formalization work, enhancing the efficiency of mathematical research [5]
AI又要颠覆数学?陶哲轩紧急发声:停止造神
3 6 Ke· 2026-01-12 01:49
Core Viewpoint - The article discusses the exaggerated claims regarding AI's ability to solve complex mathematical problems, particularly in relation to Erdős problems, and emphasizes the need for a more nuanced understanding of AI's contributions in mathematics [1][2]. Group 1: AI's Capabilities in Mathematics - AI's achievements in solving certain mathematical problems are often overstated, leading to misconceptions that AI can independently innovate or replace human mathematicians [2][4]. - The difficulty level of problems solved by AI varies significantly, making direct comparisons misleading; some problems are much easier than others, which can skew perceptions of AI's capabilities [2][3]. - Many problems labeled as "unsolved" may have been previously addressed in literature, leading to potential misattributions of "first solutions" to AI [3][10]. Group 2: Evaluation of AI Contributions - AI's contributions can be categorized into several types, including generating complete or partial solutions, conducting literature reviews, and formalizing proofs [6][12]. - Specific examples illustrate that AI has successfully provided solutions for certain problems, but these often require validation against existing literature to confirm their novelty [8][10]. - The process of formalizing AI-generated proofs can introduce risks, such as the potential for misinterpretation or the introduction of unverified axioms [4][12]. Group 3: The Role of Human Mathematicians - Human mathematicians remain essential for formulating deep questions, creating new concepts, and integrating results into the broader knowledge network of mathematics [12]. - The future of mathematics may involve a collaborative relationship where humans guide AI in exploring mathematical landscapes, rather than AI acting as an independent entity [12].
陶哲轩震撼,数学家1975年埋下的「坑」,被AI和全球网友用48小时填平了
3 6 Ke· 2025-12-15 02:26
48小时,50年数学谜题就被破解!AI与全球数学家梦幻联动,从游戏分硬币到正方形填充,层层拆解埃尔德什遗留难题,人机协作彻底引爆 了数学研究新范式。 刚刚,AI又破解了一个数学难题! Erdos#1026问题已经被攻克,且给出了正式证明。 而在此之前,这个问题已经困扰了数学界50年。 陶哲轩在Mastodon上宣布了这一消息,还在一篇博客中详细讲述了这个故事。 他强调,在AI的辅助下,人类团队仅用了48小时,就顺利攻克了这一难题。 并且,AI在此过程中带来的是全新理解,绝非搜索这么简单。 要知道,如果是靠传统方法,只靠数学家使用编程和文献检索,可能会需要数周甚至数月。 在这个过程中,AI实际上是在生成新的数学洞见,而不仅仅是检索现有文献。 Harmonic官网也宣布了这一消息,其AI系统Aristotle参与了此次解题过程。 Erdos#1026问题 1975年,传奇数学家保罗·埃尔德什在一篇论文的角落随手写下一个问题。 半个世纪后,这个问题静静躺在「埃尔德什问题网站」上,编号1026。 谁也没想到,它会在2025年的最后一个月,被一群数学家利用AI工具,在短短48小时内彻底破解。 埃尔德什的原问题,读起来有 ...
腾讯研究院AI速递 20251215
腾讯研究院· 2025-12-14 16:01
Group 1 - OpenAI's GPT-5.2 received negative feedback from users on platforms like X and Reddit, citing issues such as blandness, excessive safety checks, and poor emotional intelligence [1] - SimpleBench testing revealed GPT-5.2 scored lower than Claude Sonnet 3.7 from a year ago, with errors in simple questions, while LiveBench scores were below Opus 4.5 and Gemini 3.0 [1] - The strict safety refusal mechanism was criticized for reducing the model's empathy and contextual awareness, leading to mechanical and unrealistic suggestions in emotional support scenarios [1] Group 2 - Google launched the new Gemini Deep Research Agent just before GPT-5.2, enhancing accuracy and reducing hallucinations through multi-step reinforcement learning [2] - The new version achieved leading scores of 46.4% in the Humanity's Last Exam test set, 66.1% in DeepSearchQA, and 59.2% in BrowseComp [2] - Google also introduced an open-source benchmark for network research agents and a new interactive API for server-side state management and long inference loops [2] Group 3 - Runway released significant updates, including the Gen-4.5 flagship video model and the first general world model, GWM-1, which supports native audio generation and multi-camera editing [3] - GWM-1 is an autoregressive model that allows frame-by-frame prediction and real-time intervention, featuring variants for exploring environments, dialogue characters, and robotic operations [3] - NVIDIA's CEO congratulated Runway, indicating a shift from simple video generation to true world simulation, with AI beginning to understand the underlying logic of the physical world [3] Group 4 - Google integrated Gemini model capabilities into its translation service, launching a real-time voice translation beta that supports over 70 languages while preserving speaker tone and rhythm [4] - The text translation engine has been restructured to intelligently parse idioms and context rather than relying on literal translations, supporting translations between English and nearly 20 other languages [4] - The Chrome team introduced an experimental browser called Disco, featuring GenTabs that convert web content into interactive mini-apps [4] Group 5 - TuoZhu Technology upgraded its 3D model platform MakerWorld by integrating Tencent's Hunyuan 3D 3.0, launching a new figurine generator that allows users to create printable 3D models from a single image [6] - Hunyuan 3D 3.0 introduced a pioneering 3D-DiT sculpting technology, enhancing modeling precision threefold with a geometric resolution of 1536³ and supporting ultra-high-definition modeling with 3.6 billion voxels [6] - MakerWorld has attracted over 2 million users with 20 unique modeling tools, significantly shortening design cycles by leveraging advanced generative AI technology [6] Group 6 - Disney invested $1 billion in OpenAI, acquiring warrants for additional equity, marking a significant content licensing partnership for the Sora platform [7] - The three-year licensing agreement grants exclusivity in the first year, allowing Sora and ChatGPT Images to use over 200 Disney characters, including those from Marvel and Pixar, excluding live-action likenesses [7] - Disney plans to utilize OpenAI's API to develop new products for its Disney+ streaming platform and deploy ChatGPT for internal workflows, with selected fan-created videos to be featured on Disney+ [7] Group 7 - The Erdős 1026 problem, proposed in 1975, was solved with AI assistance in just 48 hours, showcasing AI's potential to provide new mathematical insights rather than merely searching existing literature [8] - The AI system Aristotle automatically proved a formula in Lean proof assistant language, while AlphaEvolve helped refine a clean formula from numerical results [8] - This achievement demonstrates AI's capability to generate new mathematical insights, significantly reducing the time required for traditional problem-solving methods [8] Group 8 - Yuzhu Technology launched the first humanoid robot application store, aimed at standardizing and modularizing humanoid robot functionalities to lower the development barrier for complex movements [9] - The application store includes core modules such as user forums, action libraries, datasets, and developer centers, allowing users to deploy cloud-based motion control algorithms without coding skills [9] - Initial applications include preset martial arts and dance routines for the G1 series robots, utilizing proprietary dynamics algorithms and high-precision motion capture data [9] Group 9 - Google DeepMind's chief AGI scientist predicts a 50% chance of achieving minimal AGI by 2028, with complete AGI expected within 3-6 years after that, leading to a phase of superintelligent AI [10] - AGI is viewed as a continuous spectrum rather than a critical point, with three stages: minimal AGI for typical cognitive tasks, complete AGI for exceptional human tasks, and ASI surpassing all human cognitive domains [10] - The emergence of AGI is anticipated to cause structural unemployment, primarily affecting high-level cognitive jobs, while lower-level physical jobs may remain temporarily safe [10] Group 10 - A report by Similarweb indicates that global GenAI platform monthly visits exceeded 7 billion, a 76% year-on-year increase, with mobile app downloads reaching 1.9 billion, more than tripling in a year [12] - The proportion of users aged 18-34 decreased by approximately 15%, indicating a rapid influx of older users, while ChatGPT has become one of the top five websites globally, with 95% of users still using Google [12] - AI Mode has become the first generative AI search feature to surpass 100 million visits, marking a shift in the internet from being search-driven to being AI-driven [12]
美版“梁文锋”不信邪
虎嗅APP· 2025-07-31 09:50
Core Viewpoint - The article discusses the emergence of Harmonic, a startup focused on developing a zero-hallucination AI model named Aristotle, which aims to solve the challenges of AI in mathematical reasoning and formal verification [4][5][6]. Group 1: Company Overview - Harmonic is a startup founded by Vlad Tenev and Tudor Achim, focusing on creating AI that can perform mathematical reasoning without hallucinations [9][10]. - The company has rapidly gained attention and investment, achieving a valuation close to $900 million within two years of its establishment [25][26]. - Harmonic's product, Aristotle, is designed to provide rigorous mathematical proofs and reasoning, addressing the common issue of hallucinations in AI outputs [20][21]. Group 2: Technology and Innovation - Aristotle utilizes a formal verification tool called Lean, which ensures that every step in the reasoning process is validated, thus eliminating the possibility of generating false information [36][38]. - The model has demonstrated impressive performance in mathematical competitions, achieving a success rate of 90% in the MiniF2F test, significantly outperforming existing models like OpenAI's GPT-4 [41][42]. - Harmonic's approach emphasizes the importance of rigorous logical constraints in AI, aiming to make AI a reliable assistant in high-stakes fields such as finance and healthcare [21][19]. Group 3: Market Position and Competition - The AI industry is increasingly recognizing the need for more rigorous reasoning capabilities, creating opportunities for companies like Harmonic [27][28]. - Harmonic faces competition from established players like DeepMind and OpenAI, which have their own advanced models and extensive data resources [50][51]. - The startup's unique selling proposition lies in its focus on zero-hallucination outputs, which is a critical requirement in precision-demanding applications [17][19].
美版“梁文锋”不信邪
Hu Xiu· 2025-07-31 06:51
Core Viewpoint - The article discusses the emergence of Harmonic, a startup focused on developing a zero-hallucination AI model named Aristotle, which aims to excel in mathematical reasoning and formal verification, attracting significant investment and attention in the AI industry [2][5][46]. Group 1: Company Overview - Harmonic is a two-year-old startup that has rapidly gained attention from top-tier investment firms, achieving a valuation close to $900 million [5][23]. - The company has attracted nearly $200 million in investments from prominent firms such as Sequoia Capital, Kleiner Perkins, and Paradigm [5][29][27]. - Founders Vlad Tenev and Tudor Achim bring unique backgrounds in mathematics and AI, respectively, with Tenev being the CEO of Robinhood and Achim having experience in autonomous driving [11][12][16]. Group 2: Product Development - Harmonic's flagship product, Aristotle, is designed to perform mathematical reasoning without hallucinations, utilizing a formal verification tool called Lean [18][30]. - Aristotle has demonstrated impressive performance in mathematical problem-solving, achieving a success rate of 90% in the MiniF2F test, significantly outperforming existing models like OpenAI's GPT-4 [37][38]. - The model addresses three main issues: hallucination, unclear reasoning processes, and lack of rigor in traditional AI models [19][20][21]. Group 3: Market Context - The AI industry is increasingly recognizing the need for rigorous reasoning capabilities, creating opportunities for startups like Harmonic [25][24]. - Competitors in the space include DeepSeek and Google DeepMind, both of which are also developing advanced mathematical AI models [40][45]. - The competitive landscape is intensifying as major players seek to enhance their AI models' reasoning capabilities, particularly in high-stakes applications [26][46].
速递|“保证不存在幻觉”数学AI争夺升级,获奥林匹克竞赛金牌,初创公司Harmonic估值8.75亿美元
Z Potentials· 2025-07-30 03:37
Core Viewpoint - Harmonic, an AI startup co-founded by Robinhood CEO Vlad Tenev, has launched a beta version of its AI chatbot application, Aristotle, which aims to provide reliable answers to mathematical reasoning problems without hallucinations [1][2]. Group 1: Company Overview - Harmonic recently completed a $100 million Series B funding round led by Kleiner Perkins, achieving a valuation of $875 million [1]. - The company is focused on creating "Mathematical Super Intelligence" (MSI) to assist users in fields reliant on mathematics, such as physics, statistics, and computer science [1]. Group 2: Product Features - Aristotle is claimed to be the first public product capable of reasoning and formally verifying its outputs, ensuring no hallucinations in quantitative reasoning [2]. - The model has reportedly achieved gold medal level in the International Mathematical Olympiad (IMO) through formal testing, contrasting with other AI models that used informal testing methods [2]. Group 3: Technical Approach - Harmonic utilizes the open-source programming language Lean to generate responses, ensuring high precision by double-verifying solutions through non-AI algorithms before presenting them to users [3]. - The technology employed by Harmonic is similar to that used in high-stakes fields like medical devices and aviation for output verification [3]. Group 4: Industry Context - Many leading tech companies are focusing on training AI models to solve mathematical problems, as mathematical capability is seen as a unique and verifiable domain requiring core reasoning skills [3]. - Achieving hallucination-free performance in AI models, even in narrow domains, is recognized as a challenging task, with leading models frequently producing hallucinations [4][5].
速递| 红杉、Kleiner Perkins押注数学AI革命:Harmonic B轮融资1亿美金,打造数学超智能
Z Potentials· 2025-07-12 05:17
Group 1 - Harmonic AI, co-founded by Robinhood Markets CEO Vlad Tenev, has raised $100 million in funding to address challenges in mathematical operations faced by AI models [1][2] - The recent Series B funding round was led by Kleiner Perkins, with participation from Sequoia Capital, Index Ventures, and Paradigm, bringing the company's valuation to $875 million, just below the $1 billion "unicorn" threshold [1] - The CEO of Harmonic AI, Tudor Achim, aims to develop an AI system capable of solving complex mathematical problems, referred to as "mathematical superintelligence" [1][2] Group 2 - Harmonic plans to release its flagship AI model, Aristotle, to researchers and the public later this year, with the goal of creating an AI that surpasses human-level mathematical problem-solving abilities [2] - The ultimate objective is to tackle significant unsolved problems in mathematics and extend the capabilities to physics and computer science [2] - Harmonic's math-first strategy is expected to give it an edge over large language models that typically struggle with complex mathematical tasks [2][3] Group 3 - The company employs formal verification methods to ensure the correctness of its AI system's outputs and reasoning steps, which is a distinct approach to AI model construction [3] - Tenev emphasizes that maximizing valuation is not always wise, reflecting a strategic mindset in the company's growth and funding approach [3]
美国版梁文锋来了
量子位· 2025-07-11 06:16
Core Viewpoint - Harmonic AI, co-founded by Vlad Tenev and Tudor Achim, aims to develop an AI system capable of solving complex mathematical problems, striving for Mathematical Superintelligence (MSI) [3][20]. Group 1: Company Overview - Harmonic AI has successfully raised $100 million in Series B funding, bringing its valuation to approximately $875 million [4][17]. - The company was co-founded by Vlad Tenev, who previously established Robinhood Markets, and Tudor Achim, an expert in AI and large model training [5][15]. - Robinhood, under Tenev's leadership, achieved a market cap of around $22.7 billion and reported a revenue of $927 million with a net profit of $336 million in Q1 2025 [8][12]. Group 2: Funding and Valuation - Harmonic AI's Series A funding raised $75 million, led by Sequoia Capital, with a post-money valuation of $325 million [15]. - The recent Series B funding was led by Kleiner Perkins, with participation from Paradigm and Ribbit Capital, among others [16]. - The company intentionally set its valuation below the "unicorn" threshold of $1 billion, focusing on long-term growth rather than short-term valuation targets [18][19]. Group 3: Product Development - Harmonic AI announced its first model, Aristotle, which can formalize natural language problems into formal representations, enhancing collaboration with mathematicians [20]. - The model's performance improved from 83% to 90% on the MiniF2F benchmark, which includes various levels of mathematical problems [23]. - The ultimate goal is to create an AI system with mathematical capabilities surpassing human abilities, addressing challenges like the "hallucination" problem in AI [26][28].
Robinhood CEO 的新 AI 估值 9 亿美金,打造无幻觉的数学超智能
投资实习所· 2025-07-11 04:21
Core Viewpoint - Harmonic.fun, co-founded by Vlad Tenev and Tudor Achim, focuses on developing an AI model based on Mathematical Superintelligence (MSI) to address reliability issues in AI applications, particularly in high-stakes fields like finance and healthcare [1][2][3]. Group 1: Company Overview - Harmonic.fun recently completed a $100 million Series B funding round, led by KP, with participation from Paradigm, Ribbit Capital, Sequoia Capital, and Index Ventures, achieving a valuation of nearly $900 million [1]. - The company previously raised $75 million in a Series A round led by Sequoia, with a valuation of $325 million at that time [1]. Group 2: Technology and Methodology - The core concept of MSI is rooted in formal mathematical reasoning, which allows for verifiable outputs and eliminates the "hallucination" phenomenon common in traditional AI models [2][3]. - Traditional AI models rely on probabilistic predictions and pattern recognition, which can lead to inaccuracies when faced with unfamiliar situations or complex reasoning tasks [2][3]. - Harmonic's flagship model, Aristotle, is designed to solve complex mathematical problems and is applicable in fields requiring zero-tolerance for errors, such as aerospace, chip design, and healthcare [3][4]. Group 3: Advantages of MSI - MSI provides verifiable accuracy, ensuring that every logical step in the reasoning process is rigorous and correct, contrasting with the "black box" nature of traditional AI [5]. - The model eliminates hallucinations by adhering strictly to mathematical and logical rules, ensuring the authenticity of its results [5]. - Aristotle can transparently identify and mark errors in the reasoning process, which is crucial for debugging and understanding AI decision-making, especially in high-risk applications [5]. Group 4: Applications and Impact - In high-security industries like blockchain, financial services, and aerospace, Aristotle can generate formally verified software code, enhancing system safety and reliability [5]. - In finance, Aristotle can handle complex data for rigorous risk assessment and model validation, aiding institutions in making informed investment and risk management decisions [5]. - The model also has potential applications in scientific research and engineering design, accelerating breakthroughs in fields like theoretical physics and materials science [5]. - Although primarily aimed at enterprise applications, the interpretability and accuracy of MSI could enhance mathematics education by helping students understand complex concepts through verifiable reasoning steps [5]. Group 5: Training Methodology - Harmonic employs a unique approach of using synthetic data generation for training, allowing the system to autonomously create formal problem proofs for recursive self-improvement [8][9].