Workflow
AI辅助科研
icon
Search documents
像挖币一样挖激活函数?DeepMind搭建「算力矿场」,暴力搜出下一代ReLU
机器之心· 2026-02-07 04:09
Core Insights - The article discusses the evolution of activation functions in neural networks, highlighting the transition from traditional functions like Sigmoid and ReLU to newer ones like GELU and Swish, emphasizing the impact on model performance [1][2]. Group 1: DeepMind's Innovation - Google DeepMind is revolutionizing the search for activation functions through a new method called AlphaEvolve, which explores an infinite space of Python functions rather than relying on predefined search spaces [2][4]. - The research paper titled "Finding Generalizable Activation Functions" showcases how DeepMind's approach led to the discovery of new activation functions, including GELUSine and GELU-Sinc-Perturbation, which outperform traditional functions in certain tasks [4][30]. Group 2: Methodology - AlphaEvolve utilizes a large language model (LLM) to generate and modify code, allowing for a more flexible and expansive search for activation functions [8][11]. - The process involves a "micro-laboratory" strategy, where synthetic data is used to optimize for out-of-distribution (OOD) generalization capabilities, avoiding the high costs of searching on large datasets like ImageNet [14][18]. Group 3: Performance of New Functions - The newly discovered functions demonstrated superior performance in algorithmic reasoning tasks, with GELU-Sinc-Perturbation achieving a score of 0.887 on the CLRS-30 benchmark, surpassing ReLU and GELU [34]. - In visual tasks, GELUSine and GELU-Sinc-Perturbation maintained competitive accuracy on ImageNet, achieving approximately 74.5% Top-1 accuracy, comparable to GELU [34][35]. Group 4: Insights on Function Design - The research indicates that the best-performing functions often follow a general formula combining a standard activation function with a periodic term, suggesting that incorporating periodic structures can enhance model generalization [25][35]. - The study highlights the importance of understanding the inductive biases introduced by activation functions, suggesting that periodic elements can help capture complex data structures beyond linear relationships [40][42].
数学界无视「30年漏洞」,GPT-5一眼看穿,陶哲轩:AI科研革命开始了
3 6 Ke· 2025-11-05 10:52
Core Insights - The article discusses the recent developments surrounding OpenAI's GPT-5, particularly its role in solving mathematical problems, including the Erdős problems, and the subsequent reactions from the academic community [1][6][8]. Group 1: GPT-5's Contributions - GPT-5 has been credited with accelerating scientific progress by identifying existing solutions to ten Erdős problems, although it was initially misrepresented as having solved them [1][12]. - The 707th Erdős problem, which had been thought unsolved for 30 years, was actually resolved before its proposal, highlighting the importance of literature review in mathematical research [8][10]. - Two mathematicians successfully used GPT-5 to generate formal proofs, demonstrating its potential as a collaborative tool in mathematical research [13][14]. Group 2: Academic Reactions - Yann LeCun criticized OpenAI, suggesting that the company was harmed by its own overzealous claims regarding GPT-5's capabilities [2]. - Sebastien Bubeck, an OpenAI scientist, faced backlash for his initial claims but later acknowledged the complexity of literature searches in mathematics [12][17]. - Mathematician Terence Tao praised the use of AI in generating verifiable proofs, emphasizing that AI should complement human efforts rather than replace them [14][17]. Group 3: Future Implications - The collaboration between AI and human researchers could lead to more efficient problem-solving processes, as demonstrated by the successful use of GPT-5 in generating a formal proof that required significant human input for refinement [16][29]. - The exploration of AI's role in mathematics is still in its early stages, with potential for further integration and optimization in research methodologies [16][18].
GPT-5破解世纪难题,竟是上网抄来的,哈萨比斯:太尴尬了
3 6 Ke· 2025-10-21 02:26
Core Viewpoint - The incident surrounding GPT-5 has been characterized as a farce, where initial claims of the AI solving ten Erdos problems were misleading, as it merely retrieved existing solutions from literature rather than independently solving them [1][3][10]. Group 1: Miscommunication and Misunderstanding - OpenAI scientists initially celebrated GPT-5 for allegedly solving ten long-standing Erdos problems, leading to widespread promotion within the company [1][3]. - The truth revealed that these problems had already been solved in academia, and GPT-5's role was limited to retrieving answers from existing literature [3][10]. - The misunderstanding stemmed from a lack of updated information on the website managing the Erdos problems, which led to the false impression that these problems were unsolved [8][9]. Group 2: Reactions from the Community - Prominent figures in the AI and mathematics community, including Demis Hassabis and Yann LeCun, publicly criticized the situation, highlighting the embarrassment for OpenAI [3][5]. - The developers involved clarified that GPT-5 did not independently solve the problems but efficiently found existing solutions through extensive queries [6][11]. - The incident sparked discussions about the need for caution regarding claims of AI making new scientific or mathematical discoveries, emphasizing the importance of peer review [15][17]. Group 3: Future Implications for AI in Research - Some experts suggest that the future role of AI in mathematics may not be about tackling the hardest problems but rather assisting with routine tasks in research [19][20]. - Despite the controversy, there is recognition that AI can still be a valuable tool in literature retrieval and supporting scientific research [18][20].
MIT爆火论文被曝数据造假!曾验证AI辅助科研增速44%,诺奖得主都被诓了
量子位· 2025-05-20 20:33
Core Viewpoint - The article discusses the retraction of a highly publicized paper from MIT that claimed significant advancements in scientific discovery and innovation through AI, which has now been discredited due to allegations of data fabrication [1][3][5]. Group 1: Research Findings - The paper initially reported that AI-assisted research led to a 44% increase in new material discoveries, a 39% rise in patent applications, and a 17% enhancement in downstream product innovation [2]. - It highlighted that AI-generated materials were more unique in chemical structure, with a higher proportion of new technical terms in patents and new product lines in prototypes, indicating a shift towards more radical innovation rather than incremental improvements [13]. - The study also noted that AI automated 57% of "creative generation" tasks, allowing scientists to focus more on evaluating AI suggestions [14]. Group 2: Controversy and Retraction - The paper was published in November 2024 and was under review for formal publication when it was retracted due to concerns over the authenticity of its data [3][7]. - MIT expressed a lack of confidence in the data's source, reliability, and validity, leading to a formal statement urging the paper's withdrawal from public discussion [5][36]. - The investigation began after a computer scientist raised concerns about the paper's methodology, prompting an internal review by MIT's disciplinary committee [35]. Group 3: Impact and Reactions - The paper had gained significant attention and was referred to as "the best paper on the impact of AI on scientific discovery" by various scholars [21][29]. - Following the retraction, many in the academic community expressed surprise, including those who had previously reported on the study [32]. - The article notes that the paper's GitHub link is no longer accessible, indicating further issues with the research's credibility [39].