AlphaEvolve
Search documents
像挖币一样挖激活函数?DeepMind搭建「算力矿场」,暴力搜出下一代ReLU
机器之心· 2026-02-07 04:09
Core Insights - The article discusses the evolution of activation functions in neural networks, highlighting the transition from traditional functions like Sigmoid and ReLU to newer ones like GELU and Swish, emphasizing the impact on model performance [1][2]. Group 1: DeepMind's Innovation - Google DeepMind is revolutionizing the search for activation functions through a new method called AlphaEvolve, which explores an infinite space of Python functions rather than relying on predefined search spaces [2][4]. - The research paper titled "Finding Generalizable Activation Functions" showcases how DeepMind's approach led to the discovery of new activation functions, including GELUSine and GELU-Sinc-Perturbation, which outperform traditional functions in certain tasks [4][30]. Group 2: Methodology - AlphaEvolve utilizes a large language model (LLM) to generate and modify code, allowing for a more flexible and expansive search for activation functions [8][11]. - The process involves a "micro-laboratory" strategy, where synthetic data is used to optimize for out-of-distribution (OOD) generalization capabilities, avoiding the high costs of searching on large datasets like ImageNet [14][18]. Group 3: Performance of New Functions - The newly discovered functions demonstrated superior performance in algorithmic reasoning tasks, with GELU-Sinc-Perturbation achieving a score of 0.887 on the CLRS-30 benchmark, surpassing ReLU and GELU [34]. - In visual tasks, GELUSine and GELU-Sinc-Perturbation maintained competitive accuracy on ImageNet, achieving approximately 74.5% Top-1 accuracy, comparable to GELU [34][35]. Group 4: Insights on Function Design - The research indicates that the best-performing functions often follow a general formula combining a standard activation function with a periodic term, suggesting that incorporating periodic structures can enhance model generalization [25][35]. - The study highlights the importance of understanding the inductive biases introduced by activation functions, suggesting that periodic elements can help capture complex data structures beyond linear relationships [40][42].
GPT-5.2破解数论猜想获陶哲轩认证,OpenAI副总裁曝大动作
3 6 Ke· 2026-01-29 13:24
Core Insights - OpenAI has launched a new AI research tool called Prism, powered by GPT-5.2, aimed at assisting scientists in writing and collaborating on research, now available for free to all ChatGPT personal account users [1] - The company aims to empower scientists with AI capabilities to accelerate research, with a vision to enable scientific advancements by 2030 that would typically be expected by 2050 [1][2] - OpenAI's entry into the scientific field comes after competitors like Google DeepMind have already established their presence with AI-for-science teams and groundbreaking models [2] Group 1: OpenAI's Strategic Goals - OpenAI's goal is to enhance the capabilities of scientists, allowing them to focus on more complex problems rather than previously solved issues, thereby accelerating research [2][3] - The company plans to optimize its models by reducing confidence levels in answers and implementing self-fact-checking mechanisms [3][15] - OpenAI's mission is to develop general artificial intelligence (AGI) that benefits humanity, with a focus on transforming scientific research through new drugs, materials, and instruments [3][4] Group 2: Model Performance and Capabilities - GPT-5 has shown significant improvements, achieving a 92% accuracy rate in the GPQA benchmark, surpassing the performance of 90% of graduate students [5] - The model has been recognized for its ability to assist researchers in finding connections between existing research and generating new insights, although it still makes errors [10][11] - OpenAI acknowledges that while the model can assist in research, it has not yet reached the level of making groundbreaking discoveries [6][8] Group 3: Industry Context and Competition - OpenAI's late entry into the AI-for-science domain is notable, as competitors like Google DeepMind have already made significant advancements [2][16] - The company is aware of the competitive landscape and aims to establish a strong foothold in the scientific research sector [16] - OpenAI's focus on optimizing model features and enhancing collaboration with researchers is part of its strategy to differentiate itself from other AI models in the market [15][16]
GPT-5.2破解数论猜想获陶哲轩认证!OpenAI副总裁曝大动作:正改模型核心设计,吊打90%研究生但难出颠覆性发现
AI前线· 2026-01-29 10:07
Core Viewpoint - OpenAI has launched Prism, a new AI research tool powered by GPT-5.2, aimed at enhancing scientific research collaboration and efficiency, now available for free to all ChatGPT personal account users [2][3]. Group 1: OpenAI's Strategic Move - OpenAI's entry into the scientific research field is seen as a response to the growing importance of AI in academia, with the goal of empowering scientists to conduct advanced research by 2030 [2][3]. - The establishment of the OpenAI for Science team indicates a focused effort to explore how large language models (LLMs) can assist researchers and optimize tools for scientific support [2][3]. Group 2: Model Capabilities and Limitations - Kevin Weil, OpenAI's VP, acknowledges that while current models can accelerate research by preventing time wastage on solved problems, they are not yet capable of making groundbreaking discoveries [4][5]. - The latest version, GPT-5.2, has shown significant improvement, achieving a 92% accuracy rate in the GPQA benchmark, surpassing the performance of 90% of graduate students [7][8]. Group 3: Research Applications and Feedback - Researchers have reported that GPT-5 can assist in brainstorming, summarizing papers, and planning experiments, significantly reducing the time needed for data analysis [13][14]. - Feedback from various scientists indicates that while GPT-5 can provide valuable insights, it still makes basic errors, and its role is more about integrating existing knowledge rather than generating entirely new ideas [14][15]. Group 4: Future Directions and Enhancements - OpenAI is working on two main optimizations for GPT-5: reducing confidence in its answers to promote humility and enabling the model to fact-check its outputs [4][19]. - The goal is to create a collaborative workflow where the model can serve as its own verifier, enhancing the reliability of its contributions to scientific research [19][20].
AI产业速递:谷歌正在进行哪些布局?
Changjiang Securities· 2026-01-14 15:16
Investment Rating - The investment rating for the industry is "Positive" and maintained [7] Core Insights - Google has established a comprehensive AI ecosystem, including TPU computing infrastructure, the Gemini multimodal model family, AI Studio, and the Vertex AI developer platform, which continuously empowers its AI layout and strengthens its data moat [2][10] - The acceleration of AI applications is moving towards realization, with a positive outlook on the performance of large model companies like Zhiyu and Minimax post-IPO. Key marginal factors include (1) model capability enhancement and release event catalysts; (2) advancement of business models (C-end traffic entry logic & B-end labor replacement logic). A paradigm shift in models by 2026 is expected to bring excess opportunities, with a long-term positive view on AI industry upgrade opportunities [2][10] Summary by Relevant Sections AI Applications - Google is actively expanding its AI strategy across various segments, focusing on providing infrastructure and open-source models in healthcare. Notable developments include the Vertex AI Search for Healthcare tool optimized for medical scenarios and partnerships like the one with Color Health for breast cancer screening assistance [10] AI for Science - Google has a significant advantage in AI for Science (AI4S) due to its extensive experience and capabilities in the field. The company has developed world-class scientific intelligence models and tools, applying AI across multiple scientific domains such as biology, meteorology, and physics [10] Edge Deployment - Google has a well-established edge deployment strategy, focusing on embodied intelligence, AI glasses, AI phones, Google TV, and Robotaxi services. The latest data shows significant growth in Robotaxi services, with a 80% increase in service volume compared to earlier months [10]
美国「曼哈顿计划」启动,OpenAI谷歌等24巨头打响「科技珍珠港之战」
3 6 Ke· 2025-12-19 07:54
Core Insights - The United States has officially launched a national AI initiative named the "Genesis Mission," likened to the Manhattan Project, aimed at integrating top AI technologies with national laboratory research capabilities [1][5][25] - This initiative involves major tech companies such as Microsoft, Google, NVIDIA, OpenAI, DeepMind, and Anthropic, marking a significant collaboration in advancing scientific research [1][5][19] - The goal is to create the first AI-driven research platform to accelerate scientific discoveries in areas like controlled nuclear fusion, energy materials, climate modeling, and quantum computing algorithms [1][14] Group 1: Initiative Overview - The Genesis Mission is a historic national strategy initiated by the U.S. government, with a focus on transforming AI into a default tool for scientific research [1][13] - The plan aims to unify national laboratories, supercomputers, and data assets into a single AI platform, enhancing the efficiency of scientific discovery processes [1][13] - The initiative is led by the U.S. Department of Energy, which oversees critical research areas such as nuclear energy, materials science, and climate modeling [18][19] Group 2: Collaboration and Participation - A total of 24 leading tech companies are participating in the Genesis Mission, representing a comprehensive integration of the U.S. AI industry into the national research framework [19][20] - Notably, OpenAI and Google, traditionally competitors, are collaborating to address national scientific and energy challenges [6][7] - Companies like Microsoft and Google will provide cloud computing infrastructure and AI development platforms to support national laboratories [21][22] Group 3: Strategic Goals and Implications - The initiative aims to double U.S. scientific productivity by 2030 through the integration of AI and supercomputing capabilities [7] - The Genesis Mission is expected to revolutionize traditional research processes, shifting from lengthy cycles to AI-driven hypothesis generation and experimentation [25][26] - This strategic move positions AI as a national capability rather than merely a commercial tool, emphasizing the importance of embedding AI into the scientific research ecosystem [28][30]
腾讯研究院AI速递 20251215
腾讯研究院· 2025-12-14 16:01
Group 1 - OpenAI's GPT-5.2 received negative feedback from users on platforms like X and Reddit, citing issues such as blandness, excessive safety checks, and poor emotional intelligence [1] - SimpleBench testing revealed GPT-5.2 scored lower than Claude Sonnet 3.7 from a year ago, with errors in simple questions, while LiveBench scores were below Opus 4.5 and Gemini 3.0 [1] - The strict safety refusal mechanism was criticized for reducing the model's empathy and contextual awareness, leading to mechanical and unrealistic suggestions in emotional support scenarios [1] Group 2 - Google launched the new Gemini Deep Research Agent just before GPT-5.2, enhancing accuracy and reducing hallucinations through multi-step reinforcement learning [2] - The new version achieved leading scores of 46.4% in the Humanity's Last Exam test set, 66.1% in DeepSearchQA, and 59.2% in BrowseComp [2] - Google also introduced an open-source benchmark for network research agents and a new interactive API for server-side state management and long inference loops [2] Group 3 - Runway released significant updates, including the Gen-4.5 flagship video model and the first general world model, GWM-1, which supports native audio generation and multi-camera editing [3] - GWM-1 is an autoregressive model that allows frame-by-frame prediction and real-time intervention, featuring variants for exploring environments, dialogue characters, and robotic operations [3] - NVIDIA's CEO congratulated Runway, indicating a shift from simple video generation to true world simulation, with AI beginning to understand the underlying logic of the physical world [3] Group 4 - Google integrated Gemini model capabilities into its translation service, launching a real-time voice translation beta that supports over 70 languages while preserving speaker tone and rhythm [4] - The text translation engine has been restructured to intelligently parse idioms and context rather than relying on literal translations, supporting translations between English and nearly 20 other languages [4] - The Chrome team introduced an experimental browser called Disco, featuring GenTabs that convert web content into interactive mini-apps [4] Group 5 - TuoZhu Technology upgraded its 3D model platform MakerWorld by integrating Tencent's Hunyuan 3D 3.0, launching a new figurine generator that allows users to create printable 3D models from a single image [6] - Hunyuan 3D 3.0 introduced a pioneering 3D-DiT sculpting technology, enhancing modeling precision threefold with a geometric resolution of 1536³ and supporting ultra-high-definition modeling with 3.6 billion voxels [6] - MakerWorld has attracted over 2 million users with 20 unique modeling tools, significantly shortening design cycles by leveraging advanced generative AI technology [6] Group 6 - Disney invested $1 billion in OpenAI, acquiring warrants for additional equity, marking a significant content licensing partnership for the Sora platform [7] - The three-year licensing agreement grants exclusivity in the first year, allowing Sora and ChatGPT Images to use over 200 Disney characters, including those from Marvel and Pixar, excluding live-action likenesses [7] - Disney plans to utilize OpenAI's API to develop new products for its Disney+ streaming platform and deploy ChatGPT for internal workflows, with selected fan-created videos to be featured on Disney+ [7] Group 7 - The Erdős 1026 problem, proposed in 1975, was solved with AI assistance in just 48 hours, showcasing AI's potential to provide new mathematical insights rather than merely searching existing literature [8] - The AI system Aristotle automatically proved a formula in Lean proof assistant language, while AlphaEvolve helped refine a clean formula from numerical results [8] - This achievement demonstrates AI's capability to generate new mathematical insights, significantly reducing the time required for traditional problem-solving methods [8] Group 8 - Yuzhu Technology launched the first humanoid robot application store, aimed at standardizing and modularizing humanoid robot functionalities to lower the development barrier for complex movements [9] - The application store includes core modules such as user forums, action libraries, datasets, and developer centers, allowing users to deploy cloud-based motion control algorithms without coding skills [9] - Initial applications include preset martial arts and dance routines for the G1 series robots, utilizing proprietary dynamics algorithms and high-precision motion capture data [9] Group 9 - Google DeepMind's chief AGI scientist predicts a 50% chance of achieving minimal AGI by 2028, with complete AGI expected within 3-6 years after that, leading to a phase of superintelligent AI [10] - AGI is viewed as a continuous spectrum rather than a critical point, with three stages: minimal AGI for typical cognitive tasks, complete AGI for exceptional human tasks, and ASI surpassing all human cognitive domains [10] - The emergence of AGI is anticipated to cause structural unemployment, primarily affecting high-level cognitive jobs, while lower-level physical jobs may remain temporarily safe [10] Group 10 - A report by Similarweb indicates that global GenAI platform monthly visits exceeded 7 billion, a 76% year-on-year increase, with mobile app downloads reaching 1.9 billion, more than tripling in a year [12] - The proportion of users aged 18-34 decreased by approximately 15%, indicating a rapid influx of older users, while ChatGPT has become one of the top five websites globally, with 95% of users still using Google [12] - AI Mode has become the first generative AI search feature to surpass 100 million visits, marking a shift in the internet from being search-driven to being AI-driven [12]
半世纪难题48小时破解!陶哲轩组队把AI数学玩成打怪游戏了
量子位· 2025-12-13 04:34
Core Viewpoint - The collaboration between mathematicians and AI has led to the resolution of the long-standing Erdős 1026 problem, which had remained unsolved for 50 years, in just 48 hours [1][2][3]. Group 1: Problem Overview - The Erdős 1026 problem was proposed in 1975 and involves determining the minimum possible value of a function related to a game theory scenario involving two players, Alice and Bob [8][10][12]. - The problem's complexity was highlighted by the introduction of a maximum constant c(n) that represents the minimum proportion of coins Bob can guarantee to take, regardless of how Alice distributes them [10][13]. Group 2: AI's Role in the Solution - AI tools played a crucial role in solving the problem quickly, with traditional methods potentially taking weeks or months to reach a conclusion [3][5]. - The use of AI models, such as Harmonic and AlphaEvolve, allowed mathematicians to automate the construction and proof of key inequalities, transforming the original problem into a computational geometry challenge [16][18][22]. Group 3: Collaborative Efforts - The solution involved multiple mathematicians working together, with contributions from Boris Alexeev, Koishi Chan, and Lawrence Wu, showcasing the effectiveness of human-AI collaboration [17][28][32]. - The collaborative approach of combining human insight with AI capabilities is emerging as a new trend in mathematical problem-solving [46]. Group 4: Historical Context and Future Implications - The Erdős problems, proposed by the renowned mathematician Paul Erdős, have been a significant part of mathematical research, with many remaining unsolved [39][41]. - The increasing success of AI in solving these problems suggests a shift in how mathematical research may be conducted in the future, with AI becoming a standard tool for researchers [41][42].
AI for Science,走到哪一步了?
3 6 Ke· 2025-12-03 09:15
Core Insights - Google DeepMind's AlphaFold has significantly impacted protein structure prediction, driving advancements in scientific research over the past five years [1][4] - AI is reshaping scientific research, particularly in life sciences and biomedicine, due to rich data availability and urgent societal needs [1][3] Group 1: AI in Scientific Research - AI models and tools have achieved breakthroughs in basic research, including protein structure prediction and the discovery of new biological pathways [1][3] - The paradigm of "foundation models + research agents + autonomous laboratories" is emerging in AI-driven scientific research [3][13] Group 2: Advancements in Biology - DeepMind's AlphaFold has solved the protein structure prediction problem, earning the 2024 Nobel Prize in Chemistry and establishing itself as a digital infrastructure for modern biology [4] - The C2S-Scale model, developed by Google and Yale University, has generated new hypotheses about cancer cell behavior, showcasing AI's potential in formulating original scientific hypotheses [8] Group 3: AI in Drug Development - AI-assisted pathology detection has expanded to new disease scenarios, with the DeepGEM model achieving a prediction accuracy of 78% to 99% for lung cancer gene mutations [10] - The AI-optimized drug MTS-004 has completed Phase III clinical trials, marking a significant milestone in AI-driven drug discovery [10] Group 4: AI in Other Scientific Fields - AI applications in materials science are gaining momentum, with startups like Periodic Labs and CuspAI focusing on discovering new materials [11] - DeepMind's WeatherNext 2 model has surpassed traditional physical models in accuracy and efficiency for weather predictions [5] Group 5: Future of AI in Science - The evolution of scientific intelligence technologies is expected to accelerate, with AI foundational models and robotics enhancing research efficiency [19] - The integration of AI into scientific discovery is anticipated to lead to significant breakthroughs, with predictions of achieving near-relativistic level discoveries by 2028 [19]
百度亮出秘密武器:一个自我演化的AI,给出了人类做不到的最优解
机器之心· 2025-11-14 09:30
Core Insights - The article discusses the rapid evolution of AI from being mere executors to becoming inventors, highlighting the introduction of Baidu's FM Agent, a self-evolving intelligent agent capable of solving complex problems autonomously [1][6][30] Group 1: AI Capabilities and Innovations - FM Agent can autonomously generate and optimize algorithms, significantly reducing the time required for tasks that would take human experts days or even weeks [4][8] - The system combines large language models with evolutionary search algorithms to tackle real-world problems, demonstrating a leap from executing commands to discovering solutions independently [6][8] - The agent's performance has been validated in various benchmarks, achieving a medal rate of 43.56% on MLE-Bench, outperforming the human median by 51.56% [13] Group 2: Technical Features - FM Agent employs four core technologies: automated machine learning processes, combination optimization, GPU kernel generation, and mathematical problem-solving capabilities [13][14] - The system operates through a workflow that includes cold start initialization, adaptive diversity sampling, and a distributed asynchronous infrastructure based on the Ray framework [12][14] Group 3: Industry Applications - FM Agent has shown effectiveness in multiple sectors, including finance, urban traffic optimization, and large-scale engineering projects, providing solutions that are faster and more efficient than traditional methods [25][18] - The agent can abstract real-world problems into mathematical algorithms, continuously iterating and optimizing solutions based on clear evaluation metrics [18][20] Group 4: Future Implications - The emergence of FM Agent signifies a shift towards a new paradigm where humans define problems and AI executes solutions, potentially transforming productivity across various industries [22][30] - Baidu's FM Agent has already attracted over 1,000 enterprises for testing, indicating strong interest and potential for widespread application in sectors like transportation, energy, and finance [33][32]
陶哲轩力推AlphaEvolve:解决67个不同数学问题,多个难题中超越人类最优解
3 6 Ke· 2025-11-07 07:40
Core Insights - The article discusses the introduction of AlphaEvolve, a powerful new tool for mathematical discovery, co-authored by Bogdan Georgiev and Terence Tao [1][5]. Group 1: AlphaEvolve's Capabilities - AlphaEvolve was tested on 67 mathematical problems across various fields, including combinatorial mathematics, geometry, mathematical analysis, and number theory [3]. - The system outperformed traditional tools in scalability, robustness, and interpretability, and it can autonomously discover novel mathematical constructs, surpassing existing human optimal results in some cases [5][6]. Group 2: Human-AI Collaboration - In the Nikodym set problem, AlphaEvolve generated initial constructs that, while not optimal, provided valuable insights for human researchers, leading to improved upper bounds in a subsequent independent paper [6][7]. - Similarly, in the arithmetic Kakeya conjecture, AlphaEvolve played a crucial role in advancing understanding [8]. Group 3: Interpretability and Insight Generation - AlphaEvolve's ability to generate clear and interpretable program code allows human experts to analyze and extract general mathematical formulas from its outputs [10]. - For the stacking blocks problem, the system initially created a correct recursive program, which it later simplified into a more efficient explicit program, revealing the mathematical relationship with harmonic numbers [14]. Group 4: Problem-Solving Techniques - The system demonstrated its ability to navigate complex problem spaces by adapting its scoring functions to avoid local traps, ultimately converging on known theoretical optimal solutions [19]. - AlphaEvolve exhibited excellent generalization capabilities, successfully identifying universal constructs for all perfect square inputs [20][21]. Group 5: Efficiency and Expert Guidance - AlphaEvolve operates efficiently with minimal high-quality prompts, and expert guidance significantly enhances the quality of its outputs [23]. - The system supports parallelization, allowing researchers to explore multiple problem instances simultaneously, which is particularly effective for multi-parameter geometric problems [23]. Group 6: Operational Modes - AlphaEvolve functions in two primary modes: "search mode" for efficiently discovering optimal mathematical constructs and "generalizer mode" for creating universal programs applicable to various parameters [24][26]. - In search mode, the system evolves heuristic algorithms to optimize the search process, while in generalizer mode, it aims to identify patterns and develop general formulas based on observed optimal solutions [25][26]. Conclusion - Overall, AlphaEvolve exemplifies how AI-driven evolutionary search can complement human intuition, providing a robust new paradigm for mathematical research [28].