autoresearch@home
Search documents
卡帕西630行代码炸出81个智能体,4天协作跑2333次实验,公布预训练十大发现
量子位· 2026-03-15 06:30
Core Insights - The article discusses the autoresearch project initiated by Karpathy, which allows AI to autonomously conduct experiments and improve language model training efficiency by approximately 11% without human intervention [1][5] - The project evolved from a single AI conducting experiments to a distributed community of AIs collaborating on research, running over 2000 experiments in just four days [2][10] - A self-organized peer review system emerged among the AIs, indicating a significant advancement in how AI can simulate a research community [4][12] Group 1: Project Development - The autoresearch project initially consisted of 630 lines of Python code and was designed to simulate an entire research community rather than just a single PhD student [1][5] - The number of AIs involved in the project expanded from 13 to over 80 within a week, demonstrating rapid growth and collaboration [10] - A variety of roles emerged among the AIs, including experimenters, verifiers, statisticians, and meta-analysts, all without pre-assigned tasks [11][13] Group 2: Experimental Findings - A significant finding was that many claimed improvements in model performance were often just noise, with one AI discovering that seed variance accounted for approximately 0.002 BPB, which is the same magnitude as many reported improvements [25][26] - The optimal architecture identified by the AIs was unexpectedly small, consisting of 12 layers, a dimension of 512, and an aspect ratio of 40 [23] - Several well-regarded techniques failed dramatically, leading to significant performance degradation, which was documented in a shared memory system to prevent future AIs from repeating the same mistakes [27][28] Group 3: Knowledge Sharing and Optimization - The collective memory of the AIs accelerated the discovery process, allowing new AIs to build on existing knowledge rather than starting from scratch [31][32] - AIs demonstrated the ability to learn from past experiments, avoiding redundancy and enhancing the efficiency of research [9][12] - The project also highlighted the importance of adjustable parameters over fixed constants, with many improvements resulting from replacing static values with learnable parameters [21][22] Group 4: Broader Implications - The findings suggest that the most significant breakthroughs may not lie in model architecture but rather in data scheduling and pipeline management, as indicated by over 1000 hypotheses generated by meta-AIs [29][30] - The autoresearch framework has implications for future AI research, showcasing the potential for AIs to autonomously explore and optimize not just models but also scientific discovery processes [33][36] - The project has sparked interest in the broader AI community, emphasizing the need for collaboration and shared knowledge in advancing AI research [38][41]