OpenAI Five

Search documents
AI跑分越来越没意义,谷歌说不如让AI一起玩游戏
3 6 Ke· 2025-08-11 23:25
Group 1 - Google has organized an "AI Chess King Championship" featuring top AI models from the US and China, including OpenAI's o4-mini and Google's Gemini 2.5 Pro, to evaluate and promote advancements in AI's reasoning and decision-making capabilities [1][3] - The competition aims to address the limitations of traditional AI benchmark tests, which have failed to keep pace with the rapid development of AI models, by utilizing strategy games as a testing ground [3][11] - The Kaggle Game Arena platform, introduced by Google, serves as a new public benchmark testing platform that allows AI models to compete in a more dynamic and realistic environment compared to conventional tests [3][11] Group 2 - The current investment climate has led to a phenomenon where AI startups can easily achieve valuations exceeding $1 billion, driven by a fear of missing out (FOMO) among investors [4][6] - There is a growing trend of "score manipulation" among AI companies, where high benchmark scores are used as a marketing tool to attract investment, leading to concerns about the integrity of AI performance evaluations [6][9] - Various benchmark tests exist to evaluate AI models, but their lack of flexibility has created opportunities for companies to artificially inflate their scores, undermining the reliability of these assessments [9][11] Group 3 - Google has chosen games as a testing scenario for AI models due to their structured rules and inherent randomness, which effectively measure AI intelligence and capabilities [12][13] - The relationship between gaming and AI is significant, as demonstrated by OpenAI's success in defeating human champions in games like DOTA2, showcasing AI's potential in complex environments [13][15] - The transition to reinforcement learning based on human feedback (RLHF) has been pivotal in enhancing AI's performance, as seen in OpenAI's development of ChatGPT [15]
LLM抢人血案:强化学习天才被挖空,一朝沦为「无人区」
3 6 Ke· 2025-08-04 07:22
最近,斯坦福的AI+CS博士Joseph Suarez发表了对强化学习的历史回顾。 结果,在上火了!目前,已有38.2万阅读。 封面可谓醒目:一条曲线线先是快速上升,然后平缓爬升,最后却急转直下 ,暗喻RL领域的研究前途不妙! 从历史角度看,强化学习发生了什么?为什么到现在它才真正开始起飞? 他提供了独特的个人视角。 师出名门 2019年, 他本科毕业于斯坦福大学计算机科学专业人工智能方向。 2018年,他利用休学期在OpenAI完成6个月实习,期间正式发布Neural MMO首个公开版本 更早之前,他曾在李飞飞课题组、吴恩达实验室参与过研究项目。 大约从2017年,他开始从事强化学习。 当时,他在麻省理工学院Phillip Isola实验室攻读博士,开始创建开源计算研究平台Neural MMO。 他的研究聚焦于推动现代基于智能体的学习方法向更复杂、更具认知真实性的环境拓展。 后来,这个项目后来成为他整个博士生毕业论文的的主题。 当时,各大实验室也在做从零开始、非语言模型的强化学习RL。 事实上,这是当时大多数工作的重点:多智能体(multiagent)刚刚兴起,所有核心算法刚刚发布。 AlphaGo让研究者 ...