Workflow
Gemini 3.0
icon
Search documents
GPT-5内测抢先公布:日常推理首次击败人类,编程数学科学问题能力都很强
3 6 Ke· 2025-08-07 07:21
疑似GPT-5发布的预告才刚刚发出,内测体验已抢先释出。 推理能力首次超越人类,碾压所有大模型。 这来自一位网友的实测结果,他让所有模型开启推理模式做了10道题,只有GPT-5只错了一题,比人类的正确率还高。 且不是孤例,有人表示自己的实测结果非常相似,GPT-5也是只错了10道题中的1道。 除了出色的推理能力,还有拿到内测名额的人表示,GPT-5的编程、数学以及解决科学问题的能力也很出色。 这不,已经有人开始调侃GPT-5取代博士了。 几乎都是一次答对,最多2次。其他大模型却需要更多次数尝试。 现在可以确定的是,OpenAI预告了今晚的发布会,而且把livestream中的s换成了5。 以及谜语人奥特曼刚刚发了一张图……大家自行猜测吧。 总之感觉一切都箭在弦上了,具体性能如何,先来看看提前路透吧! 推理 编程 解决科学问题 数学 首先在推理方面,网友@invincibleHunter是在Copilot上体验到的。 推理编程能力值得关注 目前来看GPT-5值得关注的能力包括: 尽管模型并没有透露自己的型号,但是结合前几天有人在发现Copilot要上线的Smart模式是集成GPT-5,所以推测应该是GPT-5。 ...
GPT-5内测抢先公布:日常推理首次击败人类,编程数学科学问题能力都很强
量子位· 2025-08-07 04:15
Core Viewpoint - The article discusses the anticipated release of GPT-5, highlighting its superior reasoning capabilities compared to previous models and even human performance in certain tasks [1][2][4]. Group 1: Performance Highlights - GPT-5 reportedly achieved a high accuracy rate, only making one mistake out of ten reasoning questions, outperforming human accuracy [4][5]. - Users have noted that GPT-5 excels in programming, mathematics, and solving scientific problems, indicating a significant improvement in these areas [7][30]. - The model's reasoning ability was tested through complex logic problems, showcasing its advanced thinking process [18][25]. Group 2: Comparison with Previous Models - The performance leap from GPT-4 to GPT-5 is noted, although some users feel the improvement is not as pronounced as the transition from GPT-3 to GPT-4 [30]. - GPT-5's parameter scale is reportedly much larger than that of GPT-4, suggesting a more complex model [33]. Group 3: Development Challenges - The development of GPT-5 faced challenges related to data quality and AI infrastructure, with OpenAI reportedly hiring scientists to create high-quality training data [31][32]. - The pre-training process for such a large model is time-consuming, which has affected the release timeline of GPT-5 [35]. Group 4: Competitive Landscape - The competitive environment is intense, with companies like Google and Anthropic releasing new models to challenge OpenAI [36][39]. - There are indications that Google may release an open-source large model, directly competing with OpenAI's offerings [38].