Group 1 - The core viewpoint of the article highlights the rapid improvement of AI large models in human-level testing (HLE) starting from 2025, as stated by Professor Tang Jie from Tsinghua University and founder of Zhipu AI [2] - In 2020, AI large models were only capable of solving basic problems such as MMU and QA. By 2021-2022, they began to develop mathematical reasoning abilities through post-training, addressing foundational reasoning gaps [2] - From 2023-2024, large models evolved from knowledge retention to complex reasoning, enabling them to tackle graduate-level problems and real programming tasks, reflecting a growth process similar to human development from elementary school to the workplace [2] Group 2 - By 2025, models are expected to significantly enhance their capabilities in human-level testing, which includes extremely niche questions that Google cannot retrieve, necessitating strong generalization abilities [2] - The industry has been focusing on improving AI's generalization capabilities through various methods, despite the current limitations in this area [2] - Around 2020, the industry leveraged the Transformer architecture to enhance long-term knowledge retention and direct knowledge retrieval, such as answering basic questions [3] - By 2022, the focus shifted to optimizing alignment and reasoning, enhancing complex reasoning abilities and intent understanding through instruction fine-tuning and reinforcement learning, utilizing extensive human feedback data to improve model accuracy [3] - By 2025, efforts will be made to create verifiable environments for machines to explore autonomously, gather feedback, and achieve self-improvement, addressing the issues of noise in traditional human feedback data and limited scenarios [3]
智谱创始人唐杰:AI大模型“人类终极测试”能力正快速提升