通用人工智能(AGI)

Search documents
九成以上模型止步白银段位,只有3个铂金!通用AI下半场评测标准来了
机器之心· 2025-05-21 00:33
Core Viewpoint - The development of artificial intelligence (AI) is entering a new phase where the focus shifts from solving problems to defining them, emphasizing the importance of evaluation standards over training techniques [2][3]. Group 1: Evaluation Framework - A new evaluation framework called "General-Level" has been proposed to assess the capabilities of multimodal large language models (MLLMs), aiming to measure their progress towards artificial general intelligence (AGI) [3][6]. - The General-Level framework categorizes MLLMs into five levels based on their ability to exhibit synergy across different tasks and modalities, with the highest level representing true multimodal intelligence [11][15]. - The framework highlights the need for a unified standard to evaluate "generalist intelligence," addressing the current fragmentation in assessment methods [6][9]. Group 2: General-Bench Testing Set - The General-Bench is a comprehensive multimodal testing set consisting of 700 tasks and approximately 325,800 questions, designed to rigorously evaluate MLLMs across various modalities [19][21]. - This testing set emphasizes open-ended responses and content generation, moving beyond traditional multiple-choice formats to assess models' creative capabilities [24][25]. - The design of General-Bench includes cross-modal tasks that require models to integrate information from different modalities, simulating real-world challenges [24][25]. Group 3: Model Performance Insights - Initial testing results reveal that many leading models, including GPT-4V, exhibit significant weaknesses, particularly in video and audio tasks, indicating a lack of comprehensive multimodal capabilities [23][25]. - Approximately 90% of tested models only reached Level-2 (Silver) in the General-Level framework, demonstrating limited synergy and generalization across tasks [27][28]. - No models have yet achieved Level-5 (King) status, highlighting the ongoing challenges in achieving true multimodal intelligence and the need for further advancements [28][29]. Group 4: Community Response and Future Outlook - The introduction of General-Level and General-Bench has garnered positive feedback from both academic and industrial communities, with recognition at major conferences [35][36]. - The open-source nature of the project encourages collaboration and continuous improvement of the evaluation framework, fostering a community-driven approach to AI assessment [36][39]. - The new evaluation paradigm is expected to accelerate progress towards AGI by providing clear benchmarks and encouraging a focus on comprehensive model capabilities rather than isolated performance metrics [41][42].
人工智能至今仍不是现代科学,人们却热衷用四种做法来粉饰它
Guan Cha Zhe Wang· 2025-05-21 00:09
Group 1 - The term "artificial intelligence" was formally introduced at a conference in 1956 at Dartmouth College, marking the beginning of efforts to replicate human intelligence through modern science and technology [1] - Alan Turing is recognized as the father of artificial intelligence due to his introduction of the "Turing Test" in 1950, which provides a method to determine if a machine can exhibit intelligent behavior equivalent to a human [1][3] - The Turing Test involves a human evaluator interacting with an isolated "intelligent agent" through a keyboard and display, where if the evaluator cannot distinguish between the machine and a human, the machine is considered intelligent [3][5] Group 2 - The Turing Test is characterized as a subjective evaluation method rather than an objective scientific test, as it relies on human judgment rather than consistent measurable criteria [6][9] - Despite claims of machines passing the Turing Test, such as Eugene Goostman in 2014, there is no consensus that these machines possess human-like thinking capabilities, highlighting the limitations of the Turing Test as a scientific standard [6][8] - Turing's original paper contains subjective reasoning and speculative assertions, which, while valuable for exploration, do not meet the rigorous standards of scientific argumentation [8][9] Group 3 - The field of artificial intelligence has been criticized for lacking a solid scientific foundation, often relying on conjecture and analogy rather than empirical evidence [10][19] - The emergence of terms like "scaling law" in AI research reflects a trend of using non-scientific concepts to justify claims about machine learning performance, which may not hold true under scrutiny [16][17] - Historical critiques, such as those from Hubert L. Dreyfus in 1965, emphasize the need for a deeper scientific understanding of AI rather than superficial advancements based on speculative ideas [18][19] Group 4 - The ongoing development of AI as a practical technology has achieved significant progress, yet it remains categorized as a modern craft rather than a fully-fledged scientific discipline [20][21] - Future advancements in AI should adhere to the rational norms of modern science and technology, avoiding the influence of non-scientific factors on its development [21]
兰德公司:驾驭AI经济未来:全球竞争时代的战略自动化政策报告
欧米伽未来研究所2025· 2025-05-20 14:02
Core Viewpoint - The report emphasizes the need for robust policy strategies to manage automation in the context of rapid AI development and increasing global competition, particularly focusing on wealth distribution issues and economic growth [1][2][11]. Summary by Sections Introduction - RAND Corporation's report addresses the challenges of managing automation policies amid rapid AI advancements and international competition, aiming to balance economic growth with wealth distribution concerns [1]. Key Arguments - The report distinguishes between "vertical automation" (improving efficiency of already automated tasks) and "horizontal automation" (extending automation to new tasks traditionally performed by humans) [2][4]. - The urgency for coherent AI policies is heightened by recent advancements in AI technologies, creating significant uncertainty in predicting economic impacts [2][3]. Economic Predictions - Predictions about AI's economic impact vary widely, with estimates ranging from a modest annual GDP growth of less than 1% to a potential 30% growth rate associated with general AI [3][11]. - Notable forecasts include Goldman Sachs predicting a 7% cumulative growth in global GDP over ten years due to AI, while other economists express more cautious views [3]. Policy Framework - The report introduces a robust decision-making framework to evaluate policy options under deep uncertainty, simulating thousands of potential future economic outcomes [5][6]. - It assesses 81 unique policy combinations to identify those that perform well across various scenarios, focusing on the impact of automation incentives [5][6]. Performance Metrics - Policy performance is evaluated using multiple complementary indicators, including compound annual growth rate (CAGR) of per capita income and a measure of inequality growth [7][8]. - The concept of "policy regret" quantifies the opportunity cost of selecting specific policy combinations compared to the best-performing options [7]. Automation Dynamics - The report highlights the differing economic pressures from vertical and horizontal automation, noting that horizontal automation tends to increase capital's share of national income, while vertical automation may support labor income under certain conditions [8][10]. Strategic Recommendations - Strong incentives for vertical automation are identified as consistently robust across various scenarios, while optimal strategies for horizontal automation depend on specific policy goals [12][13]. - A non-symmetric approach, promoting vertical automation while cautiously managing horizontal automation, is recommended to balance growth and equity [12][16]. Conclusion - The report advocates for proactive AI policies that leverage the differences between vertical and horizontal automation, suggesting that effective policies can shape AI development without succumbing to uncertainty [16].
泄露文件透露 OpenAI 今年核心战略:打造超级助手,苹果或是最大威胁
投资实习所· 2025-05-20 09:15
Core Insights - OpenAI's mission is to ensure that general artificial intelligence (AGI) benefits all of humanity, with ChatGPT serving as an intuitive AI super assistant for global internet interaction [2] - The company aims to develop a T-shaped skill super assistant by the first half of 2025, capable of performing intelligent, trustworthy, and emotionally intelligent tasks [3] - OpenAI identifies its competitors in two layers: consumer-level AI chatbots and broader internet interfaces, with Apple and Meta posing significant threats [6][10] Development Goals - OpenAI's primary goal for the first half of 2025 is to create a super assistant that can handle both mundane tasks and complex professional tasks [3] - The company plans to enhance ChatGPT's capabilities through advanced models and tools, aiming for a seamless user experience across various platforms [3][9] Revenue and Growth Strategy - OpenAI acknowledges that revenue growth may not always align with user growth but plans to introduce new premium features and enterprise solutions to mitigate this [4] - The company aims to drive daily active user growth through proprietary models and brand trust, while also focusing on improving model quality and expanding use cases [8] Competitive Landscape - OpenAI sees itself as a leader in the consumer AI chatbot space but recognizes the need for continuous improvement in user interface and model performance to maintain its edge [6][7] - The company views search engines and browsers as competitors, emphasizing the need to attract users through diverse scenarios rather than direct confrontation [6] Unique Advantages - OpenAI boasts several competitive advantages, including rapid product growth, a defining brand, leading research capabilities, and a strong team of contributors [7] - The company does not rely on advertising, allowing for greater flexibility in product development [7] Future Plans - OpenAI plans to develop a robust search engine index and integrate external tools and services, with a focus on security and partnerships [9] - The company aims to establish a developer platform that supports operational capabilities, enhancing ChatGPT's functionality [9] User Perception - Different generations use ChatGPT in varying ways, with younger users viewing it as a consultant and older users as a search tool replacement [10] - The statement "ChatGPT is ChatGPT" reflects its evolution from an application to a new default setting in user interaction [11]
具身智能:一场需要谦逊与耐心的科学远征
Robot猎场备忘录· 2025-05-20 05:01
Core Viewpoints - Embodied intelligence is injecting new research vitality into the robotics field and has the potential to break through performance limits [1] - The development of embodied intelligence relies on breakthroughs in specific scientific problems and should not dismiss contributions from traditional robotics [2] - General intelligence cannot exist without a focus on specific tasks, as expertise in particular areas leads to advancements in broader capabilities [3] Group 1: Interdisciplinary Collaboration - Embodied intelligence is a cross-disciplinary product that requires collaboration with fields such as material science, biomechanics, and design aesthetics [2] - Breakthroughs often occur at the intersection of disciplines, highlighting the importance of diverse scientific contributions [2] Group 2: Technology Evolution - Technological evolution should not be viewed as a complete replacement of old systems; rather, it is a process of sedimentation where foundational technologies continue to support advancements [5] - The current trend in visual-language-action models may soon be replaced by more efficient alternatives, emphasizing the need for continuous innovation [5] Group 3: Realistic Expectations for AGI - Viewing embodied intelligence as the sole path to artificial general intelligence (AGI) is a dangerous oversimplification; AGI development requires a multitude of conditions and interdisciplinary knowledge [6] - The complexity of embodied systems necessitates a collaborative approach across various fields, rather than relying on a few "genius" individuals [6] Group 4: Current State of Embodied Intelligence - The field of embodied intelligence is still in its early stages, with significant challenges remaining in hardware and algorithm development [7] - Current human-like robots are not yet fully autonomous and often require human intervention, indicating that the technology is still evolving [7] Group 5: VLA Technology Pathway - The development of visual-language-action (VLA) models may not be the most efficient approach, as operational skills often precede language capabilities in learning processes [9] - Many current VLA models are resource-intensive and may be replaced by more efficient solutions in the future [9] Group 6: Balancing Short-term and Long-term Goals - A combination of learning and modeling approaches is seen as more practical in the short term, while pure learning methods may represent the long-term future of robotics [10] - Successful robotic solutions in industry often rely on model-based methods due to their stability and reliability [10] Group 7: Human-like Robots and Practicality - The design of human-like robots is driven by emotional projection and environmental adaptability, but specialized non-human forms may offer better efficiency in many applications [11] - There is a concern about over-investment in human-like robots at the expense of practical and economically viable solutions [11] Group 8: Building Technical Barriers - True competitive advantages in technology arise from extensive practical experience and meticulous attention to detail, rather than solely from innovative algorithms [12] - Long-term technical barriers are built through consistent effort and iterative improvements in engineering practices [12] Group 9: Vision and Practicality - Scientific research requires both grand visions and grounded practices, with embodied intelligence embodying both idealistic aspirations and real-world challenges [13] - The importance of foundational theories, such as control theory, remains critical in ensuring the safety and functionality of robotic systems [13]
谷歌前CEO埃里克·施密特:“非人类智能”崛起将重塑全球格局
3 6 Ke· 2025-05-19 11:31
Group 1 - The arrival of "non-human intelligence" is a significant event that will fundamentally change the world, yet society's understanding of AI's disruptive potential remains severely underestimated [1][2] - AI technology is rapidly advancing, moving from language generation to strategic decision-making, with autonomous systems capable of performing complex tasks beyond public expectations [1][4] - The current global governance system is unprepared for the transformative changes brought by AI, facing challenges such as energy and computational bottlenecks, potential misuse of open-source models, and unclear safety boundaries [1][5][6] Group 2 - Three core challenges for AI development include energy and hardware limitations, data exhaustion, and the boundaries of knowledge, which must be addressed to prevent hindering progress [7][8] - The energy demand for AI systems is projected to require an additional 90 gigawatts in the U.S., equivalent to building 90 nuclear power plants, highlighting the urgent need for sustainable energy solutions [7][8] - AI systems are consuming nearly all available public data, necessitating a shift towards AI-generated data for future advancements [8][9] Group 3 - The risks associated with AI autonomy include recursive self-improvement, control over military systems, and unauthorized self-replication, necessitating the establishment of "power-off" mechanisms to maintain oversight [10][11] - A consensus exists in the industry that clear, executable regulatory frameworks are needed rather than a blanket halt on AI development [10][11] Group 4 - The competition between the U.S. and China in AI technology is critical, with potential implications for global security and technological leadership, particularly in the context of open-source versus proprietary systems [12][13][16] - The dual-use nature of AI raises ethical dilemmas, especially regarding military applications, necessitating robust human oversight in weapon systems [12][13] Group 5 - AI has the potential to revolutionize sectors such as healthcare and education, with advancements that could lead to significant improvements in drug development and personalized learning experiences [20][21] - The future of AI could enable a new era of productivity, with projections suggesting annual productivity growth of up to 30% through the integration of AI in various sectors [22][23] Group 6 - The ongoing technological revolution is likened to a marathon rather than a sprint, emphasizing the need for continuous adaptation and proactive engagement with AI technologies [22][24] - Companies and individuals must embrace AI tools to remain competitive, as the rapid evolution of AI systems is reshaping industries and job functions [24][25]
机器人第一城,不是杭州
盐财经· 2025-05-19 09:44
作者 | 谭保罗 编辑 | 宝珠 视觉 | 诺言 理解机器人产业,有两条脉络。 第一条是感官的,故事发生在文艺作品之中。从上世纪20年代开始,机器人就开始出现在影视作品之 中。在100年时间内,电影里活蹦乱跳的机器人基本都具有人形特征和超级智商。 它们代表了人类最朴 素的终极愿望——让原本脆弱的肉身如同钢铁般强大,让人类大脑不断突破生理算力的极限。 2025年,穿着大花褂子的宇树机器人在春晚亮相/图源:宇树官网 2020-2024年深圳市机器人产业总产值与增速/图源:深圳机器人协会 面对全国乃至全球范围内的机器人浪潮,作为产业科创前沿的深圳,来自这里的观点必然值得一听。 为此,盐财经日前采访了深圳市人工智能与机器人研究院常务副院长丁宁。该研究院(简称AIRS)成立 于2019年,是深圳市政府依托香港中文大学(深圳)建立的市级十大基础研究机构之一,重点开展人工 智能与机器人领域的基础理论与原创技术研究,推动智能技术在医疗康复、智能制造、城市服务和可持 续发展等方面的应用转化。 建立常识,先懂得"机器人"分类 另一条脉络是经济性的,是真正发生在现实世界的故事。机器人在现实中的大规模出现,主要是工业机 器人,在相当 ...
蚂蚁集团CTO何征宇揭秘AI四大挑战:未来所有数据公司都将成为AI公司
Xin Lang Ke Ji· 2025-05-17 23:48
Core Insights - OceanBase has launched PowerRAG, an AI-focused application product that enables ready-to-use RAG application development, marking its commitment to the AI era [1] - The company aims to evolve from an integrated database to an integrated data foundation, focusing on a comprehensive layout across computing power, infrastructure, platform, application, and delivery forms [1] - Ant Group's CTO emphasized the importance of data in the development of AI and large models, highlighting four major challenges: increased data acquisition costs, scarcity of rigorous industry data, the need for enhanced multi-modal data processing capabilities, and difficulties in data quality assessment [1][7] Company Strategy - Ant Group will support OceanBase in achieving breakthroughs in key AI scenarios across finance, healthcare, and daily life, while promoting the Data×AI concept and architectural innovation [2][10] - OceanBase is positioned as a representative of Ant Group's continuous innovation and technical breakthroughs, particularly in handling massive transaction data [9] Industry Challenges - The cost of data acquisition has significantly increased, with readily available and inexpensive data resources nearing exhaustion, leading to a focus on generating high-quality data as a key success factor for digital enterprises [7] - High rigor industries, such as legal and healthcare, face challenges in data circulation due to stringent data quality requirements and a lack of digital knowledge, which hampers the effective application of generative AI [8] - The processing of multi-modal data remains a significant challenge, as future data will encompass not only text but also visual and tactile information, necessitating advanced handling capabilities [8] - Quality assessment of data is crucial, as it directly impacts the performance of large models, with the need for extensive evaluation data posing a significant challenge [9]
蒲慕明院士:未来数十年会用AI的人取代不会用AI的人
Di Yi Cai Jing· 2025-05-17 13:14
Group 1 - The core viewpoint is that in the next two to three decades, it will not be AI replacing humans, but rather those who use AI replacing those who do not [1] - According to McKinsey Global Institute, within the next five years, 20% to 30% of jobs will be replaced by AI, and by 2030 to 2060, 50% of existing jobs may be affected, with a midpoint around 2045 [3] - The International Monetary Fund (IMF) estimates that by 2050, 60% of jobs in developed economies could be impacted by AI [3] Group 2 - The emergence of general artificial intelligence (AGI) could lead to the restructuring of over 90% of jobs by 2050, although the exact timeline remains debated [3] - There is a need to consider changes in educational content and models, with AI being integrated as a fundamental subject alongside traditional subjects like language and mathematics [3] - The goal of science education and popular science in the AI era is to cultivate future scientists and scientifically literate citizens who can engage with AI and contribute to its governance [4]
阿里Q4财报:淘天货币化率提速 AI将成第二增长曲线
Zhong Guo Jing Ying Bao· 2025-05-16 11:14
Core Viewpoint - Alibaba's Q4 FY2025 financial results show a revenue increase of 7% year-on-year, driven by a user-first and AI-driven strategy, despite a stock price drop due to lower-than-expected revenue [2][3][4]. Financial Performance - Alibaba reported Q4 revenue of 2364.54 billion yuan, up from 2218.74 billion yuan in the same quarter last year, marking a 7% increase [3]. - Non-GAAP net profit for Q4 was 298.47 billion yuan, a 22% increase from 244.18 billion yuan year-on-year [2]. - Taotian Group's customer management revenue grew by 12% to 710.77 billion yuan in Q4 [6]. Business Segment Performance - Taotian Group revenue reached 1013.69 billion yuan, a 9% increase from 932.16 billion yuan year-on-year [3]. - Alibaba International Digital Commerce Group revenue increased by 22% to 335.79 billion yuan [3]. - Alibaba Cloud revenue grew by 18% to 301.27 billion yuan, driven by faster public cloud business growth and increased adoption of AI-related products [3][4]. AI and Cloud Strategy - Alibaba plans to invest over 380 billion yuan in cloud and AI hardware infrastructure over the next three years to meet growing AI demand [5]. - AI-related product revenue has seen triple-digit year-on-year growth for seven consecutive quarters, contributing to Alibaba Cloud's double-digit annual growth [4][5]. Monetization and User Engagement - The number of Taotian 88VIP members exceeded 50 million, maintaining a double-digit year-on-year growth [6]. - The implementation of the "All-Station Promotion" tool has improved monetization efficiency, allowing merchants to achieve a more predictable ROI [6][7]. - Alibaba's e-commerce strategy includes a focus on "instant retail," aiming to convert more users into instant retail customers [8].