Workflow
通用人工智能
icon
Search documents
EvaLearn:AI下半场的全新评测范式!
机器之心· 2025-07-28 10:45
Core Viewpoint - The article discusses the shift in AI research from "can it be done" to "is it effective," emphasizing the need for new evaluation methods that assess the long-term adaptability and learning capabilities of models, particularly in the context of achieving general artificial intelligence [1][4]. Group 1: New Evaluation Paradigm - A new evaluation paradigm called EvaLearn has been proposed to assess the learning ability and efficiency of large language models (LLMs), providing a fresh perspective on understanding their human-like learning potential [5][6]. - EvaLearn focuses on "sequential problem-solving," redefining the evaluation logic for large language models, and has gained significant attention since its open-source release [6][8]. Group 2: Limitations of Traditional Benchmarks - Traditional benchmarks treat problems as isolated samples, failing to evaluate models' learning efficiency and adaptability, which are crucial for understanding their performance [8][9]. - EvaLearn constructs 648 challenging problems organized into 182 sequences, requiring models to solve them in order, thus allowing for a systematic assessment of their learning capabilities [9][11]. Group 3: Key Findings from EvaLearn - The research team found that models exhibit diverse learning abilities across different task types, with most models better leveraging prior experience for mathematical and logical reasoning tasks, while tasks like summarization rely more on pre-trained knowledge [14]. - Models based on chain-of-thought reasoning generally outperform those that are not, demonstrating better stability and the ability to solve multiple related problems consecutively [15]. - Feedback learning, which incorporates evaluations from a verifier, significantly enhances models' learning abilities and efficiency compared to example-based learning [16]. - Learning ability and efficiency metrics provide a comprehensive assessment of models' learning potential, revealing that high static performance does not guarantee superior learning capabilities [17]. Group 4: Evaluation Metrics - EvaLearn employs a comprehensive set of evaluation metrics to characterize models' dynamic learning abilities, including summary accuracy, classification skills, information extraction, logical reasoning, mathematical reasoning, and sequence reasoning [20]. - Overall accuracy, learning speed, first correct position, consecutive correct answers, and post-warm-up accuracy are key indicators used to assess models' performance [21]. Group 5: Learning Efficiency and Methods - The study indicates significant differences in learning efficiency among models and task types, with non-thinking models often showing faster progress in experience accumulation, while thinking models demonstrate more stable gains [44]. - Different problem-solving methods, such as example learning and feedback learning, significantly impact model performance, with feedback learning generally yielding higher accuracy and learning efficiency [46][48]. - The average position of the first correct answer varies across models and tasks, highlighting the models' learning potential and the importance of feedback in enhancing learning outcomes [51][53]. Group 6: Conclusion - EvaLearn represents a novel benchmark framework for sequentially evaluating models' learning abilities and efficiencies across various tasks, revealing significant performance differences among leading models [55][56]. - The findings underscore the importance of understanding models' learning capabilities and efficiencies as a new perspective for evaluating their performance and bridging the gap between current models and human capabilities [57].
“多模态卷王”,连发三箭!
Zhong Guo Ji Jin Bao· 2025-07-26 08:44
Core Insights - Jumpshare Star announced three significant developments: the launch of the new generation foundational model Step3, a strategic partnership with Shanghai State-owned Capital Investment Co., and the establishment of the Model Ecological Innovation Alliance with nearly ten chip manufacturers and computing power platforms [2][3][6]. Group 1: New Model Launch - The new foundational model Step3 is designed to balance intelligence and efficiency, aiming to create the most suitable model for the inference era and will be open-sourced to global enterprises and developers on July 31 [3]. - Step3 boasts a decoding efficiency that can reach up to 300% compared to DeepSeek-R1 on domestic chips, and it is compatible with all chip types [3]. Group 2: Strategic Partnership - The collaboration with Shanghai State-owned Capital Investment Co. marks a significant step in the commercialization of Jumpshare Star, focusing on capital linkage, ecosystem construction, business synergy, and application empowerment [6][9]. - Shanghai State-owned Capital Investment Co. has a registered capital of 10 billion yuan and is involved in strategic equity management and market-oriented investment projects [9]. Group 3: Ecosystem Alliance - The Model Ecological Innovation Alliance aims to enhance model adaptability and computing efficiency through collaborative innovation among foundational technology vendors [11]. - Initial members of the alliance include major companies such as Huawei Ascend, MuXi, and others, with the goal of providing efficient and user-friendly large model solutions [11][13].
“多模态卷王”,连发三箭!
中国基金报· 2025-07-26 08:31
Core Viewpoint - Jumpshare Star announced three significant developments: the launch of the new generation foundational model Step 3, a strategic partnership with Shanghai State-owned Capital Investment Co., and the establishment of the Model Ecological Innovation Alliance with nearly ten chip manufacturers and computing power platforms [1][7][14]. Group 1: New Generation Foundational Model - The new foundational model Step 3 is designed to balance intelligence and efficiency, aiming to create the most suitable model for the inference era and contribute a powerful multimodal inference model to the open-source community [1][2]. - Step 3 achieves a decoding efficiency that is up to 300% higher than DeepSeek-R1 on domestic chips, demonstrating significant advancements in system and architecture innovation [2]. - In distributed inference using NVIDIA Hopper architecture chips, Step 3 shows a throughput improvement of over 70% compared to DeepSeek-R1 [4]. Group 2: Strategic Partnership - The partnership with Shanghai State-owned Capital Investment Co. marks a significant step in Jumpshare Star's commercialization efforts, focusing on capital linkage, ecological construction, business collaboration, and application empowerment [7]. - Shanghai State-owned Capital Investment Co. is a large state-owned capital investment platform with a registered capital of 10 billion yuan, involved in strategic equity management and market-oriented investment [8]. Group 3: Commercialization Progress - Jumpshare Star has achieved commercial progress, with over half of domestic smartphone manufacturers collaborating with the company, and partnerships with Geely Auto for smart cockpit solutions [10]. - The company aims to achieve an annual revenue target of 1 billion yuan by 2025, driven by rapid growth in the first half of 2025 [10]. Group 4: Model Ecological Innovation Alliance - The Model Ecological Innovation Alliance, initiated by Jumpshare Star and nearly ten chip and infrastructure manufacturers, aims to enhance model adaptability and computing efficiency through collaborative innovation [14][15]. - Initial members of the alliance include Huawei Ascend, Muqi, and several other technology firms, with the goal of providing efficient and user-friendly large model solutions for enterprises and developers [14][15].
扎克伯格任命清华校友为Meta AI首席科学家
Hu Xiu· 2025-07-26 02:03
Group 1 - Meta has appointed Shengjia Zhao, a Tsinghua University alumnus, as the Chief Scientist of its Superintelligent Lab (MSL) [1][2] - Mark Zuckerberg expressed excitement about Zhao's leadership and his groundbreaking contributions in various fields, emphasizing the formation of a high-density talent team for long-term development [4][28] - Zhao has a strong academic background, having graduated from Tsinghua University and earned a PhD from Stanford University, focusing on large model architectures and multimodal reasoning [12][21] Group 2 - Zhao was a core member at OpenAI, significantly contributing to the design of GPT-4 and other models, and has been involved in key technical paths such as model reasoning and safety mechanisms [15][17] - His work includes being a primary author of the highly cited "GPT-4 Technical Report," which has over 17,000 citations, making it one of the most referenced documents in contemporary AI [18][19] - The MSL team includes several researchers from OpenAI, with a notable representation of Chinese talent, indicating a strong focus on advanced AI research [24][27] Group 3 - Despite Zhao's appointment, Yann LeCun, a Turing Award winner, will continue as the Chief Scientist of FAIR, which focuses on long-term AI research [10][11] - The MSL aims to push the frontiers of superintelligent research, with a commitment to aligning AI with human needs [8][28] - The high percentage of Chinese members in the MSL team has led to discussions about the efficiency of communication within the team [25][27]
2025智能机器人关键技术大会隆重举行
机器人圈· 2025-07-25 12:53
2025年7月22 — 24日,以 " 具身智能与多模态交互技术的融合与突破 " 为主题的2025智能机器人关键 技术大会在齐齐哈尔隆重举行。 本次大会由《机器人技术与应用》杂志社主办,山东大学、北京科技大 学、浙江工业大学、天津理工大学、齐齐哈尔大学和哈尔滨工业大学机电学院联合承办, 大会得到了 中 国自动化学会机器人专业委员会、中国人工智能学会智能机器人专业委员会、中国仪器仪表学会智能车 与机器人专业委员会和中国工程建设焊接协会机器人及智能焊接专业委员会 大力 支持, 汇聚了众多机 器人领域的顶尖专家、学者及行业精英 ,共同探讨具身智能领域前沿技术以及未来发展趋势。 大会同 期,蓝点触控(北京)科技有限公司 、北京诺亦腾科技有限公司、上海念通智能科技有限公司、烟台睿 感物联技术有限公司等企业携主打产品惊艳亮相并在现场演示,引不少观众驻足。 北方科技信息研究所韩志强 副 所长 、北方华安工业集团有限公司董事程振轩、 齐齐哈尔大学王志刚 副 校长 、 国家科技部二级研究员刘进长、山东大学李贻斌 教授 、工信部赛迪研究院科技处董凯 处长 、 哈尔滨工业大学付宜利 教授 、清华大学刘辛军 教授 、西安电子科技大学 ...
云知声市值激增逾170亿港元:磐谷创投118倍回报领跑 D轮后入股国资平均收益率282%
Xin Lang Zheng Quan· 2025-07-25 07:06
Core Viewpoint - Cloud Voice, a general artificial intelligence "unicorn," successfully listed on the Hong Kong Stock Exchange on June 30, after a tumultuous journey to the capital market, including a failed IPO attempt on the Shanghai Stock Exchange and two unsuccessful applications for the Hong Kong listing [1] Group 1: Company Overview - Cloud Voice issued 1,560,980 shares globally, with the Hong Kong public offering being oversubscribed by 91.66 times, leading to an adjusted sale of 624,400 shares [1] - The company set its IPO price at HKD 205 per share, raising approximately HKD 320 million, ranking 31st among 43 companies newly listed on the Hong Kong Stock Exchange in 2025 [1] - The company’s stock price surged on its debut, reaching a peak of HKD 319.80, a 56% increase, and closing at HKD 296.40, a 44.6% rise from the IPO price [2] Group 2: Investment and Financing - Major cornerstone investors included SenseTime, Runjian Co., and Zhenyi Asset Management, collectively subscribing to 462,860 shares, accounting for nearly 30% of the offering [2] - Prior to the IPO, Cloud Voice completed 10 rounds of equity financing, attracting investments from various institutions totaling approximately CNY 2.436 billion [2] - The company’s valuation reached approximately CNY 8.333 billion (USD 1.929 billion) after the completion of the D3 round of financing in May 2023 [2] Group 3: Financial Performance - The company reported cumulative losses of CNY 1.205 billion from 2022 to 2024, with a projected single-year loss of CNY 454 million in 2024, marking a 21.4% year-on-year increase [8] - Operating cash flow has been negative for three consecutive years, with trade receivables significantly high and turnover days exceeding industry averages [8] - The customer retention rate in the medical services sector dropped from 70.4% in 2022 to 53.3% in 2024, indicating challenges in maintaining client relationships [8] Group 4: Market Position - As of 2024, Cloud Voice ranked fourth in the Chinese AI solutions provider market, with a market share of only 0.6%, highlighting a significant gap compared to the top three competitors [8]
诺奖得主谈人类末日危机实录:关于AI“第37步”、卡尔达舍夫I型文明
3 6 Ke· 2025-07-25 04:21
Core Insights - The discussion revolves around the potential of AI to reach a transformative point akin to AlphaGo's "move 37," suggesting that AI may be approaching a critical technological shift [1][30] - Demis Hassabis warns of the risks associated with AI advancements, emphasizing the need for cautious optimism [1][30] Group 1: AI and Natural Systems - Hassabis believes that all natural models can be efficiently modeled through classical learning algorithms, particularly in fields like biology, chemistry, and physics [4][5] - The probability of achieving Artificial General Intelligence (AGI) by 2030 is estimated at around 50%, with benchmarks including the ability to propose new scientific hypotheses [4][30] - AI systems like AlphaGo and AlphaFold demonstrate the capability to solve complex problems through intelligent guided searches [4][5] Group 2: AI's Understanding of Reality - The Veo 3 model showcases an impressive ability to generate realistic videos and demonstrates a form of "intuitive physics" understanding [7][8] - Hassabis expresses surprise at Veo 3's ability to learn from video observation without physical interaction, challenging previous assumptions about AI's need for embodiment to understand the physical world [9][10] Group 3: Future of Gaming with AI - Future gaming experiences may be revolutionized by AI, allowing for dynamic story generation based on player decisions, creating a more immersive experience [12][13] - Hassabis envisions a future where AI can create truly open-world games that respond in real-time to player choices, enhancing the gaming experience [12][13] Group 4: Evolutionary Algorithms and AI Innovation - The recently released AlphaEvolve system utilizes evolutionary algorithms to explore new solution spaces, combining large language models with evolutionary computation [18][19] - Hassabis believes that understanding the underlying dynamics of systems is crucial for discovering new solutions and that evolutionary computation can lead to significant breakthroughs [18][19] Group 5: AI's Role in Scientific Research - Hassabis discusses the concept of "research taste," suggesting that while AI can solve complex problems, it currently lacks the ability to propose profound scientific hypotheses [22][23] - The challenge lies in AI's ability to discern the right questions and hypotheses, which is a critical aspect of scientific research [23][24] Group 6: Future Energy Sources - Hassabis predicts that nuclear fusion and solar energy will become the primary energy sources in the future, addressing energy challenges and potentially leading to a Kardashev Type I civilization [43][44] - The development of efficient solar materials and nuclear reactors could enable humanity to harness abundant, clean energy [43][44] Group 7: Competition in AI Development - Hassabis emphasizes the importance of collaboration in AI research, stating that the goal is to safely bring technology to the world for the benefit of humanity [47][48] - The competition for talent in AI is intensifying, with companies like Meta employing aggressive strategies to attract top researchers [51]
讯飞医疗出席2025年长三角医师联盟高质量发展论坛,共探AI赋能区域医疗协同新路径
Jiang Nan Shi Bao· 2025-07-23 11:09
Group 1 - The "2025 Yangtze River Delta Physician Alliance High-Quality Development Forum" was held in Nanjing, focusing on deepening academic exchanges and promoting regional cooperation in healthcare development [1] - Approximately 300 physician representatives from Jiangsu, Shanghai, Zhejiang, and Anhui attended the forum, highlighting the collaborative efforts in the healthcare sector [1] Group 2 - The Executive President of iFlytek Medical, Lu Xiaoliang, delivered a keynote report on the advancements of general artificial intelligence and its applications in healthcare, emphasizing the significance of AI technology in solving clinical problems [2] - iFlytek's Starfire Medical Model, launched in 2023, has made significant progress in six core medical capabilities, including knowledge Q&A and diagnostic recommendations, and has established partnerships with over 500 hospitals [2] - The AI health assistant, iFlytek Xiaoyi, has surpassed 24 million downloads and completed over 140 million AI consultations, achieving a user satisfaction rate of 98% [2] Group 3 - iFlytek Medical aims to leverage its leading position in the AI industry to drive technological breakthroughs and innovative applications in healthcare, contributing to the integrated development of the Yangtze River Delta region [3]
沪指重回3600点 券商股表现亮眼
Chang Sha Wan Bao· 2025-07-23 04:58
Market Overview - The market experienced a strong upward trend, with all three major indices reaching new highs for the year, and total trading volume in the Shanghai and Shenzhen markets increased to 1.89 trillion yuan, up by 193.1 billion yuan from the previous trading day [1] - The market focus was on large infrastructure projects, with over 2,700 stocks declining, while more than a hundred stocks hit the daily limit up for two consecutive days [1] Sector Performance - The sectors that performed well included super hydropower, engineering machinery, coal, and cement, while sectors such as AI, components, software development, and gaming saw declines [1] - The coal sector saw significant strength in the afternoon, driven by rising coking coal futures prices, which continued to reach new highs in the current rebound [3] Financial Institutions and Policies - The People's Bank of China reported that by the end of Q2 2025, the balance of real estate loans in renminbi reached 53.33 trillion yuan, with a year-on-year growth of 0.4%, an increase of 416.6 billion yuan in the first half of the year [1] - The Ministry of Industry and Information Technology announced that a work plan for stabilizing growth in ten key industries, including steel, non-ferrous metals, and petrochemicals, will be released soon [3] Investment Recommendations - CITIC Securities continues to recommend the computing power sector, highlighting sustained high growth in performance and low historical valuation levels for core North American computing chain companies [2] - The report suggests focusing on companies that can benefit from external demand and achieve breakthroughs in customer acquisition or market share [2] - Attention is also drawn to the AI sector, particularly in AI edge chips and modules, as the industry evolves towards general artificial intelligence [2]
AI产业合作加强,机构建议关注芯片等产业链
Mei Ri Jing Ji Xin Wen· 2025-07-22 06:15
Group 1 - The Shanghai Stock Exchange's Sci-Tech Innovation Board semiconductor materials and equipment index rose by 0.86%, with notable increases in constituent stocks such as Huahai Chengke (+3.56%) and Tianyue Advanced (+2.31%) [1] - The Sci-Tech Semiconductor ETF (588170) increased by 0.69%, reaching a latest price of 1.03 yuan, with a trading volume of 46.51 million yuan and a turnover rate of 17.59% [1] - The latest scale of the Sci-Tech Semiconductor ETF reached 262 million yuan, marking a one-month high, with the number of shares also hitting a one-month high at 257 million [1] Group 2 - China Unicom's chairman mentioned at the 2025 China Unicom Partner Conference that the company will collaborate with AI technology firms to enhance model architectures and learning mechanisms, focusing on vertical fields and scenario innovation [2] - CITIC Securities believes that as large models continue to evolve, the industry will move towards general artificial intelligence, and AI applications remain promising, particularly in AI edge chips and modules [2] - The recommendation continues for sectors such as telecommunications operators and military communications [2]