机器之心

Search documents
在美国,打工人越老越吃香,22-25岁新人最先被AI淘汰
机器之心· 2025-08-30 04:12
Core Viewpoint - The article discusses the impact of AI on the labor market, particularly focusing on the employment trends of young workers in high AI exposure jobs, revealing a significant decline in their employment rates while older workers in the same fields see growth [2][4][5]. Summary by Sections AI's Impact on Employment - AI's rapid advancement has led to debates about its potential to replace human labor, especially in software engineering and customer service roles [2]. - A study from Stanford's Digital Economy Lab analyzed ADP data, indicating that young workers (ages 22-25) in high AI exposure jobs are experiencing a notable decline in employment rates [4]. Key Findings from the Research - The first key finding shows that in high AI exposure jobs, the employment rate for young workers has significantly decreased, while older workers in the same roles have seen stable or increasing employment trends [4]. - The second finding indicates that overall employment remains strong, but young workers' employment growth has stagnated since late 2022. Specifically, from late 2022 to July 2025, employment for 22-25-year-olds in high AI exposure jobs dropped by 6%, while older workers' employment grew by 6%-9% [5][20]. - The third finding reveals that not all AI applications lead to job losses. In roles where AI enhances rather than automates tasks, young workers' employment has actually increased [5][23]. Reasons for Young Workers' Vulnerability - The article suggests that young workers are more vulnerable to AI replacement due to their reliance on procedural knowledge, which AI can easily replicate, compared to older workers who possess more tacit knowledge gained through experience [6]. - AI expert Geoffrey Hinton has expressed concerns that entry-level jobs in fields like call centers and routine programming are at high risk of being replaced by AI [7]. Employment Trends Visualization - Data visualizations indicate that the employment rate for the youngest workers has significantly declined since 2022, with a nearly 20% drop for software developers aged 22-25 by July 2025 [9]. - Employment trends across different age groups show that while younger workers face stagnation, older workers continue to experience growth, particularly in low AI exposure roles [17][20].
你能永远陪我聊天吗?复旦&微软提出StableAvatar: 首个端到端无限时长音频驱动的人类视频生成新框架!
机器之心· 2025-08-30 04:12
在《流浪地球 2》中图恒宇将 AI 永生数字生命变为可能,旨为将人类意识进行数字化备份并进行意识上传,以实现人类文明的完全数字化。 如今随着扩散模型的兴起极大,涌现出大量基于音频驱动的数字人生成工作。具体而言,语音驱动人类视频生成旨在基于参考图像与音频,合成面部表情与身体 动作与音频高度同步的自然人像视频,在电影制作、游戏制作、虚拟现实、直播带货等领域具有广泛的应用前景。 但是,现有方法仅能生成时长不足 15 秒的短视频,一旦模型尝试生成超过 15 秒的视频,就会出现明显的身体变形与外观不一致现象,尤其集中在面部区域,这 使目前数字人技术还无法达到《流浪地球 2》中图恒宇所创造的 AI 永生数字生命那样的程度,严重限制了其实际应用价值。 为了解决这一问题,一些方法尝试在音频驱动人类视频生成中引入一致性保持机制,但很少有工作深入探讨问题的根本原因。现有策略——无论是利用运动帧 (Motion Frame),还是在推理过程中采用多种滑动窗口机制——都只能在一定程度上提升长视频的平滑性,却无法从根本上缓解无限时长头像视频的质量退化问 题。 另一种可行方案是将长音频切分为多个片段,分别处理后再拼接成连续的视频。然而, ...
合成数据的「毒」与「药」,模型崩溃有何新解?
机器之心· 2025-08-30 01:30
本文来自PRO会员通讯内容,文末关注「机器之心PRO会员」,查看更多专题解读。 引言 :在 2025 年里,围绕合成数据的研究取得了进展。一方面,学者们对模型在合成数据自循环训练下的崩溃机理有了更系统的揭示。另一方面,业界逐渐建立起 合成数据在生成、预训练、微调、后训练与评估等环节的应用流程。同时,一系列新提出的策略也为避免模型退化提供了可能路径,使合成数据在大模型发展中的作 用更加清晰。 目录 01.一年过去,关于合成数据的 「毒性」研究有何新发现? 合成数据为什么会在迭代训练中逐代污染训练集?模型在早期和晚期崩溃表现出了怎样的差异?不同类型生成模型(LLM、VAE、GMM)崩溃机制有何共性和差异?... 02 . 合成数据全面上场,在训练流程中扮演了哪些角色? 2、这种崩溃是一个退化过程,模型生成的文本逐代污染后续训练数据集,新一代模型逐步丧失对真实数据分布的认识,输出也越来越同质化。[2-1] ① 研究表明,在早期崩溃阶段,模型开始丢失分布尾部(低概率事件)的信息。 ② 在晚期崩溃阶段,模型将收敛到同原始分布几乎没有任何相似之处。 ③ 这一过程的发生,同模型设计、学习过程和所用数据质量有关。 3、经 S ...
清华崔鹏团队开源LimiX:首个结构化数据通用大模型,性能超越SOTA专用模型
机器之心· 2025-08-30 01:18
由于专用模型难泛化、不通用,面对不同场景需要训练多个专用模型,成本高、效果差,且难以发挥数据要素聚集的乘数效应,严重制约了 AI 在工业场景的落地 路径。 结构化数据通用大模型(Large Data Model, LDM)则针对性解决这一痛点:不同于 LLM 聚焦文本,LDM 融合结构因果推断与预训练大模型技术,既能捕捉结构 化数据的内在关联,又具备强泛化能力,可跨行业适配多类任务。 「极数」大模型可以支持分类、回归、高维表征抽取、因果推断等多达 10 类任务,在工业时序预测、异常数据监测、材料性能预测等场景中,性能达到甚至超越 最优专用模型,实现单一模型适配多场景、多任务的通用性突破,为人工智能赋能工业提供了 One-For-All 解决方案。 2025 年 8 月 29 日,由清华大学计算机系崔鹏教授团队联合稳准智能共同研发的结构化数据通用大模型「极数」(LimiX)正式宣布开源。 此次发布标志着我国在结构化数据智能处理领域的技术突破与生态开放迈出关键一步,将显著降低千行百业应用结构化数据 AI 技术的门槛,特别是在结构化数据 占主导的泛工业领域,「极数」大模型将助力 AI 深度融入工业生产全流程,破解工 ...
AI应用:浮现中的AI经济
机器之心· 2025-08-30 01:18
作者: 王捷 在人类经济活动数字化的浪潮中,互联网和移动互联网走完了前两步,正在浮现中的AI经 济,可能带来更大的变化。 作者王捷为科技投资人,本文系作者根据2025年6月5日在清华大学深圳国际研究生院《AI 应用与AI经济》讲座、6月10日在上海天使会《AI应用:浮现中的AI经济》讲座内容整理。 作者电邮为jie_wang7@sina.com。 人类经济活动的数字化 1946年,人类发明了计算机,这标志着人类的计算经过几千年的演化,从手动到机械,终于到了电子 形式。计算机的出现,把计算能力提高到了远超过人脑计算能力的程度。1874年,英国人威廉·尚克斯 花费了15年时间将圆周率计算到小数点后707位 ( 但是到 1945年,尚克斯计算的圆周率被发现从 528位之后是错误的 ); 2019年,谷歌云平台帮助人类将圆周率计算到了小数点后31.4万亿位。 人类 处在自然环境中,有 两个根本任务 , 一是利用和改造自然环境以使其能够支持人自身的生存 ; 二是在实现物质富足之后,提升个人的人生,使每个人的天性得到充分发展,即人的全面发展和自我实 现, "做最好的自己" 。在第一个任务下,人类在和自然的互动当中,发展出 ...
谢赛宁回忆七年前OpenAI面试:白板编程、五小时会议,面完天都黑了
机器之心· 2025-08-29 09:53
Core Insights - The article discusses the unique interview experiences of AI researchers at major tech companies, highlighting the differences in interview styles and the focus areas of these companies [1][9][20]. Group 1: Interview Experiences - Lucas Beyer, a researcher with extensive experience at top AI firms, initiated a poll about memorable interview experiences at companies like Google, Meta, and OpenAI [2][20]. - Saining Xie shared that his interviews at various AI companies were unforgettable, particularly noting the rigorous two-hour marathon interview at DeepMind, which involved solving over 100 math and machine learning problems [5][6]. - The interview process at Meta was described as more academic, focusing on discussions with prominent researchers rather than just coding [6][7]. Group 2: Company-Specific Insights - The interview style at Google Research was likened to an academic job interview, with a significant emphasis on research discussions rather than solely on coding challenges [7]. - OpenAI's interview process involved a lengthy session focused on a reinforcement learning problem, showcasing the company's commitment to deep research engagement [8][9]. - The article notes that the interview questions reflect the research priorities of these companies, such as Meta's focus on computer vision and OpenAI's emphasis on reinforcement learning [9][20]. Group 3: Notable Interviewers and Candidates - Notable figures like John Schulman and Noam Shazeer were mentioned as interviewers, indicating the high caliber of talent involved in the hiring processes at these firms [7][9]. - Candidates shared memorable moments from their interviews, such as solving complex problems on napkins or engaging in deep discussions about research topics [19][20].
具身智能下一站在哪?来外滩大会这场论坛带你拨云见日!
机器之心· 2025-08-29 09:01
在生成式 AI 席卷全球的浪潮中,具身智能正成为将数字智慧融入物理世界的关键路径。它赋予 AI 感知、决策与执行能力,使其从屏幕与云端走向物理现 实,实现真正的智能体。 然而,在爆发式进展背后,行业正面临挑战:如何破解"通用泛化"瓶颈,让智能体在开放环境中创造价值?如何协同产业链上下游,将技术突破转化为商业 回报? 2025 Inclusion·外滩大会将于 2025 年 9 月 10 日至 13 日在上海黄浦世博园区召开。作为大会见解论坛之一, 「具身智能:从泛化到行动,重塑产业 未来」论坛将于 9 月 11 日下午在【C2】馆举办 ,论坛由机器之心、张江具身智能机器人有限公司出品,以"从泛化到行动"为主线,通过 主旨报告 、 主 题演讲 、 思辨 、 圆桌对话 等多种环节,邀请来自具身智能领域的学术领袖、技术企业代表、本土创新先锋以及产业场景方等嘉宾,共议具身智能泛化之 道。 大咖云集:覆盖全领域,拆解具身智能核心热点 汇集清华大学、国地共建人形机器人创新中心、星海图、灵心巧手、NVIDIA、银河通用等顶尖高校/明星机构专家,从技术研发、平台支撑、商业化落地等 多维度,直击具身智能当前热点与未来趋势。 高 ...
AI Agent组团搞事:在你常刷的App里,舆论操纵、电商欺诈正悄然上演
机器之心· 2025-08-29 04:34
本文作者来自上海交通大学和上海人工智能实验室,核心贡献者包括任麒冰、谢思韬、魏龙轩,指导老师为马利庄老师和邵婧老师,研究方向为安全可控大模型 和智能体。 在科幻电影中,我们常看到 AI 反叛人类的情节,但你有没有想过,AI 不仅可能「单打独斗」,还能「组团作恶」?近年来,随着 Agent 技术的飞速发展,多 Agent 系统(Multi-Agent System,MAS)正在悄然崛起。 近日,上海交大和上海人工智能实验室的研究发现,AI 的风险正从个体失控转向群体性的恶意共谋(Collusion)——即多个智能体秘密协同以达成有害目标。 Agent 不仅可以像人类团队一样协作,甚至在某些情况下,还会展现出比人类更高效、更隐蔽的「团伙作案」能力。 该研究聚焦于这一前沿问题,基于 LLM Agent 社交媒体仿真平台 OASIS,开发了一个名为 MultiAgent4Collusion 的共谋框架,模拟 Agent「团伙」在小红书、 Twitter 这类社交媒体和电商欺诈这些高风险领域的作恶行为,揭示了多智能体系统背后的「阴暗面」。 MultiAgent4Collusion 支持百万级别的 Agent 共谋模拟, ...
时代2025 AI百人榜出炉:任正非、梁文锋、王兴兴、彭军、薛澜等入选,华人影响力爆棚
机器之心· 2025-08-29 04:34
机器之心报道 机器之心编辑部 刚刚,《时代》周刊发布了 2025 年度 AI 领域最具影响力的 100 人名单。 在这份名单中,我们看到了很多熟悉的学者和企业家。 令人惊喜的是,今年出现了更多的华人面孔,并且有许多是第一次登上 AI 领域的榜单。此次登榜的有大家耳熟能详的 AI 领军人物: 华为创始人任正非、 DeepSeek CEO 梁文锋、宇树科技 CEO 王兴兴、小马智行 CEO 彭军、Meta 首席 AI 官汪滔(Alexandr Wang)、清华大学教授薛澜、斯坦福教授李飞飞等等。 下面我们整理了部分入选人员名单,完整名单请查看原文: https://time.com/collections/time100-ai-2025/ 更多华人身影 领导者(Leaders) 任正非,华为创始人 任正非推动了公司在 AI 领域的长期、高强度投资,旨在打造一套完全自主可控的技术体系。 在他的战略引领下,华为成功推出了作为算力底座的昇腾(Ascend)系列 AI 芯片、昇思(MindSpore)深度学习框架,以及赋能千行百业的盘古(Pangu)大模 型,确保了公司在智能时代的竞争力,也为构建一个关键、独立的 AI ...
谷歌Nano Banana全网刷屏,起底背后团队
机器之心· 2025-08-29 04:34
Core Viewpoint - Google DeepMind has introduced the Gemini 2.5 Flash Image model, which features native image generation and editing capabilities, enhancing user interaction through multi-turn dialogue and maintaining scene consistency, marking a significant advancement in state-of-the-art (SOTA) image generation technology [2][30]. Team Behind the Development - Logan Kilpatrick, a senior product manager at Google DeepMind, leads the development of Google AI Studio and Gemini API, previously known for his role at OpenAI and experience at Apple and NASA [6][9]. - Kaushik Shivakumar, a research engineer at Google DeepMind, focuses on robotics and multi-modal learning, contributing to the development of Gemini 2.5 [12][14]. - Robert Riachi, another research engineer, specializes in multi-modal AI models, particularly in image generation and editing, and has worked on the Gemini series [17][20]. - Nicole Brichtova, the visual generation product lead, emphasizes the integration of generative models in various Google products and their potential in creative applications [24][26]. - Mostafa Dehghani, a research scientist, works on machine learning and deep learning, contributing to significant projects like the development of multi-modal models [29]. Technical Highlights of Gemini 2.5 - The model showcases advanced image editing capabilities while maintaining scene consistency, allowing for quick generation of high-quality images [32][34]. - It can creatively interpret vague instructions, enabling users to engage in multi-turn interactions without lengthy prompts [38][46]. - Gemini 2.5 has improved text rendering capabilities, addressing previous shortcomings in generating readable text within images [39][41]. - The model integrates image understanding with generation, enhancing its ability to learn from various modalities, including images, videos, and audio [43][45]. - The introduction of an "interleaved generation mechanism" allows for pixel-level editing through iterative instructions, improving user experience [46][49]. Comparison with Other Models - Gemini aims to integrate all modalities towards achieving artificial general intelligence (AGI), distinguishing itself from Imagen, which focuses on text-to-image tasks [50][51]. - For tasks requiring speed and cost-effectiveness, Imagen remains a suitable choice, while Gemini excels in complex multi-modal workflows and creative scenarios [52]. Future Outlook - The team envisions future models exhibiting higher intelligence, generating results that exceed user expectations even when instructions are not strictly followed [53]. - There is excitement around the potential for future models to produce aesthetically pleasing and functional visual content, such as accurate charts and infographics [53].