AI科技大本营
Search documents
GitHub Copilot新代理把「自家人」逼疯了!
AI科技大本营· 2025-05-26 10:14
如果你上周有关注 微软的 Build 2025 大会 ,想必都听说其发布了一个最新的智能体—— GitHub Copilot Coding Agent 。官方给它的定位,是让 Copilot 从 "对话式编程助手"升级为真正的"协作 开发搭子",开发者可以将 GitHub Issue 直接分配给 Copilot,由 其尝试自动解 决,自己负责审核 即可,像是手底下多了一名"实习生"。 目前这个智能体已经进入公测阶段,甚至有网友发现它已经开始在 GitHub 上"实战演练"了,比如跑 到微软自家的 .NET runtime 仓库里帮忙。不过,真用起来大家发现……情况有点一言难尽。 在 Reddit 上, 一篇题为《我的新爱好:看 AI 把微软员工逼疯》 的帖子迅速引发热议。不少网友调 侃:"微软到底是想提升开发效率,还是想给自己人添堵?" 更开发者直言:"说实话,我还真有点替 那些被分配来审这些 PR 的员工感到难过。但如果这就是我们行业的未来,那我可能不想坐这趟车 了。" Coding Agent 是什么? 时下, GitHub Copilot Agent 已正式面向 iOS 和 Android 上的 Git ...
ACL 2025 高分接收|高感情语音技术:逻辑智能小语种TTS破局之道
AI科技大本营· 2025-05-26 03:27
还在听着机器人味儿的小语种语音?泰语 TTS 迎来"真人"突破! 长期以来,小语种语音合成(TTS)技术因资源匮乏而发展缓慢,冰冷的机器音让人难以 忍受。现在,逻辑智能团队提出了一种数据优化驱动的声学建模框架,成功打造了接近真人水平的泰语 TTS,不仅音质逼真,还能实现零样本声音克隆! 语音合成(TTS)技术近十年来突飞猛进,从早期的拼接式合成和统计参数模型,发展到如今的深度神经网络与扩散、GAN 等先进架构,实现了接近 真人的自然度与情感表达,广泛赋能智能助手、无障碍阅读、沉浸式娱乐等场景。 然而,这一繁荣几乎局限于英语、普通话等资源充沛的大语种。全球一千多种小语种由于语料稀缺、文字无空格或多音调等复杂语言学特性,在数据收 集、文本前端处理和声学建模上都面临巨大挑战,导致高质量 TTS 迟迟无法落地。破解"小语种困境"既是学术前沿课题,也是实现数字包容与多语文 化传播的关键。 面对这一挑战,逻辑智能团队提出了一种针对低资源语言 TTS 的解决方案并应用于泰语 TTS 合成,该工作已经被 ACL 2025 Industry track 正式接 收! 这项工作提出了一种数据优化驱动的声学建模框架的创新方案,通过 ...
地表最强AI编码模型Claude 4来了!上线前竟试图勒索工程师,Windsurf 成最大受害者?
AI科技大本营· 2025-05-23 09:36
整理 | 屠敏 出品 | CSDN(ID:CSDNnews) 今天凌晨,OpenAI 的劲敌 Anthropic 正式发布下一代 Claude 模型——Claude 4。 这次更新主要带来了两款模型:Claude Opus 4 与 Claude Sonnet 4。据官方介绍,这两款模型在代码生成、高级推理能力以及智能体任务执行方面 设立了新的性能标杆。 其中,Claude Opus 4 被称之为"全球最强的编程模型",专为复杂、长时间运行的 任务而设计,可自主运行数小时。另一款升级版本 Claude Sonnet 4 相较于其前作 Son net 3.7 实现了大幅提升,在编程和推理方面更加精准响应用户指令。 殊不知,这波 Claude 4 的发布引发了与 OpenAI 之间竞争的升级,还因上线前测试中出现"自主逃逸"等行为引发热议。 连续 7 小时重构代码,最强编码模型来了! 根据官方透露,全新的 Claude Opus 4 与 Claude Sonnet 4 不仅在性能上有了大幅提升,还可以处理之前版本无法搞定的很多任务。譬如, Claude Opus 4 能在玩《宝可梦》的同时连续运行重构代码任务长达 ...
CSDN智研社欧洲首聚,共话技术范式转换下的创新与合作
AI科技大本营· 2025-05-23 09:36
随着以大模型为代表的第四次技术革命迈入关键阶段,科技发展正经历一场前所未有的范式转换,"AGI 新纪元"的浪潮汹涌澎湃。如何立足于这一变革 的关键节点,重塑对新一轮技术浪潮的认知,凝聚共识、深化交流,成为每一位技术从业者高度关注的核心议题。 作为中文技术社区的领军者,CSDN 以前瞻性的视野,倾力打造了聚焦全球技术创新高地的系列活动——「智研社-The Intelliger」。旨在汇聚全球技 术翘楚与行业精英,深刻洞察技术发展趋势,积极推动技术革新与战略思维的碰撞。 关于「智研社-The Intelliger」 「智研社-The Intelliger」由 CSDN 发起创立,前身为 CTO 俱乐部,自 2009 年创办以来,一直是极具影响力的高端技术管理者分享与交流平台。随 着大模型技术的迅猛发展,人工智能将成为未来 10 年最有影响力的技术力量。「智研社-The Intelliger」将继续发挥平台作用,连接技术领袖,推动 行业发展,共同开创 AGI 新纪元。 图1 CSDN 创始人&董事长 蒋涛 图2 「CSDN和它的朋友们」现场分享 本次"CSDN 与 TA 的朋友们巴黎见面会"的圆满举办,标志着CSD ...
大模型之后,AI 开始“自己动手”了
AI科技大本营· 2025-05-23 06:14
Core Viewpoint - The article discusses the transition from generative AI to Agentic AI, highlighting a shift in the internet from "information retrieval" to "task completion" [1][2]. Global Trends - Global tech giants are accelerating their investments in AI agents, indicating a competitive landscape [3][4]. - Major companies like Microsoft, Google, OpenAI, and Anthropic are launching various AI agent solutions and frameworks to enhance productivity and task execution [8]. Domestic Developments - In China, Tencent is embracing AI across its business sectors, focusing on four accelerators: large models, agents, knowledge bases, and infrastructure [5]. - Tencent Cloud has upgraded its AI agent development platform to enhance efficiency and intelligence in enterprise applications [5][6]. Market Dynamics - The surge in AI agent investments reflects a dual drive of technological evolution and business demand [6][9]. - The development of AI agents is seen as a response to increasing customer needs for personalized and intelligent solutions [11]. Technical Advancements - The article outlines the rapid evolution of AI agent capabilities, particularly in self-planning and tool invocation [12][13]. - Various models such as Function Calling, ReAct, and Code Agent are being developed to improve the efficiency of tool usage and task execution [14]. Industry Applications - AI agents are being implemented across various sectors, including automotive, finance, tourism, consumer electronics, and healthcare, demonstrating their practical utility [13][15]. - These applications are no longer theoretical but are actively running in production environments [16]. Future Outlook - The evolution of AI agents is positioned as a systematic and transformative path in the deployment of large models, contributing to the future of AI in industries [17][18].
能空翻≠能干活!我们离通用机器人还有多远? | 万有引力
AI科技大本营· 2025-05-22 02:47
Core Viewpoint - Embodied intelligence is a key focus in the AI field, particularly in humanoid robots, raising questions about the best path to achieve true intelligence and the current challenges in data, computing power, and model architecture [2][5][36]. Group 1: Development Stages of Embodied Intelligence - The industry anticipates 2025 as a potential "year of embodied intelligence," with significant competition in multimodal and embodied intelligence sectors [5]. - NVIDIA's CEO Jensen Huang announced the arrival of the "general robot era," outlining four stages of AI development: Perception AI, Generative AI, Agentic AI, and Physical AI [5][36]. - Experts believe that while progress has been made, the journey towards true general intelligence is still ongoing, with many technical and practical challenges remaining [36][38]. Group 2: Transition from Autonomous Driving to Embodied Intelligence - Many researchers from the autonomous driving sector are transitioning to embodied intelligence due to the overlapping technologies and skills required [17][22]. - Autonomous driving is viewed as a specific application of robotics, focusing on perception, planning, and control, but lacks the interactive capabilities needed for general robots [17][19]. - The integration of expertise from autonomous driving is seen as a bridge to advance embodied intelligence, enhancing technology fusion and development [18][22]. Group 3: Key Challenges in Embodied Intelligence - Current robots often lack essential capabilities, such as tactile perception, which limits their ability to maintain balance and perform complex tasks [38][39]. - The operational capabilities of many humanoid robots are still in the demonstration phase, lacking the ability to perform tasks in real-world contexts [38][39]. - The complexity of high-dimensional systems poses significant challenges for algorithm robustness, especially as more sensory channels are integrated [39]. Group 4: Future Applications and Market Focus - The focus for developers should be on specific application scenarios rather than pursuing general capabilities, with potential areas including home care and household services [48]. - Industrial applications are highlighted as promising due to their scalability and the potential for replicable solutions once initial systems are validated [48]. - The gap between laboratory performance and real-world application remains significant, necessitating a focus on improving system accuracy in specific contexts [46][47].
智元机器人发布并开源世界模型EVAC与评测基准EWMBench,助力具身世界模型加速进化!
AI科技大本营· 2025-05-22 02:47
Core Viewpoint - The article highlights the significant breakthroughs by ZhiYuan Robotics in the field of embodied intelligence, introducing the world's first action sequence-driven embodied world model EVAC and the evaluation benchmark EWMBench, both of which are now open-source. These innovations aim to establish a new development paradigm of "low-cost simulation - standardized evaluation - efficient iteration" to empower global research in embodied intelligence and accelerate technology implementation and industry development [1][21]. Group 1: Industry Challenges - The evolution of embodied intelligence faces two key constraints: high costs and risks associated with real machine validation during testing, and the lack of an efficient utilization mechanism for vast amounts of real machine data, which limits diversity generation and generalization training [3][21]. - ZhiYuan Robotics aims to address these challenges by leveraging its technical expertise and insights into industry pain points, launching the action sequence-driven world model EVAC and the evaluation benchmark EWMBench to redefine the development paradigm of embodied world models [3][21]. Group 2: Technological Breakthroughs - EVAC represents a dynamic world model capable of reproducing complex interactions between robots and their environments, marking a transition from traditional simulation to generative simulation [5][21]. - The core capabilities of EVAC include precise mapping from "physical execution" to "pixel space," enabling end-to-end generation through a multi-level action condition injection mechanism [7][21]. Group 3: Dual Value Proposition - EVAC introduces a generative simulation evaluation scheme that addresses the high costs and risks of real machine evaluations, allowing for interactive evaluation pipelines that significantly enhance the efficiency of strategy model screening [9][10]. - The data augmentation engine of EVAC can generate large-scale data from minimal expert trajectory data, leading to a task success rate increase of up to 29% for strategy models trained with this augmented data [10][21]. Group 4: Evaluation Benchmark EWMBench - EWMBench is the world's first evaluation benchmark for embodied world models, designed to fill industry gaps and establish a unified, credible evaluation standard [12][21]. - The benchmark features a three-dimensional evaluation system focusing on scene consistency, motion correctness, and semantic alignment and diversity, utilizing advanced metrics for precise assessment [15][20]. Group 5: Collaborative Synergy - The synergy between EnerVerse, EVAC, and EWMBench creates a "spiral evolution" where EnerVerse provides a robust framework for EVAC, while the diverse high-quality data generated by EVAC continuously optimizes the EnerVerse model [18][21]. - The combination of EVAC and EWMBench has been officially selected as the baseline system and evaluation standard for the AgiBot World Challenge @ IROS 2025, offering a valuable platform for developers and teams engaged in embodied intelligence research [19][21].
2025 全球产品经理大会正式官宣,聚焦 AI 产品实战,全景呈现未来产品图谱!
AI科技大本营· 2025-05-21 06:10
Core Viewpoint - The article emphasizes the importance of user experience in product design, particularly in the era of AI large models, highlighting the need for product managers to transform technology into real user value [1][36]. Group 1: Conference Overview - The "2025 Global Product Manager Conference" will be held on August 15-16 in Beijing, focusing on generative AI and intelligent product design, commercial implementation, and user experience innovation across 12 key topics [1][3]. - The conference aims to facilitate deep discussions on how products and AI can co-create the future, serving as a gathering for product professionals in the intelligent era [1]. Group 2: Key Topics of the Conference - The conference will cover 12 major thematic areas, including: 1. Generative AI Products [4] 2. AI Agents and their design [6] 3. Enterprise AI Products and applications [6] 4. AI Industry Applications in sectors like finance and education [6] 5. Embodied AI and Intelligent Hardware [6] 6. Overseas Product Practices, focusing on strategies and challenges for Chinese companies going global [7] 7. Product Innovation and management practices [8] 8. Product and Service UX Design, exploring AI operational methodologies [9] 9. Business Model Design [10] 10. User Research and Requirement Analysis, focusing on data-driven insights [15] Group 3: Notable Speakers - The conference will feature prominent speakers from leading internet platforms, AI startups, and experts in product and growth operations, sharing cutting-edge experiences and insights [12]. - Notable speakers include: - Li Jianzhong, CSDN Senior Vice President, focusing on user insights and product innovation [14]. - Wang Yuan, CEO of Jiuhen Technology, with a background in product management at NetEase [18]. - Yang Yixi, a growth consultant with extensive experience in product operations [22]. - Zhao Jiuzhou, Senior Product Director at WPS, specializing in AI products [24]. Group 4: Call for Participation - The conference is open for topic submissions and speaker recruitment, inviting practitioners with real-world AI product experience to share their successes and lessons learned [37][40]. - The deadline for submissions is June 30, 2025, encouraging contributions from those with unique insights into user experience, product growth, and operational strategies [40].
AI若解决一切,我们为何而活?对话《未来之地》《超级智能》作者 Bostrom | AGI 技术 50 人
AI科技大本营· 2025-05-21 01:06
Core Viewpoint - The article discusses the evolution of artificial intelligence (AI) and its implications for humanity, particularly through the lens of Nick Bostrom's works, including his latest book "Deep Utopia," which explores a future where all problems are solved through advanced technology [2][7][9]. Group 1: Nick Bostrom's Contributions - Nick Bostrom founded the Future of Humanity Institute in 2005 to study existential risks that could fundamentally impact humanity [4]. - His book "Superintelligence" introduced the concept of "intelligence explosion," where AI could rapidly surpass human intelligence, raising significant concerns about AI safety and alignment [5][9]. - Bostrom's recent work, "Deep Utopia," shifts focus from risks to the potential of a future where technology resolves all issues, prompting philosophical inquiries about human purpose in such a world [7][9]. Group 2: The Concept of a "Solved World" - A "Solved World" is defined as a state where all known practical technologies are developed, including superintelligence, nanotechnology, and advanced robotics [28]. - This world would also involve effective governance, ensuring that everyone has a share of resources and freedoms, avoiding oppressive regimes [29]. - The article raises questions about the implications of such a world on human purpose and meaning, suggesting that the absence of challenges could lead to a loss of motivation and value in human endeavors [30][32]. Group 3: Ethical and Philosophical Considerations - Bostrom emphasizes the need for a broader understanding of what gives life meaning in a world where traditional challenges are eliminated [41]. - The concept of "self-transformative ability" is introduced, allowing individuals to modify their mental states directly, which could lead to ethical dilemmas regarding addiction and societal norms [33][36]. - The article discusses the potential moral status of digital minds and the necessity for empathy towards all sentient beings, including AI, as they become more integrated into society [38]. Group 4: Future Implications and Human-AI Interaction - The article suggests that as AI becomes more advanced, it could redefine human roles and purposes, necessitating a reevaluation of education and societal values [53]. - Bostrom posits that the future may allow for the creation of artificial purposes, where humans can set goals that provide meaning in a world where basic needs are met [52]. - The potential for AI to assist in achieving human goals while also posing risks highlights the importance of careful management and ethical considerations in AI development [50][56].
谷歌发布最强 AI“全家桶”、一句话就让AI拍大片!这一夜,谷歌Gemini贯穿始终,网友:果然Android“靠边站”了
AI科技大本营· 2025-05-21 01:06
Core Insights - Google has shifted its focus from Android to AI, showcasing significant advancements in AI technologies during the I/O conference, including the Gemini 2.5 model and various AI products [1][2][20] Group 1: AI Model and Product Developments - Google has released over 10 new models and 20 major AI products and features in the past year, aiming to deliver the best models and products to users at unprecedented speed [2] - The Gemini 2.5 Pro model has shown remarkable improvements, dominating various benchmarks and achieving top positions in code-related tests [4][13] - Monthly token processing in Google products and APIs has surged from approximately 9.7 trillion to 480 trillion, marking a nearly 50-fold increase year-over-year [5] Group 2: User Engagement and Adoption - Over 700 million developers are now using Gemini, a fivefold increase from the previous year, with Gemini's usage on Vertex AI increasing by 40 times [5] - The monthly active user count for Gemini applications has surpassed 400 million, with a 45% increase in users utilizing the Gemini 2.5 Pro model [5] - Google Search's AI overview feature has attracted over 1.5 billion users monthly, indicating its success in integrating generative AI into user experiences [22][23] Group 3: New AI Projects and Features - Project Starline has evolved into Google Beam, enhancing video communication with AI-driven 3D visuals and real-time voice translation for Google Meet [8] - Project Astra has been integrated into Gemini Live, allowing for more intuitive interactions and real-world context understanding [9] - Project Mariner has advanced to support multi-tasking and user-guided learning, with plans for broader developer access in the summer [10][11] Group 4: AI Search Experience - The new "AI Mode" in Google Search combines conversational AI, image recognition, and multi-modal reasoning to enhance user search experiences [23][25] - Features like Deep Search allow for extensive research capabilities, while real-time interaction and smart agent functionalities streamline user tasks [25][26] Group 5: Subscription Services - Google has launched Google AI Ultra, a premium subscription service priced at $249.99 per month, offering advanced AI tools and features for creators and developers [36] - A more budget-friendly option, Google AI Pro, is available for $19.99 per month, providing access to basic Gemini 2.5 Pro functionalities [38] Group 6: Multi-modal AI Innovations - Google introduced the Veo 3 video generation model, capable of synchronizing audio and video, and allowing for text or image-based video creation [28] - The Imagen 4 model enhances image generation capabilities, supporting 2K resolution and improved detail accuracy [31] - Lyria 2 facilitates real-time music generation, while Flow integrates multiple models for AI-driven film production [33]