Workflow
量子位
icon
Search documents
中国最新Agent产品趋势:多体协同,垂直赛道,行业核心业务 | 量子位智库AI 100
量子位· 2025-10-19 04:10
Core Insights - The article discusses the rapid evolution and application of Agent products in various industries, highlighting their transition from general tools to specialized "intelligent partners" that address specific pain points in sectors like research and investment [3][4]. Group 1: Agent Product Development - Agent technology is maturing, evolving from single-point intelligence to systematic intelligent collaboration, aiming for more efficient and stable task processing capabilities [3]. - The integration of cloud services with local operating systems allows for seamless user workflow and personalized services [3]. Group 2: Market Trends - There is a clear trend of Agent products embedding into various business processes across industries, enhancing automation and providing tailored solutions [3][4]. - The latest AI100 list features seven Agent products, indicating a growing market presence and competition [5]. Group 3: Notable Agent Products - Kimi, a tool for enhancing professional and learner capabilities, recorded nearly 30 million web visits in September [8][9]. - MiniMax combines chat and Agent functionalities, offering end-to-end solutions across various fields [10]. - The "扣子空间" from ByteDance serves as a professional AI work assistant, supporting deep writing and data analysis tasks [11]. - AutoGLM provides a cloud-based Agent platform for seamless task execution across applications [14]. - Bobby, an investment trading AI Agent, generates personalized trading strategies based on user preferences and market data [42].
马斯克发起编程人机大战!卡帕西说了不
量子位· 2025-10-19 04:10
Core Viewpoint - The article discusses the interaction between Elon Musk and Andrej Karpathy, highlighting Karpathy's refusal to compete with Musk's AI model, Grok 5, and the implications of their relationship in the context of AI development and collaboration [2][12][39]. Group 1: Interaction Dynamics - Musk invited Karpathy to a programming duel with Grok 5, reminiscent of the famous chess match between Kasparov and Deep Blue [1][11]. - Karpathy declined the challenge, stating that competing would diminish his value, as he sees more merit in collaboration than competition [2][12]. - The online community expressed eagerness to see a showdown between Karpathy and Grok 5, speculating on the potential outcomes and implications for AI development [16][20]. Group 2: Historical Context - Karpathy was a key figure at Tesla, where he significantly expanded the AI and Autopilot team and contributed to the development of Tesla's autonomous driving capabilities [33]. - After leaving Tesla in July 2022, he briefly joined OpenAI before founding his own AI education company, Eureka Labs [34][39]. - Despite their professional separations, both Musk and Karpathy have maintained a positive relationship, with Musk frequently expressing admiration for Karpathy's skills and contributions [37][39]. Group 3: Future Speculations - There is speculation about whether Karpathy will return to work with Musk, especially given Musk's ongoing interest in Karpathy's expertise and the potential for collaboration in AI [28][30]. - The article suggests that the future of their relationship could involve either continued independent pursuits by Karpathy or a possible reunion with Musk's ventures [39].
卡帕西:强化学习很糟糕,但其他所有方法都更糟
量子位· 2025-10-18 09:30
Group 1 - The core viewpoint of the article is that achieving Artificial General Intelligence (AGI) will take at least another decade, as current AI systems need significant improvements to reach their full potential [5][10][28] - Karpathy emphasizes that existing AI systems lack maturity, multi-modal capabilities, and the ability to learn continuously, which are essential for them to function effectively in collaboration with humans [8][9][10] - He critiques the current state of Large Language Models (LLMs), stating that they have cognitive deficiencies and overestimate their capabilities, requiring substantial enhancements [16][18] Group 2 - Karpathy argues that reinforcement learning is more flawed than commonly perceived, as it reinforces all steps taken in reaching a correct answer, regardless of their validity, leading to inefficient learning [20][21][23] - He believes that AGI will not lead to a sudden leap in productivity but will follow a gradual growth pattern, similar to the historical 2% GDP growth trend observed with the internet [25][29] - The lengthy development of autonomous driving technology is attributed to the high stakes involved, where even minor errors can have severe consequences, necessitating extensive reliability improvements [30][32][33] Group 3 - As a full-time educator, Karpathy aims to establish a leading-edge educational institution that offers a unique mentorship experience, focusing on personalized learning and advanced AI education [34][36] - He highlights the importance of tailored teaching methods, which current LLMs cannot replicate, emphasizing the need for human instructors to provide appropriate challenges to students [36][38]
量子位实习招聘|AI学术编辑实习生,线下远程均可
量子位· 2025-10-18 09:30
Core Viewpoint - The article emphasizes the rapid updates in the AI academic field and the recruitment of an intern to assist in managing the latest AI research papers and findings [1]. Group 1: Recruitment Information - The company is looking for a quick and attentive academic editor intern focused on AI to help with content organization and submission of recent AI research papers [1][2]. - The intern will be involved in editing AI and computer science academic papers, assisting in content selection, abstract extraction, and media dissemination [5]. - The internship requires a minimum commitment of three months, with flexible working arrangements, and offers a stipend and recommendation opportunities [5][6]. Group 2: Company Overview - Quantum Bit (量子位) has over 2.3 million subscribers on WeChat and more than 7 million users across the internet, with a daily reading volume exceeding 2 million [3]. - The company is recognized as a top media outlet in the AI and technology sector, receiving accolades from various mainstream media platforms [4][8]. - Quantum Bit is a strategic partner in major industry conferences and is part of influential organizations in the AI field, enhancing its credibility and reach [8]. Group 3: Company Culture and Values - The company promotes a culture driven by curiosity, encouraging individuals to explore and share new technological advancements [10][11]. - It values diverse educational backgrounds, focusing on curiosity and the ability to act on it rather than specific academic qualifications [9][10].
AI打通第一/第三人称视觉,跨视角视觉理解新SOTA|ICCV 2025 Highlight
量子位· 2025-10-18 09:30
ObjectRelator团队 投稿 量子位 | 公 众号 Q bitAI 具身智能落地迈出关键一步,AI拥有第一人称与第三人称的"通感"了! INSAIT、复旦大学等单位联合提出 O bjectRelat or框架 ,让 AI精准匹配不同视角下的同一物体,实现跨视角的统一表征与理解 。 实验中,ObjectRelator在Ego (第一人称视觉) 转Exo (三人称视觉) 和Exo转Ego两个任务上都显著超越了所有基线模型,拿下SOTA。 Ego→Exo效果,be like: Exo→Ego也可以很好地对齐: 目前,该工作已被ICCV 2025接收为Highlight论文,代码已开源。 Ego与Exo之间的鸿沟 在人类技能习得过程中,需要在两个视角之间进行流畅的转换。 我们在观看别人的演示过程时,会尝试在脑海中想象自己进行这些操作的场景。然而这一跨视角理解的能力对于计算机和机器人来说却是一个 巨大的挑战,制约着机器人学习、VR交互等关键领域的发展。 第一人称视角具备较强的沉浸感与交互细节捕捉能力,能够精确刻画主体与环境之间的动态交互过程。然而,其 视觉范围受限、画面稳定性 较差,难以全面反映场景全貌 。 尽 ...
首创“AI+真人”双保障模式!刚刚,百度健康推出7x24小时「能聊、有料、会管」AI管家
量子位· 2025-10-18 07:33
Core Viewpoint - Baidu Health has launched a 24/7 AI health manager that integrates AI with human verification to enhance the medical consultation experience, marking a significant shift from traditional medical services to a more interactive and reliable health management system [3][6][68]. Group 1: AI Health Manager Features - The AI health manager offers a comprehensive service that includes health education, diagnosis, treatment recommendations, and health record management, all within the Baidu app [6][12]. - It features a unique "AI + human" dual verification model, where AI-generated medical advice is confirmed by real doctors, ensuring higher safety and reliability for users [6][26]. - The AI can conduct multi-turn conversations and accurately identify 127 types of skin issues, achieving a 98% accuracy rate in interpreting various medical documents [21][22]. Group 2: User Experience and Accessibility - Users can access the AI health manager directly through the Baidu app without needing to download a separate application, making it more user-friendly [10][11]. - The AI serves as a "smart health partner," capable of managing all aspects of medical consultations, including purchasing medication and booking appointments [12][27]. - The system integrates over 300,000 quality doctor resources and authoritative hospital rankings to assist users in selecting the right medical services [29]. Group 3: Data and Model Architecture - Baidu Health has established a robust data pipeline supported by 360,000 doctors, ensuring the quality and reliability of the medical data used by the AI [40][42]. - The model architecture includes multi-modal capabilities and online reinforcement learning, allowing the AI to continuously evolve and improve its decision-making processes [52][57]. - The AI's knowledge base includes over 2 million medical journal articles and 14 million authoritative health resources, enabling it to provide up-to-date medical information [54]. Group 4: Industry Impact and Future Vision - The introduction of the AI health manager signifies a transition in the healthcare industry from traditional doctor-patient interactions to a model where services proactively engage with users [70]. - Baidu Health aims to create a comprehensive health ecosystem that connects users with medical professionals and resources, enhancing the overall healthcare experience [64][75]. - The company envisions becoming the preferred health content and decision-making platform for the public, leveraging AI technology to make healthcare more accessible and trustworthy [73][75].
机器人连续叠衣120分钟!仅用0.9B参数实现五大SOTA|清华AIR & 上海AI Lab开源
量子位· 2025-10-18 07:33
清华大学智能产业研究院(AIR)与上海人工智能实验室联合发布 通用跨本体具身基座模型X-VLA ,通过创新的Soft-Prompt机制、高效的 框架设计与定制化训练范式,显著提升预训练效率与模型性能。 X-VLA团队 投稿 量子位 | 公众号 QbitAI 机器人也是卷疯了! 不仅能叠衣服,而且一干就是俩小时,且全程无任何辅助。 更关键的是, X-VLA是 首个实现120min无辅助自主叠衣任务的全开源模型 (公开数据、代码与参数),以仅 0.9B的参数量 在五大权威仿 真基准上全面刷新性能纪录 。 | Methods | Size | | Simpler | | | | LIBERO | | | Calvin | | RoboTwin-2.0 | VLABench | NAVSIM | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | | | | VM | VA | WidowX | Spatial | Object | Goal | Long | Avg | ABC -> ...
AI画手总是六根手指?阿大/美团/上交首次系统量化扩散模型计数幻觉
量子位· 2025-10-18 07:33
Core Viewpoint - The article discusses the challenges of hallucination samples generated by diffusion probability models (DPMs) in image generation tasks, particularly focusing on a specific type of hallucination called "counting hallucination" [1][2]. Group 1: Research Background - Despite the prevalence of hallucination issues in DPMs, there has been a lack of systematic methods to quantify these factual errors, hindering the development of high-reliability generative models [2]. - A research team from the University of Adelaide, Meituan, and Shanghai Jiao Tong University has conducted a systematic study on counting hallucinations in diffusion models [2][3]. Group 2: Key Questions and Dataset - The research team posed several key questions regarding the quantification of counting hallucinations and the effectiveness of common optimization techniques [3][7]. - They constructed the CountHalluSet dataset suite, which includes three datasets with increasing complexity of countable objects: ToyShape, SimObject, and RealHand [10]. Group 3: Findings and Experiments - The study revealed that increasing sampling steps can reduce counting hallucination rates in synthetic datasets but may increase them in real datasets due to overfitting [19]. - The research found that higher-order ODE solvers can lower overall failure rates but may increase counting hallucination rates, indicating a trade-off in model sensitivity to counting constraints [20][21]. - The study identified that the complexity of object shapes correlates with the severity of counting hallucinations, with more complex structures leading to higher rates of errors [26]. Group 4: Correlation Analysis - The correlation between counting hallucination rates and FID scores varies depending on the dataset and solver type, suggesting that FID may not reliably reflect factual consistency [30][32]. - Non-counting failure rates showed a stable and significant correlation with FID across conditions, indicating that FID is more effective in assessing overall visual consistency rather than specific factual features [32]. Group 5: Proposed Solution - The research team proposed a Joint-Diffusion Model (JDM) that incorporates structural constraints during the diffusion process to guide the model in generating the correct number of objects [33][35]. - This approach enhances the semantic consistency and visual credibility of generated results, effectively mitigating counting hallucination issues [35]. Group 6: Future Directions - The work opens avenues for exploring higher-order factual consistency in generative models, extending the analysis to more complex hallucination types and integrating abstract knowledge into the diffusion process [37]. - The ultimate goal is to transform generative models from mere creative tools into reliable world models applicable in critical fields requiring high accuracy [37].
季度AI视频生成产品:多模态输入成标配,角逐一站式生成能力 | 量子位智库AI 100
量子位· 2025-10-18 07:33
Core Insights - The article highlights the rapid growth and competition in the AI video generation sector, with significant advancements in technology and user engagement metrics [3][6][7]. Group 1: Market Trends - Sora2 has achieved over 1 million downloads in just five days, indicating a surge in interest in AI video generation [3]. - Major companies like Google are launching competitive products such as Veo3.1, focusing on audio generation, which is expected to further intensify market competition [4]. - The integration of visual models with world models is enhancing the realism of AI-generated videos, allowing for the creation of intricate 3D physical scenes [6]. Group 2: Technological Advancements - The latest AI 100 list from Quantum Bit Think Tank shows a diverse technological evolution in AI video generation, with multi-modal input becoming standard [7]. - Output quality has significantly improved, with video lengths extending from seconds to minutes, and resolutions reaching 2K and 4K, with frame rates up to 60fps [7]. - User data reflects this trend, with five AI video generation products exceeding 200,000 visits, showcasing the growing demand [8]. Group 3: Product Highlights - The article details several leading AI video generation products, including: - **Jimeng AI**: Over 11 million downloads, with a 27% increase in visits, reaching approximately 9.5 million [9]. - **Keling AI**: Web version monthly visits surpassing 1 million, indicating strong user engagement [9]. - **RoboNeo**: A product from Meitu, focusing on image and video generation with a comprehensive workflow [10]. Group 4: Competitive Landscape - The competitive landscape features various companies, each with unique offerings: - **Jimeng AI**: A one-stop AI creation platform with advanced video generation capabilities [15]. - **Tencent's Mixed Yuan 3D**: A platform for creating immersive 3D content [18]. - **Keling AI**: A creative productivity platform with robust video generation features [20]. - Other notable products include **Sea Cucumber AI**, **Drawing Ideas**, and **Medeo**, each contributing to the diverse capabilities in the AI video generation market [24][56].
61岁退休后,华为海思创始总裁成了复旦北大清华老师
量子位· 2025-10-18 07:33
Jay 发自 凹非寺 量子位 | 公众号 QbitAI 原来低调退休的华为海思创始总裁 徐文伟 ,现在的新身份是 大学老师 。 最近,清华五道口AI首期班开学的报道中,徐文伟以教授身份亮相,给企业家学员上了一堂 《AI时代的企业创新》 为题的课程。 据说课上,徐文伟教授结合华为突破欧洲市场的故事,生动地解析了创新与商业的关系,还为企业家们分享了干货满满的创新方法论。 这也是这位前华为董事、科学家咨询委员会主任、战略研究院院长、战略Marketing总裁、企业业务总裁、IRB主任、欧洲地区部总裁以及海 思半导体总裁…… 在满满当当的履历中, 一步一个脚印凝结下来的宝贵经验 。 1963年9月,徐文伟出生于江苏常州,1990年从东南大学毕业,一年后加入华为,开启了长达三十多年的职业生涯长跑。 任职期间,徐文伟战功赫赫 ,包括但不限于:主持研发首款局用程控交换机、首颗芯片、首套GSM系统及首台云数据中心核心交换机、提出 创新2.0战略,2020年发布面向数学的十大挑战问题,布局光子计算、裸眼3D显示等前沿技术研发…… 直到2024年,在61岁时低调退休。 低调荣休后的新生活 从2023年4月起,华为启动新一轮高层换 ...