量子位

Search documents
8个月晋升独角兽,欧洲版Cursor估值18亿美元
量子位· 2025-07-18 08:36
时令 发自 凹非寺 量子位 | 公众号 QbitAI 成立仅8个月已成为 最新独角兽 , 估值飙升至 18亿美元 。 目前已拥有超 230万 免费活跃用户、 18万 付费订阅者,付费用户首月留存率甚至已 超ChatGPT 。 这不是硅谷神话,而是来自瑞典的AI新星—— L ovab le ,正在用自然语言重塑编程方式。 近日,这家公司完成了瑞典史上最大规模的A轮融资,成功筹集2亿美元。 上线数月,Lovable一直好评如潮。 有人表示Lovable让他感到惊艳,在尝试了一些开发平台(Bolt,V0,Replit)都无法完成的情况下,它竟然在短短几个小时内就生成了一个 完整的产品网站。 还有人用它生成一款新游戏。 欧洲版Cursor 与Cursor一样,Lovable也致力于利用大模型帮助用户开发应用,但它瞄准的是一个潜力更大的用户群体——那些不会编程的人。 甚至有人计划用它在30天内建立一家初创公司,整个过程全程公开。 公司在一份新闻稿中表示,这类用户及其测试活动,很可能构成了迄今为止平台上创建的1000万个项目的主要来源。 公司联合创始人兼首席执行官Osika称 : 我们的使命是让任何人都能构建。 借助大模 ...
7B模型“情商”比肩GPT-4o,腾讯突破开放域RL难题,得分直翻5倍
量子位· 2025-07-18 06:16
Core Insights - The article discusses the challenges and solutions in optimizing large models for emotional intelligence in multi-turn dialogues using Reinforcement Learning (RL) [2][4][5] - The proposed RLVER framework integrates a user simulator that acts as both the interaction environment and the reward source, addressing the three main challenges of RL in this context [2][5][11] Group 1: Challenges in RL for Emotional Intelligence - The three main challenges identified are: 1. Environmental challenge: Creating a realistic and diverse interaction environment for the model [2][4] 2. Reward challenge: Converting subjective user satisfaction into stable, long-term rewards [2][11] 3. Training challenge: Achieving stable and efficient multi-turn online RL training on large language models (LLMs) [2][4] Group 2: RLVER Framework - The RLVER framework utilizes a user simulator that embodies diverse user profiles and interaction scenarios, allowing for a rich and dynamic learning environment [7][8] - This simulator updates its emotional state based on the model's responses, providing personalized feedback that enhances the model's learning experience [9][10] Group 3: Performance Outcomes - The Qwen2.5-7B model, trained using RLVER, achieved a score of 79.2 on the Sentient-Benchmark, a significant increase from 13.3, positioning it alongside top commercial models like GPT-4o and Gemini 2.5 Pro [16][17] - The model maintained its general capabilities in areas like mathematics and coding, avoiding "catastrophic forgetting" [17] Group 4: Insights from Training - The introduction of explicit "think-then-say" prompts improved the model's ability to understand and respond empathetically, leading to two distinct paths towards empathy: "thinking models" and "reactive models" [20][21] - The choice of optimization algorithms (PPO vs. GRPO) revealed that focusing on specific dimensions of emotional intelligence can yield better overall performance [23][27] Group 5: User Simulator Insights - The RLVER team created two types of user simulators, with findings indicating that a more forgiving environment (Vanilla simulator) is beneficial for early-stage model growth compared to a more challenging environment [29][30] - Models with explicit thinking structures demonstrated greater robustness in challenging environments, suggesting that reasoning capabilities can mitigate training instability [33]
大模型IMO25数学竞赛成绩公布了
量子位· 2025-07-18 06:16
Core Viewpoint - The article discusses the results of a mathematical model evaluation conducted by MathArena, highlighting that Gemini 2.5 Pro significantly outperformed its competitors in the IMO 2025 challenge, achieving over 30% higher total scores than the second-place model, o3, which was 89% lower than Gemini [1][2]. Group 1: Evaluation Process - The evaluation was organized by MathArena, selecting models based on their past performances in MathArena competitions, including Gemini 2.5 Pro, o3, o4-mini, Grok 4, and DeepSeek-R1 [4]. - A unified prompt template was used for all models to ensure fairness, aligning with the Open Proof Corpus evaluation [5]. - Each model was run with recommended hyperparameters and a maximum token limit of 64,000 [6]. Group 2: Scoring and Judging - Four experienced human judges with IMO-level mathematics expertise were hired to assess the models, with each problem scored out of 7 points [10][11]. - Each model generated 32 initial answers, from which they selected their best four for final scoring [8]. Group 3: Performance Insights - Many models scored between 3-4 points out of 7, a phenomenon less common in human testing, indicating a disparity in capabilities between humans and models [12]. - There was a notable reduction in models overly optimizing the final answer format, suggesting progress in handling open-ended mathematical reasoning tasks [13]. - Gemini showed improvement in avoiding the fabrication of non-existent "theorems" compared to previous evaluations [14]. Group 4: Problem-Solving Performance - The models faced challenges in geometry, with the second and sixth problems yielding the lowest scores, particularly the second problem where only Grok 4 scored 4% [26][27]. - The fourth problem saw most models using similar methods to humans but making logical errors, while the fifth problem identified correct strategies but failed to provide proofs [29].
Meta全新AI组织架构曝光,这范儿有点字节
量子位· 2025-07-18 06:16
Core Viewpoint - Meta is undergoing significant organizational restructuring, particularly in its AI division, with a focus on creating a "Super Intelligence Lab" that aims to attract top talent and enhance its AI capabilities [2][10][11]. Group 1: Organizational Changes - Meta has integrated over 3,400 employees into a new AI organization, led by Alexandr Wang as Chief AI Officer, with Nat Friedman as his deputy [2][17]. - The new structure consists of four main groups: AGI foundational research, AI product development, a basic AI lab led by Yann LeCun, and a new team focused on Llama 5 [5][12][20]. - The organization is characterized by high salaries, with reports of packages exceeding $100 million, which has created a competitive atmosphere in Silicon Valley [10][11]. Group 2: Talent Acquisition - Meta has aggressively recruited talent from companies like OpenAI, Apple, and Google, leading to concerns about the impact on company culture [10][27]. - Recent hires include prominent figures from Apple, such as Tom Gunter and Mark Lee, who have close ties to the new leadership in Meta's AI division [30][32]. - The recruitment strategy appears to mirror ByteDance's approach, indicating a shift in Meta's operational philosophy towards a more aggressive talent acquisition model [37][44]. Group 3: AI Development Focus - The primary goal of the "Super Intelligence Lab" is to prioritize foundational research in AGI while also developing practical AI applications across Meta's product lines [11][21]. - The lab is expected to work on both open-source and closed-source models, with a potential dual-track approach for Llama 5 and Llama 4.1 [7][25]. - The integration of various AI capabilities aims to create a seamless application of advanced models into Meta's existing products, such as the Meta AI assistant [22][48].
突破户外RGB-only SLAM尺度漂移难题,精确定位+高保真重建 | ICCV'25开源
量子位· 2025-07-18 06:16
S3PO-GS团队 投稿 量子位 | 公众号 QbitAI 户外SLAM的尺度漂移问题,终于有了新解法! 香港科技大学(广州) 的研究的最新成果: S3PO-GS ,一个专门针对户外单目SLAM的3D高斯框架,已被ICCV 2025接收。 项工作的亮点在于首次实现了RGB单目SLAM的全局尺度一致性。在Waymo、KITTI和DL3DV三大户外基准测试中,S3PO-GS不仅在新视角 合成任务中刷新了SOTA纪录,更是在DL3DV场景中将跟踪误差降低了77.3%。 这篇文章做了什么? 在自动驾驶、机器人导航及AR/VR等前沿领域,SLAM技术的鲁棒性直接影响系统性能。 当前基于3D高斯(3DGS)的SLAM方案虽在室内场景表现卓越,但在仅依赖RGB输入的无界户外环境中仍面临严峻挑战: 单目系统固有的深度先验缺失导致几何信息不足,而引入单目深度估计或端到端点云模型(如MASt3R)作为几何先验时,又因帧间尺度不一 致性引发系统级尺度漂移,该问题在复杂户外场景尤为突出。 针对这一双重瓶颈,香港科技大学(广州)研究团队提出创新框架 S3PO-GS ,首次实现RGB单目SLAM的全局尺度一致性。 该方案通过三大核心技术 ...
一年破千万美金,一款海外AI创意引擎爆发了
量子位· 2025-07-18 06:16
Core Viewpoint - Creati, an AI-driven creative engine, has rapidly gained traction in the advertising sector, amassing 10 million users and generating millions in annual revenue within just one year of its launch [5][6]. Group 1: AI Creative Engine - Creati focuses on automating the creative process in advertising, differentiating itself from competitors by leveraging influencer power for customized creative content [6][8]. - The platform allows businesses to transform popular influencer videos into tailored templates, significantly reducing the time and effort required to generate marketing materials [9][12]. - Creati's unique AI model enables the production of high-quality videos that rival traditional advertising efforts, attracting major brands like Shein and Cider [10][11]. Group 2: Market Disruption - The platform addresses the pain points of both influencers and small businesses by providing a stable income stream for influencers and simplifying the creative process for businesses [11][12]. - Creati's approach to content generation is designed specifically for e-commerce, recognizing the unique needs of online retailers compared to general video generation tools [18][20]. - The platform's ability to maintain consistency in product representation is a key advantage, particularly for e-commerce businesses [20]. Group 3: Data-Driven Innovation - Creati employs a data feedback loop to refine its AI creative model, allowing for continuous improvement based on user engagement metrics [21][22]. - The platform's ability to generate customized content based on brand characteristics and audience feedback enhances its effectiveness in driving marketing success [21][22]. - Creati's vision includes developing a creative agent that autonomously generates and optimizes advertising content, potentially revolutionizing the marketing landscape [24][25]. Group 4: Future Aspirations - The company aims to evolve into a comprehensive creative engine that can assist users in various aspects of content creation, beyond just advertising [29]. - Creati's long-term goal is to integrate advanced technologies, such as brain-computer interfaces, to further enhance its creative capabilities [29][30].
真热AI!米哈游5亿成立新公司
量子位· 2025-07-18 00:30
就在最近,米哈游全资成立了新公司:上海米哈游无定谷科技有限公司, 注册资本高达5亿 。 该公司经营范围不仅涵盖软件开发、动漫游戏开发,还延伸至人工智能应用软件等领域。 时令 发自 凹非寺 量子位 | 公众号 QbitAI 更早一些时候,蔡浩宇创业的AI游戏《Whispers from the Star》开放Steam试玩demo。 西方马斯克在造AI女友,东边 米哈游 却在造"无定谷"。 此外,官方也整大活,蹭马斯克热度,让Grok Ani体验游戏,AI女孩对话AI女孩。 5亿注册资本,米哈游从未这么大手笔 梳理来看,米哈游在AI领域布局已久。 成立AI相关公司的动作早已有之: 直到昨天,米哈游5亿成立无定谷科技公司,创下其在AI领域投资之最,足见其布局人工智能的雄心。 点开米哈游官网,首先映入眼帘的,是这样一句愿景: 这句话出自于米哈游CEO蔡浩宇。他曾经在分享中表示,米哈游的目标是"在未来10到30年内,能够做出像《黑客帝国》、《头号玩家》等电 影中所描绘的虚拟世界"。 为此,米哈游早在2018年开始涉足AI领域,成立了"逆熵研究部",拥有自研AI大模型Glossa。 "逆熵"团队的代表作为数字人鹿鸣,曾 ...
ChatGPT智能体正式发布,多个创业赛道昨夜无眠
量子位· 2025-07-18 00:30
白交 雷刚 发自 纽凹非寺 量子位 | 公众号 QbitAI 实用,太实用了!这才是OpenAI Agent该有的样子。 就在刚刚,OpenAI最新发布来了, ChatGPT Agent 正式对外亮相。 这是一个把 "想" 和 "干" 统一了的智能体,之前 深度研究 的思考和分析能力, Operator 的操作执行能力,在ChatGPT Agent实现了统 一。 而且ChatGPT Agent还可以接管你的整个电脑——这几乎就是全新的 操作系统 了。 能做什么? 工作场景 里,安排和改期会议、生成PPT、制定出差和外出议程、自动提交报销……几乎就是大厂高管才能配置的 助理 的核心工作。 生活场景 下,你个人的旅游行程规划设计、重大活动如婚礼晚宴安排……一些定期需要手动更新的认证证明……差不多也是董事长CEO们 个 人秘书 实现的能力。 但现在,ChatGPT Agent一夜之间人人都可拥有。OpenAI还专门配备了 专用模型 ,创造了全新的SOTA,刷新了模型能力新纪录。 之前,通用Agent们只敢自称"实习生",但OpenAI在自研底层模型能力的底气下,几乎就把"实习生"变成了"大秘书"。 之前一个创业赛道 ...
o1核心贡献者离职后首发声:AI是史上最强杠杆,超越人力、资本和代码
量子位· 2025-07-17 09:03
梦晨 发自 凹非寺 量子位 | 公众号 QbitAI 又一位离职OpenAI的核心研究员发声! 刚刚被曝加入Meta的 Hyung Won Chung ,分享了他对AI未来的深刻思考:人工智能正在成为有史以来最强大的杠杆机制。 Hyung Won Chung和一同离开OpenAI 的Jason Wei 是长期搭档,他们的合作可以追溯到谷歌大脑时期,两人曾共同作为第一作者发表了 关于模型微调的重要论文《Scaling Instruction-Finetuned Language Models》。 Jason Wei曾称赞他: Hyung Won Chung识别新范式并完全舍弃任何沉没成本的能力给我留下了深刻的印象。2022年底,他意识到了强化学习的力量,并从 那时起就一直在宣扬它。 Hyung Won Chung在OpenAI期间,是o1、o1-preview和Deep Research等核心项目的贡献者。 AI成为个人能力的倍增器 Hyung Won Chung首先从一朵含苞待放的花讲起。 他指出,人类天生不擅长察觉那些以年为单位发生的缓慢变化,这种"缺陷"让我们可能 严重低估了AI带来的变革幅度 。 人工智 ...
Transformer危!谷歌MoR架构发布:内存减半推理速度还翻倍
量子位· 2025-07-17 09:03
Core Viewpoint - Google has introduced a new underlying architecture called Mixture-of-Recursions (MoR), which significantly enhances reasoning speed by 2 times while halving KV memory usage, and allows for dynamic resource allocation across different tasks within a single framework [1][2][3]. Group 1: MoR Innovations - MoR integrates unified parameter sharing and adaptive recursion depth, addressing the high computational and memory demands of traditional Transformers while maintaining model performance [7][9]. - The architecture employs a recursive Transformer that divides the model into recursive blocks, reusing a shared pool of parameters, which reduces the number of unique parameters and enhances distributed training efficiency [10][13]. - MoR utilizes a dynamic routing mechanism to assign different recursion depths to each token, concentrating computation on complex tokens, and incorporates KV caching strategies to improve memory efficiency [15][19]. Group 2: Performance Comparison - Experiments comparing MoR with original Transformers and recursive baseline models across various parameter scales (135M to 1.7B) show that MoR uses nearly 50% fewer parameters while achieving lower validation loss and higher few-shot accuracy of 43.1% [16][19]. - MoR reduces training FLOPs by 25% and training time by 19% while also decreasing peak memory usage by 25% when training on a fixed 20B tokens [21]. - The routing strategy analysis indicates that Expert-choice routing outperforms Token-choice routing, highlighting the importance of routing granularity on performance [22]. Group 3: Architectural Evolution - Google has a history of rethinking underlying architectures, aiming to reconstruct computational paradigms through innovations like the Mixture of Experts (MoE) model, which allows for efficient training of large models by activating only a subset of expert networks [27][30]. - The introduction of MoR is seen as a potential game-changer in the AI landscape, with expectations that it may surpass the capabilities of Transformers in the future [32].