Workflow
量子位
icon
Search documents
8个月晋升独角兽,欧洲版Cursor估值18亿美元
量子位· 2025-07-18 08:36
时令 发自 凹非寺 量子位 | 公众号 QbitAI 成立仅8个月已成为 最新独角兽 , 估值飙升至 18亿美元 。 目前已拥有超 230万 免费活跃用户、 18万 付费订阅者,付费用户首月留存率甚至已 超ChatGPT 。 这不是硅谷神话,而是来自瑞典的AI新星—— L ovab le ,正在用自然语言重塑编程方式。 近日,这家公司完成了瑞典史上最大规模的A轮融资,成功筹集2亿美元。 上线数月,Lovable一直好评如潮。 有人表示Lovable让他感到惊艳,在尝试了一些开发平台(Bolt,V0,Replit)都无法完成的情况下,它竟然在短短几个小时内就生成了一个 完整的产品网站。 还有人用它生成一款新游戏。 欧洲版Cursor 与Cursor一样,Lovable也致力于利用大模型帮助用户开发应用,但它瞄准的是一个潜力更大的用户群体——那些不会编程的人。 甚至有人计划用它在30天内建立一家初创公司,整个过程全程公开。 公司在一份新闻稿中表示,这类用户及其测试活动,很可能构成了迄今为止平台上创建的1000万个项目的主要来源。 公司联合创始人兼首席执行官Osika称 : 我们的使命是让任何人都能构建。 借助大模 ...
7B模型“情商”比肩GPT-4o,腾讯突破开放域RL难题,得分直翻5倍
量子位· 2025-07-18 06:16
Core Insights - The article discusses the challenges and solutions in optimizing large models for emotional intelligence in multi-turn dialogues using Reinforcement Learning (RL) [2][4][5] - The proposed RLVER framework integrates a user simulator that acts as both the interaction environment and the reward source, addressing the three main challenges of RL in this context [2][5][11] Group 1: Challenges in RL for Emotional Intelligence - The three main challenges identified are: 1. Environmental challenge: Creating a realistic and diverse interaction environment for the model [2][4] 2. Reward challenge: Converting subjective user satisfaction into stable, long-term rewards [2][11] 3. Training challenge: Achieving stable and efficient multi-turn online RL training on large language models (LLMs) [2][4] Group 2: RLVER Framework - The RLVER framework utilizes a user simulator that embodies diverse user profiles and interaction scenarios, allowing for a rich and dynamic learning environment [7][8] - This simulator updates its emotional state based on the model's responses, providing personalized feedback that enhances the model's learning experience [9][10] Group 3: Performance Outcomes - The Qwen2.5-7B model, trained using RLVER, achieved a score of 79.2 on the Sentient-Benchmark, a significant increase from 13.3, positioning it alongside top commercial models like GPT-4o and Gemini 2.5 Pro [16][17] - The model maintained its general capabilities in areas like mathematics and coding, avoiding "catastrophic forgetting" [17] Group 4: Insights from Training - The introduction of explicit "think-then-say" prompts improved the model's ability to understand and respond empathetically, leading to two distinct paths towards empathy: "thinking models" and "reactive models" [20][21] - The choice of optimization algorithms (PPO vs. GRPO) revealed that focusing on specific dimensions of emotional intelligence can yield better overall performance [23][27] Group 5: User Simulator Insights - The RLVER team created two types of user simulators, with findings indicating that a more forgiving environment (Vanilla simulator) is beneficial for early-stage model growth compared to a more challenging environment [29][30] - Models with explicit thinking structures demonstrated greater robustness in challenging environments, suggesting that reasoning capabilities can mitigate training instability [33]
大模型IMO25数学竞赛成绩公布了
量子位· 2025-07-18 06:16
Core Viewpoint - The article discusses the results of a mathematical model evaluation conducted by MathArena, highlighting that Gemini 2.5 Pro significantly outperformed its competitors in the IMO 2025 challenge, achieving over 30% higher total scores than the second-place model, o3, which was 89% lower than Gemini [1][2]. Group 1: Evaluation Process - The evaluation was organized by MathArena, selecting models based on their past performances in MathArena competitions, including Gemini 2.5 Pro, o3, o4-mini, Grok 4, and DeepSeek-R1 [4]. - A unified prompt template was used for all models to ensure fairness, aligning with the Open Proof Corpus evaluation [5]. - Each model was run with recommended hyperparameters and a maximum token limit of 64,000 [6]. Group 2: Scoring and Judging - Four experienced human judges with IMO-level mathematics expertise were hired to assess the models, with each problem scored out of 7 points [10][11]. - Each model generated 32 initial answers, from which they selected their best four for final scoring [8]. Group 3: Performance Insights - Many models scored between 3-4 points out of 7, a phenomenon less common in human testing, indicating a disparity in capabilities between humans and models [12]. - There was a notable reduction in models overly optimizing the final answer format, suggesting progress in handling open-ended mathematical reasoning tasks [13]. - Gemini showed improvement in avoiding the fabrication of non-existent "theorems" compared to previous evaluations [14]. Group 4: Problem-Solving Performance - The models faced challenges in geometry, with the second and sixth problems yielding the lowest scores, particularly the second problem where only Grok 4 scored 4% [26][27]. - The fourth problem saw most models using similar methods to humans but making logical errors, while the fifth problem identified correct strategies but failed to provide proofs [29].
Meta全新AI组织架构曝光,这范儿有点字节
量子位· 2025-07-18 06:16
Core Viewpoint - Meta is undergoing significant organizational restructuring, particularly in its AI division, with a focus on creating a "Super Intelligence Lab" that aims to attract top talent and enhance its AI capabilities [2][10][11]. Group 1: Organizational Changes - Meta has integrated over 3,400 employees into a new AI organization, led by Alexandr Wang as Chief AI Officer, with Nat Friedman as his deputy [2][17]. - The new structure consists of four main groups: AGI foundational research, AI product development, a basic AI lab led by Yann LeCun, and a new team focused on Llama 5 [5][12][20]. - The organization is characterized by high salaries, with reports of packages exceeding $100 million, which has created a competitive atmosphere in Silicon Valley [10][11]. Group 2: Talent Acquisition - Meta has aggressively recruited talent from companies like OpenAI, Apple, and Google, leading to concerns about the impact on company culture [10][27]. - Recent hires include prominent figures from Apple, such as Tom Gunter and Mark Lee, who have close ties to the new leadership in Meta's AI division [30][32]. - The recruitment strategy appears to mirror ByteDance's approach, indicating a shift in Meta's operational philosophy towards a more aggressive talent acquisition model [37][44]. Group 3: AI Development Focus - The primary goal of the "Super Intelligence Lab" is to prioritize foundational research in AGI while also developing practical AI applications across Meta's product lines [11][21]. - The lab is expected to work on both open-source and closed-source models, with a potential dual-track approach for Llama 5 and Llama 4.1 [7][25]. - The integration of various AI capabilities aims to create a seamless application of advanced models into Meta's existing products, such as the Meta AI assistant [22][48].
突破户外RGB-only SLAM尺度漂移难题,精确定位+高保真重建 | ICCV'25开源
量子位· 2025-07-18 06:16
S3PO-GS团队 投稿 量子位 | 公众号 QbitAI 户外SLAM的尺度漂移问题,终于有了新解法! 香港科技大学(广州) 的研究的最新成果: S3PO-GS ,一个专门针对户外单目SLAM的3D高斯框架,已被ICCV 2025接收。 项工作的亮点在于首次实现了RGB单目SLAM的全局尺度一致性。在Waymo、KITTI和DL3DV三大户外基准测试中,S3PO-GS不仅在新视角 合成任务中刷新了SOTA纪录,更是在DL3DV场景中将跟踪误差降低了77.3%。 这篇文章做了什么? 在自动驾驶、机器人导航及AR/VR等前沿领域,SLAM技术的鲁棒性直接影响系统性能。 当前基于3D高斯(3DGS)的SLAM方案虽在室内场景表现卓越,但在仅依赖RGB输入的无界户外环境中仍面临严峻挑战: 单目系统固有的深度先验缺失导致几何信息不足,而引入单目深度估计或端到端点云模型(如MASt3R)作为几何先验时,又因帧间尺度不一 致性引发系统级尺度漂移,该问题在复杂户外场景尤为突出。 针对这一双重瓶颈,香港科技大学(广州)研究团队提出创新框架 S3PO-GS ,首次实现RGB单目SLAM的全局尺度一致性。 该方案通过三大核心技术 ...
一年破千万美金,一款海外AI创意引擎爆发了
量子位· 2025-07-18 06:16
Core Viewpoint - Creati, an AI-driven creative engine, has rapidly gained traction in the advertising sector, amassing 10 million users and generating millions in annual revenue within just one year of its launch [5][6]. Group 1: AI Creative Engine - Creati focuses on automating the creative process in advertising, differentiating itself from competitors by leveraging influencer power for customized creative content [6][8]. - The platform allows businesses to transform popular influencer videos into tailored templates, significantly reducing the time and effort required to generate marketing materials [9][12]. - Creati's unique AI model enables the production of high-quality videos that rival traditional advertising efforts, attracting major brands like Shein and Cider [10][11]. Group 2: Market Disruption - The platform addresses the pain points of both influencers and small businesses by providing a stable income stream for influencers and simplifying the creative process for businesses [11][12]. - Creati's approach to content generation is designed specifically for e-commerce, recognizing the unique needs of online retailers compared to general video generation tools [18][20]. - The platform's ability to maintain consistency in product representation is a key advantage, particularly for e-commerce businesses [20]. Group 3: Data-Driven Innovation - Creati employs a data feedback loop to refine its AI creative model, allowing for continuous improvement based on user engagement metrics [21][22]. - The platform's ability to generate customized content based on brand characteristics and audience feedback enhances its effectiveness in driving marketing success [21][22]. - Creati's vision includes developing a creative agent that autonomously generates and optimizes advertising content, potentially revolutionizing the marketing landscape [24][25]. Group 4: Future Aspirations - The company aims to evolve into a comprehensive creative engine that can assist users in various aspects of content creation, beyond just advertising [29]. - Creati's long-term goal is to integrate advanced technologies, such as brain-computer interfaces, to further enhance its creative capabilities [29][30].
真热AI!米哈游5亿成立新公司
量子位· 2025-07-18 00:30
Core Viewpoint - The article discusses the ambitious AI initiatives by MiHoYo, particularly the establishment of a new company, Shanghai MiHoYo Wudinggu Technology Co., Ltd., with a registered capital of 500 million, marking a significant investment in the AI sector [2][8]. Group 1: Company Developments - MiHoYo has established a new company with a registered capital of 500 million, indicating its strong commitment to AI development [2][8]. - The new company will focus on software development, animation game development, and artificial intelligence application software [3]. - MiHoYo's CEO, Cai Haoyu, has expressed a vision to create virtual worlds akin to those depicted in movies like "The Matrix" and "Ready Player One" within the next 10 to 30 years [9]. Group 2: AI Initiatives and Projects - MiHoYo has been investing in AI since 2018, with the establishment of the "Reverse Entropy Research Department" and the development of its own AI model, Glossa [10]. - The "Reverse Entropy" team has gained recognition, with a digital persona, Luming, attracting 660,000 viewers during its first live stream on Bilibili [10]. - Cai Haoyu has also ventured into AI with the founding of Anuttacon, which has a strong team including former Microsoft and Bilibili executives [13]. Group 3: Competitive Landscape - The article contrasts MiHoYo's AI game "Whispers from the Star" with Elon Musk's AI girlfriend, Ani, highlighting different focuses: storytelling and RPG experience for MiHoYo, versus emotional engagement for Musk's product [26]. - The release of a demo for "Whispers from the Star" coincided with Musk's announcement of Ani, showcasing the competitive nature of AI-driven entertainment [14][20].
ChatGPT智能体正式发布,多个创业赛道昨夜无眠
量子位· 2025-07-18 00:30
Core Viewpoint - OpenAI has launched ChatGPT Agent, a unified intelligent agent that combines thinking and execution capabilities, transforming the way users interact with technology and manage tasks [2][5][8]. Group 1: Features and Capabilities - ChatGPT Agent can take over entire computer operations, functioning almost like a new operating system [3]. - It can perform various tasks in work scenarios, such as scheduling meetings, generating presentations, and submitting expense reports, akin to a high-level executive assistant [4]. - In personal scenarios, it can plan travel itineraries and manage significant events, similar to a personal secretary for CEOs [4]. - The agent integrates multiple capabilities, including website interaction, high-quality information synthesis, and conversational abilities, into a single system [10][12]. - Users can set fixed times for task execution, such as generating weekly reports [19]. Group 2: User Access and Model Training - Pro, Plus, and Team version users can experience the enhanced capabilities, with Pro users able to execute nearly unlimited tasks monthly [22][23]. - The model is not entirely new but is a specialized version of OpenAI's existing models, trained to dynamically learn and optimize its task execution [26][27]. - ChatGPT Agent has achieved state-of-the-art (SOTA) performance in various benchmarks, including a score of 41.6 in a challenging test known as "the last exam" [30][31]. Group 3: Industry Impact and Future Trends - The introduction of ChatGPT Agent signifies a major transformation in the AI landscape, potentially reshaping how tasks are performed across various sectors [41]. - The concept of AI agents is evolving, with applications extending beyond simple tasks to more complex interactions, resembling human-like capabilities [47][50]. - The rise of AI agents is expected to redefine the internet landscape, moving from website-centric models to agent-centric applications [52][55].
o1核心贡献者离职后首发声:AI是史上最强杠杆,超越人力、资本和代码
量子位· 2025-07-17 09:03
梦晨 发自 凹非寺 量子位 | 公众号 QbitAI 又一位离职OpenAI的核心研究员发声! 刚刚被曝加入Meta的 Hyung Won Chung ,分享了他对AI未来的深刻思考:人工智能正在成为有史以来最强大的杠杆机制。 Hyung Won Chung和一同离开OpenAI 的Jason Wei 是长期搭档,他们的合作可以追溯到谷歌大脑时期,两人曾共同作为第一作者发表了 关于模型微调的重要论文《Scaling Instruction-Finetuned Language Models》。 Jason Wei曾称赞他: Hyung Won Chung识别新范式并完全舍弃任何沉没成本的能力给我留下了深刻的印象。2022年底,他意识到了强化学习的力量,并从 那时起就一直在宣扬它。 Hyung Won Chung在OpenAI期间,是o1、o1-preview和Deep Research等核心项目的贡献者。 AI成为个人能力的倍增器 Hyung Won Chung首先从一朵含苞待放的花讲起。 他指出,人类天生不擅长察觉那些以年为单位发生的缓慢变化,这种"缺陷"让我们可能 严重低估了AI带来的变革幅度 。 人工智 ...
Transformer危!谷歌MoR架构发布:内存减半推理速度还翻倍
量子位· 2025-07-17 09:03
Core Viewpoint - Google has introduced a new underlying architecture called Mixture-of-Recursions (MoR), which significantly enhances reasoning speed by 2 times while halving KV memory usage, and allows for dynamic resource allocation across different tasks within a single framework [1][2][3]. Group 1: MoR Innovations - MoR integrates unified parameter sharing and adaptive recursion depth, addressing the high computational and memory demands of traditional Transformers while maintaining model performance [7][9]. - The architecture employs a recursive Transformer that divides the model into recursive blocks, reusing a shared pool of parameters, which reduces the number of unique parameters and enhances distributed training efficiency [10][13]. - MoR utilizes a dynamic routing mechanism to assign different recursion depths to each token, concentrating computation on complex tokens, and incorporates KV caching strategies to improve memory efficiency [15][19]. Group 2: Performance Comparison - Experiments comparing MoR with original Transformers and recursive baseline models across various parameter scales (135M to 1.7B) show that MoR uses nearly 50% fewer parameters while achieving lower validation loss and higher few-shot accuracy of 43.1% [16][19]. - MoR reduces training FLOPs by 25% and training time by 19% while also decreasing peak memory usage by 25% when training on a fixed 20B tokens [21]. - The routing strategy analysis indicates that Expert-choice routing outperforms Token-choice routing, highlighting the importance of routing granularity on performance [22]. Group 3: Architectural Evolution - Google has a history of rethinking underlying architectures, aiming to reconstruct computational paradigms through innovations like the Mixture of Experts (MoE) model, which allows for efficient training of large models by activating only a subset of expert networks [27][30]. - The introduction of MoR is seen as a potential game-changer in the AI landscape, with expectations that it may surpass the capabilities of Transformers in the future [32].