量子位 - filings, earnings calls, financial reports, news

量子位

Search documents

量子位· 2025-12-01 05:45

微软前AI副总裁、目前在OpenAI研究AGI的Sebastien Bubeck激动分享了这一消息，并表示： 30年悬而未决的数学难题就这样被AI证明了？！此时此刻，（前推特）正在刮起一股讨论之风—— 来自Harmonic的数学AI模型独立证明了 Erdős问题 #124 ，而这个问题已经被数学家无奈搁置了近30年。一水发自凹非寺量子位 | 公众号 QbitAI 该解决方案100%由AI生成，总计耗时6小时。甚至连陶哲轩这样的顶尖数学家也跑来围观讨论，他在对比了Gemini和ChatGPT的深度研究工具后发现，Harmonic模型对该问题的证明表现更佳。所以这到底是一个怎样的问题？Harmonic模型又是如何"大显神功"？咱接着瞧—— AI证明了Erdős问题 #124简易版首先需要提醒，在听完各路大神讨论后，我们才意识到—— 原来Harmonic模型所证明的并非原版Erdős问题 #124 ，而是一个简易版本。 Erdős问题 #124需要提供的证明如下： $$\sum_{1\leq i\leq k}{\frac{1}{d_{i}-1}}\geq1.$$ 通俗理解即为：假设你有 ...

AI证明数学难题

Vibe Coding

Vibe证明时代

Artificial Intelligence

Artificial Intelligence

Harmonic模型

Aristotle模型

免费国产Banana真香！我想把PS给卸载了

量子位· 2025-12-01 05:45

Core Viewpoint - The article discusses the advancements in Vidu Q2, a product from Shengshu Technology, highlighting its superior consistency and new features in AI-generated images and videos, positioning it as a competitive alternative to established players like OpenAI and Google [8][9][57]. Group 1: Product Features - Vidu Q2 has upgraded its reference image generation capabilities, claiming to have the industry's strongest consistency, allowing for repeated edits while maintaining character and object integrity [8]. - The new features include text-to-image generation and image editing, enabling users to create images with simple prompts, comparable to advanced editing software [9][35]. - Vidu Q2's image editing function allows users to change image proportions and details without complex processes, making it user-friendly and efficient [37][46]. Group 2: Performance Comparison - In a performance comparison, Vidu Q2 ranked fourth in the latest AA leaderboard, surpassing OpenAI and competing closely with major companies like Google and ByteDance [9]. - The article emphasizes that Vidu Q2 maintains high consistency in image generation, outperforming competitors like Nano Banana Pro in preserving background and structural details [20][29]. Group 3: User Experience and Accessibility - Vidu Q2 offers a one-month free membership for its new features, making it accessible for users to explore its capabilities [11]. - The platform provides a streamlined workflow for creators, allowing seamless transitions between image and video generation, which reduces the trial-and-error costs associated with content creation [52][57].

联通破解扩散模型速度质量零和博弈，推理速度提升5倍丨CVPR 2025 Highlight

量子位· 2025-12-01 04:26

Core Insights - The article discusses the advancements in diffusion models, particularly focusing on the ShortDF and LeMiCa papers, which represent significant breakthroughs in the field of image and video generation [1][2][4]. Group 1: Technical Evolution - ShortDF serves as a theoretical pioneer in optimizing diffusion models through online training, while LeMiCa expands this theory into offline mapping for higher-dimensional tasks [4]. - The core challenge in diffusion models is the expensive inference costs, which hinder real-time applications [8]. - The non-linear denoising trajectory of diffusion models is identified as a primary reason for slow progress in the field [9]. Group 2: ShortDF's Mechanisms - ShortDF introduces a "shortest path optimization" approach to directly straighten the denoising trajectory during training, aiming to break the trade-off between speed and quality [12]. - The model's core insight is that the denoising process is fundamentally a correction of the initial error, which can be minimized to improve overall performance [13][14]. - ShortDF employs a three-pronged strategy: 1. Locking the "error upper bound" to optimize from the source [14][15]. 2. Utilizing graph theory to relax and compress paths, thereby minimizing the error upper bound [20][21]. 3. Implementing multi-state optimization to ensure training stability amidst random noise [28][29]. Group 3: Performance Metrics - ShortDF demonstrates superior performance in speed and quality, achieving a 5.0 times speed increase over DDIM while improving image quality (FID score of 9.08 compared to DDIM's 11.14) [36]. - The model shows robustness in complex scenarios, effectively restoring object contours faster than competing methods [37]. - In various datasets, ShortDF maintains a balance between performance and speed, showcasing its potential for real-world applications [40]. Group 4: Industry Implications - The advancements in ShortDF and LeMiCa highlight the importance of refined mathematical modeling over mere computational power in enhancing diffusion model speeds [41]. - These developments are crucial for the application of AIGC technology in resource-constrained environments, such as mobile devices and real-time interactive designs [42].

ChatGPT广告代码泄露！奥特曼一年三变脸：从“广告令人不安”到“并非完全不可取”

量子位· 2025-12-01 04:26

梦晨发自凹非寺量子位 | 公众号 QbitAI ChatGPT广告代码泄露，就在发布三周年之际，终于要开始变现了？在ChatGPT安卓APP测试版的代码中，发现了多个与广告相关的引用。就在2024年5月奥特曼还称ChatGPT加广告是 "最后的手段"和"令人不安的" 。到2025年10月，他已经改口 "觉得广告有点令人反感，但并非完全不可取" 。到现在从技术细节来看，这次的广告系统已经相当成熟，距离正式上线不远了。 OpenAI的运营模式面临着巨大的财务压力，探索广告变现终于还是提上了日程。汇丰银行最近给OpenAI算了一笔账，运营ChatGPT的开销中仅维持其算力基础设施每年就可能需要数千亿美元。目前ChatGPT Plus订阅服务每月收费20美元，加上API授权收入，远远无法覆盖这些成本。 OpenAI在2029年之前将持续处于亏损状态，累计亏损可能超过1000亿美元。 ChatGPT变现，奥特曼变脸开发者Tibor首次曝光了这个重磅消息。他在ChatGPT安卓应用1.2025.329测试版的代码中，发现了多个与广告相关的引用。代码中明确出现了"ads feature"字样，还包含了 ...

AI广告

AI变现

Artificial Intelligence

ChatGPT

AI广告

AI变现

Artificial Intelligence

ChatGPT

6B文生图模型，上线即登顶抱抱脸

量子位· 2025-12-01 04:26

Core Viewpoint - The article discusses the launch and performance of Alibaba's new image generation model, Z-Image, which has quickly gained popularity and recognition in the AI community due to its impressive capabilities and efficiency [1][3]. Group 1: Model Overview - Z-Image is a 6 billion parameter image generation model that has achieved significant success, including 500,000 downloads on its first day and topping two charts on Hugging Face within two days of launch [1][3]. - The model is available in three versions: Z-Image-Turbo (open-source), Z-Image-Edit (not open-source), and Z-Image-Base (not open-source) [8]. Group 2: Performance and Features - Z-Image demonstrates state-of-the-art (SOTA) performance in image quality, text rendering, and semantic understanding, comparable to contemporaneous models like FLUX.2 [3][8]. - The model excels in generating realistic images and handling complex text rendering, including mixed-language content and mathematical formulas [6][15]. - Users have reported high-quality outputs, including detailed portraits and creative visual interpretations, showcasing the model's versatility [11][14][32]. Group 3: Technical Innovations - Z-Image's speed and efficiency are attributed to its architecture optimization and model distillation techniques, which reduce computational load without sacrificing quality [34][39]. - The model employs a single-stream architecture (S3-DiT) that integrates text and image processing, streamlining the workflow and enhancing performance [35]. - The distillation process allows Z-Image to generate high-quality images with only eight function evaluations, significantly improving generation speed [40][42]. Group 4: Market Position and Future Prospects - The timing of Z-Image's release is strategic, coinciding with the launch of FLUX.2, indicating a competitive landscape in the AI image generation market [44]. - The model's open-source availability on platforms like Hugging Face and ModelScope positions it favorably for further adoption and experimentation within the AI community [45].

Artificial Intelligence

Artificial Intelligence

Z-Image

量子位编辑作者招聘

量子位· 2025-12-01 04:26

编辑部发自凹非寺量子位 | 公众号 QbitAI AI热潮还在汹涌，但如果你还不知道如何参与……那为什么不来量子位呢？我们是一家以追踪AI新进展为核心的内容平台，经过8年积累，目前拥有顶流影响力，广泛且备受认可的产业资源，以及时代风口的最佳观测和学习生态位。目前，我们有三大方向岗位招聘，希望你是（或者能成为）这三个方向的内容专家：岗位均为全职，工作地点：北京中关村。岗位面向：加入我们，你可以获得：以下是岗位详情：所有岗位不同能力层级职位均在开放，欢迎结合个人履历和经验申请。 AI产业方向岗位职责： AI产业方向：关注基建层创新，包含芯片、AI Infra、云计算； AI财经方向：关注AI领域创投和财报，跟踪产业链资本动向； AI产品方向：关注AI在应用和硬件终端方向的进展。社招：覆盖编辑、主笔、主编各个层级，按能力匹配岗位；校招：应届毕业生，接受实习且可转正。站在AI浪潮之巅：第一时间接触和了解AI领域最新技术和产品，构建完整的AI认知体系。玩转AI新工具：将各种AI新技术、新工具应用于工作，提升工作效率和创造力。打造个人影响力：通过撰写独家原创内 ...

对商户投放ROI负责，这个视频营销Agent底气从何而来？丨对话布尔向量

量子位· 2025-11-30 11:30

Core Insights - The article discusses the emergence of Temvideo, an AI video agent developed by Boolvector, aimed at addressing the marketing challenges faced by cross-border e-commerce businesses. The product enhances video production efficiency and reduces costs while maintaining high ROI for users [6][11]. Group 1: Product Overview - Temvideo is the world's first AI video agent designed specifically for marketing scenarios, targeting the pain points of low efficiency and high costs in video production for cross-border e-commerce [11]. - The core functionality of Temvideo includes batch video generation using verified high ROI templates, significantly reducing production time while achieving quality comparable to human-made videos [11][12]. - The product is designed to cater to e-commerce users with annual revenues between 10 million and 100 million, focusing on their advertising needs and ensuring high click-through and conversion rates [12][22]. Group 2: Unique Features and Advantages - Temvideo's design incorporates industry know-how, allowing it to generate effective marketing videos based on successful past campaigns, thus enhancing the quality of output [12][36]. - The product utilizes a combination of large models and industry-specific algorithms to improve video content understanding and production accuracy, addressing the limitations of generic AI models [30][32]. - Temvideo's ability to automatically segment video clips and match background music enhances the overall video quality, meeting the high detail requirements of merchants [29][30]. Group 3: Market Context and Trends - The article highlights that only about 10% of e-commerce businesses currently utilize AI video and image generation technologies, indicating significant room for growth in this sector [71]. - The demand for high-quality video content on social media platforms is increasing, with platforms like TikTok and Meta requiring more engaging and effective video advertisements [75][76]. - The potential market for AI-generated video content is substantial, with two primary business models: charging per video produced or sharing revenue based on performance metrics [78][79]. Group 4: Challenges and Future Directions - The article notes that many AI products face challenges in user retention due to high expectations and the complexity of AI capabilities, which can lead to unsatisfactory results [86]. - Boolvector aims to balance result delivery and cost control, focusing on optimizing the video generation process to ensure user satisfaction and retention [92][93]. - The future vision for Temvideo includes transitioning from a pay-per-video model to a performance-based payment system, fostering a sustainable business model that aligns with user success [95][98].

Transformer作者爆料GPT-5.1内幕！OpenAI内部命名规则变乱了

量子位· 2025-11-30 11:30

Core Insights - The article discusses a significant paradigm shift in AI, indicating that the development of AI is not slowing down but rather transitioning to a new phase of growth [1][7][12]. Group 1: AI Development Trends - There are two contrasting views on AI development: one claims that AI growth is slowing down, while the other highlights continuous advancements with new models like GPT-5.1 and Gemini 3 being released [3][12]. - Łukasz Kaiser argues that the perception of slowing growth is incorrect, stating that AI's capability growth follows a smooth exponential curve, akin to Moore's Law [15][16]. - The shift from pre-training to reasoning models is a key factor in this transition, with pre-training being in a later stage of its S-curve while reasoning models are still in their early stages [18][19]. Group 2: Reasoning Models and Their Impact - The industry is focusing on smaller, cost-effective models that maintain quality, leading to the misconception that pre-training has stalled [21]. - Reasoning models, which allow for more complex thought processes and the use of tools during inference, are expected to progress rapidly due to their emerging nature [22][27]. - The evolution of models like ChatGPT demonstrates a qualitative leap in performance, with newer versions incorporating reasoning and external tool usage for more accurate responses [23][24]. Group 3: GPT-5.1 Insights - GPT-5.1 is not merely a minor update but represents a significant stability iteration, enhancing reasoning capabilities through reinforcement learning and synthetic data [34][35]. - The naming convention for versions has shifted to focus on user experience rather than technical details, allowing for greater flexibility in development [38]. - Despite improvements, GPT-5.1 still has limitations, particularly in multi-modal reasoning, as illustrated by its struggles with basic tasks that require contextual understanding [41][42]. Group 4: Future of AI and Robotics - AI is expected to change the nature of work without eliminating jobs, as human expertise will still be needed in high-stakes scenarios [62][66]. - Home robots are anticipated to be the next visible AI revolution, driven by advancements in multi-modal capabilities and general reinforcement learning [67][69]. - The integration of these technologies is expected to lead to a significant leap in the capabilities of home robots, making them more intuitive and perceptible compared to current AI models like ChatGPT [69].

Artificial Intelligence

Artificial Intelligence

Transformer

居然有21%的ICLR 2026评审纯用AI生成…

量子位· 2025-11-30 06:45

衡宇发自凹非寺量子位 | 公众号 QbitAI ICLR 2026，居然有21%的评审是纯纯由AI生成的？！上面这个相当扎心的答案，来自Pangram实验室的分析报告。这件事被发现的起因颇具戏剧色彩：CMU的AI研究员Graham Neubig，感觉自己收到的同行评审AI味超级重。他之所以起疑心，是因为这些评审内容"非常冗长，且包含大量符号"，并且所要求的分析方式并非"审稿人通常在AI或ML论文中所要求的那种标准统计分析方式"。做事嘛，不能光靠直觉，要真凭实据啦。 Graham Neubig自己干不了这个事儿，就在上发布了一个悬赏令，希望有人能做一轮系统性的检测，看ICLR的论文和审稿中到底夹杂了多少AI文本。我愿意悬赏50美元，给第一个做了这件事的人～ Pangram实验室就是那个接黄榜的。这个实验室的业务之一，正好是开发检测AI生成文本的工具。结论简单粗暴：一个顶级AI学术会议，审稿和投稿两头都出现大规模AI代写…… 是怎么测出"AI味"的？ Pangram这次对ICLR的全部提交论文和所有评审做了系统分析，并且在博客中公开了全过程。他们先在OpenReview上，把I ...

告别GUI Agent工程基建噩梦：阶跃开源4B Agent模型，跑通所有安卓设备，手搓党一键部署

量子位· 2025-11-30 06:45

Core Insights - The article discusses the launch of GELab-Zero, an open-source GUI Agent model that allows for easy deployment and aims to enhance the scalability of mobile agents in various applications [1][8]. Group 1: Model Performance and Capabilities - The 4B version of the GUI Agent model has achieved state-of-the-art (SOTA) performance across multiple GUI benchmarks on both mobile and desktop platforms [2][11]. - GELab-Zero-4B-preview outperforms other mainstream models, including larger parameter models like GUI-Owl-32B, demonstrating superior performance and easier deployment [13][11]. - The model is designed to handle complex tasks and vague instructions effectively, showcasing its versatility in various applications [19][24]. Group 2: Development and Deployment - The article emphasizes the need to lower development and usage barriers for mobile agents, allowing developers to focus on value creation rather than infrastructure setup [7][30]. - GELab-Zero includes a complete technical architecture that enables one-click deployment, facilitating a seamless experience for developers [25][26]. - The model supports lightweight local inference, enabling it to run on consumer-grade hardware while maintaining low latency and privacy [26]. Group 3: Evaluation Standards - The research team has established a new evaluation standard called AndroidDaily, which focuses on real-world applications and user scenarios, moving beyond traditional productivity benchmarks [5][31]. - AndroidDaily assesses the model across six core dimensions of modern life, including dining, travel, shopping, housing, information consumption, and entertainment [33]. - The evaluation framework includes both static and end-to-end testing methodologies to ensure comprehensive assessment of the model's capabilities [35][38]. Group 4: Future Directions - The research team aims to continue optimizing model performance, expanding cross-platform support, and enriching the ecosystem of tools while adhering to principles of openness, control, and privacy [41].

GELab-Zero-4B-preview

GELab-Zero-4B-preview

AndroidDaily

Previous Next