量子位
Search documents
0产品获苏妈投资!前腾讯AI大牛刘威视频创业,又融了8000万美元
量子位· 2026-03-18 00:21
Core Viewpoint - Video Rebirth, an AI video startup founded by former Tencent AI expert Wei Liu, has successfully raised $80 million (approximately 550 million RMB) in funding, indicating strong investor confidence in its technology and market potential [2][5][12]. Group 1: Company Overview - Video Rebirth was established over a year ago and has quickly attracted significant investment from top venture capital firms and strategic giants like AMD Ventures and Hyundai [3][5]. - The company is headquartered in Singapore and currently focuses on overseas markets, targeting enterprise and professional individual users [7]. Group 2: Funding Details - The recent $80 million funding round will be utilized for the commercialization of Video Rebirth's advanced video generation product line, the Bach series models, and for global market expansion [6][8]. - The funding round began in November 2025, with multiple institutions expressing investment interest while the company's products were still in development [8]. Group 3: Technology and Product Development - Wei Liu is leading the development of an industrial-grade video generation model named Bach, which has not yet been publicly released [4][22]. - The Bach engine employs a physics-native attention mechanism (PNA) to ensure continuity in character and physical logic throughout long shots, addressing common issues in existing AI models [22]. - The engine's unique Dual Diffusion Transformer (DDiT) architecture allows for precise instruction adherence, alleviating the time-consuming generation process faced by professional creators [23]. Group 4: Market Position and Future Prospects - The AI video generation sector is transitioning from a technology validation phase to a critical stage of commercialization and scaling, with increasing interest from various industries [12][13]. - Video Rebirth is positioned at the forefront of this technological wave, with its products expected to compete with established models like OpenAI's Sora [24]. - The company is planning another significant funding round by the end of the month, indicating ongoing investor interest and potential for further growth [25].
英伟达最强B200算力浪费60%!普林斯顿团队出手,利用率升至71%
量子位· 2026-03-18 00:21
Core Viewpoint - The article discusses the inefficiencies associated with NVIDIA's Blackwell B200 GPU, highlighting that due to hardware and software compatibility issues, 60% of computational resources are wasted [1][15]. Group 1: Performance Issues - The Blackwell B200 GPU has a tensor core computing power of 2.25 PFLOPS, which is double that of the previous Hopper H100 generation [7]. - Despite the increase in core computing power, the supporting computational units have not improved, leading to a performance bottleneck [12]. - The time taken for memory read/write operations and exponential calculations now exceeds that of matrix multiplication by 25%-60%, causing significant resource idling [13][14]. Group 2: FlashAttention-4 Solution - FlashAttention-4, developed by a team including Tri Dao and Meta, aims to address the performance bottlenecks of the Blackwell GPU [4][5]. - The algorithm increases utilization rates from the industry standard of 20%-30% to 71% [4][32]. - It employs three main optimization strategies: 1. Software simulation of exponential functions to enhance throughput and conditional softmax rescaling to reduce unnecessary computations [18][19]. 2. A restructured computation pipeline that maximizes parallelism by overlapping softmax calculations with matrix multiplications [23][24]. 3. Consideration for future hardware upgrades to ensure ongoing compatibility and optimization [27][28]. Group 3: Development Efficiency - FlashAttention-4 is developed entirely in Python using the CuTe-DSL framework, eliminating C++ code and significantly improving compilation efficiency [29]. - Compilation times for forward and backward passes have been reduced by up to 30 times compared to FlashAttention-3, with forward pass times dropping from 55 seconds to 2.5 seconds [30][32]. Group 4: Competitive Advantage - FlashAttention-4 demonstrates superior performance compared to NVIDIA's cuDNN 9.13, being 1.1-1.3 times faster, and 2.1-2.7 times faster than the Triton framework [34]. - The algorithm shows particularly strong performance in core scenarios such as long sequences and causal masking during model training and inference [37].
百度龙虾全家桶火速上桌!出手就是全球最大搜索skill
量子位· 2026-03-17 11:59
Core Viewpoint - Baidu has advanced beyond competitors in the AI industry by launching a comprehensive suite of AI applications, referred to as the "lobster family bucket," which includes various tools and capabilities designed for ease of use and deployment [1][8][80]. Group 1: Product Launches and Innovations - Baidu introduced multiple AI products, including the DuMate desktop AI assistant and the world's first "home lobster" application, enhancing its ecosystem [4][33]. - The Baidu search Skill has achieved over 45,000 downloads, making it the most downloaded official Skill on the global search engine platform [2][44]. - Baidu's new DuClaw service allows zero-deployment for users, eliminating the need for technical knowledge and simplifying the use of AI tools [26][27]. Group 2: Ecosystem Development - Baidu has established a complete ecosystem that integrates cloud deployment, mobile applications, and zero-deployment capabilities, making AI tools accessible and manageable [7][81]. - The company has focused on transforming AI from a mere installation process to a fully functional and scalable productivity system [15][80]. - Baidu's AI capabilities are packaged into standardized Skills, addressing various user needs and enhancing the overall user experience [38][42]. Group 3: Market Position and Strategy - Baidu's strategy emphasizes not just enabling users to deploy AI but also ensuring they can effectively use, manage, and expand their AI capabilities [15][80]. - The company has positioned itself as a leader in the AI space by leveraging its full-stack AI layout, which includes proprietary chips, cloud services, and model capabilities [65][82]. - As the industry shifts focus from deployment to practical application, Baidu is ahead in providing solutions that allow users to integrate AI into their daily operations [78][81].
北京养虾er!明晚19点,为你带来9+场养虾干货Talk,来创业大街见
量子位· 2026-03-17 11:59
Core Insights - The article discusses the increasing adoption of "lobster" technology, with many users expressing dissatisfaction with its performance and seeking practical usage guidance [1][2] - A "Lobster Experience Sharing Salon" is scheduled to provide hands-on insights from experienced users, aiming to enhance the practical application of the technology in daily life and work [2][8] Event Details - The salon will take place on March 18, from 19:00 to 21:00, at a specified location in Beijing [3] - Attendees can receive a "Lobster Farmer Identity Certification" sticker and engage with like-minded individuals [9] Talks Overview - The event features multiple talks covering various aspects of using the "lobster" technology, including: - Development of a new lobster model [5] - Memory management challenges and solutions for AI agents [5] - Personal experiences in utilizing OpenClaw as a personal assistant [5] - Insights from non-technical backgrounds on agent capabilities [6] - Best practices and insights on OpenClaw [6] - Innovative applications in legal practices using OpenClaw [6] - Daily life enhancements through integration with health monitoring and delivery APIs [6] - Personal journeys in developing a highly responsive AI assistant [7] Participation - The salon is open for audience registration, encouraging participation from those interested in enhancing their understanding and application of the technology [8]
Kimi新架构让马斯克叹服!17岁高中生作者一战成名
量子位· 2026-03-17 06:10
Core Insights - The article discusses the development of a new technique called Attention Residuals by the Kimi team, which innovatively applies attention mechanisms to residual connections in deep learning models, enhancing their efficiency and performance [1][6][26]. Group 1: Attention Residuals Technique - The Kimi team transformed the traditional residual network by applying attention mechanisms, allowing the model to selectively recall information from previous layers, thus improving the model's ability to focus on relevant data [2][12]. - This new method was validated on the Kimi Linear 48B model, resulting in a 25% increase in training efficiency with less than a 2% increase in inference latency [6][22]. - The implementation of Attention Residuals is a drop-in replacement for existing residual connections, requiring no modifications to other parts of the network [26]. Group 2: Performance Metrics - The Kimi Linear model demonstrated superior performance across various tasks, achieving better results in mathematical reasoning and code generation compared to baseline models [24][25]. - Specific performance improvements included a rise in MMLU scores from 73.5 to 74.6 and GSM8K scores from 81.7 to 82.4, showcasing the effectiveness of the new technique [25]. Group 3: Challenges and Solutions - The article highlights the "PreNorm dilution problem," where equal weight contributions from all layers dilute the significance of earlier information, making it difficult to retrieve relevant data [9][10]. - To address the computational complexity of full attention residuals, the team introduced Block AttnRes, which compresses outputs from multiple layers into a single vector, reducing complexity from O(L²) to O(L·B) [15][20]. Group 4: Team and Collaboration - The paper features a notable collaboration, including a 17-year-old co-author, Nathan Chen, who has garnered attention from prominent figures in the tech industry, such as Elon Musk and Andreessen Horowitz [3][31][34]. - Nathan's journey from a high school hackathon participant to a contributor in advanced AI research exemplifies the potential for young talent in the field [36][53].
卡帕西点赞Transformer内置计算机!每秒3万Token吞吐,拿下世界最难数独
量子位· 2026-03-17 06:10
Core Viewpoint - The article discusses a new approach to enhance the efficiency of large language models (LLMs) by embedding a native computer within the Transformer architecture, allowing for direct execution of programs without relying on external tools [1][2][3]. Group 1: Current Challenges and Solutions - Current advanced LLMs struggle with multi-step, long-context precise computation tasks, often relying on external tools for execution [9][10]. - Two mainstream solutions in the industry include tool invocation, where models generate scripts executed externally, and intelligent agent scheduling, which breaks down tasks for the model to process [11][12][13]. Group 2: Innovative Approach - The Percepta team's research proposes embedding a modern RAM computer and a WebAssembly interpreter directly within the Transformer weights, enabling the model to execute code natively [15][16]. - This allows any standardized program code to be compiled into a sequence of tokens that the model can recognize and execute, enhancing transparency and verifiability of the computation process [20][18]. Group 3: Efficiency Improvements - The team introduced a novel 2D attention head design, reducing computational complexity from O(n) to O(log n) by maintaining a convex hull of historical keys during token generation [21][24]. - The HullKVCache, based on this principle, achieved a throughput of 31,037 tokens per second on standard CPUs, completing approximately 9,000 instruction sequences in just 1.3 seconds, marking a nearly 200-fold efficiency improvement over traditional KV caches [26][28]. Group 4: Practical Applications - The new method was validated through two complex tasks: a 10x10 minimum cost perfect matching and solving the world's hardest Sudoku puzzle, achieving 100% accuracy in both cases [29][35]. - In the perfect matching task, the model executed the Hungarian algorithm with a token generation speed of 33,583 tokens per second [33]. - For the Sudoku puzzle, the model utilized a compiled solver, successfully solving the puzzle in 3 minutes while generating a readable log of the process [38].
企业级靠谱龙虾升级,拒绝失控
量子位· 2026-03-17 04:13
Core Insights - The article discusses the rapid rise and subsequent decline of AI solutions like OpenClaw, highlighting the need for AI employees that can effectively assist in complex business environments rather than just lightweight tasks [2][3][48] - It emphasizes the importance of a comprehensive underlying infrastructure, specifically enterprise-level models, to support the effective deployment of AI digital employees [5][6][10] Group 1: Company Overview - Dipu Technology has recently upgraded its Deepexi enterprise model, which is distinct from general models due to its focus on business-specific applications and integration with platforms like FastAGI and FastData Foil [6][10][14] - The company has reported a narrowing of losses, attributed to significant revenue growth and improved gross margins, indicating a positive cycle of revenue growth leading to profitability [10][11] - Dipu Technology serves numerous leading enterprises across various sectors, including retail, manufacturing, healthcare, and transportation, showcasing its broad client base [12] Group 2: AI Model Capabilities - The upgraded Deepexi enterprise model is designed to accurately understand business processes and logic, enabling it to perform tasks such as coding, data querying, and process execution [24][25][39] - The model's training is supported by a proprietary Deepology dataset, which has been developed through extensive collaboration with over 300 industry-leading clients, ensuring high-quality data for AI training [26][28] - The integration of FastData Foil enhances the model's ability to process multimodal data, allowing for dynamic updates and improved data governance [33][35] Group 3: Market Position and Future Directions - Dipu Technology distinguishes itself in the enterprise AI market by focusing on practical applications and the seamless integration of AI into business processes, moving away from the hype surrounding general models [42][46] - The company is planning to develop next-generation models and explore the integration of AI with physical robotics, aiming to create embodied AI employees that can operate in real-world environments [50] - The ongoing evolution of the Deepexi model and its associated platforms positions Dipu Technology as a leader in the enterprise AI space, capable of delivering substantial value to businesses [42][49]
视频生成一长就漂移竟是前序帧「太干净」惹的祸!研究揭示共享噪声水平才是长视频稳定关键
量子位· 2026-03-17 04:13
Core Insights - The article discusses the challenges of autoregressive (AR) video generation, particularly the issue of error accumulation leading to drift and degradation in video quality over long sequences. It introduces HiAR, a new model designed to address these issues effectively [3][4][12]. Group 1: Problem Identification - The main challenge in AR video generation is the inconsistency between training and inference, which results in accumulated errors and significant drift in longer video sequences [3][4]. - Existing methods to mitigate drift, such as simulating prediction errors and using first frame sinks, have limitations that hinder their effectiveness [3][18]. Group 2: HiAR Model Introduction - HiAR is a collaborative effort from researchers at multiple institutions aimed at exploring the reasons behind drift and providing an efficient solution [5]. - The model re-evaluates the necessity of completely denoised previous frames, proposing a hierarchical denoising framework that allows for causal generation without waiting for prior frames to be fully denoised [9][10]. Group 3: Technical Innovations - HiAR maintains a shared noise level across all video blocks during the denoising process, significantly reducing error propagation between blocks and enabling pipeline parallel inference [9][16]. - The introduction of forward KL regularization during training helps maintain dynamic diversity in generated videos, preventing the model from producing static, low-motion outputs [10][11]. Group 4: Performance Evaluation - HiAR demonstrated superior performance in the VBench long video benchmark, achieving a drift score of 0.257, which is significantly lower than baseline methods, while maintaining high visual quality and semantic stability [13][14]. - The model can generate high-quality continuous videos for extended durations, achieving 3 hours of video generation from just 5 seconds of training data, although some semantic continuity issues remain due to the absence of external memory modules [15]. Group 5: Engineering Advantages - HiAR's hierarchical denoising architecture allows for approximately 1.8 times faster inference without compromising video quality, achieving a throughput of 30 frames per second with a low latency of 0.30 seconds per chunk [16].
量子位编辑作者招聘
量子位· 2026-03-17 04:13
Core Viewpoint - The article emphasizes the ongoing AI boom and invites individuals to join the company "Quantum Bit," which focuses on tracking AI advancements and has established itself as a leading content platform in the industry [1]. Group 1: Job Opportunities - The company is hiring for three main directions: AI Industry, AI Finance, and AI Product, with positions available for both experienced professionals and fresh graduates [2][4]. - Positions are open for various levels, including editors, lead writers, and chief editors, with a focus on matching roles to individual capabilities [6]. Group 2: Job Responsibilities - **AI Industry Direction**: Responsibilities include tracking innovations in infrastructure, such as chips, AI infrastructure, and cloud computing, as well as producing accessible reports on technical conferences and papers [6][7]. - **AI Finance Direction**: Focuses on venture capital, financial reports, and analyzing capital movements within the AI industry, including interviews with investors and entrepreneurs [11]. - **AI Product Direction**: Involves monitoring AI applications and hardware developments, writing in-depth product evaluations, and engaging with product experts [11]. Group 3: Benefits and Work Environment - Employees can expect a vibrant team atmosphere, opportunities for personal influence through original content creation, and professional mentorship from senior editors [6][11]. - The company offers competitive salaries and comprehensive benefits, including social insurance, meal allowances, and performance bonuses [6]. Group 4: Company Growth and Reach - By 2025, Quantum Bit aims to have over 2.4 million subscribers on WeChat and more than 7 million users across platforms, with a daily reading volume exceeding 2 million [12]. - The company is recognized as the top new media outlet in the AI and frontier technology sectors according to third-party data platforms [12].
GPT-5.4一周狂赚10亿美元ARR!一句嗨烧掉80刀,效率却飙升32倍
量子位· 2026-03-17 04:13
Core Insights - GPT-5.4 has achieved record-breaking performance, processing approximately 5 trillion tokens daily and generating an annualized net new revenue of $1 billion within just one week of its launch [1][2]. Group 1: Performance Metrics - The daily traffic of GPT-5.4 has surpassed the total API usage of OpenAI from a year ago [2]. - To put this into perspective, GPT-5.4 processes over 45 million complete books daily, equivalent to the word count of "Dream of the Red Chamber" [3]. - The model's cost for running the full intelligence index test was approximately $2,951, which is about 28% higher than GPT-5.2 [13]. Group 2: Cost and Efficiency - The cost of using GPT-5.4 is significantly higher due to its increased token consumption, with around 120 million tokens used, which is 55% more than GPT-5.3 [15]. - The pricing for GPT-5.4 is $2.5 per million tokens for input and $15 for output, compared to GPT-5.2's $1.75 and $14 respectively [17]. - Despite the higher costs, GPT-5.4 has shown a 32-fold increase in efficiency over the past three months, achieving a 90% accuracy rate at a cost of only $0.37 per task [26][27]. Group 3: Technological Advancements - GPT-5.4 is the first "unified model" that integrates reasoning, programming, computer-native interaction, deep web search, and supports a million-token context [30]. - It has demonstrated superior performance in various key benchmark tests, outperforming previous models and showing an 83% probability of surpassing human performance in 44 different job roles [33]. Group 4: Practical Applications - The model can autonomously perform tasks such as sending emails and scheduling appointments, showcasing its advanced computer operation capabilities [39]. - Users have tested its ability to create interactive scripts and even draw images using Microsoft Paint, demonstrating its versatility in handling complex tasks [41][44]. Group 5: Market Impact - OpenAI's strategy appears to focus on rapidly deploying tokens to the market, which is seen as a best practice in capitalism and innovation [49]. - The company has secured $110 billion in new financing while facing challenges with its data center infrastructure, indicating a dual approach to growth and operational hurdles [51].