量子位
Search documents
OpenClaw最强外挂出现:小龙虾抓不到数据有救了!
量子位· 2026-03-08 04:26
克雷西 发自 凹非寺 量子位 | 公众号 QbitAI 用OpenClaw挂机,抓取网页时频频翻车的烦人bug终于有解了。 一个名为 Scrapling 的数据采集神器,几乎一夜之间就成了OpenClaw的"最强外挂"。 这玩意儿不仅能穿透各种防爬虫的网页护盾,还能把网上杂乱的网页源码生扒下来,直接清洗成干净的结构化数据。 摇身一变成为龙虾神器后,这个发布一年多的项目人气直接大爆发,狂揽2.3万stars,一口气冲上了GitHub单日趋势榜第一名。 工具爆火之后,原作者也已经明确放话, 正在把Scrapling做成OpenClaw的Skill ,期待值直接拉满。 Scrapling自带的 StealthyFetcher 隐身获取器专门就是来搞定这些恶心拦截的。 它能完美模拟最新版浏览器的指纹和操作行为,帮OpenClaw开箱即用地绕过这些阻拦。 除了躲过拦截,还得应付网站老板一拍脑袋就搞的改版换皮。 数据爬虫成了AI挂机神器 让智能体上网抓数据,最烦的就是遇到那种动不动就跳出来让你选图片的真人验证,稍微不注意就会被关进小黑屋。 以前那些老旧的爬虫工具实在太死板了,它们通常死死扣住几个固定的路径,只要网页排版稍 ...
一年一度最值得关注的AI榜单来啦!申报即日启动
量子位· 2026-03-08 04:26
Core Insights - The article discusses the evolution of generative AI in China, highlighting its transition from a "new technology" to an essential tool for businesses, impacting content production, R&D efficiency, marketing methods, team collaboration, and decision-making processes [1] - The fourth China AIGC Industry Summit will evaluate generative AI companies and products based on their performance and feedback over the past year, with results to be announced in May 2026 [1][2] Evaluation Criteria for AIGC Companies - Companies must be based in China or have their main business operations in China [7] - The primary business should focus on generative AI or have widely applied AI in its core operations [7] - Companies should have demonstrated outstanding performance in technology/products and commercialization over the past year [7] Evaluation Dimensions for AIGC Companies - **Technical Dimension**: Focus on the company's technical strength, R&D capabilities, and innovation, including technological achievements, R&D investment, and talent reserves [12] - **Product Dimension**: Emphasizes the innovation, market adaptability, and user experience of core products, including product innovation, user scale, and user experience [12] - **Market Dimension**: Evaluates the company's market performance and growth opportunities, including business models, market size, revenue situation, and cooperative ecosystem [12] - **Potential Dimension**: Assesses the strength of the core team and brand potential, including core team capabilities, financing progress, and brand influence [12] Evaluation Criteria for AIGC Products - Products must be based on generative AI capabilities [13] - Products should have mature technology, be market-released, and possess a certain user scale [13] - Significant technological innovations or functional iterations should have occurred in the past year, promoting the application of AI technology and impacting the industry [13] Evaluation Dimensions for AIGC Products - **Product Technical Strength**: Focus on the product's technological advancement, maturity, and efficiency, including technical architecture and outcomes [13] - **Product Innovation**: Emphasizes the uniqueness and innovation in functionality, experience, and application scenarios [13] - **Product Performance**: Evaluates user feedback and market performance, including user scale, retention rates, and product influence [13] - **Product Potential**: Assesses future development and market expansion potential, including product ecosystem and strategic planning [13] Registration Information - The registration for the evaluation starts immediately and ends on April 27, with final results to be announced at the May China AIGC Industry Summit [14] - Companies can register through a provided link or contact Quantum Bit staff for inquiries [14] About the China AIGC Industry Summit - The summit will take place in Beijing in May 2026, themed "Everyone, Let's AI Now," focusing on how to effectively utilize AI [17] - The event aims to engage AI entrepreneurs, developers, and experienced players to clarify and implement AI technologies [17]
首次将十亿参数三维模型塞进手机!4比特量化,速度2.5倍、内存降3.7倍、精度98%|ICLR'26
量子位· 2026-03-08 04:26
Core Insights - The article discusses the development of QuantVGGT, a quantization framework designed to effectively compress and accelerate the Visual Geometry Grounded Transformers (VGGT) model, which has over 1 billion parameters, while maintaining high accuracy and performance [2][5][58]. Group 1: Quantization Framework - QuantVGGT utilizes 4-bit quantization, achieving a speed increase of 2.5 times and a memory reduction of 3.7 times, while preserving 98% of the reconstruction accuracy compared to the full precision model [2][5][7]. - The framework introduces two main technical contributions: Dual-Smoothed Fine-Grained Quantization (DSFQ) and Noise-Filtered Diverse Sampling (NFDS) [5][9]. Group 2: Challenges in Quantization - VGGT's unique properties, such as the presence of data-independent special tokens and the inherent complexity of 3D data, pose significant challenges for quantization [11][12]. - The data-independent tokens lead to a heavy-tailed activation distribution, complicating the quantization process and increasing the risk of information loss [11][12]. Group 3: Technical Contributions - DSFQ combines pre-global Hadamard rotation and post-local channel smoothing to mitigate the heavy-tailed distribution and inter-channel variance issues [5][9][30]. - NFDS employs deep statistical information to filter out noise and create frame-aware diverse calibration clusters, ensuring the stability of the quantization range [5][9][40]. Group 4: Experimental Results - Extensive experiments demonstrate that QuantVGGT outperforms existing quantization methods across various benchmark datasets and bit widths, achieving optimal performance [5][13][59]. - In camera pose estimation tasks, QuantVGGT maintains 99.9% performance at 8-bit quantization and achieves an AUC@30 of 88.2 at 4-bit quantization, significantly outperforming other methods [47][50]. Group 5: Efficiency and Deployment - The proposed quantization framework shows minimal additional latency, with only a 0.2% increase in delay while significantly retaining model performance [56][58]. - The results indicate that QuantVGGT is well-suited for deployment in resource-constrained environments, demonstrating its practical advantages [5][58].
清华公布毕业生去向:出国比例仅8.5%,华为字节是最大赢家
量子位· 2026-03-08 04:26
Group 1 - The proportion of Tsinghua graduates pursuing further studies abroad is 8.5%, which is lower than the average of the past decade [2][24] - The employment rate in key domestic sectors exceeds 86%, maintaining above 80% for 16 consecutive years, with major companies hiring including Huawei, BYD, ByteDance, Tencent, and others [4] - The employment rate of Tsinghua graduates outside Beijing is 56.3%, consistently above 50% for 11 years, indicating a trend of graduates not solely remaining in Beijing or coastal developed areas [5][17] Group 2 - Major employment destinations for Tsinghua graduates include leading digital technology companies such as Huawei, Tencent, and Alibaba, which actively recruit on campus [7][8] - There is a significant increase in graduates entering advanced manufacturing and energy/defense sectors, with employment numbers in these areas growing by 11% year-on-year [11] - Tsinghua University plays a crucial role in guiding graduates towards key industries, organizing specialized recruitment events and activities focused on national defense and manufacturing [14][15] Group 3 - The number of graduates choosing to study abroad is declining, with only 8.5% of the 2025 cohort opting for this path, indicating a higher retention rate among those entering research and industry [24][27] - Technical positions are in high demand, particularly in AI, with major companies increasing their recruitment for these roles significantly [28][31] - Tsinghua graduates are increasingly entering the manufacturing, energy, and national defense sectors, reflecting a broader trend of top talent moving to areas of high demand [36][37]
一年一度最值得关注的AI榜单来啦!申报即日启动
量子位· 2026-03-07 02:24
组委会 发自 凹非寺 量子位|公众号 QbitAI 中国生成式AI正在进入产业深水区。 这两年,AI从"新技术"变成了"新工具",又从"新工具"慢慢变成企业必须面对的现实。它不只在改变内容生产,也在影响研发效率、营销方 式、团队协作,甚至决策流程。 时值第四届中国AIGC产业峰会, 量子位将根据过去一年里生成式AI企业、产品的表现与反馈,结合对2026年技术与场景的观察与预判,评 选出: 量子位将结合对公司的深入调研及数十位行业知名专家的意见,评选结果将于2026年5月中国AIGC产业峰会上公布。 届时,量子位也将邀请数百万行业从业者,共同见证这些优秀企业的荣誉。 2026年度值得关注的AIGC企业 将评选出拥有最创新、最前瞻或最有规模落地潜力的AI企业。 【参选条件】 2026年度值得关注的AIGC企业 2026年度值得关注的AIGC产品 1. 公司主体在中国或主营业务在中国; 2. 主营业务是生成式AI及相关,或已将AI广泛应用于其主营业务; 3. 近一年在技术/产品、商业化有出色表现的企业。 【评选维度】 2026年度值得关注的AIGC产品 将评选出拥有最创新、最实用、最热门或最有应用潜力的AI产品。 ...
龙虾之父参与的首个OpenAI项目:@开源贡献者,ChatGPT token免费送
量子位· 2026-03-07 02:24
Core Viewpoint - OpenAI is launching a new initiative called "Codex for Open Source," aimed at supporting open-source project maintainers by providing free ChatGPT tokens and other resources [3][4]. Group 1: Project Overview - The "Codex for Open Source" project allows open-source contributors to apply for API credits and access to ChatGPT Pro for six months, along with Codex Security services for coding and issue management [3]. - The application criteria are relatively accessible, targeting core maintainers of open-source projects or operators of widely used public projects [4]. Group 2: Engagement and Support - OpenAI encourages projects that may not meet the standard criteria but play a significant role in the ecosystem to apply and explain their importance [5]. - Peter Steinberger, who is involved in the project, is actively engaging with the community and responding to inquiries about the initiative [6][7]. Group 3: Personal Insights - Peter Steinberger humorously refers to himself as "OpenAI Troublemaker," indicating a light-hearted approach to his new role [10]. - He acknowledges that joining OpenAI has changed his work dynamics, as he now manages two significant responsibilities instead of one [12].
鹅厂门口免费装龙虾,几百人排爆了!一代人有一代人的鸡蛋要领
量子位· 2026-03-07 02:24
Core Viewpoint - The article discusses the recent event at Tencent where hundreds of people gathered to receive free installations of OpenClaw, referred to as "lobster," highlighting the growing interest and accessibility of AI technology across various demographics [6][19][31]. Group 1: Event Overview - The event featured dozens of programmers assisting hundreds of attendees in installing OpenClaw, creating a lively atmosphere [2][10]. - Participants ranged from young children to seniors, showcasing a diverse demographic interested in AI technology [5][12]. - Tencent prepared 800 deployment slots and had 30 engineers on-site, indicating a high level of organization and demand [15][16]. Group 2: OpenClaw Description - OpenClaw is an AI framework that automates various computer tasks, likened to raising a digital employee [22][25]. - The framework has gained significant popularity, with over 100,000 users and a GitHub star count exceeding 270,000 [20][29]. - OpenClaw's unique features include using Docker for security and the ability to directly access local data, setting it apart from traditional agent tools [27][28]. Group 3: Industry Impact - The surge in interest for OpenClaw has prompted various companies to optimize their services to integrate with the framework, enhancing their API and cloud offerings [32][35]. - Major tech companies like ByteDance, Alibaba, and NetEase are quickly adapting to the trend, developing products that leverage OpenClaw's capabilities [34][36]. - The event not only increased user engagement but also benefited Tencent's cloud services, indicating a strategic move to capitalize on the growing AI trend [36][37].
劝视频博主别拿龙虾起号,7×24小时全自动,碳基生物真卷不过
量子位· 2026-03-06 10:12
Core Viewpoint - The article discusses the launch of AIVideo Agent, an AI-driven video creation tool that automates the entire video production process, allowing users to create and publish videos effortlessly while they sleep [1][4][39]. Group 1: Features and Functionality - AIVideo Agent operates 24/7, autonomously completing the video production workflow without requiring technical skills or API keys [2][14]. - Users can input natural language requests, and the tool can add music, transitions, effects, and send notifications via email or publish to social media platforms [3][6]. - The platform integrates with Google Drive, Notion, Discord, and Gmail, streamlining the video creation and distribution process [5][10]. Group 2: User Experience - The tool simplifies the traditional video production workflow, which typically involves topic selection, script writing, sourcing materials, editing, voiceover, and publishing [9]. - AIVideo Agent can automatically check tasks, prioritize projects, and generate drafts, significantly enhancing productivity [10][11]. - The interface resembles common video editing software, making it user-friendly for non-technical individuals [28][35]. Group 3: Pricing and Market Potential - The service is currently in testing and requires a subscription costing $74 per month, allowing for approximately 1,100 video clips and 22,000 images [15][17]. - There is potential interest from professional content creators, such as YouTubers and social media influencers, who may find value in automating their video production [17][39]. Group 4: Industry Impact - The introduction of AIVideo Agent could revolutionize video production, similar to how coding has evolved, with creators taking on more of a director's role while AI handles execution [39]. - The article raises questions about the future of video editors and content creators in light of such automation, indicating a significant shift in the industry landscape [39].
基础模型又一关键拼图,腾讯混元发布训练新范式「无相」:引入功能性记忆,打破静态权重枷锁
量子位· 2026-03-06 10:12
Core Insights - The article discusses the HY-WU paradigm proposed by Tencent's Mixyuan team, which addresses the "catastrophic forgetting" problem in large models by allowing real-time generation of personalized parameters without rewriting existing ones [2][5][13]. Group 1: Challenges in Foundation Models - Foundation models face two main challenges: "catastrophic forgetting" when learning new tasks and the need for personalization [5][6]. - Traditional fine-tuning methods often overwrite existing knowledge, leading to conflicts and loss of previously learned capabilities [5][6]. - The "impossible triangle" of parameter space limits the ability to meet diverse user needs without compromising performance [6][7]. Group 2: Limitations of Existing Solutions - Current solutions like PEFT (e.g., LoRA) and context memory still operate within static parameter frameworks, leading to conflicts and overfitting [9][10]. - MoE (Mixture of Experts) models improve fitting for diverse distributions but do not fundamentally resolve the issues of catastrophic forgetting [10]. Group 3: HY-WU Paradigm - The HY-WU paradigm introduces a functional memory framework that allows for dynamic routing of parameters based on input conditions, avoiding the pitfalls of static parameter memory [13][16]. - It utilizes a parameter generator that synthesizes specific operators in real-time, enhancing adaptability without compromising the foundational model's capabilities [16][27]. Group 4: Practical Applications and Performance - HY-WU has been tested in text-guided image editing, demonstrating superior performance in content understanding and instruction adherence compared to traditional methods [3][28]. - The model has shown excellent results in various applications, including social media, gaming, and advertising, outperforming other models in personalization tasks [30][39]. Group 5: Evaluation and Benchmarking - The HY-WU model was evaluated against leading models in comprehensive tests covering over 60 editing tasks, achieving high scores in both human evaluations and automated benchmarks [41][45]. - In GEdit-Bench, HY-WU ranked first among open-source models in semantic consistency and overall quality, showcasing its competitive edge [45]. Group 6: Future Directions - The article outlines future explorations for the HY-WU framework, including online continual learning protocols and cross-modal universality, aiming to enhance the model's adaptability and efficiency [54][56]. - The potential for "memory separation" and "functional modularization" in AI architectures is emphasized as a key area for further research [52].
彻底告别VE与VAE!商汤硬核重构多模态:砍掉所有中间编码器
量子位· 2026-03-06 06:33
Core Viewpoint - The development paradigm of multimodal large models is being fundamentally restructured, with the introduction of NEO-unify by SenseTime and Nanyang Technological University, marking a significant breakthrough in achieving a truly "native, unified, end-to-end" multimodal model architecture [1][2][3]. Group 1: Technical Breakthroughs - NEO-unify eliminates the reliance on traditional visual encoders (VE) and variational autoencoders (VAE), moving away from component-based approaches to a model that directly processes near-lossless pixel and text inputs [3][10]. - The innovative Mixture-of-Transformer (MoT) architecture enables a seamless integration of visual and language understanding and generation capabilities within the same framework [4][13]. - This architecture signifies a shift from "modal connection" to "native unified intelligence," laying the groundwork for future integrated cross-modal cognitive and generative systems [5][6]. Group 2: Current Challenges in Multimodal Intelligence - The existing multimodal architectures have created a natural divide between perception and generation, which has been a persistent challenge in the field [7]. - Recent attempts to create "shared encoders" have often led to new structural design trade-offs, highlighting the need for a more cohesive approach [8][9]. Group 3: Model Performance and Efficiency - NEO-unify demonstrates superior performance metrics compared to other models, achieving notable results in various benchmarks, such as a score of 86.71 in the WISE benchmark and 0.914 in LongText-en [19]. - The model's design allows for high data and computational efficiency, achieving better performance with fewer training tokens compared to other models like Bagel [49]. Group 4: Future Implications - The introduction of NEO-unify represents not just a model architecture innovation but also a clear pathway towards the next generation of intelligent systems, where multimodal AI evolves from "component stacking" to "essential unity" [51][54]. - Ongoing research and development efforts are in a critical phase of scaling and iteration, with upcoming model results and open-source contributions expected to be released soon [55].