量子位
Search documents
ChatGPT千亿tokens,干掉麦肯锡5000名顾问
量子位· 2025-10-21 03:38
Core Insights - McKinsey has received an award from OpenAI for being a major client in token consumption, raising questions about the traditional consulting model as it relies on AI-generated content [1][3][4] - The consulting industry is undergoing a significant transformation as firms like McKinsey and BCG embrace AI technologies to enhance operational efficiency and redefine their service offerings [5][19] AI Integration in Consulting Firms - McKinsey has been proactive in AI adoption, having acquired QuantumBlack in 2015, which has since evolved into its AI-native consulting division [7][10][13] - The launch of McKinsey's internal AI, Lilli, has allowed consultants to automate PPT generation and streamline research processes, with over 70% of employees using it [14][18] - BCG has developed multiple internal AI tools, with nearly 90% of its employees utilizing AI in their daily work, indicating a competitive push in AI integration [21][25] Workforce Changes and Challenges - McKinsey has laid off over 5,000 employees, approximately 10% of its workforce, attributed to overexpansion during the pandemic and the impact of AI on job roles [27][28][30] - The rise of AI has led to increased productivity, with AI handling about 30% of information gathering tasks, raising concerns about the future of entry-level positions [32][33][56] - The consulting industry is witnessing a decline in entry-level hiring, with a 54% drop in recruitment for junior consultants, as firms prioritize experienced hires [60][63] Emergence of AI-Driven Startups - New AI-driven companies are emerging, offering alternatives to traditional consulting services, targeting small to medium-sized enterprises that cannot afford established firms like McKinsey [49][52] - These startups are leveraging AI to automate consulting processes, posing a competitive threat to traditional firms by providing cost-effective and immediate solutions [41][53] The Future of Consulting - The consulting industry is undergoing a fundamental transformation, with AI replacing traditional roles and altering the career trajectory for new consultants [55][72] - Despite the challenges posed by AI, there remains a belief that human consultants will still be needed for complex problem-solving and insights that AI cannot replicate [69][70]
我拿AI给神曲《八方来财》做了个MV,真的好魔性!
量子位· 2025-10-21 03:38
金磊 发自 凹非寺 量子位 | 公众号 QbitAI 给歌曲做MV ,现在已经是 一个AI就能搞定 的时代了。 来,请欣赏用AI给神曲 《八方来财》 做的 东方赛博朋克 MV: 而这整整一分钟的内容,真的就是用一个AI来生成的。 并且这个AI啊,并不是来自咱们主流定义的大厂,而是出自一家 央企 —— 中国电信 面向公众开放的AI创作平台, TeleStudio 。 从视频制作角度来看,TeleStudio支持最高清 2K 、一次最长 20秒 的视频生成,并且复杂的动作也是可以一气呵成。 最重要的一点是,目前TeleStudio处于 限时免费 阶段!人人都可以做AI视频导演了。 (PS:目前还仅限PC端操作,移动端正在上线中。) 那么具体又该如何玩转TeleStudio呢? 我们这就以刚才的视频为例,带你深度体验一波~ 手把手教你用AI做MV 在TeleStudio中,创作的方式主要有三大类,分别是 生成图片 、 生成视频 和 生成音效 。 要打造一个分钟级的短剧剧集,我们要做的就是把这三个功能给联动起来。 TeleStudio默认是一次直接生成4张图片,以及我们可以选择图片的宽高比(五种选项)和清晰度(三种选 ...
DeepSeek新模型被硅谷夸疯了!用二维视觉压缩一维文字,单GPU能跑,“谷歌核心机密被开源”
量子位· 2025-10-20 23:34
Core Insights - DeepSeek has released a groundbreaking open-source model named DeepSeek-OCR, which is gaining significant attention in Silicon Valley for its innovative approach to processing long texts with high efficiency [1][3][7]. Model Overview - The DeepSeek-OCR model addresses the computational challenges associated with large models handling long texts by utilizing a method that compresses textual information into visual tokens, thereby reducing the number of tokens needed for processing [5][12][13]. - The model achieves high accuracy rates, with a decoding accuracy of 97% when the compression ratio is less than 10 times and around 60% even at a 20 times compression ratio [6]. Performance Metrics - DeepSeek-OCR has demonstrated superior performance on the OmniDocBench benchmark, achieving state-of-the-art (SOTA) results with significantly fewer visual tokens compared to existing models [14][15]. - For instance, using only 100 visual tokens, DeepSeek-OCR outperforms the GOT-OCR2.0 model, which uses 256 tokens, and matches the performance of other models while using far fewer tokens [17]. Technical Components - The architecture of DeepSeek-OCR consists of two main components: the DeepEncoder, which converts high-resolution images into highly compressed visual tokens, and the DeepSeek3B-MoE-A570M decoder, which reconstructs text from these tokens [20][22]. - The model supports various input modes, allowing it to adapt its compression strength based on the specific task requirements [24]. Innovative Concepts - The research introduces the concept of "Contextual Optical Compression," which simulates human memory mechanisms by dynamically allocating computational resources based on the temporal context of the information being processed [36][38]. - This approach aims to enhance the model's ability to handle long conversations or documents, potentially leading to a more human-like memory structure in AI systems [39][41].
马斯克要让Grok全面接管x,彻底剔除人类规则推荐算法
量子位· 2025-10-20 23:34
henry 发自 凹非寺 量子位 | 公众号 QbitAI AI助手Grok要全面接手X了! 马斯克宣布:X(推特)将在未来几周内 彻底移除启发式推荐算法 ,由 Grok 接手,通过阅读和观看全部内容来 全自动匹配用户兴趣。 随后,马斯克在二级转发中透露了X推流算法的新进展: 全面将启发式算法升级,由Grok接管AI推送。 如果计划成真,X将成为 首个完全抛弃启发式算法的大型社交平台 。 这一手更新,直接让Grok从无情的总结机器罗伯特、X上的维基百科,上位到了X的总管。 现在X上到处都是Grok,未来还会有更多Grok,这次更新还是Grok。 X推流机制更新 事情的起因是X百万网红 DogeDesigner (有人说他是马斯克的小号)分享了X的推荐机制。 他表示, X的推荐算法 并不是随意地推流或限流,而是像人类一样,根据帖子的内容判断它是否会吸引别人: 而且,用户还可以对Grok提出请求, 从而动态调整内容推荐 ,让其更个性化。 如果你只发了一个没有文字的链接,算法几乎没有可判断的信息,所以它不会把内容推给太多人。 但如果你添加了精彩的标题、图片或更多背景信息,帖子就会被更多人看到。 这次更新的目的很明确 ...
AI正在改写地图APP!这一次轮到谷歌了
量子位· 2025-10-20 11:45
Core Insights - Google has launched the Gemini API, allowing developers to integrate Google Maps tools into their applications for enhanced location awareness [1][5] - The Gemini API connects to a vast geographical database of 250 million locations, enabling real-time responses for various applications such as restaurant recommendations and travel planning [2][3] - The API charges based on query volume, with a current rate of $25 per 1,000 fact-based prompts [5] Group 1: Functionality and Use Cases - Developers can utilize the Gemini API for applications related to food delivery, travel, and real estate, providing accurate geographic information and interactive travel planning tools [25][41] - The integration allows for personalized and visual experiences, as demonstrated by a Google AI Studio leader who used voice commands to find restaurant recommendations [8][10] - Users can inquire about real-time data such as restaurant hours and traffic conditions, leveraging Google Maps' extensive real-time data [15][17] Group 2: Industry Context and Comparisons - The introduction of AI in mapping applications is not new in the industry, with domestic players like Gaode already implementing similar technologies focused on spatial intelligence [30][33] - Gaode's AI capabilities allow for real-time responses to complex travel and lifestyle needs, showcasing the evolution of maps from mere navigation tools to intelligent spatial agents [41][44] - Both Google and Gaode are transforming maps into dynamic, intelligent spaces, enhancing user experience and interaction with geographic data [44][45]
拍个照就能测秃头等级?蚂蚁这AI医疗App我体验了一下
量子位· 2025-10-20 11:45
Core Viewpoint - Ant Group has entered the AI healthcare sector with its product AQ, which integrates various healthcare services into a seamless experience, addressing the demand for medical consultations and related services [1][2]. Group 1: Product Features - AQ utilizes AI capabilities to create a closed-loop system for healthcare, including medical insurance, payment, and local delivery services [2][11]. - The product offers a user-friendly consultation process that mimics traditional hospital visits, providing preliminary assessments and diagnostic suggestions based on user input and image analysis [13][10]. - AQ can analyze skin conditions, heart rate abnormalities, and even traditional Chinese medicine diagnostics, showcasing its versatility [6][30][25]. Group 2: User Experience - Users report that the diagnostic results from AQ are generally accurate, often aligning with conclusions from top-tier hospitals [17][10]. - The system includes a knowledge base called AQ Intelligence, which breaks down diagnostic keywords into categories like causes, symptoms, and treatment options, enhancing user understanding [18][20]. - While the product has many strengths, some functionalities are similar to existing AI agents, raising questions about its uniqueness [11][12]. Group 3: Limitations and Concerns - Certain diagnostic results appear overly generalized, lacking personalization, which may affect user trust [22][24]. - The AI struggles with complex imaging, such as CT scans, indicating limitations in its diagnostic capabilities [36][12]. - Privacy concerns have been raised regarding the integration of personal health data within the platform [43][44]. Group 4: Overall Assessment - The integration of various healthcare functions into a single app enhances user convenience, allowing for easy appointment scheduling, medication purchases, and insurance inquiries [41][26]. - The overall user experience is reported to be smooth, with a well-structured process from diagnosis to treatment [42][40]. - Users are advised to utilize AQ for minor health issues and routine inquiries, while still recommending professional medical consultations for serious conditions [44][46].
人工智能年度榜单火热报名中!五大奖项,寻找AI+时代的先锋力量
量子位· 2025-10-20 10:29
让我们共同见证年度之星,点亮未来的方向。 组委会 发自 凹非寺 量子位|公众号 QbitAI 为了让更多从业者感受智能浪潮的跃迁,也为了给予更多同行同路人掌声与鼓舞,我们将正式启动 「2025人工智能年度榜单」评选报名 。 这是量子位人工智能年度榜单的 第8年 。八年来,我们见证了技术的突破与落地,产业的融合与重塑,也见证了一批又一批推动时代前行的 企业、人物与产品。 在人工智能重新定义一切的时代里,智能技术已不再是单一工具,而是产业与社会协同进化的驱动力。我们期待通过这场年度评选,去发现并 致敬那些真正引领变革、开拓边界的探索者与实践者。 本次评选将从 企业 、 产品 、 人物 三大维度,设立五类奖项。欢迎企业踊跃报名! 企业榜 产品榜 人物榜 2025 人工智能年度 焦点人物 详细评选标准及报名方式如下。 2025 人工智能年度领航企业 2025 人工智能年度 领航企业 2025 人工智能年度 潜力创业公司 2025 人工智能年度 杰出产品 2025 人工智能年度 杰出解决方案 将面向中国人工智能领域,评选出最具综合实力的企业, 参选条件 : 评选标准 : 2025 人工智能年度潜力创业公司 聚焦于中国人 ...
Vidu Q2携「王炸」登场!杀手锏「参考生」功能全球上线,APP体验全面革新
量子位· 2025-10-20 10:29
Core Viewpoint - The article highlights the rapid advancements in the AI video generation field, particularly focusing on the new features and upgrades of the Vidu platform, which aims to enhance user experience and creativity in content creation. Group 1: New Features of Vidu - The long-awaited Vidu Q2 reference generation feature is officially launched, allowing for high consistency, faster processing, and more affordable pricing without the need for an invitation code [2][13]. - Vidu's video extension feature allows users to extend videos up to five minutes, with free users able to generate videos up to 30 seconds [20]. - The Vidu app has undergone a comprehensive redesign, transforming from an AI creation platform to a one-stop AI content social platform, enabling users to easily create and share videos [4][12]. Group 2: User Experience Enhancements - Users can create engaging duet videos by simply tagging a subject and providing a brief prompt, significantly lowering the creative barrier [7]. - The app includes a vast library of subjects, including characters and effects, allowing users to generate fun videos anytime and anywhere [8]. - The platform now supports browsing various AI-generated video content, enhancing the social aspect of video sharing [9]. Group 3: Performance Improvements - Vidu Q2 shows a threefold increase in generation speed compared to the previous version, allowing creators to transform ideas into videos more efficiently [40]. - The platform maintains high video quality, ensuring that even demanding scenarios like animation and advertising are well-handled [25]. - The combination of high consistency, video extension capabilities, and 1080P resolution meets the needs of content creators and companies for quality AI video generation [24]. Group 4: Commercial Applications - The advancements in Vidu's technology significantly lower the production costs and barriers for marketing videos, making it accessible for small and medium-sized businesses [47]. - A typical application scenario in the e-commerce sector allows merchants to create dynamic product showcase videos quickly by providing static images and simple prompts [43][46]. - The democratization of technology is expected to unleash creativity among users, enabling anyone to generate high-quality videos with minimal effort [47].
LLM记忆管理终于不用“手把手教”了,新框架让智能体自主管理记忆系统
量子位· 2025-10-20 10:29
Core Insights - The article introduces Mem-α, an innovative reinforcement learning framework designed to enable large language models (LLMs) to autonomously manage complex memory systems, moving away from reliance on manual design and predefined instructions [2][4][14]. Memory Management Challenges - Traditional memory-enhanced agents often depend on predefined instructions and tools for memory updates, which can lead to suboptimal memory construction and information loss, particularly in long-term interactions [7][9][8]. - LLMs face limitations due to finite context windows, making external memory systems crucial for understanding long-term information [5][6]. Mem-α Framework - Mem-α transforms the memory construction problem into a sequential decision-making problem that can be optimized through reinforcement learning, allowing agents to explore optimal memory management strategies during information processing [14][16]. - The framework incorporates a complex memory system inspired by cognitive science, consisting of core memory, episodic memory, and semantic memory, each supporting various memory operations [22][20]. Training and Evaluation - Mem-α utilizes a multi-dimensional reward function to optimize memory construction, focusing on accurate retrieval, test-time learning, long-range understanding, and conflict resolution [18][28]. - Experimental results demonstrate that Mem-α significantly outperforms existing methods, achieving higher accuracy and efficient memory usage while maintaining performance [35][36]. Key Findings - Mem-α shows superior performance across all tasks, particularly in accurate retrieval and long-range understanding, indicating strong generalization capabilities [35]. - The framework reduces memory usage by approximately 50% compared to traditional methods while enhancing performance, validating the effectiveness of semantic compression mechanisms [35]. - The structured architecture of Mem-α proves essential for processing complex information, highlighting the limitations of flat memory representations [35]. - Mem-α exhibits robust generalization to document lengths exceeding 400K tokens, despite being trained on documents averaging less than 30K tokens [35].
宇树最新机器人发布:1米8大高个,能跳舞会功夫,就是颜值一言难尽
量子位· 2025-10-20 10:29
衡宇 发自 凹非寺 量子位 | 公众号 QbitAI 宇树第四款人形机器人, Unitree H2 转着圈圈来了! 这款新品身高180厘米,体重70公斤,比同身高的H1机器人重了足足23公斤。 宽肩窄腰,意味着电池和控制板都必须塞进它小小的胸膛。 相较于前作, 宇树H2 的最大变化是在 外观上增加了仿生人脸 。 从长相到身高体重,H2整体形态更接近真人。 不过,广大网友们对这张脸的美感好像不太买账…… 虽然看起来神似2004年威尔·史密斯主演科幻电影《我,机器人》 (又译为《机械公敌》) 里的机器人NS-5。 但刷遍各大平台评论区, 大家都觉得它有点诡异 。 不知道是不是美瞳直径太大,惹得大家恐怖谷效应犯了。 | (a^ X )ndrew � 2 @0xnullcline · 43m | | --- | | fire whoever designed the face please | | Tru (") - 111 470 | 截至发稿,宇树官网上还没有更新H2的详细信息。 只有官号放出宣传片时有一小段配套文案介绍: (H2定位为) 仿生人形机器人,为每个人安全友好地服务而生。 | 型号 | HI | G1 ...