量子位
Search documents
OpenAI新模型GPT-5.1发布,不跑分不刷榜,主打一个说人话
量子位· 2025-11-13 00:49
Core Insights - The article discusses the recent upgrade of ChatGPT to version GPT-5.1, which emphasizes improved intelligence and conversational abilities [1][2]. Model Features - GPT-5.1 includes two sub-models: GPT-5.1 Instant for everyday conversations and quick responses, and GPT-5.1 Thinking for complex reasoning and in-depth problem-solving [2][19]. - The new model allows users to customize the tone and style of interactions, making it more personable and engaging [4][27]. Performance Improvements - Early tests indicate that GPT-5.1 Instant provides more enjoyable and light-hearted responses while maintaining practicality [5][30]. - The model's adherence to instructions has improved significantly, showcasing a better ability to follow specific guidelines [12][15]. Adaptive Reasoning - GPT-5.1 Instant employs adaptive reasoning technology, allowing it to decide when to think critically before responding, thus enhancing the quality of answers [17][18]. - GPT-5.1 Thinking is designed to be twice as fast as its predecessor in typical tasks, while also providing clearer explanations for specialized topics [20][24]. User Customization - Users can select from eight predefined personality traits for the AI, including professional, friendly, and sarcastic, among others [27]. - OpenAI is testing a feature that allows the model to proactively ask users about their preferred tone or style during conversations [28]. User Experience - Initial user experiences suggest that the more personalized GPT-5.1 offers a unique and entertaining interaction, with examples of humorous exchanges [30][32].
小红书提出DeepEyesV2,从“看图思考”到“工具协同”,探索多模态智能新维度
量子位· 2025-11-13 00:49
Core Insights - DeepEyesV2 is a significant upgrade from its predecessor, DeepEyes, enhancing its capabilities from merely recognizing details to actively solving complex problems through multi-tool collaboration [3][12]. Multi-Tool Collaboration - Traditional multimodal models are limited in their ability to actively utilize external tools, often functioning as passive information interpreters [4]. - DeepEyesV2 addresses two main pain points: weak tool invocation capabilities and lack of collaborative abilities among different functions [5][8]. - The model can now perform complex tasks by integrating image search, text search, and code execution in a cohesive manner [12][18]. Problem-Solving Process - DeepEyesV2's problem-solving process involves three steps: image search for additional information, text search for stock price data, and code execution to retrieve and calculate financial data [15][16][17]. - The model demonstrates advanced reasoning capabilities, allowing it to tackle intricate queries effectively [14]. Model Features - DeepEyesV2 incorporates programmatic code execution and web retrieval as external tools, enabling dynamic interaction during reasoning [22]. - The model generates executable Python code or web search queries as needed, enhancing its analytical capabilities [23][27]. - This integration results in improved flexibility in tool invocation and a more robust multimodal reasoning framework [28]. Training and Development - The development of DeepEyesV2 involved a two-phase training strategy: a cold start to establish foundational tool usage and reinforcement learning for optimization [37][38]. - The team created a new benchmark, RealX-Bench, to evaluate the model's performance in real-world scenarios requiring multi-capability integration [40][41]. Performance Evaluation - DeepEyesV2 outperforms existing models in accuracy, particularly in tasks requiring the integration of multiple capabilities [45]. - The model's performance metrics indicate a significant improvement over open-source models, especially in complex problem-solving scenarios [46]. Tool Usage Analysis - The model exhibits a preference for specific tools based on task requirements, demonstrating adaptive reasoning capabilities [62]. - After reinforcement learning, the model shows a reduction in unnecessary tool calls, indicating improved efficiency in reasoning [67][72]. Conclusion - The advancements in DeepEyesV2 highlight the importance of integrating tool invocation with reasoning processes, showcasing its superior problem-solving abilities in various domains [73][75].
稚晖君最新188机器人,阅后即焚
量子位· 2025-11-13 00:49
Core Viewpoint - The article discusses the rapid rise of the company "Shangwei New Materials" in the context of its acquisition by "Zhiyuan Robotics," highlighting a significant stock price increase and the implications of entering the embodied intelligence robotics sector [3][26][45]. Group 1: Company Overview - Shangwei New Materials, established in 2020 and listed on the STAR Market, specializes in environmentally friendly high-performance corrosion-resistant materials and new composite materials [33]. - Zhiyuan Robotics, founded in February 2023, is led by former Huawei executive Deng Taihua and focuses on various commercial applications of robotics [31]. Group 2: Acquisition Details - Zhiyuan Robotics completed its acquisition of Shangwei New Materials through a combination of agreement transfer and tender offer, marking a significant shift in control [34][39]. - The acquisition process began with a public announcement on July 8, leading to a stock price surge of 1083.42% from July 9 to July 30, making it one of the first tenfold stocks in the A-share market for 2025 [35]. Group 3: Market Reaction - Following the announcement of new products by Zhiyuan Robotics, Shangwei New Materials' stock experienced a strong surge, reaching a limit-up on November 11, driven by market excitement despite the lack of substantial product demonstrations [12][20]. - The article notes that the stock price rose from 7 yuan in July to 130 yuan by November 11, reflecting the market's speculative interest in the embodied intelligence sector [25]. Group 4: Business Implications - Despite the stock price increase, the robotics business is still in the development stage and has not yet generated revenue or profit, with limited expected impact on financial performance until 2025 [27][44]. - Shangwei New Materials maintains its primary focus on its original materials business, emphasizing that the robotics venture is independent and still under development [43][42].
比0.99元羊毛更重要的,是跟AI砍价的快乐
量子位· 2025-11-12 12:07
Core Viewpoint - The article discusses the creative and humorous ways users are leveraging AI, specifically the Kimi Agent, to negotiate lower subscription prices during the Double Eleven shopping festival, highlighting the effectiveness of emotional appeals and playful tactics in bargaining [1][18]. Group 1: User Strategies for Bargaining - Users are employing various tactics to negotiate the Kimi Agent subscription price down to 0.99, including emotional appeals and playful language [1][2]. - Strategies include pretending to be in dire situations, using poetic language, and even adopting historical personas to persuade the AI [4][8][11]. - The article emphasizes that patience and creativity are key in successfully negotiating with the AI [15][16]. Group 2: Promotional Details - The promotional offer allows new users to subscribe for 0.99 for the first month, with the regular price resuming in the following month [18]. - Existing users can extend their subscription by sharing the discount link with new users, creating a community-driven promotional strategy [18][20]. - The promotion runs from Double Eleven until the 25th of the month, indicating a limited-time opportunity for users [18]. Group 3: User Experience and Feedback - Users express a sense of achievement and enjoyment from interacting with the Kimi Agent, noting its advanced capabilities and engaging personality [22][23]. - Feedback from users suggests that the AI feels lifelike and responsive, enhancing the overall experience of negotiating [24]. - The article concludes with a call for readers to share their own experiences and strategies, fostering community engagement [25].
孙正义再次清仓英伟达!上一次教训“价值2500亿美元”
量子位· 2025-11-12 08:01
Core Viewpoint - Masayoshi Son's decision to liquidate his entire stake in Nvidia raises questions about his investment strategy, particularly in the context of the AI boom and his shift towards OpenAI [2][6][31]. Group 1: Nvidia Stake Sale - SoftBank sold 32.1 million shares of Nvidia for $5.83 billion (approximately 41.5 billion RMB) after the end of Q2 2025 [3]. - Nvidia's market capitalization recently surpassed $5 trillion, indicating its significant value in the AI sector [5]. - This is not the first time Son has sold Nvidia shares; he previously liquidated his stake in 2019, which he later regretted as it cost him an estimated $250 billion in potential returns [28][25]. Group 2: Shift to OpenAI - The proceeds from the Nvidia sale are intended to fund SoftBank's substantial investment in OpenAI, with a commitment of up to $40 billion, of which $30 billion is expected to be invested [11][9]. - SoftBank's CFO confirmed that the liquidation aligns with the company's collaboration with OpenAI [8]. - The first tranche of $10 billion was completed in April 2025, with plans for further investments as OpenAI prepares for an IPO [11][20]. Group 3: Strategic Shift in AI Investment - Analysts suggest that Son's move to sell Nvidia is not an exit from AI but a strategic repositioning towards software and application layers, moving away from hardware [14][16]. - SoftBank's upcoming investments include acquiring Ampere for $6.5 billion and ABB's robotics business for $5.4 billion, indicating a focus on software and AI applications [17]. - The completion of OpenAI's restructuring paves the way for its IPO, which could yield significant returns for SoftBank [20][21].
硅谷热议:最快语音转文字模型
量子位· 2025-11-12 08:01
Core Insights - The article discusses the launch of Scribe v2 Realtime, a cutting-edge speech-to-text model by ElevenLabs, which has garnered significant attention in Silicon Valley for its impressive performance metrics [3][4][16]. - The model boasts a latency of just 150 milliseconds and an accuracy rate of 93.5%, supporting over 90 languages, marking a significant advancement in the field of real-time speech transcription [4][10][15]. Company Overview - ElevenLabs, founded in 2022, focuses on AI voice technology and has quickly established itself in the industry, achieving over $200 million in revenue within 20 months of operation [18][21]. - The company’s founding team includes former Google machine learning engineers and Palantir strategists, emphasizing a strong technical background [19][23]. - ElevenLabs operates with a unique team structure, consisting of small, agile teams without formal titles, allowing for efficient decision-making and operations [23]. Product Features - Scribe v2 Realtime is designed to handle various audio formats and includes features like voice activity detection and customizable audio stream processing, enhancing its usability for diverse applications [10][12]. - The model has shown remarkable adaptability, accurately transcribing speech even in noisy environments and with complex terminology, which is a significant improvement over previous models [9][13]. Industry Context - The real-time speech-to-text sector has evolved through multiple technological iterations, with earlier models struggling with accuracy and latency issues, often exceeding 30% error rates in noisy conditions [13][14]. - The introduction of the Transformer architecture has alleviated the long-standing trade-off between speed and accuracy, enabling models like Scribe v2 Realtime to achieve both high accuracy and low latency [14][15].
罗福莉C位亮相小米,离职DeepSeek后首次官宣
量子位· 2025-11-12 08:01
Core Insights - Luo Fuli has officially announced her position at Xiaomi, leading the MiMo team to advance the development of multi-modal spatial intelligence, a key step towards achieving Artificial General Intelligence (AGI) [1][3][7] Group 1: Background and Context - Rumors about Luo Fuli joining Xiaomi surfaced at the end of last year, with reports indicating that she was recruited by Lei Jun with a salary of tens of millions [4][10] - Significant events include the launch of DeepSeek-V3 on December 25, followed by media reports of Xiaomi assembling a GPU cluster [5][6] - Luo Fuli's name appeared in Xiaomi's AI team papers as an independent researcher prior to her official announcement [11][20] Group 2: Luo Fuli's Profile - Luo Fuli holds a Bachelor's degree in Computer Science from Beijing Normal University and a Master's degree in Computational Linguistics from Peking University, with numerous publications in top NLP conferences [15][17] - She has over 11,000 citations for her academic papers, with approximately 8,000 citations added in the current year alone [18] - Luo previously worked at Alibaba's DAMO Academy and DeepSeek, contributing to the development of various deep learning models [17] Group 3: Xiaomi's AI Ambitions - Xiaomi aims to enter the deep waters of AI following the establishment of its automotive business, with a focus on spatial intelligence [9][24] - The concept of spatial intelligence, as articulated by Luo Fuli, involves bridging the gap between information AI and physical AI, which aligns with Xiaomi's ecosystem of people, vehicles, and homes [23][25]
医疗AI质变时刻来临!国产医疗AI率先突破,临床诊疗能力问鼎全球
量子位· 2025-11-12 04:08
Core Viewpoint - The article discusses the gap between the real capabilities of medical AI and clinical expectations, highlighting the need for a new evaluation standard for medical AI that reflects its clinical applicability and safety [1][10][17]. Group 1: Medical AI's Current Challenges - Many AI models that perform well in standardized exams often reveal issues such as reasoning errors, misdiagnosis, and inappropriate treatment plans in real clinical settings [2][9]. - The recent update from OpenAI prohibits ChatGPT from assisting in medical diagnostics, indicating a cautious approach towards AI's involvement in serious medical fields [2][10]. Group 2: New Evaluation Standards - A new evaluation standard, the Clinical Safety-Effectiveness Dual-Track Benchmark (CSEDB), has been developed by top Chinese clinical experts to assess the clinical applicability of medical AI [5][10]. - This new standard introduces a dual evaluation system focusing on both safety and effectiveness, moving beyond traditional exam score metrics [11][12]. Group 3: MedGPT's Performance - MedGPT, a model developed by a Chinese company, achieved the highest score of 0.895 in the CSEDB evaluation, outperforming other models by over 15 percentage points [19][22]. - MedGPT is the only model to score higher in safety than in effectiveness, demonstrating a cautious approach in clinical scenarios [24][26]. Group 4: Future of Medical AI - The future of medical AI lies in its ability to replicate the expertise of top clinicians, creating new medical resources and serving as a reliable assistant for healthcare professionals [34][46]. - The "Future Doctor" platform aims to scale the clinical experience and decision-making capabilities of expert doctors through AI, ensuring that all patient interactions are handled by real doctors [41][45]. Group 5: Industry Impact - The establishment of the CSEDB standard represents a significant step towards a more mature medical AI industry, allowing for better evaluation and optimization of AI models [54][55]. - The evolution of medical AI from merely simulating doctor responses to actively participating in clinical reasoning marks a pivotal moment in the industry [56][57].
最后一周!人工智能年度榜单申报即将截止。
量子位· 2025-11-12 04:08
让我们共同见证年度之星,点亮未来的方向。 组委会 发自 凹非寺 量子位|公众号 QbitAI 「2025人工智能年度榜单」申报 已进入倒计时阶段。 今年是量子位 「2025人工智能年度榜单」评选报名 的 第8年。 八年来,我们见证了技术的突破与落地,产业的融合与重塑,也见证了一批 又一批推动时代前行的企业、人物与产品。 本次评选已经从 企业 、 产品 、 人物 三大维度,设立五类奖项。欢迎企业抓住最后时间,尽快报名! 企业榜 产品榜 人物榜 2025 人工智能年度 焦点人物 报名方式 本次评选将于 2025年11月17日 截止。评选结果将于量子位主办的 MEET2026智能未来大会 上正式公布。 扫描二维码即可报名评选: 网页端链接:https://wj.qq.com/s2/23740133/iso8/ 如对本次评选有其他疑问,请联系量子位工作人员。添加微信18801103170,或邮件发送至linyu@qbitai.com,并备注「评选-企业-姓 名」。 详细评选标准及报名方式如下。 2025 人工智能年度领航企业 将面向中国人工智能领域,评选出最具综合实力的企业, 参选条件 : 评选标准 : 2025 人 ...
阿里发了个简历AI神器,大小仅0.6B
量子位· 2025-11-12 04:08
Core Viewpoint - Alibaba Group's research team has developed a layout-aware resume parsing framework that significantly improves the efficiency and accuracy of automated resume screening, addressing key pain points in the recruitment process [2][4][6]. Group 1: Technology and Innovation - The new framework achieves an accuracy rate close to top industry models like Claude-4, processing entire resumes in just 1-2 seconds [3][4]. - This innovation directly addresses three major challenges in automated resume parsing: diverse formatting, high costs of large models, and slow response times [4][8]. - The framework's paper titled "Layout-Aware Parsing Meets Efficient LLMs: A Unified, Scalable Framework for Resume Information Extraction and Evaluation" has been published [4]. Group 2: Model Efficiency - Instead of using large models with billions of parameters, the research team fine-tuned a smaller model with only 0.6 billion parameters (Qwen3-0.6B) [15]. - The model was trained on a specially constructed dataset containing thousands of resumes, enabling it to extract key information accurately [16]. - The system employs a "parallel task decomposition" and "index pointer" mechanism, allowing for simultaneous processing of extraction tasks, which significantly reduces response time [17][18]. Group 3: Performance Metrics - The fine-tuned 0.6B model achieved an F1-score of 0.964 on the RealResume dataset, with an average processing time of 1.54 seconds per resume, outperforming Claude-4's 4.62 seconds [20]. - The system can handle a throughput of 240-300 resumes per minute, with an average response delay of under 2 seconds and a 100% success rate in parsing within 10 seconds [22]. Group 4: Deployment and Impact - The technology framework has been fully deployed in Alibaba Group's internal HR systems, demonstrating its practical application and effectiveness in real-time processing [21]. - This research illustrates that innovative system design and model optimization can significantly lower the barriers and costs associated with using large model technologies without sacrificing accuracy [23].