Workflow
量子位
icon
Search documents
量子位编辑作者招聘
量子位· 2026-02-12 07:52
加入我们,你可以获得: 编辑部 发自 凹非寺 量子位 | 公众号 QbitAI AI热潮还在汹涌,但如果你还不知道如何参与……那为什么不来 量子位 呢? 我们是一家以 追踪AI新进展 为核心的内容平台,经过8年积累,目前拥有顶流影响力,广泛且备受认可的产业资源,以及时代风口的最佳观 测和学习生态位。 目前,我们有 三大方向 岗位招聘,希望你是 (或者能成为) 这三个方向的内容专家: 岗位均为全职,工作地点:北京中关村。 岗位面向: 参与核心采访,对话产业专家、技术大牛、撰写AI云落地案例。 任职要求: 以下是岗位详情: 所有岗位不同能力层级职位均在开放,欢迎结合个人履历和经验申请。 AI产业方向 岗位职责: AI产业方向 :关注基建层创新,包含芯片、AI Infra、云计算; AI财经方向 :关注AI领域创投和财报,跟踪产业链资本动向; AI产品方向 :关注AI在应用和硬件终端方向的进展。 社招:覆盖编辑、主笔、主编各个层级,按能力匹配岗位; 校招:应届毕业生,接受实习且可转正。 站在AI浪潮之巅 :第一时间接触和了解AI领域最新技术和产品,构建完整的AI认知体系。 玩转AI新工具 :将各种AI新技术、新工具 ...
GLM-5真够顶的:超24小时自己跑代码,700次工具调用、800次切上下文!
量子位· 2026-02-12 07:52
Core Insights - The release of GLM-5 marks a significant advancement in open-source AI, bringing it into the era of long-task capabilities [2][25] - GLM-5 demonstrates exceptional programming abilities, successfully creating a Game Boy Advance emulator from scratch, showcasing its stability and reliability in complex tasks [3][9][12] - The model has achieved competitive performance, ranking alongside Claude Opus 4.5 in various assessments, indicating its strong programming capabilities and operational stability [15][17] Group 1: Performance and Capabilities - GLM-5 executed over 700 tool calls and 800 context switches while maintaining consistent syntax and accuracy [12] - It has been recognized for its ability to generate complex applications, such as a 3D Monopoly game and an interactive version of Minecraft, demonstrating its versatility [26][35] - The model's performance in the Vending Bench 2 test has positioned it as the leading open-source model in terms of operational capabilities [23] Group 2: Industry Impact - The emergence of GLM-5 signifies a transformative shift in the SaaS industry, as it allows developers to create sophisticated applications without relying on traditional software subscriptions [38][40] - The release has caused market reactions, with significant declines in SaaS-related stocks, reflecting investor concerns about the implications of AI on software sales [39] - GLM-5's capabilities challenge the previous dominance of closed-source models, empowering developers with tools that were once exclusive to major corporations [40] Group 3: Community and Developer Engagement - The open-source nature of GLM-5 has generated significant interest and demand among developers, with many eager to utilize its capabilities [41] - The model's development has become a focal point for the community, with its headquarters attracting attention as a notable location [42] - The ongoing advancements in AI programming, initiated with earlier versions, have positioned GLM-5 as a leading choice for coding tasks in both domestic and international markets [41]
大模型桌游试玩员来了:用五大画像模拟「千人千面」,评分精准度超越GPT-5.1
量子位· 2026-02-12 07:52
Core Insights - The article introduces MeepleLM, a virtual playtester that simulates diverse player experiences and provides constructive feedback based on dynamic gameplay [1][4] - MeepleLM significantly outperforms general models like GPT-5.1 and Gemini3-Pro in accurately reflecting player reviews and ratings [2] Group 1: MeepleLM Overview - MeepleLM is developed by a collaborative research team from Shanda Tokyo Research Institute, Shanghai Chuangzhi Academy, Nankai University, and Shanghai AI Lab [1] - The model utilizes a dataset of 1,727 structured board game rulebooks and 150,000 real player reviews to create a mapping from objective rules to subjective experiences [1][9] - The MDA (Mechanics-Dynamics-Aesthetics) framework is employed to enhance the model's understanding of gameplay interactions and emotional experiences [12] Group 2: Challenges in Board Game Design - The board game industry is experiencing rapid growth, yet the design process faces significant challenges due to its reliance on social interactions and emergent gameplay [3] - Traditional playtesting methods are time-consuming and often fail to capture the preferences of diverse player types [3] Group 3: Data and Methodology - A high-quality dataset was constructed through a layered sampling strategy, converting unstructured PDF rulebooks into structured documents [9] - The team filtered through 1.8 million reviews to extract approximately 8% of high-quality data that deeply connects game mechanics with dynamic experiences [9] Group 4: Player Personas - Five distinct player personas were identified through clustering analysis, each representing different preferences and reactions to game mechanics [13][14][15][16][17] - MeepleLM can role-play these personas to provide varied feedback based on specific player preferences [18] Group 5: Performance Evaluation - Extensive testing on 207 games demonstrated MeepleLM's superior performance in community alignment, review quality, and utility compared to general models [21][22] - MeepleLM effectively captures the polarized nature of player reviews, identifying both strengths and critical flaws in games [22] Group 6: Practical Applications - MeepleLM's reviews are characterized by factual accuracy and diverse viewpoints, making it a valuable tool for players and designers alike [25][27] - Over 70% of users prefer MeepleLM for purchase decisions, citing its effectiveness in identifying potential design flaws [27] Group 7: Future Implications - MeepleLM establishes a new paradigm for automated virtual testing in interactive systems, paving the way for empathetic human-machine collaboration [28]
神仙打架+1!讯飞星火X2硬核亮相,行业深度全面升级
量子位· 2026-02-11 14:57
Core Viewpoint - The article highlights the significant advancements of iFLYTEK's Starfire Model X2, showcasing a 50% improvement in reasoning performance compared to its predecessor, Starfire X1.5, achieved within just three months, and emphasizes its capabilities based on domestic computing power [2][10][50]. Group 1: Model Performance and Capabilities - Starfire X2 demonstrates outstanding general capabilities, ranking among the top in the industry, and competes closely with international models like GPT-5.2 and Gemini-3-Pro [2][12]. - The model excels particularly in mathematical calculations and logical reasoning, maintaining proficiency in over 130 languages [3][12]. - Starfire X2's reasoning training efficiency has improved by 50%, a notable achievement given the diminishing returns in performance enhancement for large models [8][9]. Group 2: Technical Innovations - The model incorporates several technical innovations, including weight quantization, low-precision KVCache, and a two-stage reasoning sampling method, which enhance training efficiency by 10% [24][27][30]. - A new adaptive calibration algorithm ensures that the model maintains logical consistency during training and inference, addressing common issues in large model architectures [24]. - The recursive high-difficulty data synthesis method allows for the creation of high-quality training data, improving the model's reasoning accuracy [25][26]. Group 3: Industry Applications and Market Position - Starfire X2 is positioned to drive significant advancements in various industries, with a focus on practical applications and user experience [31][54]. - In the healthcare sector, Starfire X2 outperforms competitors in key metrics, establishing a new benchmark for medical AI models [34][35]. - The model's deployment in education enhances personalized learning experiences, allowing for detailed feedback and analysis of student performance [37][41]. Group 4: Strategic Direction and Future Outlook - iFLYTEK's strategy emphasizes a combination of a general-purpose model and specialized industry models, aiming for rapid deployment and practical applications [54][55]. - The company has established itself as a leader in the domestic AI landscape, leveraging its unique approach to overcome challenges related to computing power and model performance [50][56]. - The advancements of Starfire X2 signal a shift towards the application of domestic AI models, marking the beginning of a new phase of growth and opportunity in the industry [56].
这个AI炒股年化收益27.75%!用自进化Agent挖掘穿越牛熊的量化因子
量子位· 2026-02-11 12:49
上财团队 投稿 量子位 | 公众号 QbitAI 在量化金融的底层,Alpha因子本质上是一段可执行的代码逻辑,它们试图将嘈杂的市场数据映射为精准的交易信号。然而,长期以来,自 动化因子挖掘始终被困在"两难"的夹缝中:传统的遗传规划 (Genetic Programming,GP) 虽然擅长在海量空间中进行进化搜索,但其 本质是"盲目的随机变异"。 它们在回测中过度拟合了历史噪声,却在逻辑上极难解释,如同一个充满巧合的黑盒。而新兴的大语言模型 (LLM) 虽然具备强大的语义 理解能力,能像人类一样读懂金融理论,但在处理高精度的量化逻辑时,却容易陷入"语义漂移"与"幻觉"。生成的代码往往语法完美,但在 经济学逻辑上支离破碎,甚至不仅无法复现,更难以在实盘中生存。 这引发了一个根本性的思考: 我们能否构建一个智能体,既拥有机器般不知疲倦的进化探索能力,又具备人类程序员般严谨的逻辑控制 力? 正是在这一背景下,QuantaAlpha团队联合上海财经大学AIFin Lab等机构,提出了 QuantaAlpha 框架:一种全新的自进化Agent Alpha 因子挖掘框架。 核心框架解析:基于轨迹的自进化范式 Quant ...
量子位编辑作者招聘
量子位· 2026-02-11 12:49
Core Viewpoint - The article emphasizes the ongoing AI boom and invites individuals to join the company "Quantum Bit," which focuses on tracking AI advancements and has established itself as a leading content platform in the industry [1]. Group 1: Job Opportunities - The company is hiring for three main directions: AI Industry, AI Finance, and AI Product, with positions available for both experienced professionals and fresh graduates [2][4]. - Positions are open for various levels, including editors, lead writers, and chief editors, with a focus on matching roles to individual capabilities [6]. Group 2: Job Responsibilities - **AI Industry Direction**: Responsibilities include tracking innovations in infrastructure, such as chips, AI infrastructure, and cloud computing, as well as interpreting technical reports from conferences [6][7]. - **AI Finance Direction**: Focuses on venture capital, financial reports, and capital movements within the AI industry, requiring strong analytical skills and a passion for interviews [11]. - **AI Product Direction**: Involves monitoring AI applications and hardware developments, requiring a keen understanding of product experiences and market trends [11]. Group 3: Benefits and Growth - Employees will have the opportunity to engage with industry leaders, enhance their personal influence through original content creation, and receive professional mentorship from senior editors [6]. - The company offers competitive salaries and comprehensive benefits, including social insurance, meal allowances, and performance bonuses [6]. Group 4: Company Growth Metrics - By 2025, Quantum Bit aims to have over 2.4 million subscribers on WeChat and a total of over 7 million users across platforms, with a daily reading volume exceeding 2 million [12].
智谱开源OCR!测完我把手机里的扫描软件都卸了......
量子位· 2026-02-11 12:49
Core Insights - The article discusses the capabilities and performance of the GLM-OCR model, highlighting its competitive edge in the OCR technology landscape, particularly in complex scenarios like handwriting and table recognition [1][39]. Performance Comparison - GLM-OCR outperforms several competitors in various OCR tasks, achieving a document parsing accuracy of 94.6% on OmniDocBench V1.5, surpassing PaddleOCR and others [2]. - In text recognition, GLM-OCR achieves 94.0% accuracy, significantly higher than some competitors like Deepseek-OCR2, which only reaches 34.7% [2]. - For formula recognition, GLM-OCR scores 96.5%, indicating strong performance in recognizing mathematical expressions [2]. - The model also excels in table recognition, with an accuracy of 85.2% on PubTabNet, outperforming many alternatives [2]. Practical Applications - GLM-OCR is particularly effective for structured documents such as Word, PPT, and academic papers, as well as for recognizing clear handwriting, receipts, and scanned contracts [3][4]. - The model demonstrates strong capabilities in recognizing handwritten forms, achieving an accuracy of 86.1% [4]. - It can accurately extract information from various documents, including meeting minutes and whiteboard notes, making it suitable for everyday work scenarios [3][4]. User Experience - Users report a generally positive experience with GLM-OCR in standard document parsing tasks, although challenges remain with unclear handwriting and complex layouts [4][12]. - The model's ability to handle low-quality inputs is commendable, with a recognition accuracy of around 96% for mixed content, although some errors were noted in specific cases [13][29]. Structural Extraction - GLM-OCR is capable of structured information extraction, producing outputs in standard JSON format from various documents, which is beneficial for applications like invoicing and identification [36][38]. - The model's performance in structured extraction improves significantly when clear prompts are provided, indicating its adaptability to user requirements [38]. Industry Trends - The OCR technology market is rapidly evolving, with new models like GLM-OCR emerging to meet increasing demands for efficiency and accuracy [39][40]. - The trend towards smaller model parameters (0.07B to 0.9B) is making deployment easier and more cost-effective for users [51]. - Enhanced output quality and reduced processing times are becoming standard expectations in the OCR industry, benefiting users across various sectors [51].
9B端侧开源模型跑通百万上下文,面壁全新稀疏-线性混合注意力架构SALA立功了!
量子位· 2026-02-11 12:49
henry 发自 凹非寺 量子位 | 公众号 QbitAI 最强的大模型,已经把scaling卷到了一个新维度: 百万级上下文 。 几天前,Claude Opus 4.6发布,让人第一次真切感受到了百万上下文的涌现能力—— 单次吃进50万字中文内容、实现跨文档法律分析、多轮Agent规划…… 此情此景,用户火速用脚投票,华尔街更是直接给出K线回应。 与此同时,基于SALA注意力架构的模型 MiniCPM-SALA 也将一并开源。 除此之外,面壁还以OpenBMB社区名义,联合SGLang与NVIDIA发起 2026稀疏算子加速大奖赛(SOAR) ,将这套scaling能力直接交到 开发者手中,推动端侧Agent部署的性能突破。 Linear-Sparse混合注意力架构 太长不看,咱直接说重点—— 面壁这次全新的 线性与稀疏注意力混合架构SALA(Sparse Attention-Linear Attention,SALA) ,究竟是怎么个混合法呢? 简单来说,这套架构将 75%线性注意力(Lightning Attention) 与 25%稀疏注意力(InfLLM v2) 结合,并通过 混合位置编码HyPE ...
马斯克xAI雪崩!24小时两联创离职,一月内连失三位华人创始人,12人梦之队只剩一半
量子位· 2026-02-11 04:10
梦晨 发自 凹非寺 量子位 | 公众号 QbitAI 24小时内,马斯克的xAI连失两位华人联合创始人。 xAI联合创始人 吴宇怀 (Tony Wu)和 Jimmy Ba 先后在社交平台上宣布离职。 而就在一个月前,另一位华人联合创始人 杨格 (Greg Yang)刚刚因病退出。 三位华人核心科学家,一个月内全部离开。算上此前已出走的三人,xAI成立不到三年,最初12人创始团队已走6人。 同时,一些后加入的核心成员,也纷纷宣布离职。 马斯克的AI团队,发生了什么? 一月三别:从因病退出到师徒接连告别 这一波离职潮最先离开的是 杨格 。 2026年1月,这位Grok核心架构师宣布,自己被诊断出患有 莱姆病 (Lyme disease) ,不得不退出日常工作,转为公司的非正式顾问。 杨格拥有哈佛大学数学系学位, 师从著名数学家丘成桐 ,曾是微软研究院的研究员。 他在声明中解释自己可能早已感染此病,但在xAI创立期间的 "长期高强度工作"和"把自己逼得太狠" 导致免疫系统受损,最终使病情显现和 恶化。 紧接着是 吴宇怀 。2月10日,他在X平台上发布离职声明: 是时候开启我的下一章了,这是一个充满可能性的时代:一个 ...
超越CLIP!北大开源细粒度视觉识别大模型,每类识别训练仅需4张图像
量子位· 2026-02-11 01:55
Core Viewpoint - The article discusses the limitations of current multimodal large models in fine-grained visual recognition tasks and introduces the Fine-R1 model developed by Professor Peng Yuxin's team at Peking University, which significantly improves recognition accuracy with minimal training data [1][2][5]. Group 1: Fine-Grained Visual Recognition Challenges - Current multimodal large models excel in complex tasks but lag in fine-grained visual recognition compared to their visual encoders like CLIP [1]. - Real-world objects exhibit fine-grained characteristics, with numerous subclasses, such as over 500 types of fixed-wing aircraft, highlighting the importance of fine-grained recognition in practical applications [3]. Group 2: Fine-R1 Model Overview - The Fine-R1 model aims to leverage the rich knowledge of fine-grained subclasses and a generative decoding paradigm to overcome the limitations of traditional recognition methods, enabling fine-grained recognition of any visual object in an open domain [5]. - Fine-R1 enhances the model's ability to reason about unseen subclasses using a small number of training images (only 4 per subclass), outperforming models like OpenAI's CLIP and Google's DeepMind's SigLIP [5][15]. Group 3: Model Development Process - The development of Fine-R1 involves two main steps: 1. Chain-of-thought supervised fine-tuning, which simulates human reasoning to build inference capabilities [7]. 2. Triplet enhancement strategy optimization, which improves robustness to intra-class variations and inter-class distinctions by using positive and negative samples [8][10]. Group 4: Experimental Results - Fine-R1's performance was evaluated on six authoritative fine-grained image classification datasets, demonstrating superior accuracy in both seen and unseen categories compared to other models [15][17]. - The model's ability to utilize fine-grained subclass knowledge effectively was identified as the primary factor for its improved recognition accuracy, rather than enhancements in visual representation or knowledge storage [19]. Group 5: Conclusion and Future Work - The article concludes with the potential of Fine-R1 to excel in fine-grained visual recognition tasks, emphasizing its innovative approach to reasoning and knowledge application [21]. - The research has been accepted for ICLR 2026 and the code is open-sourced for further exploration [2][22].