量子位 - filings, earnings calls, financial reports, news

量子位

Search documents

量子位· 2025-10-14 04:08

Core Viewpoint - The article discusses a new reinforcement learning framework called SEAL (Self-Adapting LLMs) developed by MIT, which enables large models to autonomously update their weights and learn new knowledge without human intervention [1][4][6]. Group 1: SEAL Framework Overview - SEAL employs a nested learning mechanism that consists of an external loop driven by reinforcement learning and an internal loop for parameter updates [4][26]. - The framework allows models to generate fine-tuning data and self-update instructions, thus overcoming the limitations of relying solely on external supervised data [6][25]. Group 2: Knowledge Incorporation Experiment - In the knowledge incorporation experiment, the Qwen2.5-7B model was tested using the SQuAD dataset, where it generated training data based on new paragraphs without seeing the corresponding questions [9][10]. - The accuracy of the model improved from 32.7% to 47.0% when using SEAL for fine-tuning, outperforming both original and GPT-4.1 generated data [14][15]. - SEAL demonstrated a significant accuracy of 58.2% when tested with longer paragraphs, indicating its ability to generalize to larger data organization tasks [16]. Group 3: Few-Shot Learning Experiment - In the few-shot learning experiment, the LLaMA-3.2-1B-Instruct model was evaluated using a subset of tasks from the ARC-AGI dataset [17][18]. - SEAL achieved a success rate of 72.5%, significantly higher than the 0% success rate of fixed few-shot prompts and 20% from random sampling strategies [22][23]. - Although SEAL's performance did not reach the optimal strategy (Oracle TTT) at 100%, it showcased strong task adaptability through self-discovered learning paths [22]. Group 4: Mechanism of SEAL - SEAL's process involves reading new information, rewriting it in its own language, and performing gradient updates for autonomous learning [25]. - The model generates self-edit instructions that describe how to update itself based on the current input, including information extraction and training parameters [28][29]. - The framework utilizes a non-traditional reinforcement learning method called ReSTEM, which focuses on behavior cloning and filtered sampling to optimize self-edit strategies [33][36].

人工智能

强化学习

模型自我更新

Artificial Intelligence

SEAL（Self-Adapting LLMs）

Artificial Intelligence

SEAL（Self-Adapting LLMs）

ReSTEM

2025人工智能年度评选启动！3大维度5类奖项，正在寻找AI+时代领航者

量子位· 2025-10-14 04:08

组委会发自凹非寺量子位｜公众号 QbitAI 为了让更多从业者感受智能浪潮的跃迁，也为了给予更多同行同路人掌声与鼓舞，我们将正式启动「2025人工智能年度榜单」评选报名。这是量子位人工智能年度榜单的第8年。八年来，我们见证了技术的突破与落地，产业的融合与重塑，也见证了一批又一批推动时代前行的企业、人物与产品。在人工智能重新定义一切的时代里，智能技术已不再是单一工具，而是产业与社会协同进化的驱动力。我们期待通过这场年度评选，去发现并致敬那些真正引领变革、开拓边界的探索者与实践者。本次评选将从企业、产品、人物三大维度，设立五类奖项。欢迎企业踊跃报名！让我们共同见证年度之星，点亮未来的方向。 2025 人工智能年度焦点人物详细评选标准及报名方式如下。 2025 人工智能年度领航企业 2025 人工智能年度领航企业 2025 人工智能年度潜力创业公司 2025 人工智能年度杰出产品 2025 人工智能年度杰出解决方案将面向中国人工智能领域，评选出最具综合实力的企业，参选条件：评选标准：企业榜产品榜人物榜 2025 人工智能年度潜力创业公司聚焦于中国人 ...

人工智能

别Claude Code了，一个国产免费命令行就够了

量子位· 2025-10-14 04:08

Core Viewpoint - The article discusses the launch of iFlow CLI by Alibaba's Heartflow research team as a strong domestic alternative to Claude Code, emphasizing its performance, user-friendliness, and free access for individual users [2][58]. Performance Comparison - iFlow CLI outperforms Claude Code and Codex in four benchmark tests: GAIA (general search Q&A), SWE-bench (GitHub code repair), Terminal-Bench (diverse CLI usage scenarios), and BrowseComp-ZH (Chinese general search) [2]. - The performance of iFlow CLI is enhanced by integrating top domestic open-source models like Qwen3-Coder, DeepSeek-V3.1-Terminus, Kimi-K2-0905, and GLM-4.5 [4][6]. Features and Advantages - iFlow CLI offers zero-cost access to advanced models such as Qwen3 MAX, Kimi K2, DeepSeek V3.2, and GLM4.6, with no usage limits [7]. - The tool supports natural language commands for task execution, enabling full automation of workflows [9]. - iFlow CLI includes features like custom commands, task tools, and a built-in open market, enhancing its usability for developers [10][11]. User Experience - The installation process for iFlow CLI is straightforward, requiring only a single command in the terminal [14]. - Users can perform complex tasks, such as data analysis and code reviews, with simple commands, significantly reducing the learning curve [20][29]. - The tool allows for the creation of sub-agents tailored to specific tasks, enhancing its versatility [43][45]. Market Position and Implications - iFlow CLI represents a significant advancement in the domestic AI ecosystem, particularly in light of changes in usage policies by overseas tools like Claude [56][58]. - The tool's free access and supportive community platform foster a conducive environment for the proliferation of AI applications among domestic developers [58].

Command Line Interface

Software and Internet

iFlow CLI

Claude Code

Codex

Command Line Interface

Software and Internet

iFlow CLI

Claude Code

Codex

将科研脏活累活真·丢给AI！上海AI Lab推出深度科研智能体FlowSearch

量子位· 2025-10-14 04:08

InternAgent团队投稿量子位 | 公众号 QbitAI 将复杂科研过程自动化落地，上海人工智能实验室推出FlowSearch！在GAIA、HLE、GPQA以及TRQA等科研基准上， FlowSearch不仅实现了性能全面领先，还展示了AI在复杂科研任务中的动态协作与深度推理能力。展开来说，当AI在问答基准和标准化测试中表现卓越之时，其进行科学研究的能力也在被更多关注。科学研究不同于解题或信息检索，它是一个开放性、长期且复杂的认知过程——研究者需要提出原创问题、设计实验方案、收集并整合多源证据，并在不断迭代中形成系统结论。这样的过程远超计算能力本身，它要求的是创新思维、动态推理能力以及对复杂知识关系的精准掌控。而 FlowSearch ，正是一个由动态结构化知识流驱动的深度科研智能体。它通过动态结构化知识流构建科研任务的多层依赖图，并在多智能体框架下实现任务的并行探索、知识的递归整合和流程的自适应优化。与传统"输入—计算—输出"的封闭式AI不同，FlowSearch更像一个理解你研究思路的伙伴——当发现新信息，它会主动调整计划；当证据链不完整，它会引导进一步探索；当推理偏离目 ...

混元3D开源端到端全景深度估计器，代码+精选全景数据已上线，在线可玩

量子位· 2025-10-14 04:08

Core Insights - The article discusses the development of DA, a novel end-to-end panoramic depth estimator by Tencent's Mixed Reality 3D team, which addresses the challenges of panoramic data scarcity and zero-shot generalization capabilities [2][8]. Group 1: Background and Challenges - Panoramic images provide a 360°×180° immersive view, essential for advanced applications like AR/VR and 3D scene reconstruction [5][6]. - Traditional methods for depth estimation in panoramic images are limited due to the scarcity of panoramic depth data and the inherent spherical distortion of panoramic images [10][12]. - The team aims to expand panoramic data and build a robust data foundation for DA [8]. Group 2: Data Augmentation Engine - The team developed a data management engine to convert high-quality perspective depth data into panoramic data, significantly increasing the quantity and diversity of panoramic samples [11][14]. - Approximately 543K panoramic samples were created, expanding the total sample size from about 63K to approximately 607K, addressing the issue of data scarcity [14]. Group 3: Model Architecture and Training - The SphereViT architecture was introduced to mitigate the effects of spherical distortion, allowing the model to focus on the spherical geometry of panoramic images [16][17]. - The training process incorporates distance loss for global accuracy and normal loss for local surface smoothness, enhancing the model's performance [18]. Group 4: Experimental Results - DA demonstrated state-of-the-art (SOTA) performance, with an average improvement of 38% in AbsRel performance compared to the strongest zero-shot methods [23][24]. - Qualitative comparisons showed that DA's training utilized approximately 21 times more panoramic data than UniK3D, resulting in more accurate geometric predictions [27]. Group 5: Application Scenarios - DA's exceptional zero-shot generalization capabilities enable a wide range of 3D reconstruction applications, such as panoramic multi-view reconstruction [28]. - The model can reconstruct globally aligned 3D point clouds from panoramic images of different rooms in a house or apartment, ensuring spatial consistency across multiple panoramic views [29].

4399元起，vivo 2亿像素影像旗舰“大小王”亮相！旅行演唱会不用带相机

量子位· 2025-10-14 02:19

Core Viewpoint - The article highlights the launch of vivo's new flagship smartphone series, the X300, emphasizing its advanced imaging capabilities and AI-driven features, positioning it as a competitive option in the high-end smartphone market. Group 1: Imaging Capabilities - The X300 Pro features a Zeiss APO 200 million super telephoto lens, while the standard version also sees a significant upgrade with a Zeiss 200 million super main camera [2][12] - The camera system includes a custom Samsung HPB 1/1.4 large sensor, supporting CIPA 4.5 professional-grade stabilization, ensuring high-quality images even at high pixel counts [12] - The X300 series supports full-focus motion portrait capture, allowing for clear shots of dynamic subjects, and features advanced night portrait algorithms to handle challenging lighting conditions [19][24] Group 2: Pricing and Availability - The standard version starts at 4399 yuan, which is more affordable compared to the previous generation Pro mini, while the Pro version is priced at 5299 yuan, maintaining the same price as its predecessor [8][10] Group 3: AI and Software Features - The X300 series introduces the OriginOS 6 AI operating system, which includes a comprehensive upgrade in multi-modal interaction, enhancing user experience [5][38] - New AI features include automatic summarization of documents and emails, intelligent file naming, and proactive customer service call assistance, streamlining user interactions [42][47] Group 4: Competitive Landscape - The launch of the X300 series coincides with Apple's announcement of the iPhone Air, both set to be available for purchase on the same day, indicating a competitive market environment [53][56]

卡帕西8000行代码手搓ChatGPT，成本仅100美元，训练12小时CORE表现超越GPT-2，手把手教程来了

量子位· 2025-10-14 02:19

Core Insights - The article discusses the launch of "nanochat," a simplified version of ChatGPT created by Andrej Karpathy, which can be built with minimal cost and code [1][2][4]. Project Overview - "nanochat" is a full-stack training and inference pipeline that allows users to create a basic ChatGPT-like model with approximately 8000 lines of code [2][4]. - The entire project can be executed on a cloud GPU server for about $100, taking as little as 4 hours to set up and run [3][4][16]. Technical Specifications - The model is built using Rust and includes a tokenizer, a pre-trained Transformer architecture, and various training datasets [5]. - It supports efficient inference with features like KV caching and a lightweight Python interpreter for tool usage [5][43]. Performance Metrics - After about 12 hours of training, the model's performance on the CORE metric surpasses that of GPT-2 [8]. - A specific example shows that a model trained for 24 hours can achieve scores of over 40 on the MMLU dataset and over 70 on the ARC-Easy dataset [10]. Development Goals - Karpathy aims to create a unified, simple, and modifiable codebase that can serve as a strong baseline for future developments [11][13]. - The project is intended to be a capstone for the upcoming LLM101n course, which focuses on building large language models [12]. Community Engagement - The project has gained significant attention, with GitHub stars reaching 4.8k shortly after its release, indicating strong community interest [14]. - Users are encouraged to optimize and modify the codebase, allowing for a collaborative improvement process [59]. Training Process - The training process involves several stages: pre-training, mid-training, supervised fine-tuning (SFT), and reinforcement learning (RL) [45][48][51]. - The total time for the training process, excluding RL, is approximately 3 hours and 51 minutes, with a total cost of about $92.4 [57]. Final Remarks - The article emphasizes the potential of "nanochat" as a research tool and a framework for benchmarking, similar to previous projects like nanoGPT [13]. - The project is still in its early stages, with many opportunities for further optimization and enhancement [13][50].

大语言模型

强化学习

监督微调

Artificial Intelligence

Artificial Intelligence

nanochat

ChatGPT

人类遗忘的难题解法，被GPT-5重新找出来了

量子位· 2025-10-13 10:00

西风发自凹非寺量子位 | 公众号 QbitAI 人类遗忘的难题解法，被GPT-5 Pro重新找出来了！这事儿聚焦于埃尔德什问题 #339 ，这是著名数学家保罗・埃尔德什提出或转述的近千道问题之一，收录于erdosproblems.com网站。该网站记录了每道题目的当前状态，其中约三分之一已解决，大部分仍待解。尤其值得关注的是，GPT-5 Pro仅通过埃尔德什问题 #339的图片，直接定位到了关键文献。此前该问题被标为处于"未解决"状态，属于待攻克的数学难题，不少人还在继续研究探讨。直到最近，有人用GPT-5 Pro检索后才发现，该问题实际在 2003年就已被解决了。 OpenAI研究员Sebastien Bubeck将此事分享出来后立马引发大量网友关注。 By the way，陶哲轩的著名成果之一，就是通过"遍历理论（ergodic theory ） "工具，突破了"埃尔德什差异问题"这一困扰数学界几十年的猜想。问题详情具体来看，埃尔德什问题 #339是数论中加法基方向的一个经典问题，表述为：设A⊆N是一个r阶基（即每个足够大的整数都能表示为A中r个元 ...

前端危！Gemini 3内测结果获网友一致好评，“有史以来最强前端开发模型”

量子位· 2025-10-13 10:00

Core Viewpoint - Google's next-generation flagship model, Gemini 3, has gained significant attention even before its official release due to its impressive capabilities and performance in various tasks [1][8]. Group 1: Performance and Features - Gemini 3 excels in front-end and SVG vector graphics generation, showcasing enhanced multimodal capabilities [3][19]. - The model can generate a personal introduction webpage and visualize complex concepts like black holes with minimal input [4][8]. - It has demonstrated the ability to compose original piano music, receiving high praise from users [8]. Group 2: Technical Specifications - Gemini 3.0 Pro utilizes a MoE architecture with trillions of parameters, activating only 15-20 billion parameters per query, and features an expanded context window from 1 million to several million [13]. - In the challenging ARC-AGI-2 general intelligence test, Gemini 3.0 achieved an accuracy rate of nearly 35%, outperforming other models [15]. - It scored 32.4% on the "Human Last Exam HLE benchmark," surpassing GPT-5 and Grok 4 [16]. Group 3: User Experience and Applications - Users have reported that Gemini 3.0 is particularly adept at programming and interface design, producing visually appealing results for projects like an ancient art museum website [20][21]. - The model successfully generated a demonstration website based on the Kardashev Scale Level 3, showcasing its advanced capabilities [23][24]. - Gemini 3.0 has shown proficiency in rendering complex images, including high-quality game backgrounds and intricate SVG graphics [31][35]. Group 4: Anticipated Release - There are speculations about the release date of Gemini 3.0, with rumors suggesting it may launch on October 22, following earlier incorrect predictions [42][45].

2025人工智能年度评选启动！3大维度5类奖项，正在寻找AI+时代领航者

量子位· 2025-10-13 08:47

为了让更多从业者感受智能浪潮的跃迁，也为了给予更多同行同路人掌声与鼓舞，我们将正式启动「2025人工智能年度榜单」评选报名。这是量子位人工智能年度榜单的第8年。八年来，我们见证了技术的突破与落地，产业的融合与重塑，也见证了一批又一批推动时代前行的企业、人物与产品。在人工智能重新定义一切的时代里，智能技术已不再是单一工具，而是产业与社会协同进化的驱动力。我们期待通过这场年度评选，去发现并致敬那些真正引领变革、开拓边界的探索者与实践者。本次评选将从企业、产品、人物三大维度，设立五类奖项。欢迎企业踊跃报名！组委会发自凹非寺量子位｜公众号 QbitAI 让我们共同见证年度之星，点亮未来的方向。企业榜产品榜人物榜 2025 人工智能年度焦点人物详细评选标准及报名方式如下。 2025 人工智能年度领航企业 2025 人工智能年度领航企业 2025 人工智能年度潜力创业公司 1、业务能力｜市场占有率与营收规模，商业模式与盈利能力，客户数量及行业覆盖面，增长潜力与持续性等； 2、技术能力｜科研实力与技术成果，研发投入比例，技术核心竞争力，创新案例与技术落地情况等； ...