抽象推理 - filings, earnings calls, financial reports, news

抽象推理

Search documents

机器之心· 2026-02-12 10:08

当我们解一道复杂的数学题或观察一幅抽象图案时，大脑往往需要反复思考、逐步推演。然而，当前主流的深度学习模型却走的是「一次通过」的路线—— 输入数据，经过固定层数的网络，直接输出答案。这种前馈式架构在图像分类等感知任务上表现出色，但面对需要多步推理的抽象问题时，却显得力不从心。最典型的例子就是「ARC-AGI 基准测试」 ——一个被认为是衡量 AI 抽象推理能力的「试金石」。近日，来自香港科技大学、中科院自动化所、UC Santa Cruz 的研究团队提出了「 Loop-ViT 」，首次将循环 Transformer 引入视觉推理领域。这个仅有 18M 参数的模型，在 ARC-AGI-1 基准上达到了「65.8%」的准确率，超越了参数量高达 73M 的 VARC 集成模型。更令人惊讶的是，其 3.8M 的小型版本也能达到 60.1% 的准确率，几乎追平人类平均水平（60.2%）。什么是 ARC-AGI？为什么它如此困难？ ARC-AGI（Abstraction and Reasoning Corpus）是由 Keras 之父 François Chollet 提出的抽象推理基准。与 Image ...

AI秒破18世纪“天书”账本，谷歌新模型盲测刷屏全网

3 6 Ke· 2025-11-12 10:44

Core Insights - Google has potentially solved two longstanding challenges in AI with a mysterious model that successfully recognized and corrected a 200-year-old merchant's handwritten ledger, showcasing advanced reasoning capabilities that astonished historians [1][3][15]. Group 1: Model Performance - The mysterious model achieved near-perfect performance in handwritten text recognition (HTR) and corrected a formatting error in the original ledger, indicating its ability to understand the logic and context behind the text [3][15]. - The model's performance in HTR reached human expert-level accuracy, with a strict Character Error Rate (CER) of 1.7% and a Word Error Rate (WER) of 6.5% on a challenging test set [13][15]. - Compared to previous models, the new Gemini model demonstrated significant improvements, with Gemini-2.5-Pro showing a 50-70% enhancement over earlier versions [11][15]. Group 2: Historical Context and Challenges - Recognizing historical handwriting requires not only visual recognition but also an understanding of the historical context, making it a complex task for AI models [5][8]. - The model's ability to accurately transcribe difficult historical documents, including those with ambiguous numbers and inconsistent styles, marks a significant advancement in AI capabilities [19][23]. - The model successfully interpreted complex historical currency and measurement systems, showcasing its potential for abstract reasoning and contextual understanding [23][24]. Group 3: Expert Validation - Historian Mark Humphries utilized the model to test its capabilities, emphasizing that the final accuracy in historical text recognition is crucial for practical use [8][9]. - The model's performance in transcribing a ledger with non-standard formats and mixed languages was particularly impressive, as it corrected errors and inferred missing context [20][23]. - Humphries noted that the model's ability to perform multi-step reasoning and contextual inference suggests a shift towards genuine understanding in AI, beyond mere pattern recognition [24].