C3 (Context Cascade Compression) - filings, earnings calls, financial reports, news

C3 (Context Cascade Compression)

Search documents

AI科技大本营· 2025-11-28 06:32

Core Insights - The article discusses the challenges of handling million-token inputs in large language models (LLMs) and introduces DeepSeekOCR's "Context Cascade Compression" (C3) technology, which achieves a 10x token compression rate [1][2]. Group 1: Compression Technology - DeepSeekOCR's success has led to misconceptions that "visual encoding" is the key to compression, while the research team identifies that the core of high compression rates lies in Latent Tokens, which are more efficient than discrete text tokens [1][2]. - C3 proposes a new approach that directly compresses text without visual intermediaries, utilizing a dual LLM architecture for encoding and decoding [6][9]. Group 2: Performance Metrics - C3 demonstrates superior performance with a 20x compression ratio achieving 98% decoding accuracy, compared to DeepSeekOCR's 60% accuracy [4][14]. - Even at a 40x compression ratio, C3 maintains over 93% reconstruction accuracy, showcasing its effectiveness in context compression [4][14]. Group 3: Unique Features - C3 exhibits a unique "forgetting pattern," where information loss tends to occur at the end of the text, resembling human memory's gradual forgetting process, which differs from the global blurriness seen in optical compression methods [12][13]. - This characteristic allows for more predictable applications, ensuring that critical information can be prioritized at the beginning of the text [13]. Group 4: Applications - C3 can serve as a front-end compressor for existing LLMs, enabling the processing of large token inputs, such as entire books or large codebases, while reducing computational costs [16]. - The architecture of C3 can be applied to next-generation models, facilitating the conversion of variable-length text into fixed-length latent representations [18].

上下文级联压缩

无损压缩

Artificial Intelligence

C3 (Context Cascade Compression)

DeepSeek-OCR

上下文级联压缩

无损压缩

Artificial Intelligence

C3 (Context Cascade Compression)

DeepSeek-OCR