Workflow
C3 (Context Cascade Compression)
icon
Search documents
百万 Token 也能无损压缩?C3 模型用“级联压缩”重新定义长上下文挑战
AI科技大本营· 2025-11-28 06:32
Core Insights - The article discusses the challenges of handling million-token inputs in large language models (LLMs) and introduces DeepSeekOCR's "Context Cascade Compression" (C3) technology, which achieves a 10x token compression rate [1][2]. Group 1: Compression Technology - DeepSeekOCR's success has led to misconceptions that "visual encoding" is the key to compression, while the research team identifies that the core of high compression rates lies in Latent Tokens, which are more efficient than discrete text tokens [1][2]. - C3 proposes a new approach that directly compresses text without visual intermediaries, utilizing a dual LLM architecture for encoding and decoding [6][9]. Group 2: Performance Metrics - C3 demonstrates superior performance with a 20x compression ratio achieving 98% decoding accuracy, compared to DeepSeekOCR's 60% accuracy [4][14]. - Even at a 40x compression ratio, C3 maintains over 93% reconstruction accuracy, showcasing its effectiveness in context compression [4][14]. Group 3: Unique Features - C3 exhibits a unique "forgetting pattern," where information loss tends to occur at the end of the text, resembling human memory's gradual forgetting process, which differs from the global blurriness seen in optical compression methods [12][13]. - This characteristic allows for more predictable applications, ensuring that critical information can be prioritized at the beginning of the text [13]. Group 4: Applications - C3 can serve as a front-end compressor for existing LLMs, enabling the processing of large token inputs, such as entire books or large codebases, while reducing computational costs [16]. - The architecture of C3 can be applied to next-generation models, facilitating the conversion of variable-length text into fixed-length latent representations [18].