OCR
Search documents
“扫描识字”便宜200倍,DeepSeek革了Adobe们的命
Guan Cha Zhe Wang· 2026-01-28 09:46
Core Viewpoint - The release of DeepSeek-OCR2 marks a significant disruption in the OCR (Optical Character Recognition) market, which is valued in the hundreds of billions, by introducing a more efficient and cost-effective solution that challenges traditional OCR providers [5][11][18]. Group 1: Product Innovation - DeepSeek-OCR2 introduces a new encoder structure called DeepEncoder-V2, which dynamically adjusts the processing order of visual information based on semantic understanding, enhancing the model's ability to recognize text accurately [6][9]. - The model incorporates a concept of "visual causal flow," allowing it to process images intelligently rather than mechanically, improving its performance in complex layouts and distorted documents [6][9]. - Testing on the OmniDocBench v1.5 benchmark shows that DeepSeek-OCR2 achieved an overall score of 91.09%, a 3.73% improvement over its predecessor, with a notable reduction in reading order accuracy error [7]. Group 2: Cost Efficiency - DeepSeek's pricing model offers a dramatic cost reduction compared to traditional OCR services, with processing costs dropping from approximately $65 to $0.28 for 1,000 pages of complex financial documents, representing a cost difference of over 200 times [12][11]. - The introduction of a token-based billing system allows for even lower costs, potentially as low as $0.028 per document if cached [12]. Group 3: Market Impact - The emergence of DeepSeek-OCR2 threatens established OCR companies like 合合信息, 汉王科技, and ABBYY, as it undermines their claims of specialized expertise and high-value services [13][14]. - Traditional OCR providers, which have relied on proprietary algorithms and extensive template libraries, face a significant challenge as DeepSeek demonstrates that general models can outperform specialized ones without extensive training [14][13]. - The shift towards open-source solutions, as exemplified by DeepSeek-OCR2, is expected to democratize access to OCR technology, enabling small businesses and various sectors to leverage automated document processing [15][16]. Group 4: Future Implications - The release of DeepSeek-OCR2 signifies a transition of OCR technology from a high-cost service to a fundamental infrastructure, akin to utilities like water and electricity, making it accessible to a broader audience [16][18]. - As the cost of machine reading decreases, new opportunities arise in various fields, including small business credit services, automated grading, and intelligent document review processes [15][17]. - The development of a unified multimodal encoder through open-source collaboration is anticipated to accelerate technological advancements and reduce costs across the industry [16].