DeepSeek悄悄上线新模型

Core Insights - DeepSeek has released a new multimodal model called DeepSeek-OCR, which has sparked significant discussion in the industry regarding its potential applications in optical and quantum computing [1] - The model's visual encoder enables efficient decoding, providing a clear technical pathway for integrating optical computing into large language models (LLMs) [1] Group 1: Contextual Optical Compression - DeepSeek has introduced "Contextual Optical Compression" technology, which processes text as images to achieve efficient information compression, theoretically allowing for infinite context [3] - This technology can compress tokens by 7 to 20 times; for instance, converting a page of text that typically requires 2000-5000 tokens down to just 200-400 visual tokens [3][4] - The model maintains 97% decoding accuracy at 20x compression, with 60% accuracy still achievable at 20x compression, which is crucial for implementing LLM memory's forgetting mechanism [4] Group 2: Optical Computing Integration - By transforming text problems into image problems, DeepSeek's OCR technology may pave the way for the integration of optical computing chips into large language models [5] - Optical computing chips are seen as a potential technology for the "post-Moore era," leveraging light-speed transmission, high parallelism, and low power consumption for AI and other computation-intensive tasks [5] - The DeepEncoder component of DeepSeek-OCR is particularly suited for execution by optical co-processors, while the text decoding will still be handled by electronic chips [5] Group 3: Challenges and Industry Landscape - Current challenges for optical computing include advanced optoelectronic integration and the maturity of the software ecosystem, which hinder large-scale development and optimization [6] - Key players in the domestic market include companies like Xizhi Technology and Turing Quantum, while international competitors include Lightmatter and Cerebras Systems [6][7] - Turing Quantum has made significant progress in the mass production of thin-film lithium niobate (TFLN) products, but it may take 3 to 5 years to compete with GPUs in data centers due to engineering, cost, and ecosystem challenges [7]