Seek .(SKLTY)
Search documents
AI赛道又卷起来了!DeepSeek开源新模型,OpenAl推出AI浏览器!科创人工智能ETF随市回调,逢跌布局时刻到?
Xin Lang Ji Jin· 2025-10-22 03:32
Group 1 - DeepSeek, a domestic AI company, has open-sourced its latest model, DeepSeek-OCR, which utilizes a visual-text compression paradigm to reduce computational costs by representing content with fewer visual tokens [1] - DeepSeek-OCR can compress a 1000-word article into just 100 visual tokens, achieving a recognition accuracy of 96.5% with a tenfold compression [1] - OpenAI launched the AI browser Atlas to compete directly with Google Chrome, allowing users to invoke ChatGPT on any webpage for summarization, questioning, or task execution [1] Group 2 - The Ministry of Industry and Information Technology is soliciting opinions on the "Computing Power Standard System Construction Guide (2025 Edition)," aiming to revise over 50 standards by 2027 to promote the construction of a computing power standard system [2] - The AI industry is currently experiencing a three-dimensional resonance of policy, technology, and demand, with potential funding support from the "AI+" initiative, leading to increased certainty in the performance of domestic chip and cloud computing leaders [2] - Analysts expect continued technology-led market trends in the fourth quarter, with the AI sector remaining a key driver of investment [2] Group 3 - In the stock market, companies like Stone Technology and Optoelectronics led gains of over 2%, while others like Zhongke Star Map and Haotian Ruisheng saw declines of over 2% [3] Group 4 - The focus on the Sci-Tech Innovation AI ETF (589520) highlights three key points: 1. Policy support is igniting AI growth, with core trends in edge-cloud integration benefiting leading companies in the sector [4] 2. The importance of information and industrial security emphasizes the need for self-controllable AI technologies, with the ETF focusing on domestic AI industry chains [5] 3. The ETF offers high elasticity and strong offensive potential, with a 20% price fluctuation limit, allowing for efficient investment during market surges [5] Group 5 - The top ten weighted stocks in the Sci-Tech Innovation AI ETF as of September 30, 2025, show a concentration in semiconductor companies, with the first largest holding, Cambricon, at 16.623% [6]
AI赛道又卷起来了!DeepSeek开源新模型,OpenAl推出AI浏览器!科创人工智能ETF随市回调,逢跌布局时刻已到
Xin Lang Ji Jin· 2025-10-22 03:32
Group 1 - DeepSeek, a domestic AI company, has open-sourced its latest model, DeepSeek-OCR, which utilizes a visual-text compression paradigm to reduce computational costs by representing content with fewer visual tokens [1] - DeepSeek-OCR can compress a 1000-word article into 100 visual tokens, achieving a recognition accuracy of 96.5% with a tenfold compression [1] - OpenAI launched the AI browser Atlas to compete with Google Chrome, allowing users to directly invoke ChatGPT on any webpage for summarization, questioning, or task execution [1] Group 2 - The Ministry of Industry and Information Technology is soliciting opinions on the "Computing Power Standard System Construction Guide (2025 Edition)," aiming to revise over 50 standards by 2027 across various aspects of computing power [2] - The AI industry is currently experiencing a three-dimensional resonance of policy, technology, and demand, with potential funding support from the "AI+" initiative, leading to increased certainty in the development of domestic chips and cloud computing [2] - Analysts expect a technology-led market trend in the fourth quarter, with the AI sector remaining a key focus for investment [2] Group 3 - The domestic AI industry chain is highlighted as a key investment area, with the Sci-Tech Innovation Artificial Intelligence ETF (589520) experiencing a slight decline of 0.50% amid market adjustments [3][4] - The ETF is positioned to benefit from policy support and the trend of AI development, focusing on companies with significant revenue in niche segments [4] - The ETF offers a low-threshold investment opportunity with a 20% price fluctuation limit, enhancing efficiency during market surges [5] Group 4 - The top ten holdings of the Sci-Tech Innovation Artificial Intelligence ETF account for over 70% of its weight, with the semiconductor sector representing more than half of the portfolio [6]
DeepSeek昨天开源的新模型,有点邪门
3 6 Ke· 2025-10-22 01:00
Core Insights - DeepSeek has introduced a new model called DeepSeek-OCR, which can compress text information into images, achieving a significant reduction in token usage while maintaining high accuracy [5][31][39]. Group 1: Model Capabilities - DeepSeek-OCR can store large amounts of text as images, allowing for a more efficient representation of information compared to traditional text-based models [9][10]. - The model demonstrates a compression ratio where it can use only 100 visual tokens to outperform previous models that required 256 tokens, and it can achieve results with less than 800 visual tokens compared to over 6000 tokens used by other models [14][31]. - DeepSeek-OCR supports various resolutions and compression modes, adapting to different document complexities, with modes ranging from Tiny to Gundam, allowing for dynamic adjustments based on content [17][18]. Group 2: Data Utilization - The model can capture previously unutilized data from documents, such as graphs and images, which traditional models could not interpret effectively [24][26]. - DeepSeek-OCR can generate over 200,000 pages of training data in a day on an A100 GPU, indicating its potential to enhance the training datasets for future models [29]. - By utilizing image memory, the model reduces the computational load significantly, allowing for a more efficient processing of longer conversations without a proportional increase in resource consumption [31]. Group 3: Open Source Collaboration - The development of DeepSeek-OCR is a collaborative effort, integrating various open-source resources, including Huawei's Wukong dataset and Meta's SAM for image feature extraction [38][39]. - The model's architecture reflects a collective achievement from the open-source community, showcasing the potential of collaborative innovation in AI development [39].
10倍压缩率、97%解码精度!DeepSeek开源新模型 为何赢得海内外关注
Xin Lang Cai Jing· 2025-10-21 23:26
Core Insights - DeepSeek has open-sourced a new model called DeepSeek-OCR, which utilizes visual patterns for context compression, aiming to reduce computational costs associated with large models [1][3][6] Model Architecture - DeepSeek-OCR consists of two main components: DeepEncoder, a visual encoder designed for high compression and high-resolution document processing, and DeepSeek3B-MoE, a lightweight language decoder [3][4] - The DeepEncoder integrates two established visual model architectures: SAM (Segment Anything Model) for local detail processing and CLIP (Contrastive Language–Image Pre-training) for capturing global knowledge [4][6] Performance and Capabilities - The model demonstrates strong "deep parsing" abilities, capable of recognizing complex visual elements such as charts and chemical formulas, thus expanding its application in fields like finance, research, and education [6][7] - Experimental results indicate that when the number of text tokens is within ten times that of visual tokens (compression ratio <10×), the model achieves 97% OCR accuracy, maintaining around 60% accuracy even at a 20× compression ratio [6][7][8] Industry Reception - The model has received widespread acclaim from tech media and industry experts, with notable figures like Andrej Karpathy praising its innovative approach to using pixels as input for large language models [3][4] - Elon Musk commented on the long-term potential of AI models primarily utilizing photon-based inputs, indicating a shift in how data may be processed in the future [4] Practical Applications - DeepSeek-OCR is positioned as a highly practical model capable of generating large-scale pre-training data, with a single A100-40G GPU able to produce over 200,000 pages of training data daily [7][8] - The model's unique approach allows it to compress a 1000-word article into just 100 visual tokens, showcasing its efficiency in processing and recognizing text [8]
DeepSeek的终极野心:把大语言模型的基本语言都改造成图像
3 6 Ke· 2025-10-21 12:52
Core Insights - DeepSeek has open-sourced DeepSeek-OCR, an OCR model that achieves state-of-the-art results on benchmarks like OmniDocBench [1] - The motivation behind entering the OCR field is to address the computational bottleneck of long context processing in large language models (LLMs) [4][6] - The paper proposes that text information can be efficiently compressed through optical 2D mapping, allowing visual language models (VLMs) to decompress original information from images [4][6] Group 1: Long Context Processing - The pursuit of longer context in LLMs has led to a competitive arms race, with token windows expanding from thousands to millions [7] - The core limitation arises from the attention mechanism in the Transformer architecture, where computational complexity and memory usage grow quadratically with sequence length [7] - DeepSeek-AI's engineers propose a fundamental question: can the number of tokens be compressed rather than just optimizing attention calculations? [7][10] Group 2: Visual Tokens vs. Text Tokens - Visual tokens are the basic units of information processed by visual models, while text tokens are used by LLMs [8] - A 1024x1024 image can be divided into 4096 visual tokens, significantly reducing the number of tokens needed compared to text representation [9] - The understanding that visual modalities can serve as efficient compression mediums for text information led to the creation of DeepSeek-OCR [9] Group 3: DeepEncoder and Compression Techniques - DeepSeek-OCR is essentially a proof of concept for an "optical compression-decompression" system [10] - The DeepEncoder, a key innovation, is designed to handle high-resolution inputs while producing minimal visual tokens [11][12] - The architecture consists of three stages: a local detail processor, a compression module, and a global attention layer [14][16] Group 4: Performance Metrics - Experimental results show a 10.5x compression rate with 64 visual tokens decoding 600-700 text tokens, achieving an OCR accuracy of 96.5% [17][18] - At a 20x compression rate, the model maintains around 60% accuracy while decoding over 1200 text tokens [17][18] - DeepSeek-OCR outperforms existing models like GOT-OCR2.0 and MinerU2.0 in terms of performance and token efficiency [19][20] Group 5: Future Vision and Memory Simulation - The team aims to simulate human memory's forgetting mechanism, which naturally prioritizes relevant information while compressing less important details [25][27] - The multi-resolution design of DeepSeek-OCR provides a technical foundation for managing memory in a way that mimics human cognitive processes [29][30] - The ultimate goal is to create a system that balances information retention and computational efficiency, potentially leading to a new paradigm in AI memory and input systems [32][35]
谁家AI用一万美元赚翻了?DeepSeek第一 GPT 5垫底
Di Yi Cai Jing· 2025-10-21 12:33
Core Insights - The article discusses a live investment competition called "Alpha Arena" initiated by the startup Nof1, where six AI models are trading real cryptocurrencies with a starting capital of $10,000 each [3][4] - The competition began on October 18 and will last for two weeks, concluding on November 3, with real-time tracking of performance and trading strategies [4][6] - The AI models participating include DeepSeek chat v3.1, Claude Sonnet 4.5, Grok 4, Qwen3 Max, Gemini 2.5 pro, and GPT 5, with varying performance and trading styles observed [4][6] Performance Summary - As of the fourth day, DeepSeek has maintained a stable performance, initially achieving a return close to 40% but stabilizing around 10% after market fluctuations [4][6] - Grok 4 showed aggressive trading but faced volatility, while Claude improved from third to second place, closely following DeepSeek [6][8] - Gemini 2.5 and GPT 5 experienced significant losses, with Gemini 2.5 down over 30% and GPT 5 down over 40% [6][8] Trading Styles - DeepSeek's strategy is characterized by stability and a diversified portfolio, employing a straightforward approach without frequent trading [8][10] - In contrast, Gemini 2.5's erratic trading style has been likened to that of retail investors, leading to higher trading costs and losses [10][12] - Grok 4 is noted for its aggressive trading style, while Claude is recognized for its analytical capabilities but struggles with decisiveness [12][13] AI's Role in Investment - The competition highlights the potential of AI in trading, with some users already adopting DeepSeek's strategies [12][13] - However, industry experts caution that AI lacks understanding of individual investors' circumstances and cannot predict future market movements [12][13] - The consensus is that while AI can provide logical investment strategies, the combination of rational tools and human insight may yield the best results [13]
深度|DeepSeek-OCR引爆“语言vs像素”之争,Karpathy、马斯克站台“一切终归像素”,视觉派迎来爆发前夜
Sou Hu Cai Jing· 2025-10-21 12:25
Core Insights - DeepSeek-OCR introduces a novel approach to visual encoding, emphasizing high information compression efficiency through multi-resolution mechanisms [2][3] - The model's design allows for a "coarse-to-fine" path, where entire pages are covered with lower resolution while key areas are processed at higher resolutions, enhancing both structure and detail density [2][4] Technical Mechanisms - The model compresses documents significantly, reducing 100,000 tokens to a few hundred visual tokens, which leads to substantial improvements in latency, memory usage, and cost [4][14] - DeepSeek-OCR's approach aligns with the "pyramid" paradigm in multi-scale generation and understanding, achieving near-lossless compression with a 10× reduction and maintaining about 60% accuracy at a 20× reduction [5][11] Memory and Context Management - The model incorporates a "forgetting" mechanism, where recent information is stored at high resolution while older information is retained at lower resolutions, mimicking human memory decay [7][18] - This creates a three-dimensional temporal structure for context, allowing the model to retain information in a layered manner rather than as a flat sequence of tokens [7][18] Industry Implications - The shift towards visual input is seen as a parallel track to traditional text tokens, with specific advantages in handling complex layouts, cross-language tasks, and security concerns [16][17] - The integration of visual tokens could lead to significant advancements in long-context processing and overall system optimization, as evidenced by community estimates of processing capabilities [14][16] Future Directions - The ultimate goal is to unify visual input with semantic memory, allowing for efficient context management where older contexts can exist in a "blurred" state while still being accessible for detailed review when necessary [18][20] - The development of a robust evaluation framework that measures not just accuracy but also layout, semantic, and logical consistency will be crucial for the adoption of this new paradigm [19][20]
谁家AI用一万美元赚翻了?DeepSeek第一,GPT 5垫底
Di Yi Cai Jing· 2025-10-21 11:24
Core Insights - The article discusses a live investment competition called "Alpha Arena," initiated by the startup Nof1, where six AI models are trading real cryptocurrencies with a starting capital of $10,000 each [5][9] - The competition began on October 18 and will last for two weeks, ending on November 3, showcasing the performance of AI models in a volatile market [5][7] - The AI models participating include DeepSeek chat v3.1, Claude Sonnet 4.5, Grok 4, Qwen3 Max, Gemini 2.5 pro, and GPT 5, with varying trading strategies and performance [5][9] Performance Summary - As of the fourth day, DeepSeek has maintained a stable performance, initially achieving a return close to 40% but stabilizing around 10% after market fluctuations [5][7] - Grok 4 showed aggressive trading but faced significant volatility, while Claude improved from third to second place, closely following DeepSeek [7][9] - Gemini 2.5 and GPT 5 have been underperforming, with losses exceeding 30% and 40% respectively, indicating a struggle in their trading strategies [7][9] Model Characteristics - DeepSeek's success is attributed to its professional background and straightforward trading strategy, maintaining a full position without frequent adjustments [9][11] - Gemini 2.5 has been criticized for its erratic trading style, resembling that of retail investors, leading to higher transaction costs and losses [11][13] - Grok 4 is characterized by high-frequency trading and significant exposure to multiple assets, while Claude is noted for its analytical skills but indecisiveness in execution [13][14] Industry Perspectives - The competition highlights the potential and limitations of AI in trading, with industry experts noting that AI lacks understanding of individual investor circumstances and cannot predict future market movements [13][14] - AI's strength lies in its ability to provide logical, emotion-free analysis, but it is not a substitute for human judgment in navigating complex market dynamics [14]
DeepSeek-OCR横空出世,3B参数量开启OCR新“视界”!科创人工智能ETF华夏(589010) 早盘活跃,AI主题热度延续
Mei Ri Jing Ji Xin Wen· 2025-10-21 07:36
Group 1 - The core viewpoint of the news highlights the ongoing interest and investment in the AI sector, particularly driven by innovations from companies like DeepSeek, which are reshaping the landscape of AI technology and investment perceptions in China [1][3]. - The 科创人工智能ETF (589010) has shown a positive market response, with a price increase of 0.94% to 1.389 yuan, indicating strong trading activity and investor interest in AI-related assets [1]. - DeepSeek-AI has introduced a new method for compressing long text contexts using visual modalities, demonstrating high OCR accuracy even with significant compression ratios, which showcases the potential for practical applications in historical document processing [2]. Group 2 - 国盛证券 emphasizes that the current AI wave is driven by technological innovations from companies like DeepSeek, which possess core advantages such as high performance and low cost, positioning them competitively on a global scale [3]. - The 科创人工智能ETF closely tracks the Shanghai Stock Exchange's AI index, covering high-quality enterprises across the entire industry chain, benefiting from high R&D investment and supportive policies [3]. - The ETF has experienced continuous net inflows over the past five days, reflecting sustained market interest in the AI theme [1].
文本已死,视觉当立,Karpathy狂赞DeepSeek新模型,终结分词器时代
3 6 Ke· 2025-10-21 07:22
Core Insights - DeepSeek has made a significant breakthrough with its new model, DeepSeek-OCR, which fundamentally changes the input paradigm from text to visual data, suggesting that visual inputs may become the mainstream in AI applications [1][14][17] Performance Metrics - DeepSeek-OCR achieves approximately 2500 tokens per second on a single A100-40G card while maintaining a 97% OCR accuracy. It compresses visual context to 1/20 of its original size, with typical usage achieving a compression ratio of less than 1/10 [3][5] - The model can compress an entire page of dense text into just 100 visual tokens, achieving up to 60 times compression on the OmniDocBench benchmark [5][11] Technical Advantages - DeepSeek-OCR boasts fewer parameters, high compression rates, fast processing speeds, and support for 100 languages, making it both theoretically valuable and highly practical [7][11] - The model demonstrates that physical pages (like microfilm and books) are superior data sources for training AI models compared to low-quality internet text [11] Industry Implications - The shift from text to visual inputs could redefine how large language models process information, potentially eliminating the need for traditional tokenizers, which have been criticized for their inefficiencies [16][19] - Karpathy, a prominent figure in AI, emphasizes that the future may see all inputs for AI models being images, enhancing efficiency and information flow [15][25] Community Response - The open-source project has gained significant traction, receiving 4.4k stars on GitHub overnight, indicating strong community interest and support [10][46]