Workflow
DeepSeek
icon
Search documents
AI日报丨DeepSeek发布DeepSeek-OCR 2;阿里千问最强模型亮相,性能媲美GPT-5.2
美股研究社· 2026-01-27 10:44
【 DeepSeek 发布 DeepSeek-OCR 2 】 DeepSeek 发布全新 DeepSeek-OCR 2 模型,采用创新的 DeepEncoder V2 方法,让 AI 能够根据图像的含义动态重排图像的各个部分,而不再只是机械地从左到右扫描。 这种方式模拟了人类在观看场景时所遵循的逻辑流程。最终,该模型在处理布局复杂的图片 (如文档或图表)时,表现优于传统的视觉 - 语言模型,实现了更智能、更具因果推理能力的 视觉理解。 【 Kimi 智能助手发布 K2.5 版本】 Moonshot AI 于 2026 年 1 月 27 日上线 Kimi 智能助手 K2.5 版本,以静默推送方式完成 更新,用户在官网聊天界面中 K2 模型已自动切换为 K2.5 。 该版本原生支持视觉理解功能,用户可直接上传图片并由 AI 进行分析与创作,例如依据平面 布局图生成对应 3D 模型。工具调用能力同步增强,模型可通过分步推理执行数学计算、编程 等复杂任务,相关测试显示其在多项任务中表现突出。 【阿里千问最强模型亮相,性能媲美 GPT-5.2 】 1 月 26 日,阿里正式发布千问旗舰推理模型 Qwen3-Max-Th ...
除了马化腾演讲,腾讯年会还透露了哪些信息? | 电厂
Sou Hu Cai Jing· 2026-01-27 10:42
Core Insights - Tencent has acknowledged its slow progress in AI, admitting it is lagging behind competitors like ByteDance and Alibaba by approximately 9 to 12 months [1][2] - The company is focusing on integrating AI with its various platforms and is undergoing organizational adjustments to enhance its AI capabilities [1][9] Group 1: AI Strategy and Competitors - Tencent's AI strategy is seen as a response to the competitive landscape, with ByteDance excelling in algorithms and application scenarios, DeepSeek focusing on AI infrastructure, and Alibaba leveraging its research capabilities through DAMO Academy [2][3] - Liu Chiping, Tencent's president, emphasizes that AI serves as an accelerator for existing business, with notable growth in AI-supported advertising revenue from 3% in 2024 to 10% in 2025, contributing nearly 15 billion to the overall revenue [8] Group 2: Internal Challenges and Reflections - Tencent's internal structure has been criticized for lacking a dedicated research team for AI, which has hindered its ability to develop effective AI products [4][5] - The company has identified shortcomings in its foundational AI models and infrastructure, which have limited its ability to scale and innovate effectively [4][7] Group 3: Future Directions and Innovations - Tencent is pursuing a comprehensive collaboration between its mixed model and Yuanbao, aiming to create a robust AI technology infrastructure that supports all its products [9][11] - The company is particularly optimistic about integrating AI into WeChat, with plans for a native AI agent that could significantly enhance user experience and operational efficiency [12][13]
Kimi发布新模型,月之暗面完成C轮融资现金储备破100亿
21世纪经济报道· 2026-01-27 10:41
Core Viewpoint - Kimi has launched its new multimodal model K2.5, which demonstrates state-of-the-art performance in various core areas, including agent collaboration and code generation, marking a significant advancement in AI capabilities [1][3]. Group 1: Model Features and Capabilities - K2.5 is designed with a native multimodal architecture, supporting both visual and textual inputs, and can perform tasks in thinking and non-thinking modes, excelling in agent, code, image, video, and general intelligence tasks [1][3]. - The model significantly lowers the AI interaction threshold by allowing users to submit requests via photos, screenshots, or screen recordings, thus overcoming the limitations of text-based communication [5]. - K2.5 introduces a new "Agent Cluster" capability, enabling the creation of multiple intelligent agents that can work in parallel, improving task efficiency by reducing key steps by 3 to 4.5 times and cutting actual runtime by up to 4.5 times [5][6]. Group 2: Financial and Strategic Developments - Kimi's valuation has risen to $4.3 billion (approximately 29.9 billion RMB) following a $500 million Series C funding round completed on December 31, 2023, which was significantly oversubscribed [1][10]. - The company is currently finalizing a new round of financing with a pre-money valuation of $4.8 billion [1]. - Kimi's strategic shift from a user acquisition strategy to focusing on foundational algorithms and model development has been influenced by the competitive landscape, particularly the rise of DeepSeek [7][8]. Group 3: Future Plans and Goals - Kimi aims to surpass leading companies like Anthropic and become a world leader in AGI by enhancing its K3 model and focusing on unique capabilities that have not been defined by other models [12]. - The company plans to integrate model training and agent product development, aiming for significant revenue growth while not prioritizing absolute user numbers [12].
氪星晚报|德国军工巨头要为德军打造本土版“星链”;OpenAI首席信息安全官奈特将卸任职务;金饰克价涨至1585元
3 6 Ke· 2026-01-27 10:16
Group 1 - Adidas has become the official strategic partner of the 2026 Jiangsu Province City Football League, with attendance exceeding 2.43 million and an average of 28,000 spectators per match since its inception in 2025 [1] - Samsung and SK Hynix have reportedly decided to significantly increase the price of LPDDR used in iPhones, nearly doubling the price compared to the previous quarter [2] - Vietnamese automaker Kim Long Motor will collaborate with China's BYD to build a $130 million electric vehicle battery factory in northern central Vietnam, with funding provided by Kim Long and technical support from BYD [3] Group 2 - Rheinmetall and Bremen-based satellite manufacturer are planning to bid for a contract to provide a satellite internet service similar to the US Starlink for the German military, with the contract potentially worth several billion euros [4] - OpenAI's Chief Information Security Officer, Nate, is set to resign from his position [5] - The Beijing Stock Exchange has denied rumors regarding a delayed announcement for new stock subscriptions, stating that the circulated notice was false [6] Group 3 - Beijing Yonghui Supermarket has issued a statement regarding the suspension of operations at its Hongkun Plaza store due to property management issues, including water and heating disruptions [7] - Nike is laying off 775 employees to accelerate automation processes in its U.S. distribution centers, following a previous announcement to cut 1,000 positions [8] - DeepSeek has released the DeepSeek-OCR 2 model, which utilizes an innovative DeepEncoder V2 method for dynamic image rearrangement based on meaning [9][10] Group 4 - Alibaba Health's medical AI application "Hydrogen Ion" has launched a new feature for "dynamic evidence positioning," which accurately locates specific statements supporting viewpoints in original texts [11] - AI medical innovation company "Virtual Reality" has completed an A+ round of financing exceeding 50 million yuan, with plans for further development in AI algorithms and hardware [12] - The National Market Supervision Administration has reported 1,169 cases related to charging treasure safety violations, emphasizing the importance of product quality safety [13] Group 5 - Domestic gold jewelry prices have increased, with several brands reporting prices for pure gold jewelry ranging from 1,575 to 1,585 yuan per gram [14] - China's Ministry of Human Resources and Social Security plans to implement measures to support employment in response to the impact of artificial intelligence, including actions to stabilize and expand job opportunities [15]
DeepSeek-OCR 2重磅发布:AI学会“人类视觉逻辑”,以因果流解读图片
华尔街见闻· 2026-01-27 09:56
Core Viewpoint - DeepSeek has launched the DeepSeek-OCR 2 system, which utilizes the DeepEncoder V2 method to enable AI to understand images in a human-like logical sequence, potentially transforming document processing and complex visual understanding applications [1][12]. Group 1: Technical Innovations - The DeepEncoder V2 method allows AI to dynamically rearrange image segments based on their meaning, rather than following a rigid left-to-right scanning approach, mimicking human visual perception [1][5]. - DeepSeek-OCR 2 achieved a score of 91.09% in the OmniDocBench v1.5 benchmark, representing a 3.73% improvement over its predecessor [1][10]. - The model maintains high accuracy while controlling computational costs, with visual token counts limited to between 256 and 1120, aligning with Google’s Gemini-3 Pro [2][8]. Group 2: Performance Metrics - In practical applications, the model demonstrated a reduction in repetition rates, decreasing from 6.25% to 4.17% for online user logs and from 3.69% to 2.88% for PDF data processing, indicating its high practical maturity [2][10]. - The reading order edit distance metric improved significantly from 0.085 to 0.057, validating the effectiveness of the logical reordering capabilities of DeepEncoder V2 [10]. Group 3: Architectural Changes - The architecture of DeepEncoder V2 replaced the original CLIP components with a compact LLM-style architecture (Qwen2-0.5B), introducing learnable query vectors known as "causal flow tokens" [6][8]. - The design retains a bidirectional attention mechanism for visual tokens while employing a causal attention mechanism for causal flow tokens, allowing for intelligent reordering of visual information [7][8]. Group 4: Future Implications - The release of DeepSeek-OCR 2 signifies not only an upgrade in OCR performance but also a significant exploration of architecture, suggesting a promising path towards unified multimodal encoders capable of feature extraction across images, audio, and text [12].
Nvidia’s Rally Shows DeepSeek Fears Were Unfounded a Year Later
Insurance Journal· 2026-01-27 08:57
A year ago, the Chinese startup DeepSeek freaked out the stock market with the idea that developing artificial intelligence was much easier and cheaper than everyone imagined. But 12 months later, that’s turned out to be largely a mirage so far.DeepSeek erased a record $589 billion from Nvidia Corp.’s market value in one day after the company revealed an AI model thought to be comparable to those of OpenAI and Meta Platforms Inc. and developed at a fraction of the cost. Nvidia’s double-digit plunge led the ...
DeepSeek开源全新OCR模型!弃用CLIP改用Qwen轻量小模型,性能媲美Gemini-3 Pro
量子位· 2026-01-27 08:32
henry 发自 凹非寺 量子位 | 公众号 QbitAI 刚刚,DeepSeek开源了全新的OCR模型—— DeepSeek-OCR 2 ,主打将PDF文档精准转换Markdown。 相较于去年10月20日发布的初代模型,DeepSeek-OCR 2的核心突破在于打破了传统模型死板的"光栅扫描"逻辑,实现了 根据图像语义动态 重排视觉标记(Visual Tokens) 。 为此,DeepSeek-OCR 2弃用了前作中的CLIP组件,转而使用轻量化的语言模型(Qwen2-0.5B)构建 DeepEncoder V2 ,在视觉编码阶 段就引入了"因果推理"能力。 这一调整模拟了人类阅读文档时的因果视觉流,使LLM在进行内容解读之前,智能地重排视觉标记。 性能上,DeepSeek-OCR 2在仅采用轻量模型的前提下,达到了媲美Gemini-3 Pro的效果。 在OmniDocBench v1.5基准上,DeepSeek-OCR 2提升了 3.73% ,并在视觉阅读逻辑方面取得了显著进展。 | Model | | | | V-token™ax Overall ↑ Formula OM ↑ TableTEDs ↑ ...
X @AscendEX
AscendEX· 2026-01-27 08:15
📰 #AscendEX Daily Updates🔷60% of the top 25 banks in the U.S. are developing BTC-related products.🔷DeepSeek releases OCR2, capable of interpreting images with human-like logic sequencing.🔷Ethereum network fees have dropped to the lowest level since May 2017.#AscendEX #Crypto #CryptoNews ...
重磅!DeepSeek发布新模型并开源
Mei Ri Jing Ji Xin Wen· 2026-01-27 08:12
每经编辑|程鹏 1月27日,DeepSeek团队发布全新DeepSeek-OCR 2模型并开源,采用创新的DeepEncoder V2方法,让AI能够根据图像的含义动态重排图像的各个部分,而 不再只是机械地从左到右扫描。这种方式更接近人类的视觉编码逻辑。最终,该模型在处理布局复杂的图片时,表现优于传统的视觉-语言模型,实现了 更智能、更具因果推理能力的视觉理解。 编辑|程鹏 杜波 校对|许绍航 封面图片来源:视觉中国(资料图) 每日经济新闻综合自每经AI快讯 ...
DeepSeek开源OCR2模型
Cai Jing Wang· 2026-01-27 08:05
Core Viewpoint - The DeepSeek team has released a paper titled "DeepSeek-OCR2: Visual Causal Flow" and has open-sourced the DeepSeek-OCR2 model, which utilizes an innovative DeepEncoder V2 method to enable AI to dynamically rearrange parts of an image based on its meaning, aligning more closely with human visual encoding logic [1]. Group 1 - The DeepSeek-OCR2 model represents a significant advancement in AI's ability to interpret and manipulate visual information [1]. - The innovative DeepEncoder V2 method is a key feature that enhances the model's performance in visual tasks [1]. - The open-sourcing of the model allows for broader access and potential collaboration within the AI research community [1].