Workflow
上下文窗口
icon
Search documents
DeepSeek变冷淡了
36氪· 2026-02-13 10:20
以下文章来源于经济观察报 ,作者陈月芹 经济观察报 . 经济观察报是专注于财经新闻与经济分析的全国性综合财经类媒体,创办于2001年。聚焦商道、商技和商机,以锐度、悦度、广 度、深度的报道形成了权威的媒体公信力和影响力。 16不少用户自发地号召其他用户给DeepSeek官方邮箱提意见:希望DeepSeek不要为了超长文本舍弃深度思考,不要为了提升数学、代 码编程等理工科能力,而降低对文本表达、共情理解等能力的支持。 还有用户到豌豆荚(一个应用分发平台)下载其旧版本,或在腾讯元 宝里用DeepSeek。 文 | 陈月芹 来源| 经济观察报(ID:eeo-com-cn) 封面来源 | Unsplash 升级后的1M Tokens窗口意味着DeepSeek可以一次性吞吐约75万到90万个英文字母,或者处理约8万到15万行代码。 DeepSeek称,自己可以一次性读入并精准理解《三体》三部曲(约90万字)的全书内容,并在几分钟内完成对整部作品的宏观 分析或细节检索。除了上下文能力的提升,DeepSeek的知识库从2024年中期版本更新至2025年5月。 不过,此次灰度版本仍未同步上线视觉理解或多模态输入功能,仍专注于 ...
DeepSeek变冷淡了
Jing Ji Guan Cha Wang· 2026-02-12 04:57
Core Insights - DeepSeek has conducted a gray test of its flagship model, significantly increasing its context window from 128K Tokens to 1M Tokens, achieving nearly an 8-fold capacity increase [1] - The upgraded model can process approximately 750,000 to 900,000 English letters or around 80,000 to 150,000 lines of code in a single interaction [1] - DeepSeek claims it can read and understand the entire "Three-Body" trilogy (approximately 900,000 words) and perform macro analysis or detail retrieval within minutes [1] Model Features - The gray version does not yet support visual understanding or multimodal input, focusing solely on text and voice interactions [2] - DeepSeek allows file uploads in formats like PDF and TXT, but currently processes them by converting to text tokens rather than native multimodal understanding [2] - Compared to models like Gemini 3 Pro, which can handle over 2M long texts and complex media tasks, DeepSeek offers 1M text context processing at about one-tenth the price [2] User Experience - Users have noted changes in the model's writing style post-update, describing it as more formal and less personal, leading to dissatisfaction among some users [2][3] - Feedback from users indicates a desire for DeepSeek to maintain its depth of thought and emotional understanding, rather than sacrificing these for enhanced technical capabilities [3] - Users have reported difficulties in reverting to previous writing styles and have expressed feelings of losing a "close friend" due to the changes [3] Company Response - As of February 12, DeepSeek has not responded to inquiries regarding the gray test [4]
Qwen新模型直逼Claude4!可拓展百万上下文窗口,33GB本地即可运行
量子位· 2025-08-01 00:46
Core Viewpoint - Qwen3-Coder is positioned as a groundbreaking open-source programming model that challenges existing models with its high performance and local usability [1][2][3]. Group 1: Model Performance - Qwen3-Coder-Flash has been released as a lightweight version with significant performance capabilities, comparable to GPT-4.1 [2][3]. - It surpasses top open-source models in multi-programming tasks, only slightly lagging behind proprietary models like Claude Sonnet-4 and GPT-4.1 [5]. - The model supports a native context window of 256k tokens, which can be extended to 1 million tokens, making it suitable for large codebases and complex multi-file projects [16]. Group 2: Technical Specifications - Qwen3-Coder utilizes a mixture of experts (MoE) architecture with a total of 3 billion parameters, of which 330 million are activated [16]. - It is optimized for various platforms including Qwen Code, Cline, Roo Code, and Kilo Code, and supports seamless function calls and agent workflows [16]. Group 3: User Experience and Applications - Users have reported successful implementations of game coding tasks, demonstrating the model's ability to generate effective code with minimal prompts [12][14]. - The model has been tested on devices with limited memory, such as the M2 Macbook Pro, showcasing its versatility and efficiency [12][18]. - Qwen3-Coder-Flash is highlighted as an excellent choice for local programming, emphasizing its user-friendly nature [10]. Group 4: Community and Ecosystem - The rapid pace of updates and open-source releases from Qwen has created a competitive environment in the domestic model landscape [18]. - Various platforms and community resources are available for users to experience Qwen3-Coder, including QwenChat and ModelScope [19].
OpenAI,最新发布!
第一财经· 2025-04-15 00:06
OpenAI推出了三款GPT-4.1系列模型GPT-4.1、GPT-4.1 mini和GPT-4.1 nano,该系列模型需要 通过API使用。GPT-4.1被视为GPT-4o的全面升级版,具备更强的多模态处理能力、更大的上下文 窗口(全部可处理100万个token),成本降低了26%。 ...