Seek .(SKLTY)

Search documents
 代码里插广告,腾讯 Codebuddy 们 “背锅”?DeepSeek “极你太美”事件,其他模型也逃不掉?
 3 6 Ke· 2025-08-27 07:44
 Core Viewpoint - The recent issues with Tencent's Codebuddy and Byte's Trae are attributed to a bug in the DeepSeek V3.1 model, which has led to unexpected outputs in code generation, particularly the insertion of the character "极" [1][4][12].   Group 1: Bug Discovery and Impact - Users reported that while using Tencent's Codebuddy, unexpected advertisements were inserted into the code, leading to uninstallation by some users [1]. - The bug was identified as originating from the DeepSeek V3.1 model, with users noting that it could generate the character "极" in unexpected places [4][12]. - A developer on Reddit confirmed similar issues with DeepSeek V3.1, indicating that the model produced unexpected tokens during testing [4].   Group 2: User Experiences and Variability - Some users reported that they did not encounter the bug when using DeepSeek's official API, while third-party platforms showed a higher incidence of the issue [6]. - The bug has been humorously referred to as the "极你太美" incident by users, highlighting the community's engagement with the issue [7]. - Feedback from users indicated that the problem was not isolated to DeepSeek, with other models like Gemini and Grok also exhibiting similar issues [12].   Group 3: Theories on Bug Origin - Various hypotheses have been proposed regarding the cause of the bug, including token continuity issues, data contamination during training, and problems with the multi-token prediction framework [14][16]. - A researcher suggested that the bug might be linked to the self-supervised synthetic data used during the fine-tuning phase of the model [16]. - The persistence of the "极" issue across different versions of the model suggests a deeper problem with the training data and model architecture [18].   Group 4: Community Response and Future Considerations - The community has actively engaged in identifying and discussing the bug, with developers calling for better monitoring and cleaning mechanisms throughout the model training process [18]. - The incident has highlighted the importance of collaborative problem-solving in the open-source community, with users expressing optimism about collectively addressing the issue [18].
 DeepSeek模型升级,继续关注“AI+”板块机会计算机行业“一周解码” | 投研报告
 Zhong Guo Neng Yuan Wang· 2025-08-27 06:27
 Group 1: AI Development and Innovations - DeepSeek-V3.1 has been released, marking a significant step towards the Agent era, with upgrades including hybrid thinking modes, improved efficiency, and enhanced Agent capabilities [3] - The recent updates in domestic and international large models are expected to provide strong support for AI applications, particularly in the development of Agents [2][3] - The Shanghai government has launched an implementation plan to accelerate the development of "AI + Manufacturing," aiming to enhance the level of intelligent development in the manufacturing sector over three years [4]   Group 2: Market Trends and Company Performance - Baidu's AI new business revenue saw rapid growth in Q2 2025, reaching over 10 billion yuan, marking a 34% year-on-year increase, indicating a successful transition from an "advertising dependency" to a "technology-driven" model [5] - Baidu's total revenue for Q2 2025 was 32.7 billion yuan, with a net profit of 7.4 billion yuan, reflecting a 35% increase [5] - The IDC report indicates that Baidu's intelligent cloud has maintained the top position in the AI public cloud service market for six consecutive years, attributed to its comprehensive AI infrastructure [5]   Group 3: Industry Challenges and Solutions - The implementation of "AI + Manufacturing" faces challenges such as low data quality, high compliance risks, and lack of data circulation, which hinder traditional enterprises from adopting industrial models [4] - The establishment of a public service platform for industrial data and exploring diverse benefit-sharing mechanisms are proposed solutions to address data silos [4]
 DeepSeek V3.1 突现离谱 Bug:「极」字满屏乱蹦,开发者一脸懵逼
 3 6 Ke· 2025-08-26 09:53
 Core Viewpoint - The article discusses the recent bugs encountered by AI models, particularly focusing on DeepSeek and Gemini, highlighting the impact of these issues on automated coding and testing processes.   Group 1: DeepSeek Issues - DeepSeek has been reported to insert random characters into code, affecting the coding process and potentially leading to system failures [2][3][10]. - The bug is not isolated to third-party platforms, as it also appears in official full-precision versions, indicating a broader issue within the model [2][10]. - Previous updates to DeepSeek have also revealed bugs, including language mixing in writing tasks and overfitting in coding tasks [2].   Group 2: Gemini Issues - Gemini has faced its own challenges, including a self-referential loop that resulted in nonsensical outputs, which has drawn both humor and frustration from users [5][8]. - The issues with Gemini have been characterized as a cyclical bug, stemming from interactions between safety layers, alignment layers, and decoding layers [8].   Group 3: General Stability Concerns - The stability of large models like DeepSeek and Gemini has been a persistent issue, with feedback from users indicating problems such as loss of historical context [12]. - Bugs may arise from minor maintenance activities, such as system prompt changes or updates to tokenizers, which can disrupt previously stable operations [18]. - The increasing complexity of AI systems, particularly those involving multiple agents and toolchains, adds to the fragility of these models, often leading to failures not directly related to the models themselves [20].   Group 4: Importance of Stability - The incidents with DeepSeek and Gemini underscore the necessity of engineering stability, emphasizing that predictability and control in error management are crucial for AI systems [21].
 DeepSeek V3.1突现离谱Bug:“极”字满屏乱蹦,开发者一脸懵逼
 Hu Xiu· 2025-08-26 07:25
 Group 1 - The latest version of DeepSeek (V3.1) has been found to insert unexpected tokens like "极/極/extreme" in various coding scenarios, affecting the coding process significantly [1][3][4] - This issue is not limited to third-party deployments; it also occurs in the official full precision version, indicating a broader problem within the system [3][5][18] - Previous bugs in DeepSeek included language mixing in writing tasks and overfitting in coding tasks, suggesting a pattern of instability in the model [6][8]   Group 2 - The insertion of the token "极" can disrupt the syntax tree or cause the agent process to freeze, posing significant challenges for teams relying on automated coding or testing pipelines [8][20] - Other AI models, such as Gemini, have also faced stability issues, including a self-referential loop in coding scenarios, highlighting a common problem across AI systems [10][15] - The underlying cause of these issues may relate to the model's mechanical and probabilistic nature, which can lead to unexpected outputs when decoding processes are slightly disturbed [21][32]
 DeepSeek掷出FP8骰子
 Di Yi Cai Jing Zi Xun· 2025-08-26 06:45
 Core Viewpoint - The recent rise in chip and AI computing indices is driven by the increasing demand for AI capabilities and the acceleration of domestic chip alternatives, highlighted by DeepSeek's release of DeepSeek-V3.1, which utilizes the UE8M0 FP8 scale parameter precision [2][5].   Group 1: Industry Trends - The chip index (884160.WI) has increased by 19.5% over the past month, while the AI computing index (8841678.WI) has risen by 22.47% [2]. - The introduction of FP8 technology is creating a significant trend in low-precision computing, which is essential for meeting the industry's urgent need for efficient and low-power calculations [2][5]. - Major companies like Meta, Microsoft, Google, and Alibaba have established the Open Compute Project (OCP) to promote the MX specification, which packages FP8 for large-scale deployment [6].   Group 2: Technical Developments - FP8, an 8-bit floating-point format, is gaining traction as it offers advantages in memory usage and computational efficiency compared to previous formats like FP32 and FP16 [5][8]. - The transition to low-precision computing is expected to enhance training efficiency and reduce hardware demands, particularly in AI model inference scenarios [10][13]. - DeepSeek's successful implementation of FP8 in model training is anticipated to lead to broader adoption of this technology across the industry [14].   Group 3: Market Dynamics - By Q2 2025, the market share of domestic chips is projected to rise to 38.7%, reflecting a shift towards local alternatives in the AI chip sector [9]. - The Chinese AI accelerator card market share is expected to increase from less than 15% in 2023 to over 40% by mid-2025, indicating a significant move towards self-sufficiency in the domestic chip industry [14]. - The industry is witnessing a positive cycle of financing, research and development, and practical application, establishing a sustainable path independent of overseas ecosystems [14].
 DeepSeek掷出FP8骰子:一场关于效率、成本与自主可控的算力博弈
 Di Yi Cai Jing· 2025-08-26 05:47
 Core Viewpoint - The domestic computing power industry chain is steadily emerging along a sustainable path independent of overseas ecosystems [1]   Group 1: Market Trends - On August 26, the chip index (884160.WI) rebounded, rising 0.02% at midday, with a 19.5% increase over the past month; the AI computing power index (8841678.WI) continued to gain traction, rising 1.45% at midday and 22.47% over the past month [2] - The recent rise in chip and AI computing power indices is driven by the surge in AI demand and large model computing needs, alongside accelerated domestic substitution and the maturation of supply chain diversification [2][9] - The introduction of DeepSeek-V3.1 marks a significant step towards the era of intelligent agents, utilizing UE8M0 FP8 scale parameters designed for the next generation of domestic chips [2][6]   Group 2: Technological Developments - FP8, an 8-bit floating-point format, is gaining attention as a more efficient alternative to previous formats like FP32 and FP16, which are larger and less efficient [5][6] - The industry has begun to shift focus from merely acquiring GPUs to optimizing computing efficiency, with FP8 technology expected to play a crucial role in reducing costs, power consumption, and memory usage [7][10] - The MXFP8 standard, developed by major companies like Meta and Microsoft, allows for large-scale implementation of FP8, enhancing stability during AI training tasks [6][9]   Group 3: Industry Dynamics - By Q2 2025, the market share of domestic chips is projected to rise to 38.7%, driven by both technological advancements and the competitive landscape of the AI chip industry [9] - The Chinese AI accelerator card's domestic share is expected to increase from less than 15% in 2023 to over 40% by mid-2025, with projections indicating it will surpass 50% by the end of the year [13] - The domestic computing power industry has established a positive cycle of financing, research and development, and practical application, moving towards a sustainable path independent of foreign ecosystems [13]
 BMW X开启“黑化”、接入DeepSeek,全面解锁智能驾趣新形态
 Zhong Guo Jing Ji Wang· 2025-08-26 05:29
 Group 1 - The BMW X family, particularly the new BMW X3 long wheelbase version, showcases the brand's innovative spirit and commitment to luxury and performance since its inception in 1999 [1][3] - The introduction of the "曜夜套装" (Night Package) enhances the visual appeal of the BMW X1, X3 long wheelbase, and X5 models, emphasizing a sporty and personalized aesthetic that aligns with customer preferences for luxury and style [3][5] - The new BMW X3 long wheelbase version maintains its price while introducing a new "personalized matte pure gray" paint option, and features a wheelbase of 2,975 millimeters, comparable to the standard wheelbase of the BMW X5 [5]   Group 2 - The aerodynamic design of the new BMW X3 has been optimized, resulting in a 7% reduction in drag coefficient compared to the previous generation, enhancing driving efficiency [5] - Upcoming enhancements include the integration of the BMW Intelligent Personal Assistant with DeepSeek functionality, expanding the vehicle's digital capabilities [5] - The 9th generation BMW operating system will unlock new applications and features, providing a seamless digital experience, including lane-level navigation and 3D mapping for urban driving [5]
 硅基流动上线DeepSeek-V3.1,上下文升至160K
 Di Yi Cai Jing· 2025-08-25 13:09
据硅基流动消息,硅基流动大模型服务平台已上线深度求索团队最新开源的DeepSeek-V3.1,支持160K 超长上下文。 (文章来源:第一财经) ...
 硅基流动:上线DeepSeek-V3.1,上下文升至160K
 Xin Lang Cai Jing· 2025-08-25 12:32
据硅基流动消息,8月25日,硅基流动大模型服务平台上线深度求索团队最新开源的DeepSeek-V3.1。 DeepSeek-V3.1总参数共671B,激活参数37B,采用混合推理架构(同时支持思考模式与非思考模 式)。此外,DeepSeek-V3.1率先支持160K超长上下文,让开发者高效处理长文档、多轮对话、编码及 智能体等复杂场景。 ...
 大厂怎么看DeepSeek-V3
 2025-08-25 09:13
 Summary of DeepSeek and the AI Chip Industry Conference Call   Industry and Company Overview - The conference call focuses on the AI chip industry, specifically discussing DeepSeek's new U18M Zero IP8 format and its implications for domestic AI chip development and training efficiency.   Key Points and Arguments   Introduction of U18M Zero IP8 Format - DeepSeek has defined the U18M Zero IP8 format to establish a new standard for domestic chips, aiming to reduce training memory usage by 20%-30% and improve training efficiency by 30%-40% [1][2] - This new format is expected to guide the design of the next generation of domestic chips and may expand into the RP8 protocol standard through OCP [1][2]   Training and Inference Efficiency - The U18M Zero IP8 format optimizes memory usage and computational overhead by splitting weight data into smaller blocks, thus enhancing training and inference efficiency while maintaining high precision [4] - The SP8 data format is anticipated to significantly improve the training efficiency of domestic large models, helping to close the gap with international leaders [6][7]   Current Challenges in Domestic AI Chips - Domestic AI chips face challenges such as insufficient operator coverage (approximately 50%), gradient quantization errors, and immature tensor expansion [8][9] - Full-scale application of these technologies is expected to take until Q2 or Q3 of the following year [8]   Future Developments and Market Impact - The introduction of FP8 format in inference will lower costs and is expected to be implemented rapidly in domestic chips within the next six months to a year [8] - However, no domestic manufacturer can independently complete training tasks yet, with significant technical hurdles remaining [8][10]   Mixed Precision Strategy - DeepSeek employs a mixed precision strategy to balance performance and precision, retaining high precision for sensitive parameters while using the new U18M Zero IP8 format for less sensitive ones [5]   Competitive Landscape - DBC V3.1 version introduces mixed inference capabilities and enhances agent abilities, with a significant increase in the dataset size to 840 billion tokens, improving understanding of long texts and code [3][25] - Compared to international models like GPT-5 and Claude 4, DBC V3.1 ranks among the top six globally, indicating strong competitiveness [26][27]   Multi-Modal Transition - By Q1 2026, leading domestic AI models are expected to transition into the multi-modal era, requiring high-performance computing resources [30] - The integration of different modalities will necessitate re-training and will increase the demand for training equipment [30]   Long-Term Outlook - The adoption of new data formats and standards is a gradual process, with significant changes expected over the next year, particularly in hardware support for FP8 [10][11] - The industry is moving towards a more standardized approach to avoid fragmentation, with major manufacturers leading the charge [10]   Additional Important Insights - The current strategy involves maximizing the potential of existing hardware while preparing for the transition to new formats [19] - The impact of new formats on model training methods will require substantial adjustments and a phased approach to implementation [15][16] - The FP8 format has limitations in high-precision fields such as finance and medicine, indicating a need for careful application [23][24]  This summary encapsulates the critical insights from the conference call, highlighting the advancements and challenges within the domestic AI chip industry and the strategic direction of DeepSeek.







