Workflow
Seek .(SKLTY)
icon
Search documents
DeepSeek推出DeepSeekMath V2 模型
Mei Ri Jing Ji Xin Wen· 2025-11-27 13:50
Core Viewpoint - DeepSeek has launched a new mathematical reasoning model, DeepSeekMath-V2, which features a self-verifying training framework, marking a significant advancement in the development of reliable mathematical intelligence systems [1] Group 1: Model Development - DeepSeekMath-V2 is built on the foundation of DeepSeek-V3.2-Exp-Base and utilizes an LLM verifier to automatically review generated mathematical proofs [1] - The model continuously optimizes its performance using high-difficulty samples [1] Group 2: Performance Achievements - The model has achieved gold medal levels in both IMO2025 and CMO2024 competitions [1] - In the Putnam 2024 competition, the model scored 118 out of 120 [1] Group 3: Open Source Initiative - The model's code and weights have been open-sourced and are available on Hugging Face and GitHub platforms [1]
DeepSeek推出DeepSeekMath V2模型
Core Insights - DeepSeek launched a new mathematical reasoning model, DeepSeekMath-V2, on November 27, featuring a self-verifying training framework [1] Group 1 - The model is built on DeepSeek-V3.2-Exp-Base and utilizes an LLM verifier to automatically review generated mathematical proofs [1] - DeepSeekMath-V2 continuously optimizes its performance using high-difficulty samples [1]
DeepSeek-OCR实现光学压缩 光计算可为大模型“减负”
3 6 Ke· 2025-11-27 08:49
Group 1 - The core idea of the article revolves around the concept of optical compression of context using visual tokens to address the computational challenges faced by large language models as context window sizes increase [2][3] - DeepSeek's research demonstrates that visual compression can maintain high accuracy, achieving a compression rate of 10 times while retaining 96.5% precision [3][4] - The DeepEncoder module is identified as the key engine for achieving optical compression, utilizing components such as the SAM module, convolutional blocks, and CLIP to effectively compress data from 1000 text tokens to 100 visual tokens [5][7] Group 2 - Optical computing is highlighted as a more suitable solution for context compression due to its ability to handle the information aggregation processes inherent in ViT and CNN structures more efficiently than traditional electronic chips [7][9] - The advantages of optical computing include simplified computation processes and scalability, allowing for enhanced parallelism and dynamic programmability, which are crucial for long text reasoning tasks [9][11] - Future plans involve exploring algorithms based on human memory mechanisms and developing specialized hardware for context compression and AI tasks, aiming to connect optical computing with large models [13][15] Group 3 - The article emphasizes the need for optical computing to overcome the limitations of traditional GPUs, particularly in terms of memory constraints and power density, as large models become more prevalent [15] - The company aims to build a next-generation disruptive platform system for large-scale AI computing, providing comprehensive optical computing solutions across various scenarios [15]
零代码落地!DeepSeek+ChatWiki,打造企业专属智能客服
Sou Hu Cai Jing· 2025-11-27 02:51
Core Insights - The article highlights the challenges faced by customer service teams, including high workload and inefficiencies in handling inquiries [2] - It introduces DeepSeek and ChatWiki as a solution for building an efficient AI customer service system without the need for complex development [2] Group 1: AI Customer Service Solution - DeepSeek's strong semantic understanding captures customer intent accurately, while ChatWiki builds a private knowledge base using RAG technology, ensuring responses are both warm and professional [2] - The entire process is zero-code, allowing deployment within one day, significantly lowering technical barriers and time costs for businesses [2] Group 2: Integration and Setup - ChatWiki is compatible with over 20 mainstream AI models, enabling easy integration for businesses without requiring specialized developers [3] - The knowledge base can be built by uploading various document formats, with ChatWiki handling text cleaning and conversion automatically [4] Group 3: AI Bot Creation - After setting up the knowledge base, businesses can create a personalized AI bot by configuring its name and welcome message, linking it to the knowledge base for immediate deployment [6] - DeepSeek extracts relevant information from the knowledge base to provide coherent responses, improving the quality of customer interactions [6] Group 4: Multi-Channel Support - The AI bot can be integrated across multiple platforms, including H5 links, company websites, and messaging apps, ensuring a consistent service experience for customers [8] - An education platform reported a 100% response rate for nighttime inquiries and doubled conversion rates after integration, demonstrating the commercial value of the solution [8] Group 5: Role Management - ChatWiki offers detailed permission management, allowing administrators to assign roles and control access to knowledge base editing and bot configuration, enhancing data security and team collaboration [10] - This feature supports complex organizational structures while ensuring the safety of core business data [10]
拐点信号显现?国产AI再迎DeepSeek时刻!技术突破+业绩验证,科创人工智能ETF(589520)盘中上探3.6%!
Xin Lang Ji Jin· 2025-11-25 11:49
Core Insights - The AI concept stocks are actively performing, with the domestic AI industry chain-focused ETF (589520) showing a price increase of 3.61% intraday and closing up 2.17% on November 25, with a total trading volume of 35.94 million yuan, indicating a shift from a weak to a strong short-term trend [1][3] Group 1: ETF Performance - Over 80% of the 30 constituent stocks of the ETF closed in the green, with 40% of the stocks rising over 2%, led by Lingyun Technology with a gain of over 10% [3][4] - The top-performing stocks include: - Mikeling: 10.18% increase, total market value of 18.9 billion yuan, trading volume of 1.34 billion yuan - Haitai Ruisheng: 9.29% increase, total market value of 7.2 billion yuan, trading volume of 854 million yuan - Hengxuan Technology: 6.91% increase, total market value of 36.34 billion yuan, trading volume of 1.2014 billion yuan [4] Group 2: Market Dynamics - The launch of Ant Group's AI assistant "Lingguang" has garnered significant attention, achieving over 2 million downloads within six days, reflecting a rapid acceleration in domestic AI applications [5] - The AI computing power sector faced a downturn earlier this year due to concerns over low-cost models, but this has now become a pivotal point for domestic AI advancements, leading to a rebound in the market [5] Group 3: Strategic Opportunities - The current period is identified as a "golden window" for the domestic AI sector, driven by: 1. Policy support from the new five-year plan emphasizing technological self-reliance [5] 2. Strong earnings performance, with 20 out of 30 ETF constituent companies reporting profits and 22 showing year-on-year net profit growth [5] 3. External pressures necessitating self-sufficiency in AI technology amid geopolitical tensions [5][7] Group 4: Industry Focus - The ETF and its associated funds are heavily invested in the domestic AI industry chain, with over 70% of the top ten holdings concentrated in semiconductor and AI-related sectors, indicating a strong offensive strategy [7]
千问App一周下载破千万,超越DeepSeek成为增长最快的AI应用
Guan Cha Zhe Wang· 2025-11-24 05:17
Core Insights - Alibaba's "Qianwen" project has officially launched, marking its entry into the AI to C market, and has quickly become the fastest-growing AI application in history, surpassing competitors like ChatGPT and DeepSeek [4][5][9] Group 1: Market Performance - Following the announcement of Qianwen, Alibaba's stock surged by 4.13% by midday [3] - The Qianwen app reached the fourth position on the Apple App Store's free applications chart within a day of its public beta launch, causing server congestion due to high traffic [5][6] - By November 19, just two days after its launch, Qianwen climbed to the third position on the App Store [6] Group 2: Competitive Landscape - Qianwen's download speed has significantly outpaced other popular AI applications, achieving over 10 million downloads faster than ChatGPT and DeepSeek [7][8] - The Qwen model, which powers Qianwen, has become a leading open-source model globally, with over 600 million downloads, and is recognized for its superior performance compared to competitors like Llama and DeepSeek [9] Group 3: Strategic Vision - Alibaba views Qianwen as a critical component in the "AI era future battle," aiming to establish a consumer-facing AI entry point [10] - Analysts suggest that Qianwen's initial success is just the beginning, with potential for further growth through subscription models and integration with Alibaba's other services [10] - The app is positioned as an "Agentic AI" capable of understanding and executing complex tasks, indicating a shift from passive AI tools to proactive AI agents [11]
两部门发文,DeepSeek、Kimi、豆包等或将入围
Core Points - The National Internet Information Office and the Ministry of Public Security released a draft regulation on personal information protection for large internet platforms, establishing criteria for identifying such platforms and their obligations for personal information protection [1][3] - The draft regulation aligns with previous regulations and emphasizes the principle that greater capabilities entail greater responsibilities in digital economy regulation [1][3] Group 1: Identification Criteria for Large Platforms - Large platforms are identified based on having over 50 million registered users or over 10 million monthly active users, providing significant network services, and handling data that could impact national security and economic operations if compromised [5][6] - Traditional internet platforms like Tencent, Alibaba, and ByteDance, as well as emerging AI companies and smart device manufacturers, are likely to fall under this regulation [3][6] Group 2: Compliance and Reporting Requirements - Large platforms must appoint a personal information protection officer and establish a dedicated team to manage personal information protection, including creating internal management systems and emergency response plans [9][10] - The draft regulation requires platforms to publish annual social responsibility reports on personal information protection, addressing previous shortcomings in compliance and transparency [9][10] Group 3: Independent Supervision Mechanism - The draft regulation proposes the establishment of independent supervisory committees composed mainly of external members to oversee personal information protection compliance [12][13] - These committees will have specific responsibilities, including monitoring compliance systems, evaluating the impact of personal information protection measures, and maintaining regular communication with users [13][14]
邓海清:DeepSeek为中国创新驱动要素转型提供了非常强的基础
Sou Hu Cai Jing· 2025-11-23 10:15
Group 1 - The core viewpoint is that the stock market in 2025 will experience a recovery in confidence similar to previous bull markets with significant gains, driven primarily by a shift towards innovation, particularly in AI [1][3] - The current market is characterized as an idealistic market rather than a utilitarian one, indicating that investor confidence is not primarily based on tracking financial reports or order volumes [3] Group 2 - The emergence of DeepSeek's results has marked a transition for China from a factor-driven growth model to an innovation-driven growth model, highlighting the potential for internationally competitive products [3] - The upcoming bull market is referred to as a "mental bull market," emphasizing the importance of future industries and innovation as the main themes driving investor sentiment [3]
DeepSeek带来紧迫感,蚂蚁推“灵光”竞速AGI战场
Di Yi Cai Jing· 2025-11-21 10:40
Core Insights - Ant Group is actively entering the AI assistant market with its newly launched multimodal AI assistant "Lingguang," which has already surpassed 500,000 downloads, indicating a strong strategic push in the AGI space [1][2][3] - The excitement and urgency within Ant Group were significantly influenced by the early success of DeepSeek, prompting the company to rapidly assemble resources and make strategic decisions regarding AI development [2][3] - Ant Group aims to create a national-level application in the AGI era, emphasizing the importance of exploring various strategic avenues rather than solely focusing on direct competition with existing products [3][6] Company Strategy - Ant Group has established a relatively independent AGI organization with over 200 members to focus on AI development since March [1] - The company is prioritizing the enhancement of natural language interaction and reducing the barriers for users to engage with AI technologies through Lingguang [5][6] - Ant Group believes that the commercial viability of AI applications will emerge naturally as user value and engagement increase, rather than being a primary focus at this stage [6] Market Context - The AI market in China is still in its early stages, with no product achieving over 100 million daily active users, despite significant investments from major internet companies [2][6] - The current landscape is characterized by a lack of clear market leaders, and companies are exploring various opportunities to identify potential breakthroughs in AI applications [2][5] - Ant Group's approach to AI development is not solely about competing with specific products but rather about understanding the evolving capabilities of models and user needs over time [6][7]
DeepSeek悄悄开源LPLB:用线性规划解决MoE负载不均
3 6 Ke· 2025-11-20 23:53
Core Insights - DeepSeek has launched a new code repository called LPLB on GitHub, which aims to address the bottlenecks of correctness and throughput in model training [1][4] - The project currently has limited visibility, with fewer than 200 stars on GitHub, indicating a need for more attention [1] Project Overview - LPLB stands for Linear-Programming-Based Load Balancer, designed to optimize load balancing in machine learning models [3] - The project is still in the early research phase, with performance improvements under evaluation [7] Mechanism of LPLB - LPLB implements dynamic load balancing through three main steps: dynamic reordering of experts, constructing replicas, and solving optimal token allocation for each batch [4] - The mechanism utilizes a built-in linear programming solver and NVIDIA's cuSolverDx and cuBLASDx libraries for efficient linear algebra operations [4][10] Comparison with EPLB - LPLB extends the capabilities of EPLB (Expert Parallel Load Balancer) by focusing on dynamic fluctuations in load, while EPLB primarily addresses static imbalances [8] Key Features - LPLB introduces redundant experts and edge capacity definitions to facilitate token redistribution and minimize load imbalance among experts [9] - The communication optimization leverages NVLINK and NVSHMEM to reduce overhead compared to traditional methods [10] Limitations - Current limitations include ignoring nonlinear computation costs and potential delays in solving optimization problems, particularly for small batch sizes [11][12] - In extreme load imbalance scenarios, LPLB may not perform as well as EPLB due to its allocation strategy [12] Typical Topologies - LPLB allows for various topological configurations, such as Cube, Hypercube, and Torus, to define the distribution of expert replicas [13] Conclusion - The LPLB library aims to solve the "bottleneck effect" in large model training, where the training speed is limited by the slowest GPU [14] - It innovatively employs linear programming for real-time optimal allocation and utilizes NVSHMEM technology to overcome communication bottlenecks, making it a valuable resource for developers working on MoE architecture training acceleration [14]