DeepSeek
Search documents
X @The Wall Street Journal
The Wall Street Journal· 2025-09-30 16:43
Chinese AI developer DeepSeek has released an experimental large language model that it says has much better training and reasoning, and which can be operated at a lower cost https://t.co/cIvUC7UsUJ ...
Trump fails to reach a deal to avert a shutdown, gold and silver power to fresh highs
Youtube· 2025-09-30 13:37
[Music] Welcome to Morning Brief Market Sunrise. I'm Raman Karamali live from Yahoo Finance Studios in London. It's Tuesday, 30th September.Coming up on the show, the vice president signals that a government shutdown looks like it will kick in from tomorrow. We'll look into whether China can threaten the US's dominance in the world of artificial intelligence. and I'll tell you all about a stock that surged over 1,700% in less than 24 hours.So, grab your coffee and let's own the morning. [Music] I think we'r ...
AI日报丨再套现超4000万美元!黄仁勋持续减持英伟达,看好OpenAI称其或成为下一个万亿美元巨头
美股研究社· 2025-09-30 12:06
Core Insights - The article discusses the rapid advancements in artificial intelligence (AI) technology and its implications for investment opportunities in AI-related companies and market trends [3]. Group 1: AI Model Developments - The latest GLM-4.6 model by Zhiyuan has been launched, showing a 27% improvement in coding capabilities compared to its predecessor GLM-4.5, excelling in real programming tasks [5]. - DeepSeek introduced a "sparse attention" mechanism in its experimental AI model, DeepSeek-V3.1-Exp, aimed at enhancing training and inference efficiency in long contexts [5]. - Anthropic released its new AI model, Claude Sonnet 4.5, claiming it to be the "best coding model globally," with significant improvements in reliability and performance across various professional fields [6]. Group 2: Market Trends and Predictions - OpenAI has launched an "Instant Checkout" feature in ChatGPT, allowing users to purchase items directly through the platform, initially supporting single-item purchases [7]. - NVIDIA's CEO Jensen Huang sold 225,000 shares of NVIDIA stock for over $40 million, expressing confidence in AI's future, particularly in OpenAI's potential to become a trillion-dollar company [7][8]. - Huang predicts that OpenAI could achieve unprecedented growth, similar to other tech giants like Meta and Google, by offering both consumer and enterprise services [8]. Group 3: Copyright and Content Usage - OpenAI's Sora AI video generator will default to using copyrighted content, with an option for studios to opt-out, indicating a shift in content usage policies [12]. - The company has been in discussions with talent agencies and studios regarding the opt-out mechanism, ensuring that copyrighted characters do not appear in its AI tools [13].
DeepSeek-V3.2上线国家超算互联网 开发者可免费下载
Sou Hu Cai Jing· 2025-09-30 11:58
Core Insights - DeepSeek has launched the experimental version DeepSeek-V3.2-Exp, which introduces the DeepSeekSparseAttention mechanism to enhance training and inference efficiency for long texts [1] - The AI community now hosts over 700 high-quality open-source models, providing developers with various services including API calls and distributed training [2] Group 1 - DeepSeek-V3.2-Exp is available for free download in the national supercomputing internet AI community, allowing enterprises and developers to quickly develop applications [1] - The new model is a step towards a next-generation architecture, building on the previous version V3.1-Terminus [1] - DeepSeekSparseAttention achieves significant improvements in long text training and inference efficiency with minimal impact on model output [1] Group 2 - The supercomputing internet AI community features a collection of over 700 models, including various versions of the DeepSeek series [2] - Developers can utilize the community for a range of services, including online inference dialogue and model fine-tuning [2] - The community supports a comprehensive MaaS (Model as a Service) offering for developers [2]
互联网行业 2025 年 10 月投资策略:港美股巨头估值差异快速收敛,国内巨头加码投入 AI
Guoxin Securities· 2025-09-30 11:32
Market Overview - The Hang Seng Technology Index rose by 9.2% in September, outperforming the Nasdaq Index which increased by 4.8% [11][12] - Key companies in the internet sector, such as Baidu, Alibaba, and Meituan, showed significant stock performance, with Baidu and Alibaba gaining 44.4% and 43.9% respectively, outperforming the Hang Seng Technology Index by 35.2 percentage points and 34.7 percentage points [14] AI Developments - Major advancements in artificial intelligence were reported, including Google's release of the Nano Banana Prompt template and the AP2 protocol, which enhances AI-driven payment systems [19][20] - OpenAI announced the opening of five new data centers in the U.S. as part of a $400 billion investment to enhance its AI capabilities [23] - Meta launched the Code World Model (CWM) and the AI video generation platform Vibes, showcasing significant improvements in AI-driven content creation [25][26] Industry Dynamics - The gaming sector saw the approval of new domestic game licenses, including titles from MiHoYo and Tencent, indicating a recovery in the gaming market [46][47] - In fintech, payment institutions reported a 6% year-on-year increase in reserve funds, reflecting growth in the financial technology sector [48] - The short video industry is facing increased scrutiny, with the National Copyright Administration focusing on combating copyright infringement [51] E-commerce Trends - Douyin's e-commerce platform reported a 49% year-on-year growth in GMV, highlighting the rapid expansion of social commerce [56] - Alibaba's Lazada has integrated with Tmall, allowing brands to easily enter Southeast Asian markets, indicating a strategic move towards regional expansion [57] Company-Specific Insights - Tencent, Alibaba, and Kuaishou are identified as key players aggressively investing in AI, with expectations of short-term profit impacts but long-term stock price growth driven by AI advancements [2] - Baidu's AI search platform has regained the top position in monthly active users in China, reflecting its strong market presence [38] - Kuaishou launched its AI digital human feature, enabling users to create videos with AI-generated characters, further enhancing its content creation capabilities [40]
深夜炸场!Claude Sonnet 4.5上线,自主编程30小时,网友实测:一次调用重构代码库,新增3000行代码却运行失败
AI科技大本营· 2025-09-30 10:24
Core Viewpoint - The article discusses the release of Claude Sonnet 4.5 by Anthropic, highlighting its advancements in coding capabilities and safety features, positioning it as a leading AI model in the market [1][3][10]. Group 1: Model Performance - Claude Sonnet 4.5 has shown significant improvements in coding tasks, achieving over 30 hours of sustained focus in complex multi-step tasks, compared to approximately 7 hours for Opus 4 [3]. - In the OSWorld evaluation, Sonnet 4.5 scored 61.4%, a notable increase from Sonnet 4's 42.2% [6]. - The model outperformed competitors like GPT-5 and Gemini 2.5 Pro in various tests, including Agentic coding and terminal coding [7]. Group 2: Safety and Alignment - Claude Sonnet 4.5 is touted as the most "aligned" model to date, having undergone extensive safety training to mitigate risks associated with AI-generated code [10]. - The model received a low score in automated behavior audits, indicating a lower risk of misalignment behaviors such as deception and power-seeking [11]. - It adheres to AI Safety Level 3 (ASL-3) standards, incorporating classifiers to filter dangerous inputs and outputs, particularly in sensitive areas like CBRN [13]. Group 3: Developer Tools and Features - Anthropic has introduced several updates to Claude Code, including a native VS Code plugin for real-time code modification tracking [15]. - The new checkpoint feature allows developers to automatically save code states before modifications, enabling easy rollback to previous versions [21]. - The Claude Agent SDK has been launched, allowing developers to create custom agent experiences and manage long tasks effectively [19]. Group 4: Market Context and Competition - The article notes a competitive landscape with other AI models like DeepSeek V3.2 also making significant advancements, including a 50% reduction in API costs [36]. - There is an ongoing trend of rapid innovation in AI tools, with companies like OpenAI planning new product releases to stay competitive [34].
华为昇腾、寒武纪宣布适配DeepSeek最新模型
2 1 Shi Ji Jing Ji Bao Dao· 2025-09-30 10:19
Core Insights - DeepSeek officially launched the DeepSeek-V3.2-Exp model on September 29, introducing the self-developed DeepSeek Sparse Attention (DSA) mechanism, which optimizes training and inference efficiency for long texts [1][7] - The release of the new model has led to a significant reduction in service costs, with DeepSeek API prices dropping by over 50% [2][10] - The open-sourcing of the TileLang version operator has garnered considerable attention within the industry [3] Technical Innovations - The DSA mechanism is an optimization technique for the Transformer architecture, addressing the computational complexity associated with traditional dense attention mechanisms, which grow exponentially with text length [6][7] - The V3.2-Exp model has achieved substantial improvements in training and inference efficiency for long texts while maintaining performance levels comparable to the previous V3.1-Terminus model [7] Market Impact - DeepSeek has made the V3.2-Exp model fully open-source on platforms like HuggingFace and ModelScope, with related research papers also published [5] - The collaboration with domestic hardware providers such as Huawei, Cambricon, and Haiguang demonstrates the synergy between AI software and hardware ecosystems in China [11][12] - The adoption of TileLang, a programming language designed to simplify GPU operator development, is expected to enhance the efficiency of AI operator development significantly [12]
国产芯片再迎利好!智谱发布新一代大模型 全面适配寒武纪和摩尔线程芯片!
Zheng Quan Shi Bao· 2025-09-30 09:24
Core Insights - The release of the new generation large model GLM-4.6 by the domestic unicorn company Zhipu marks a significant advancement in programming capabilities, surpassing the latest model DeepSeek-V3.2-Exp and aligning with global leader Claude Sonnet 4 in various benchmarks [2][3][5] Model Performance - GLM-4.6 has achieved substantial improvements in core capabilities such as Agentic Coding, long context processing, reasoning ability, information retrieval, text generation, and intelligent agent applications [3][4] - The context window has been increased from 128K to 200K, allowing for better handling of longer code and agent tasks [3] - The model's reasoning ability has been enhanced, supporting tool invocation during reasoning processes [3] - In practical programming tasks, GLM-4.6 outperformed Claude Sonnet 4 in 74 real-world scenarios [3] Token Efficiency and User Experience - GLM-4.6 has improved token efficiency, consuming over 30% fewer tokens compared to GLM-4.5 for similar tasks [4] - The model enhances the usability of presentations and the aesthetic quality of front-end code, improving layout design [4] Open Source and Ecosystem Development - GLM-4.6 is set to be open-sourced on platforms like Hugging Face and ModelScope under a permissive MIT license, positioning it as one of the strongest general-purpose models in the global open-source ecosystem [5] - The model has been adapted for use on domestic AI chips from companies like Cambrian and Moore Threads, facilitating a collaborative ecosystem between domestic large models and chips [5][6] Industry Collaboration and Future Prospects - The rapid adaptation of GLM-4.6 by domestic chip manufacturers indicates a deepening collaboration within China's AI industry, moving towards a unified ecosystem of software and hardware [6] - Zhipu has initiated A-share listing guidance, aiming to become the first publicly listed company focused on domestic AI large models, signaling a shift from technological competition to commercialization and capital operation [6]
国庆前搞大事!DeepSeek 新模型速度翻 3 倍,API 直接半价!网友调侃:这假没法休了
程序员的那些事· 2025-09-30 08:45
Core Viewpoint - DeepSeek has released its experimental version DeepSeek-V3.2-Exp, which significantly improves long text training and inference efficiency while maintaining output quality compared to its predecessor V3.1-Terminus [5][6]. Group 1: Model Performance - DeepSeek-V3.2-Exp introduces DeepSeek Sparse Attention (DSA), achieving a 2-3 times increase in long text inference speed and a 30%-40% reduction in memory usage, along with a 50% improvement in training efficiency [5]. - In benchmark tests, DeepSeek-V3.2-Exp performs comparably to V3.1-Terminus, with scores of 85.0 in MMLU-Pro and a slight improvement in AIME 2025, scoring 89.3 compared to 88.4 [5][6]. Group 2: Pricing Adjustments - Due to the reduced service costs associated with the new model, DeepSeek has lowered its API pricing by over 50%, with input prices dropping from 0.5 yuan to 0.2 yuan per million tokens for cache hits, and from 4 yuan to 2 yuan for cache misses. Output prices have decreased from 12 yuan to 3 yuan per million tokens [7].
深夜炸场,Claude Sonnet 4.5上线,自主编程30小时,网友实测:一次调用重构代码库,新增3000行代码却运行失败
3 6 Ke· 2025-09-30 08:43
Core Insights - Anthropic has launched the Claude Sonnet 4.5, claiming it to be the "best coding model in the world" with significant improvements over its predecessor, Opus 4 [1][2]. Performance Enhancements - Claude Sonnet 4.5 can autonomously run for over 30 hours on complex multi-step tasks, a substantial increase from the 7 hours of Opus 4 [2]. - In the OSWorld evaluation, Sonnet 4.5 achieved a score of 61.4%, up from 42.2% of Sonnet 4, indicating a marked improvement in computer operation capabilities [4]. - The model outperformed competitors like GPT-5 and Gemini 2.5 Pro in various tests, including Agentic Coding and Agentic Tool Use [6][7]. Safety and Alignment - Claude Sonnet 4.5 is touted as the most "aligned" model to date, having undergone extensive safety training to mitigate issues like "hallucination" and "deception" [9][10]. - It has received an AI Safety Level 3 (ASL-3) rating, equipped with protective measures against dangerous inputs and outputs, particularly in sensitive areas like CBRN [12]. Developer Tools and Features - The update includes a native VS Code plugin for Claude Code, allowing real-time code modification tracking and inline diffs [13]. - A new checkpoint feature enables developers to save code states automatically, facilitating easier exploration and iteration during complex tasks [18]. - Claude API has been enhanced with context editing and memory tools, enabling the handling of longer and more complex tasks [20]. Market Response and Competition - Developers have expressed surprise at the capabilities of Claude Sonnet 4.5, with reports of it autonomously generating complete projects [21][22]. - The competitive landscape is intensifying, with other companies like DeepSeek also releasing new models that significantly reduce inference costs [29][32].