DeepSeek
Search documents
DeepSeek 开源 TileLang 与 CUDA 算子:AI 底层国产替代的关键尝试
小熊跑的快· 2025-09-30 01:11
Core Viewpoint - DeepSeek's release of TileLang and CUDA operator versions represents a significant step towards achieving "independence and control" in AI foundational technology, particularly in the GPU operator development field, addressing issues of technical autonomy, domestic hardware compatibility, ecological collaboration, and innovation efficiency [2][11]. Group 1: Breaking CUDA Monopoly - The dominance of CUDA, a closed-source platform led by NVIDIA, poses risks of technological dependency for domestic developers, limiting their ability to customize operators for new model research [2][3]. - Domestic GPUs, despite improving in computational power, face high migration costs due to the lack of compatible operator libraries and development tools with CUDA [3][5]. Group 2: Lowering Barriers for Domestic Hardware - DeepSeek's open-source solution, TileLang, allows developers to quickly validate operator logic without relying on CUDA, thus reducing dependency on NVIDIA [4][6]. - The dual-version approach provides a precision baseline for domestic platforms, facilitating the verification of operator implementations and lowering debugging costs [4][6]. Group 3: Activating Open Source Community Collaboration - The success of domestic alternatives relies on ecological collaboration, where DeepSeek's open-source initiative encourages community participation in developing new operators [7][8]. - Researchers can quickly develop and share new operator prototypes using TileLang, which can then be adapted by domestic hardware manufacturers [8]. Group 4: Accelerating Domestic Research Pathways - The reliance on CUDA and its tools can hinder innovation in cutting-edge fields like large models and multi-modal research, creating an "optimization black box" [9][10]. - DeepSeek's dual-version operators provide a pathway for domestic teams to innovate without the constraints of CUDA compatibility and licensing issues [10][11]. Group 5: From Single Point Replacement to Ecological Breakthrough - DeepSeek's actions signify a shift from passive following to active construction in the domestic AI foundational technology stack, addressing the challenges of high barriers, long cycles, and adaptation difficulties in GPU operator development [11]. - The approach of using open-source to break monopolies, abstracting complexities, and fostering collaboration may become a crucial paradigm for domestic alternatives in the AI foundational technology sector [11].
9月30日国际晨讯 | 现货黄金价格升破3840美元再创新高 美国关键经济数据或延迟发布
Sou Hu Cai Jing· 2025-09-30 01:09
Market Review - Spot gold prices have surpassed $3840 per ounce, reaching a new high [6] - On September 29, US major stock indices experienced slight gains, with the Dow Jones up 0.15% at 46316.07 points, S&P 500 up 0.26% at 6661.21 points, and Nasdaq up 0.48% at 22591.15 points [6] - European stock indices also saw minor increases, with Germany's DAX up 0.02% at 23745.06 points, France's CAC40 up 0.13% at 7880.87 points, and the UK's FTSE 100 up 0.16% at 9299.84 points [6] International Macro - US President Trump met with congressional leaders on September 29 to discuss avoiding a government shutdown, with significant disagreements noted by Senate Democratic leader Chuck Schumer [7] - The US federal government is set to run out of funding by midnight on September 30, risking a government shutdown if no agreement is reached on funding legislation [7] - The Bureau of Labor Statistics (BLS) has announced that it will halt data collection and not release planned reports, including the monthly non-farm payroll report, if funding is interrupted [7] Corporate News - On September 29, DeepSeek-V3.2-Exp model was officially released and open-sourced on the Hugging Face platform, introducing a sparse attention mechanism for improved efficiency in training and inference of long texts [8] - OpenAI plans to release the new Sora 2 video generator as a standalone application, which may generate videos containing copyrighted content unless rights holders opt out [8] Institutional Views - Goldman Sachs analysts predict that global stock markets are likely to continue rising until the end of the year, supported by strong US economic performance and favorable stock valuations [9] - The team led by Christian Mueller-Glissmann has upgraded the global stock market allocation rating to "overweight" for the next three months, suggesting that stocks typically perform well in the later stages of economic slowdown with strong policy support [9]
DeepSeek突然拥抱国产GPU语言!TileLang对标CUDA替代Triton,华为昇腾Day0官宣支持适配
量子位· 2025-09-30 00:57
Core Viewpoint - The article highlights the significance of TileLang, a domain-specific language for GPU kernel development, which has been adopted by DeepSeek in its v3.2 update, showcasing its performance advantages over traditional methods like Flash Attention 2 [1][6][26]. Group 1: TileLang Overview - TileLang is designed to simplify the development of high-performance GPU/CPU kernels, comparable to NVIDIA's CUDA, and is recommended by DeepSeek for experiments due to its debugging and rapid iteration advantages [6][10]. - The language allows developers to write efficient code with significantly reduced lines, achieving performance parity with existing implementations [5][8]. - TileLang's development is led by a team from Peking University, including key figures such as Wang Lei and Dong Yuqi [15][19]. Group 2: DeepSeek's Adoption of TileLang - DeepSeek's choice to use TileLang was first showcased at the Beijing Zhiyuan Conference in June, where its potential for faster operator implementation was discussed [10][11]. - The integration of TileLang has been recognized by industry leaders, including Huawei, which announced support for the language [7][4]. - DeepSeek's v3.2 release demonstrates that TileLang can effectively be used for model training, validating its capabilities in real-world applications [34][26]. Group 3: Performance and Technical Aspects - TileLang provides three programming interfaces catering to different developer expertise levels, from beginners to performance-focused experts [20][21][23]. - The language's architecture allows for decoupling scheduling space from data flow, enabling more efficient optimization by the compiler [19]. - DeepSeek's implementation of TileLang has resulted in significant performance improvements, with claims of achieving a 30% speed increase over traditional methods [5][27].
9月30日投资早报|领益智造筹划发行H股并在香港联交所上市,英联股份前三季度净利同比预增1531.13%—1672.97%,今日两只新股上市
Xin Lang Cai Jing· 2025-09-30 00:43
Market Overview - On September 29, 2025, A-shares saw all three major indices rise, with the Shanghai Composite Index closing at 3862.53 points, up 0.9% [1] - The Shenzhen Component Index closed at 13479.43 points, up 2.05%, and the ChiNext Index closed at 3238.01 points, up 2.74% [1] - The total trading volume in the Shanghai and Shenzhen markets was 2.16 trillion yuan, an increase of 100 billion yuan from the previous trading day [1] - Hong Kong stocks also experienced gains, with the Hang Seng Index rising 1.89% to 26622.88 points and a total trading volume of 3090.96 million HKD [1] - In the US market, all three major indices closed higher, with the Dow Jones up 0.15% to 46316.07 points and the Nasdaq Composite up 0.48% to 22591.15 points [1] New IPOs - Two new stocks were listed on September 29, 2025, with no new stock subscriptions [1] - Yunhan Xincheng, listed on the ChiNext with a stock code of 301563, had an issue price of 27 yuan per share and a price-to-earnings ratio of 20.91 [1] - The company focuses on electronic component distribution and industrial internet integration, providing a full range of services from product selection to technical support [1] Policy Developments - The National Development and Reform Commission (NDRC) is actively promoting a new type of policy financial tool with a scale of 500 billion yuan to support effective investment in specific projects [3] - The NDRC aims to ensure that funds from this financial tool are quickly allocated to projects to promote stable economic development [3] Industry Licensing - The Ministry of Industry and Information Technology has issued a satellite mobile communication business license to China Mobile, allowing it to engage in satellite communication services [4] - This move follows similar licensing for China Telecom and China Unicom, enhancing emergency communication and services in remote areas [4] Smart Technology Promotion - The NDRC is working to expand the market for smart terminals and intelligent entities by focusing on policy guidance, technological innovation, and market demand [6] - The initiative aims to support the application of artificial intelligence in key sectors such as education, healthcare, and transportation, promoting new products and applications [6]
A股盘前播报 | 发改委发声!事关5000亿新型政策性金融工具、宏观政策
智通财经网· 2025-09-30 00:40
Macro - The National Development and Reform Commission (NDRC) is promoting new policy financial tools with a total scale of 500 billion yuan, aimed at supplementing project capital [1] - The NDRC will continue to implement macro policies to support effective investment [1] Company - DeepSeek officially released and open-sourced DeepSeek-V3.2-Exp, which has significantly improved usability according to analysts [2] - Cambricon has achieved compatibility with DeepSeek-V3.2-Exp, indicating deep collaboration among leading companies in China's AI industry [2] Industry - Gold prices reached a new historical high, with COMEX gold futures rising by 1.42% to $3862.9 per ounce, and analysts predict further long-term price increases supported by factors like potential Federal Reserve rate cuts [3] - The Ministry of Industry and Information Technology and five other departments released a plan for the mechanical industry, targeting an average annual revenue growth rate of about 3.5% and aiming for revenue to exceed 10 trillion yuan by 2025-2026 [4] - The National Social Security Fund reported an investment return rate of 8.1% for 2024, with total assets reaching 33,224.62 billion yuan [5]
中国造不出AI芯片?黄仁勋:仅落后美国“几纳秒”;DeepSeek放大招;小米否认削减订单;OPPO要做云台对标大疆丨邦早报
创业邦· 2025-09-30 00:09
Group 1 - DeepSeek has released version V3.2-Exp of its model, significantly reducing service costs, with API prices dropping by over 50% [2][29] - The new pricing structure for DeepSeek API includes costs of 0.5 yuan for cache hits and 4 yuan for cache misses, effective from September 29, 2025 [3] - Jaguar Land Rover is set to resume production after a month-long halt due to a cyber attack, with operations gradually restarting [5] Group 2 - Apple CEO Tim Cook confirmed his personal investments in cryptocurrencies like Bitcoin and Ethereum but stated that Apple will not accept crypto payments for products [5] - Xiaomi's public relations manager announced that there are no plans to cut orders for the Xiaomi 17 series, which is expected to see an increase in overall product orders [12] - The total import and export value of automotive goods in August 2025 was reported at $25.81 billion, with a month-on-month increase of 3.3% [30] Group 3 - AstraZeneca plans to list its shares on the New York Stock Exchange while maintaining its headquarters in the UK [22] - Linghou Robotics has completed over 100 million yuan in Series A financing, focusing on core components for industrial automation and general robotics [22] - The new ZEEKR 9X luxury SUV has been launched with a starting price of 465,900 yuan, featuring advanced electric and AI technologies [23]
9月30日早餐 | DeepSeek发布新模型;OpenAI将发布新版Sora
Xuan Gu Bao· 2025-09-30 00:01
Market Overview - US stock market saw gains with the Dow Jones up 0.15%, Nasdaq up 0.48%, and S&P 500 up 0.26% [1] - Notable stock movements include Nvidia rising 2.05%, Amazon up 1.09%, and Tesla and Microsoft increasing by up to 0.64% [1] Storage Sector Performance - US storage stocks experienced significant increases, with SanDisk rising nearly 17%, Western Digital up over 9%, Seagate Technology up over 5%, and Micron Technology up over 4% [2] Labor Market Data - The US Labor Department announced that the Bureau of Labor Statistics will pause operations during a government shutdown, affecting the collection and release of non-farm employment data [3] Trade Policy - Former President Trump proposed a 100% tariff on all films produced outside the US [4] Gold Reserves - The value of US gold reserves has reached $1 trillion, exceeding the official book value by over 90 times [5] AI Developments - OpenAI announced its third annual developer conference scheduled for October 6 in San Francisco, expecting 1,500 developers to attend and will unveil a new version of the Sora video generation model [6] - Anthropic launched its latest AI model, Claude Sonnet 4.5, claiming it to be the "best coding model globally" [7] Commodity Market Trends - Spot gold rose by 1.9% to surpass $3,820, reaching a historical high, while silver increased over 1.7%, briefly breaking the $47 mark [7] Securities Strategy Insights - Pacific Securities indicated that the TMT sector is at an extreme in terms of trading volume, suggesting that further investment in tech stocks may not be cost-effective [9] - The report recommends reallocating investments towards high-dividend, anti-involution, and commodity resource sectors as tech stocks may face a temporary slowdown [9] Industry Developments - DeepSeek announced a significant update to its services, reducing API costs by over 50%, which may enhance the development of custom agents in AI applications [10] - The Ministry of Industry and Information Technology aims for the mechanical industry to achieve an average annual revenue growth rate of around 3.5% from 2025 to 2026, targeting a revenue surpassing 1 trillion yuan [11] - Tesla and Apple are exploring the use of glass substrates to enhance semiconductor chip and data center performance, indicating potential growth in the glass substrate market [12] Superconducting Technology - A new world record was set for a fully superconducting magnet achieving a steady-state magnetic field of 35.1 Tesla, which could drive advancements in various high-tech applications [14] Corporate Announcements - Companies such as Gelaun Electronics and Sailyus have made significant acquisitions, indicating active M&A activity in the market [15][18] - Yinglian Co. expects a net profit of 34.5 million to 37.5 million yuan for the first three quarters, reflecting a substantial year-on-year growth [16]
新华财经早报:9月30日
Xin Hua Cai Jing· 2025-09-29 23:50
Group 1 - The Ministry of Commerce of China expressed strong opposition to the U.S. Department of Commerce's new export control rules, which impose additional sanctions on subsidiaries of companies listed on the U.S. "Entity List" if they hold more than 50% ownership [1][8] - The National Development and Reform Commission (NDRC) announced support for various enterprises, including private companies, to participate deeply in the "Artificial Intelligence +" initiative, with plans to issue "AI vouchers" to subsidize companies in using computing power services [1][8] - The Ministry of Industry and Information Technology, along with other departments, released a "Mechanical Industry Stabilization Growth Work Plan (2025-2026)" focusing on expanding effective demand through domestic and international efforts [1][8] Group 2 - The international spot gold price surpassed $3,800 per ounce for the first time, marking a year-to-date increase of over 40% [3][8] - Huayou Cobalt signed a significant sales contract with LG Energy Solution, agreeing to supply approximately 76,000 tons of ternary precursor products from 2026 to 2030 [6][8] - The company Saisir announced the completion of payment for acquiring a 10% stake in a subsidiary held by Huawei, and proposed a cash dividend of 3.10 yuan per 10 shares for the first half of 2025 [6][8]
DeepSeek最新模型上线,全新注意力机制基于北大ACL最佳论文
3 6 Ke· 2025-09-29 23:39
Core Insights - DeepSeek has launched its latest experimental model, DeepSeek-V3.2-Exp, featuring a new attention mechanism called DeepSeek Sparse Attention (DSA), which improves training and inference efficiency while reducing API costs by over 50% [1][19]. Model Features - The V3.2 model builds on DeepSeek-V3.1-Terminus and introduces DSA, achieving faster and more efficient training and inference for long contexts [3][5]. - DSA is the first key technology branded under "DeepSeek" and is an improvement over the Native Sparse Attention (NSA) from a previous collaboration with Peking University [3][5]. - The DSA mechanism allows the model to focus on a small subset of important tokens rather than all tokens, significantly reducing computational complexity from O(L²) to O(Lk), where k is much smaller than L [8][10]. Performance Evaluation - Evaluation results indicate that DeepSeek-V3.2-Exp maintains performance levels comparable to its predecessor, with no significant decline in effectiveness across both short and long text tasks [14][15]. - Specific benchmark results show that while some metrics slightly decreased, others improved, indicating a balanced performance across various tasks [15]. Cost Efficiency - The introduction of DSA has led to substantial reductions in operational costs, with the API price being lowered by over 50% for developers [19]. - The model's deployment has demonstrated significant end-to-end acceleration and cost savings in inference [18]. Future Implications - Although still an experimental model, DeepSeek-V3.2-Exp presents a promising engineering pathway for overcoming long text processing challenges without sacrificing performance [18].
成本下降超50%!DeepSeek新模型API价格大幅下调,国产AI芯片第一时间适配
Xuan Gu Bao· 2025-09-29 23:28
Group 1 - DeepSeek has announced the update of its official App, web version, and mini-program to DeepSeek-V3.2-Exp, resulting in a significant reduction in API costs by over 50% for developers [1] - The cost of AI inference computing power has been decreasing due to advancements in AI large models and improvements in the performance and cost-effectiveness of inference chips, with hardware costs dropping approximately 30% annually and energy efficiency improving by about 40% [1] - The continuous decline in costs for large models, represented by DeepSeek, supports the commercialization of AI applications and enhances the efficiency of distilled models [1] Group 2 - The rapid iteration of large models and enhanced inference capabilities are creating opportunities for customized Agent applications, allowing users to tailor agents based on personal data and needs [2] - Companies like Cambricon and Huawei Ascend have announced their compatibility with DeepSeek-V3.2-Exp and have open-sourced the vLLM-MLU inference engine [2] - Companies such as Fanwei Network, Kingsoft Office, and Dingjie Smart are involved in the development of Agent and AI applications [3] Group 3 - Huawei Ascend has achieved software and hardware adaptation with companies like Softcom and Changshan Beiming [4] - Jiuqi Software plans to upgrade its Nüwa GPT in early 2025, integrating deeply with mainstream large models and launching various intelligent applications [4] - Jiuqi Software's AI distillation technology is similar to that of DeepSeek, indicating a trend in the industry towards efficient model optimization [4]