DeepSeek
Search documents
重磅!DeepSeek 梁文锋论文登上《自然》封面,正面回应蒸馏质疑
程序员的那些事· 2025-09-20 01:10
9 月 18 日,由 DeepSeek 团队共同完成、梁文锋担任通讯作者的 DeepSeek-R1 推理模型研究论文,登上了国际权威期刊《自然(Nature)》的封面。 与今年 1 月发布的 DeepSeek-R1 的初版论文相比,本次论文披露了更多模型训练的细节,并正面回应了模型发布之初的蒸馏质疑。 DeepSeek-R1 是全球首个经过同行评审的主流大语言模型。目前几乎所有主流的大模型都还没有经过独立同行评审,这一空白"终于被 DeepSeek 打 破"。 在《自然》封面的推荐介绍中,是这样写的: "如果训练出的大模型能够规划解决问题所需的步骤,那么它们往往能够更好地解决问题。这种『推理』与人类处理更复杂问题的方式类似,但这对人工 智能有极大挑战,需要人工干预来添加标签和注释。在本周的期刊中,DeepSeek 的研究人员揭示了他们如何能够在极少的人工输入下训练一个模型,并 使其进行推理。 DeepSeek-R1 模型采用强化学习进行训练。在这种学习中,模型正确解答数学问题时会获得高分奖励,答错则会受到惩罚。结果,它学会了推理——逐 步解决问题并揭示这些步骤——更有可能得出正确答案。这使得 DeepSeek ...
X @Decrypt
Decrypt· 2025-09-19 18:25
An Oxford–Vela study finds that GPT-4o and DeepSeek-V3 beat Y Combinator and top VCs at predicting startup success. https://t.co/0HJ5KxzGfH ...
新经济崛起正重构资本市场生态体系
Zheng Quan Ri Bao· 2025-09-19 16:11
Group 1 - The core viewpoint of the articles highlights a significant transformation in China's capital market driven by the rise of new economy forces, particularly through technological innovation and industrial upgrades [1] - The trading volume on the STAR Market reached a record high of 361.047 billion yuan on September 18, indicating robust market activity and investor interest in technology-driven companies [1] - The proportion of high-tech companies among new listings is expected to exceed 90% in 2024, with strategic emerging industries accounting for over 40% of the total market capitalization [2] Group 2 - Technological breakthroughs are fundamentally reshaping the valuation of companies, with firms like DeepSeek and Yushu Technology achieving significant advancements that enhance China's position in global tech competition [2] - The emergence of unicorns in AI and other tech sectors is increasing the weight of the new economy in the capital market, reflecting a shift in investor focus towards long-term technological accumulation rather than short-term profits [2] - The transition of traditional industries through intelligent upgrades and the cluster development of strategic emerging industries like new energy and biomedicine are creating new value growth areas [3] Group 3 - The rise of new economy is leading to a "double premium" for Chinese assets, with AI advancements expected to increase the earnings per share (EPS) of Chinese companies by 2.5% annually [4] - Global capital is recognizing the potential of China's new economy, as evidenced by rating upgrades from institutions like Goldman Sachs and UBS [4] - The transformation driven by innovation is expected to cultivate globally competitive new economy giants and provide sustained momentum for economic transition and upgrading [4]
DeepSeek首度公开R1模型训练成本仅为29.4万美元,“美国同行开始质疑自己的战略”
Xin Lang Cai Jing· 2025-09-19 13:25
Core Insights - DeepSeek has achieved a significant breakthrough in AI model training costs, with the DeepSeek-R1 model costing only $294,000 to train, which is substantially lower than the costs reported by American competitors [1][2][4] - The model's training utilized 512 NVIDIA H800 chips, and the total training time was 80 hours, marking it as the first mainstream large language model to undergo peer review [2][4] - The cost efficiency of DeepSeek's model has sparked discussions about China's position in the global AI landscape, challenging the notion that only countries with the most advanced chips can dominate the AI race [1][2] Cost Efficiency - The training cost of DeepSeek-R1 is reported at $294,000, while OpenAI's CEO indicated that their foundational model training costs exceed $100 million [2] - DeepSeek's approach emphasizes using a large amount of free data for pre-training and fine-tuning with self-generated data, which has been recognized as a cost-effective strategy [5][6] Response to Criticism - DeepSeek addressed accusations from U.S. officials regarding the alleged illegal acquisition of advanced chips, clarifying that they used legally procured H800 chips and acknowledging prior use of A100 chips for smaller model experiments [4][5] - The company defended its use of "distillation" technology, which is a common practice in AI, asserting that it enhances model performance while reducing costs [5][6] Competitive Landscape - The success of DeepSeek-R1 demonstrates that AI competition is shifting from merely having the most GPUs to achieving more with fewer resources, thus altering the competitive dynamics in the industry [6][7] - Other AI models, such as OpenAI's GPT-4 and Google's Gemini, still hold advantages in certain areas, but DeepSeek's model has set a new standard for cost-effective high-performance AI [6][7]
“训练成本才这么点?美国同行陷入自我怀疑”
Guan Cha Zhe Wang· 2025-09-19 11:28
Core Insights - DeepSeek has achieved a significant breakthrough in AI model training costs, with the DeepSeek-R1 model's training cost reported at only $294,000, which is substantially lower than the costs disclosed by American competitors [1][2][4] - The model utilizes 512 NVIDIA H800 chips and has been recognized as the first mainstream large language model to undergo peer review, marking a notable advancement in the field [2][4] - The cost efficiency of DeepSeek's model challenges the notion that only countries with the most advanced chips can dominate the AI race, as highlighted by various media outlets [1][2][6] Cost and Performance - The training cost of DeepSeek-R1 is significantly lower than that of OpenAI's models, which have been reported to exceed $100 million [2][4] - DeepSeek's approach emphasizes the use of open-source data and efficient training methods, allowing for high performance at a fraction of the cost compared to traditional models [5][6] Industry Impact - The success of DeepSeek-R1 is seen as a potential game-changer in the AI landscape, suggesting that AI competition is shifting from resource quantity to resource efficiency [6][7] - The model's development has sparked discussions regarding China's position in the global AI sector, particularly in light of U.S. export restrictions on advanced chips [1][4] Technical Details - The latest research paper provides more detailed insights into the training process and acknowledges the use of A100 chips in earlier stages, although the final model was trained exclusively on H800 chips [4][5] - DeepSeek has defended its use of "distillation" techniques, which are common in the industry, to enhance model performance while reducing costs [5][6]
DeepSeek刷屏论文背后:除了梁文锋,还有一个18岁中国高中生,曾写出神级提示词
3 6 Ke· 2025-09-19 03:32
Core Insights - DeepSeek has published a paper in Nature, showcasing advancements in reasoning within large language models (LLMs) through reinforcement learning, which includes richer implementation details and experimental analysis compared to earlier versions [2][4][38] - The paper highlights the contributions of notable researchers, including Liang Wenfeng, Tu Jinhao, and Luo Fuli, indicating a strong presence of Chinese AI talent in global academic circles [4][38] Group 1 - The Nature publication represents a significant achievement for DeepSeek, marking a historical moment for Chinese AI development on a global stage [38] - The paper emphasizes the importance of the reasoning process in AI models, suggesting that a comprehensive thinking approach is crucial for improving the quality of AI responses [30][38] - The research team includes young talents, such as Tu Jinhao, who has gained recognition for innovative approaches in AI competitions and model enhancements [6][30] Group 2 - Luo Fuli, another key contributor, has a strong academic background and has been involved in significant projects, including leading the development of multilingual pre-trained models at Alibaba [34][36] - The publication reflects a broader trend of increasing representation of Chinese AI researchers in top-tier academic publications, enhancing the visibility of China's contributions to the AI field [38] - The collaborative nature of the research team underscores the importance of teamwork in achieving significant milestones in AI research [38]
Global Markets React to Central Bank Decisions, Tech Innovations, and Geopolitical Shifts
Stock Market News· 2025-09-19 02:38
Corporate News - Xiaomi (1810.HK) announced a software fix for 30,931 of its SU7 electric vehicles in China to address safety concerns related to the assisted driving system following a fatal crash involving the model earlier this year [5][10] - DeepSeek, a Chinese AI firm, revealed that its top AI model cost just $294,000 to train, significantly lower than the tens or hundreds of millions estimated for similar models by U.S. competitors, potentially reshaping the AI development cost debate [6][10] Market Developments - The Nikkei 225 index in Japan surged to a new all-time high of 45,296.21 points, marking a year-to-date gain of nearly 15% despite ongoing U.S. tariff pressures [2][10] - The Indonesian Rupiah declined by 0.3% to 16,550 per U.S. dollar, reaching its lowest level since May 15, coinciding with a downturn in the Indonesia Stock Index, which opened around 7,990 points [3][10] - Malaysia set its October Crude Palm Oil reference price at 4,268.68 RGT/Ton, maintaining a 10% export duty [7]
为什么烧钱救不了中国AI?
3 6 Ke· 2025-09-19 01:36
Group 1 - In 2020, the capital expenditure ratio between major tech companies in the US and China was approximately 1:6, which is expected to widen to 1:10 by 2024, with US companies spending a total of 5.36 trillion yuan compared to only 630 billion yuan from Chinese internet firms [1] - By 2025, Chinese internet companies are projected to significantly increase their capital expenditure to 500 billion yuan, yet this amount is still only one-fifth of the AI-related capital expenditure of the four major US companies this year [3] - The US has three structural advantages in AI competition: a large consumer market, a mature capital market, and a top-tier talent cultivation system [4][7][8] Group 2 - The Nasdaq index has seen a significant increase from approximately 8,970 points in early 2020 to 22,200 points by September 2025, indicating the strong performance of tech stocks, particularly the "Big Seven" US tech companies [5] - The US has a robust talent pipeline for AI, with top universities continuously supplying high-level talent, which fosters innovation and accelerates technology transfer [7][8] - China's unique advantages in AI lie in its efficiency and scene-driven innovation, with historical examples showing that capital is not the sole determinant of success [9][10] Group 3 - China's core competitive advantage in AI is its application scenarios, supported by a complete manufacturing supply chain and a large user base that allows for rapid validation and iteration of new technologies [11][13] - The scale of China's STEM graduates is significantly larger than that of the US, providing a stable and high-quality talent base for the AI industry [14] - The trend of high-end talent returning to China from overseas is enhancing local companies' R&D capabilities and innovation quality [15][18] Group 4 - The competition in AI is a long-term marathon rather than a sprint, and maintaining open communication and collaboration with the global innovation ecosystem is crucial for China to sustain its competitive edge [18]
AI医学的“DeepSeek时刻”快来了?
Di Yi Cai Jing· 2025-09-19 00:32
Core Insights - The article highlights the emergence of AI technologies in the pharmaceutical and medical fields, particularly focusing on the advancements made by Chinese AI company DeepSeek and its large model R1, which has gained recognition in the scientific community [2] - The integration of AI in drug discovery and clinical applications is accelerating, with significant investments from major pharmaceutical companies aiming to revolutionize the drug development process [4][5] Group 1: AI in Drug Discovery - Major pharmaceutical companies, including Bristol-Myers Squibb and Sanofi, are investing billions in AI drug discovery, hoping to achieve breakthroughs that will transform the drug development process [4] - Medidata's data indicates that the proportion of clinical trials initiated by Chinese companies has surged from approximately 3% to 30% by 2024, positioning China as the second-largest clinical trial market globally [4] - AI is expected to drive a new wave of drug development, becoming a crucial force in the transformation of new drug research [4] Group 2: AI in Medical Applications - The "Meta-Medical" laboratory, launched by Zhongshan Hospital affiliated with Fudan University, aims to develop AI agents and apply large model technologies to enhance medical knowledge digitization and productization of diagnostic capabilities [6] - AI is changing the paradigm of diagnosis and treatment, with significant advancements in areas such as heart disease risk prediction and real-time monitoring through wearable devices [6] - The successful application of AI in specific medical fields has reached clinical levels, exemplified by the monitoring of intermittent atrial fibrillation using wearable technology [6] Group 3: Challenges and Ethical Considerations - Despite the potential of AI in drug discovery, challenges remain, including a 90% failure rate in clinical trials and the need to address complex biological issues and regulatory hurdles [5] - Ethical considerations are paramount, with the responsibility for medical decisions still resting with physicians, who must ensure that AI technologies are used safely and effectively in clinical settings [7]
中国服务业企业500强发布,华为公布AI芯片发展路线 | 财经日日评
吴晓波频道· 2025-09-19 00:30
Group 1: Federal Reserve and Economic Policy - The Federal Reserve announced a 25 basis point rate cut, lowering the target range from 4.25%-4.5% to 4.00%-4.25%, marking the first rate cut of the year after a total reduction of 125 basis points since last September [2][3] - The Fed's statement highlighted a slowdown in job growth and a slight increase in the unemployment rate, indicating a cautious approach to future rate cuts amid rising inflation [2][3] - Fed Chair Powell faces a challenging decision between maintaining higher rates to curb inflation or cutting rates to support the job market, with the current economic indicators suggesting a need for preventive measures [2][3] Group 2: Immigration and Service Industry Growth - From January to August, the number of visa-free foreign entrants to China increased by 52.1% year-on-year, with a total of 15.89 million foreign visitors [4][5] - The Chinese government is optimizing visa policies to attract more foreign visitors, which is expected to stimulate consumption and boost the service industry [4][5] - The 2025 China Service Industry Top 500 report revealed a total revenue of 51.1 trillion yuan, with an average revenue per company exceeding 1 billion yuan, indicating strong growth in the service sector [6][7] Group 3: AI Chip Development - Huawei announced a three-year roadmap for its Ascend AI chip series, with plans to release four new chips between 2026 and 2028, emphasizing the use of self-developed high-bandwidth memory [8][9] - The development of AI chips is seen as a strategic move to reduce reliance on foreign technology, with other Chinese companies like Alibaba and Baidu also accelerating their AI chip research [8][9] - The DeepSeek team's research on a new language model was published in Nature, showcasing advancements in AI training methodologies and contributing to the global AI landscape [10][11] Group 4: International Market Expansion - Didi and Meituan are investing heavily in the Brazilian food delivery market, with Didi planning to invest 2 billion reais and Meituan committing 1 billion USD over five years [12][13] - The competitive landscape in Brazil's food delivery market is intensifying, with both companies facing challenges from local giants like iFood [12][13] - The entry of Chinese companies into the Brazilian market reflects a broader strategy to capture opportunities in Latin America, despite the challenges of local competition [12][13] Group 5: Digital Asset Regulation - The SEC has simplified the approval process for digital asset ETFs, reducing the timeline from 240 days to a maximum of 75 days, signaling a shift towards a more favorable regulatory environment for digital assets [14][15] - This regulatory change aims to promote innovation while maintaining oversight, as the U.S. seeks to catch up with other financial hubs that have embraced digital currencies [14][15] - The SEC's decision reflects a broader trend of increasing acceptance of digital assets within the U.S. financial system, potentially reshaping the competitive landscape for digital asset products [14][15]