Workflow
Seek .(SKLTY)
icon
Search documents
DeepSeek-R1 论文登上《自然》封面,通信ETF收涨1.92%
Sou Hu Cai Jing· 2025-09-18 07:50
Market Performance - Major indices experienced a rapid pullback after an initial rise, with the Shanghai Composite Index down 1.15%, Shenzhen Component Index down 1.06%, and ChiNext Index down 1.64% [2] - Sectors such as tourism, CPO, and the chip industry chain saw significant gains, while sectors like non-ferrous metals, large finance, and rare earth permanent magnets faced notable declines [2] ETF Highlights - The Guotai CSI All-Share Communication Equipment ETF (515880) rose by 1.92%, with constituent stocks like Guangku Technology (300620.SZ) increasing by 15%, and Fenghuo Communication (600498.SH), Changfei Optical Fiber (601869.SH), and Hengtong Optic-Electric (600487.SH) hitting the daily limit [2] AI and Computing Power Forecast - Huawei predicts that the total computing power in society will increase by 100,000 times by 2035, with AI storage capacity demand expected to grow by 500 times compared to 2025 [3] - Huawei's rotating chairman Xu Zhijun emphasized that computing power is crucial for artificial intelligence, sharing plans for the Ascend chip series, with the Ascend 950PR chip expected in Q1 2026 and the Ascend 970 chip in Q4 2028 [3] Industry Insights - Guosheng Securities noted significant volatility in the optical communication sector, but strong demand and large orders in the overseas AI computing power market indicate a solid fundamental outlook for the optical module industry [3] - Dongxing Securities highlighted that the current phase of artificial intelligence is characterized by a three-dimensional resonance of policy, technology, and demand, with domestic chip and cloud computing leaders gradually validating their performance [3]
DeepSeek登《Nature》封面 梁文锋带队 首次回应争议
Feng Huang Wang· 2025-09-18 07:48
Core Insights - DeepSeek-AI team has published research on the open-source model DeepSeek-R1, demonstrating significant improvements in reasoning capabilities through pure reinforcement learning, reducing reliance on human annotations [1][4] - The cost of training DeepSeek-R1 is remarkably low at $29.4 million, which is significantly less than the estimated $100 million spent by OpenAI on GPT-4 [3][4] - The methodology employed by DeepSeek-R1, including the use of pure reinforcement learning and the GRPO algorithm, allows the model to develop advanced behaviors such as self-reflection and self-verification without human reasoning demonstrations [4][5] Cost Efficiency - DeepSeek-R1's reasoning cost is only $29.4 million, with total costs, including base model training, remaining under $6 million, making it highly competitive against major players like OpenAI and Google [3][4] - The model's cost efficiency is attributed to a focus on algorithmic innovation rather than extensive financial resources [8] Methodological Innovation - The research highlights a shift from traditional training methods to a framework that rewards correct answers rather than mimicking human reasoning paths, leading to the emergence of complex thinking patterns [4][9] - DeepSeek-R1 achieved a significant accuracy increase in the AIME 2024 math competition, from 15.6% to 77.9%, and further to 86.7% with self-consistency decoding, surpassing human average performance [4][5] Industry Impact - The success of DeepSeek-R1 represents a pivotal moment in AI, indicating a potential shift from a competition based on data and computational power to one focused on algorithmic and innovative advancements [9] - The model's development is seen as a "methodological manifesto," showcasing a sustainable path for AI evolution that does not rely on vast amounts of labeled data [8][9]
DeepSeek打破历史!中国AI的“Nature时刻”
Zheng Quan Shi Bao· 2025-09-18 07:29
Core Insights - The DeepSeek-R1 inference model research paper has made history by being the first Chinese large model research to be published in the prestigious journal Nature, marking a significant recognition of China's AI technology on the global scientific stage [1][2] - Nature's editorial highlighted that DeepSeek has broken the gap of independent peer review for mainstream large models, which has been lacking in the industry [2] Group 1: Research and Development - The DeepSeek-R1 model's research paper underwent a rigorous peer review process involving eight external experts over six months, emphasizing the importance of transparency and reproducibility in AI model development [2] - The paper disclosed significant details about the training costs and methodologies, including a total training cost of $294,000 (approximately 2.09 million RMB) for R1, achieved using 512 H800 GPUs [3] Group 2: Model Performance and Criticism - DeepSeek addressed initial criticisms regarding the "distillation" method used in R1, clarifying that all training data was sourced from the internet without intentional use of outputs from proprietary models like OpenAI's [3] - The R1 model's training duration was 198 hours for R1-Zero and 80 hours for R1, showcasing a cost-effective approach compared to other models that often exceed tens of millions of dollars [3] Group 3: Future Developments - There is significant anticipation regarding the release of the R2 model, with speculation that delays may be due to computational limitations [4] - The recent release of DeepSeek-V3.1 indicates advancements towards the "Agent" era, featuring a mixed inference architecture and improved efficiency, which has sparked interest in the upcoming R2 model [4][5] Group 4: Industry Impact - DeepSeek's adoption of UE8M0 FP8 Scale parameter precision in V3.1 suggests a shift towards utilizing domestic AI chips, potentially accelerating the development of China's computing ecosystem [5] - The collaboration between software and hardware in DeepSeek's models is seen as a new paradigm in the AI wave, with expectations for significant performance improvements in domestic computing chips [5]
中国大模型首登《自然》封面,AI医学的DeepSeek时刻还远吗?
Di Yi Cai Jing· 2025-09-18 07:02
Group 1: AI in Drug Development - AI has become a significant focus for multinational pharmaceutical companies, with substantial investments aimed at transforming the drug discovery process and generating breakthroughs in understanding biological data [3][4] - The global proportion of clinical trials initiated by Chinese companies has increased from approximately 3% to 30% by 2024, positioning China as the second-largest clinical trial market [3] - AI is expected to drive a new wave of drug development, becoming a crucial force in the transformation of new drug research and development [3][4] Group 2: AI Applications in Medical Diagnosis - Major medical institutions in China are actively promoting the integration of large models and AI agents in clinical applications, exemplified by the launch of the "Meta-Medical Simulation Laboratory" by Fudan University and technology companies [5] - AI is changing the paradigm of diagnosis and treatment, with significant advancements in areas such as heart rate screening, imaging analysis, and risk assessment [6] - The application of AI in medicine involves three key aspects: data quality, computational power, and algorithm optimization, which are essential for effective clinical application [6] Group 3: Challenges and Considerations - Despite the potential of AI in drug discovery, there are significant challenges, including a 90% failure rate in clinical trials and the need to address complex biological and regulatory issues [4] - Ethical considerations are paramount, with the understanding that physicians remain the primary decision-makers in clinical settings, and the responsibility for medical actions lies with them [6]
DeepSeek声明:防范冒用“深度求索”名义实施诈骗
Mei Ri Jing Ji Xin Wen· 2025-09-18 06:56
1.深度求索从未要求用户向个人账户或非官方账户付款,任何要求私下转账的行为均属诈骗; 2.任何冒用我司名义开展"算力租赁"、"融资"等行为均属违法,我们将依法追究其法律责任。 每经AI快讯,9月17日,深度求索(DeepSeek)发布官方声明: 近期,有不法分子冒充"深度求索"(DeepSeek)官方或在职员工,伪造工牌、营业执照等材料,在多个 平台以"算力租赁"、"股权融资"等名义向用户收取费用实施诈骗。该行为严重侵害用户权益,并损害我 司声誉。 ...
DeepSeek登《Nature》封面,梁文锋带队,首次回应“蒸馏”争议
Feng Huang Wang· 2025-09-18 06:17
Core Insights - The article highlights a significant achievement in China's AI sector with the publication of the DeepSeek-R1 model, which demonstrates a breakthrough in reducing the cost of training large language models while enhancing their reasoning capabilities [1][10]. Cost Efficiency - DeepSeek-R1's inference cost is remarkably low at $294,000, which is significantly less than the estimated $100 million spent by OpenAI on GPT-4 and the tens of millions by other tech giants [6]. - Even when including the approximately $6 million for the foundational model training, the total cost remains substantially lower than that of international competitors [6]. Methodological Innovation - The research team employed a pure reinforcement learning framework and introduced the Group Relative Policy Optimization (GRPO) algorithm, rewarding the model based solely on the correctness of final answers rather than mimicking human reasoning paths [6][10]. - This unconventional training approach led to the emergence of advanced behaviors such as self-reflection and self-verification, allowing the model to generate extensive reasoning chains [7]. Performance Metrics - DeepSeek-R1-Zero achieved an impressive accuracy rate of 77.9% in the American Mathematics Invitational Exam (AIME 2024), which further improved to 86.7% with self-consistency decoding, surpassing the human average [7]. - The model's performance extends beyond mathematics and programming tasks, demonstrating fluency and consistency in writing and question-answering tasks [7]. Leadership and Vision - The success of DeepSeek-R1 is attributed to the leadership of Liang Wenfeng, who has a background in machine learning and a vision for AI's transformative potential [8]. - Liang's approach to team building emphasizes capability over experience, focusing on nurturing young talent to drive innovation [9]. Industry Implications - The research represents a methodological declaration that emphasizes a sustainable path for AI evolution, moving away from reliance on vast labeled datasets and high funding barriers [10]. - The competition in AI is expected to shift from a focus on data and computational power to one centered on algorithmic and intellectual innovation, with DeepSeek-R1 setting the stage for this new era [11].
DeepSeek发布防诈骗声明:有不法分子冒用公司名义开展“算力租赁”“融资”,将追究其法律责任
Xin Lang Ke Ji· 2025-09-18 05:53
Core Points - DeepSeek has issued a statement regarding fraudulent activities where individuals impersonate the company or its employees, using forged identification and business licenses to scam users under the guise of "computing power leasing" and "equity financing" [1][2][3] - The fraudulent actions have severely harmed user rights and damaged the company's reputation [1][2] Company Policy - DeepSeek has never requested users to make payments to personal or unofficial accounts; any such requests for private transfers are considered scams [3] - Any activities that misuse the company's name for "computing power leasing" or financing are illegal, and the company will pursue legal action against such actions [3] User Advisory - Users are advised to obtain official information and updates through the official website (deepseek.com) and verified accounts [1] - The company's official webpage and app products are currently free; for API services, users should recharge through the official platform, with the official payment account name being "Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd." [1] - In case of suspicious situations, users should verify through the official email or report to law enforcement [1]
DeepSeek,打破历史!中国AI的“Nature时刻”
Zheng Quan Shi Bao· 2025-09-18 05:24
Core Insights - The DeepSeek-R1 inference model research paper has made history by being the first Chinese large model research to be published on the cover of the prestigious journal Nature, marking a significant recognition of China's AI technology in the international scientific community [1][2] - Nature's editorial highlighted that DeepSeek has broken the gap of independent peer review for mainstream large models, which has been lacking in the industry [2] Group 1: Research and Development - The DeepSeek-R1 model's research paper underwent a rigorous peer review process involving eight external experts over six months, emphasizing the importance of transparency and reproducibility in AI model development [2] - The paper disclosed significant details about the training costs and methodologies, including a total training cost of $294,000 (approximately 2.09 million RMB) for R1, achieved using 512 H800 GPUs over 198 hours [3] Group 2: Model Performance and Criticism - DeepSeek addressed initial criticisms regarding the "distillation" method used in R1, clarifying that all training data was sourced from the internet without intentional use of outputs from proprietary models like OpenAI's [3] - The R1 model has been recognized for its cost-effectiveness compared to other inference models, which often incur training costs in the tens of millions [3] Group 3: Future Developments - There is significant anticipation regarding the release of the R2 model, with speculation that delays may be due to computational limitations [4] - The recent release of DeepSeek-V3.1 has introduced a mixed inference architecture and improved efficiency, indicating a step towards the "Agent" era in AI [4][5] - DeepSeek's emphasis on using UE8M0 FP8 Scale parameter precision in V3.1 suggests a strategic alignment with domestic AI chip development, potentially enhancing the performance of future models [5]
DeepSeek首次回应“蒸馏OpenAI”质疑
Di Yi Cai Jing· 2025-09-18 04:34
Core Insights - DeepSeek's research paper, "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning," has been published in the prestigious journal Nature, highlighting significant advancements in AI reasoning capabilities [1][11]. Group 1: Research and Development - The initial version of DeepSeek's paper was released on arXiv in January, and the Nature publication includes more detailed model specifications and reduced anthropomorphism in descriptions [5]. - DeepSeek-R1's training cost was reported to be $294,000, with specific costs for different components outlined, including $202,000 for DeepSeek-R1-Zero training and $82,000 for SFT data creation [9]. - The training utilized A100 GPUs for smaller models and expanded to 660 billion parameters for the R1 model, demonstrating a scalable approach to model development [8][10]. Group 2: Model Performance and Validation - DeepSeek-R1 has become the most popular open-source inference model globally, with over 10.9 million downloads on Hugging Face, marking it as the first mainstream large language model to undergo peer review [11]. - The research emphasizes that significant reasoning capabilities can be achieved through reinforcement learning without relying heavily on supervised fine-tuning, which is a departure from traditional methods [13]. - The model's training involved a reward mechanism that encourages correct reasoning, allowing it to self-validate and improve its performance on complex tasks [13]. Group 3: Industry Implications - The findings from DeepSeek's research could set a precedent for future AI model development, particularly in enhancing reasoning capabilities without extensive data requirements [11][13]. - The independent peer review process adds credibility to the model's performance claims, addressing concerns about potential manipulation in AI benchmarking [11].
DeepSeek登上国际权威期刊Nature封面;华为预测2035年AI存储容量需求将比2025年增长500倍
Mei Ri Jing Ji Xin Wen· 2025-09-18 03:02
Market Performance - As of September 17, the Shanghai Composite Index rose by 0.37% to close at 3876.34 points, the Shenzhen Component Index increased by 1.16% to 13215.46 points, and the ChiNext Index gained 1.95% to 3147.35 points [1] - The Kweichow Moutai Semiconductor ETF (588170) increased by 3.64%, while the Semiconductor Materials ETF (562590) rose by 3.32% [1] - In the overnight U.S. market, the Dow Jones Industrial Average increased by 0.57%, while the Nasdaq Composite Index fell by 0.33% and the S&P 500 Index decreased by 0.10% [1] Industry Insights - The DeepSeek-R1 inference model research paper, led by Liang Wenfeng, was published in the prestigious journal Nature, marking it as the first mainstream large language model to undergo peer review [2] - Huawei released the "Smart World 2035" series of reports, predicting a significant increase in total computing power by 2035, with a 500-fold increase in AI storage capacity demand compared to 2025 [2] - The Zhangjiang Artificial Intelligence Innovation Town in Shanghai aims to gather over 500 AI companies by 2027 and achieve a scale of 100 billion yuan by 2030, supported by a 2 billion yuan fund initiated by Hillhouse Capital and local state-owned enterprises [3] - Tianfeng Securities anticipates a structural prosperity in the global semiconductor industry driven by rapid growth in AI computing demand, accelerated terminal intelligence, and the recovery of automotive electronics [3] Related ETFs - The Kweichow Moutai Semiconductor ETF (588170) tracks the Shanghai Stock Exchange's semiconductor materials and equipment index, focusing on semiconductor equipment (59%) and materials (25%) [4] - The Semiconductor Materials ETF (562590) also emphasizes semiconductor equipment (59%) and materials (24%), benefiting from the expansion of semiconductor demand driven by the AI revolution [4]