Workflow
DeepSeek
icon
Search documents
华人学者一天发表了11篇Nature论文
生物世界· 2025-09-18 10:05
Core Insights - On September 17, 2025, a total of 24 papers were published in the prestigious journal Nature, with 10 of them authored by Chinese scholars, highlighting the significant contribution of Chinese researchers to global scientific advancements [2][5][7][9][12][14][16][18][21]. Group 1: Research Contributions - A paper titled "Toughened self-assembled monolayers for durable perovskite solar cells" was co-authored by scholars from Hong Kong City University and the Chinese Academy of Sciences, focusing on enhancing the durability of perovskite solar cells [2]. - Another significant paper, "A movable long-term implantable soft microfibre for dynamic bioelectronics," was published by researchers from the Chinese Academy of Sciences and Donghua University, contributing to the field of bioelectronics [5]. - The paper "Atomic-scale imaging of frequency-dependent phonon anisotropy" was authored by researchers from the University of California, Irvine, providing insights into phonon behavior at the atomic level [7]. - A study titled "Covariation mass spectrometry uncovers a protein that controls cysteine catabolism" was led by a researcher from Dana-Farber Cancer Institute, revealing important findings in protein metabolism [9]. - The research "A room temperature rechargeable all-solid-state hydride ion battery" was published by scholars from the Dalian Institute of Physical Chemistry, focusing on advancements in battery technology [12]. - A paper on "High-density soft bioelectronic fibres for multimodal sensing and stimulation" was authored by researchers from Stanford University, contributing to the development of bioelectronic devices [14]. - The study "DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning" was published by DeepSeek, exploring advancements in large language models [16]. - A paper titled "Structural basis for mTORC1 activation on the lysosomal membrane" was authored by researchers from the University of California, Berkeley, providing insights into cellular signaling mechanisms [17]. - The research "Peroxisomal metabolism of branched fatty acids regulates energy homeostasis" was published by scholars from Washington University in St. Louis, contributing to the understanding of metabolic processes [18]. - A study on "Delta-type glutamate receptors are ligand-gated ion channels" was published by Johns Hopkins University, enhancing knowledge in neurobiology [21].
DeepSeek 首登《自然》封面:中国大模型创造新历史,做了 OpenAI 不敢做的事
3 6 Ke· 2025-09-18 09:56
Core Insights - DeepSeek's AI model, R1, has gained significant recognition by being featured on the cover of Nature, a prestigious scientific journal, highlighting its impact in the AI industry [2][10][12] - The training cost for R1 was notably low at $294,000, which contrasts sharply with the multi-million dollar investments typical for models from companies like OpenAI [7][48] - The model's development process involved rigorous peer review, setting a new standard for transparency and scientific validation in AI [11][15][16] Group 1: Model Development and Training - DeepSeek R1's training process was detailed in a paper published on arXiv, which was later expanded upon in the Nature article, showcasing a comprehensive methodology [6][7] - The model was trained using a pure reinforcement learning framework, allowing it to develop reasoning capabilities without relying on human-annotated data [19][41] - R1 achieved an impressive accuracy of 77.9% in the AIME 2024 math competition, surpassing human average scores and even outperforming GPT-4 in certain tasks [23][31] Group 2: Peer Review and Industry Impact - The peer review process for R1 involved independent experts scrutinizing the model, which is a departure from the typical practices of major AI companies that often do not submit their models for academic evaluation [10][11][15] - Nature's editorial team has called for other companies to submit their models for peer review, emphasizing the importance of transparency and accountability in AI development [15][16] - The recognition from Nature not only validates R1's scientific contributions but also positions DeepSeek as a leader in the push for more rigorous standards in AI research [12][50] Group 3: Technical Innovations - R1's architecture is based on a mixture of experts (MoE) model with 671 billion parameters, which was pre-trained on a vast dataset of web pages and e-books [25] - The model's training involved a unique approach where it was rewarded solely based on the correctness of its answers, fostering an environment for self-reflection and dynamic adjustment during problem-solving [29][38] - The final version of R1 was developed through a multi-stage training process that combined reinforcement learning with supervised fine-tuning, enhancing both reasoning and general capabilities [39][47]
训练成本29.4万美元,DeepSeek-R1登Nature封面,首个通过权威期刊同行评审的主流大模型获好评
3 6 Ke· 2025-09-18 07:55
Core Insights - DeepSeek-R1's research results have been published in Nature, marking it as the first mainstream large model to undergo peer review by a reputable journal, which has sparked significant discussion in the academic community [1][14][17] - The training cost of DeepSeek-R1 is reported to be only $294,000, significantly lower than the industry standard of tens of millions for leading models, despite an investment of approximately $6 million in the foundational LLM [1][2][17] Training Costs - The training costs for DeepSeek-R1 are broken down as follows: - DeepSeek-R1-Zero: $202,000 - SFT data creation: $10,000 - DeepSeek-R1: $82,000 - Total: $294,000 - The training utilized 648 H800 GPUs over approximately 198 hours for DeepSeek-R1-Zero and around 80 hours for DeepSeek-R1 [2] Reinforcement Learning and Reasoning Capabilities - The model employs Group Relative Policy Optimization (GRPO) to enhance reasoning capabilities without traditional supervised fine-tuning, allowing for more exploratory learning [3][4] - DeepSeek-R1-Zero demonstrates complex reasoning behaviors, generating longer responses that incorporate verification and exploration of different solutions [4][6] Performance Metrics - DeepSeek-R1-Zero achieved a pass@1 score of 77.9% in the AIME 2024 math competition, with further improvements to 86.7% using self-consistent decoding strategies, surpassing human average performance [6][8] - The model also excelled in programming competitions and graduate-level questions in biology, physics, and chemistry, validating the effectiveness of reinforcement learning in enhancing reasoning capabilities [6] Development Pipeline - The development of DeepSeek-R1 involved multiple stages, starting from data collection based on human-like dialogue to reinforcement learning and sampling, ultimately enhancing the model's utility and safety [9][11] - Experimental results indicate significant improvements in instruction execution across various development stages, with DeepSeek-R1 outperforming its predecessors in benchmark tests [11][13] Industry Impact - The peer review of DeepSeek-R1 is seen as a positive trend for AI research, promoting transparency and standardization in the field, which has been lacking for many mainstream AI models [14][16][17]
DeepSeek登《Nature》封面 梁文锋带队 首次回应争议
Feng Huang Wang· 2025-09-18 07:48
Core Insights - DeepSeek-AI team has published research on the open-source model DeepSeek-R1, demonstrating significant improvements in reasoning capabilities through pure reinforcement learning, reducing reliance on human annotations [1][4] - The cost of training DeepSeek-R1 is remarkably low at $29.4 million, which is significantly less than the estimated $100 million spent by OpenAI on GPT-4 [3][4] - The methodology employed by DeepSeek-R1, including the use of pure reinforcement learning and the GRPO algorithm, allows the model to develop advanced behaviors such as self-reflection and self-verification without human reasoning demonstrations [4][5] Cost Efficiency - DeepSeek-R1's reasoning cost is only $29.4 million, with total costs, including base model training, remaining under $6 million, making it highly competitive against major players like OpenAI and Google [3][4] - The model's cost efficiency is attributed to a focus on algorithmic innovation rather than extensive financial resources [8] Methodological Innovation - The research highlights a shift from traditional training methods to a framework that rewards correct answers rather than mimicking human reasoning paths, leading to the emergence of complex thinking patterns [4][9] - DeepSeek-R1 achieved a significant accuracy increase in the AIME 2024 math competition, from 15.6% to 77.9%, and further to 86.7% with self-consistency decoding, surpassing human average performance [4][5] Industry Impact - The success of DeepSeek-R1 represents a pivotal moment in AI, indicating a potential shift from a competition based on data and computational power to one focused on algorithmic and innovative advancements [9] - The model's development is seen as a "methodological manifesto," showcasing a sustainable path for AI evolution that does not rely on vast amounts of labeled data [8][9]
DeepSeek打破历史!中国AI的“Nature时刻”
Zheng Quan Shi Bao· 2025-09-18 07:29
Core Insights - The DeepSeek-R1 inference model research paper has made history by being the first Chinese large model research to be published in the prestigious journal Nature, marking a significant recognition of China's AI technology on the global scientific stage [1][2] - Nature's editorial highlighted that DeepSeek has broken the gap of independent peer review for mainstream large models, which has been lacking in the industry [2] Group 1: Research and Development - The DeepSeek-R1 model's research paper underwent a rigorous peer review process involving eight external experts over six months, emphasizing the importance of transparency and reproducibility in AI model development [2] - The paper disclosed significant details about the training costs and methodologies, including a total training cost of $294,000 (approximately 2.09 million RMB) for R1, achieved using 512 H800 GPUs [3] Group 2: Model Performance and Criticism - DeepSeek addressed initial criticisms regarding the "distillation" method used in R1, clarifying that all training data was sourced from the internet without intentional use of outputs from proprietary models like OpenAI's [3] - The R1 model's training duration was 198 hours for R1-Zero and 80 hours for R1, showcasing a cost-effective approach compared to other models that often exceed tens of millions of dollars [3] Group 3: Future Developments - There is significant anticipation regarding the release of the R2 model, with speculation that delays may be due to computational limitations [4] - The recent release of DeepSeek-V3.1 indicates advancements towards the "Agent" era, featuring a mixed inference architecture and improved efficiency, which has sparked interest in the upcoming R2 model [4][5] Group 4: Industry Impact - DeepSeek's adoption of UE8M0 FP8 Scale parameter precision in V3.1 suggests a shift towards utilizing domestic AI chips, potentially accelerating the development of China's computing ecosystem [5] - The collaboration between software and hardware in DeepSeek's models is seen as a new paradigm in the AI wave, with expectations for significant performance improvements in domestic computing chips [5]
中国大模型首登《自然》封面,AI医学的DeepSeek时刻还远吗?
Di Yi Cai Jing· 2025-09-18 07:02
Group 1: AI in Drug Development - AI has become a significant focus for multinational pharmaceutical companies, with substantial investments aimed at transforming the drug discovery process and generating breakthroughs in understanding biological data [3][4] - The global proportion of clinical trials initiated by Chinese companies has increased from approximately 3% to 30% by 2024, positioning China as the second-largest clinical trial market [3] - AI is expected to drive a new wave of drug development, becoming a crucial force in the transformation of new drug research and development [3][4] Group 2: AI Applications in Medical Diagnosis - Major medical institutions in China are actively promoting the integration of large models and AI agents in clinical applications, exemplified by the launch of the "Meta-Medical Simulation Laboratory" by Fudan University and technology companies [5] - AI is changing the paradigm of diagnosis and treatment, with significant advancements in areas such as heart rate screening, imaging analysis, and risk assessment [6] - The application of AI in medicine involves three key aspects: data quality, computational power, and algorithm optimization, which are essential for effective clinical application [6] Group 3: Challenges and Considerations - Despite the potential of AI in drug discovery, there are significant challenges, including a 90% failure rate in clinical trials and the need to address complex biological and regulatory issues [4] - Ethical considerations are paramount, with the understanding that physicians remain the primary decision-makers in clinical settings, and the responsibility for medical actions lies with them [6]
Omdia:中国财富500强的企业中正在部署或已经使用GenAI技术达到74.6%
智通财经网· 2025-09-18 06:59
Group 1 - The adoption rate of GenAI technology among China's Fortune 500 companies has reached 74.6%, driven by full-stack solutions from GenAI cloud giants and the rise of open-source models and tools [1] - Leading GenAI providers in China include Alibaba Cloud and DeepSeek, serving 40% and 38% of Fortune 500 companies respectively, with a trend towards multi-vendor strategies where companies use an average of 2.1 GenAI suppliers [1] - Open-source models play a crucial role in the rise of GenAI in China, providing openness, transparency, customization, and flexibility for rapid deployment of large models [1] Group 2 - Adoption rates of GenAI vary significantly across industries, with 100% in telecommunications, automotive, and IT, 90% in financial services, and 80% in manufacturing, influenced by digital infrastructure maturity and regulatory environments [2] - Companies are actively applying GenAI in various scenarios, including enhancing employee productivity, customer service, sales and marketing, and process optimization, with notable examples such as NIO generating 30% of its software code through GenAI [2] - In customer service, companies like FAW Group improved query resolution rates from 37% to 84% using GenAI, while Ctrip saved 10,000 work hours daily through virtual assistants [2] Group 3 - By 2025, the largest verticals for GenAI software revenue in China will be IT, healthcare, retail, consumer, and professional services, with continued growth expected through 2029 [3] - Conversational tools are anticipated to be the most popular use case in the coming years due to the availability of language and text data and the maturity of language processing [3] - Companies are encouraged to ensure that GenAI deployments provide a return on investment while prioritizing trustworthy, secure, and robust solutions, and many are beginning to embrace the benefits of agent-based AI [3]
DeepSeek登《Nature》封面,梁文锋带队,首次回应“蒸馏”争议
Feng Huang Wang· 2025-09-18 06:17
Core Insights - The article highlights a significant achievement in China's AI sector with the publication of the DeepSeek-R1 model, which demonstrates a breakthrough in reducing the cost of training large language models while enhancing their reasoning capabilities [1][10]. Cost Efficiency - DeepSeek-R1's inference cost is remarkably low at $294,000, which is significantly less than the estimated $100 million spent by OpenAI on GPT-4 and the tens of millions by other tech giants [6]. - Even when including the approximately $6 million for the foundational model training, the total cost remains substantially lower than that of international competitors [6]. Methodological Innovation - The research team employed a pure reinforcement learning framework and introduced the Group Relative Policy Optimization (GRPO) algorithm, rewarding the model based solely on the correctness of final answers rather than mimicking human reasoning paths [6][10]. - This unconventional training approach led to the emergence of advanced behaviors such as self-reflection and self-verification, allowing the model to generate extensive reasoning chains [7]. Performance Metrics - DeepSeek-R1-Zero achieved an impressive accuracy rate of 77.9% in the American Mathematics Invitational Exam (AIME 2024), which further improved to 86.7% with self-consistency decoding, surpassing the human average [7]. - The model's performance extends beyond mathematics and programming tasks, demonstrating fluency and consistency in writing and question-answering tasks [7]. Leadership and Vision - The success of DeepSeek-R1 is attributed to the leadership of Liang Wenfeng, who has a background in machine learning and a vision for AI's transformative potential [8]. - Liang's approach to team building emphasizes capability over experience, focusing on nurturing young talent to drive innovation [9]. Industry Implications - The research represents a methodological declaration that emphasizes a sustainable path for AI evolution, moving away from reliance on vast labeled datasets and high funding barriers [10]. - The competition in AI is expected to shift from a focus on data and computational power to one centered on algorithmic and intellectual innovation, with DeepSeek-R1 setting the stage for this new era [11].
DeepSeek首次回应“蒸馏OpenAI”质疑
第一财经· 2025-09-18 05:34
Core Viewpoint - DeepSeek's R1 model has gained significant attention after being published in the prestigious journal "Nature," showcasing its ability to enhance reasoning capabilities through reinforcement learning without relying heavily on supervised data [3][11]. Group 1: Model Development and Training - The training cost for the DeepSeek-R1 model was approximately $294,000, with specific costs for different components detailed as follows: R1-Zero training cost was $202,000, SFT dataset creation cost was $10,000, and R1 training cost was $82,000 [10]. - DeepSeek-R1 utilized 64×8 H800 GPUs for training, taking about 198 hours for R1-Zero and around 80 hours for R1 [10]. - The total training cost, including the earlier V3 model, remains significantly lower than competitors, totaling around $6 million for V3 and $294,000 for R1 [10]. Group 2: Model Performance and Validation - DeepSeek's approach allows for significant performance improvements in reasoning capabilities through large-scale reinforcement learning, even without supervised fine-tuning [13]. - The model's ability to self-validate and reflect on its answers enhances its performance on complex programming and scientific problems [13]. - The research indicates that the R1 model has become the most popular open-source reasoning model globally, with over 10.9 million downloads on Hugging Face [10]. Group 3: Industry Impact and Peer Review - The publication of the R1 model in "Nature" sets a precedent for transparency in AI research, addressing concerns about the reliability of benchmark tests and the potential for manipulation [11]. - The research emphasizes the importance of independent peer review in validating the capabilities of AI systems, which is crucial in an industry facing scrutiny over performance claims [11].
DeepSeek,打破历史!中国AI的“Nature时刻”
Zheng Quan Shi Bao· 2025-09-18 05:24
Core Insights - The DeepSeek-R1 inference model research paper has made history by being the first Chinese large model research to be published on the cover of the prestigious journal Nature, marking a significant recognition of China's AI technology in the international scientific community [1][2] - Nature's editorial highlighted that DeepSeek has broken the gap of independent peer review for mainstream large models, which has been lacking in the industry [2] Group 1: Research and Development - The DeepSeek-R1 model's research paper underwent a rigorous peer review process involving eight external experts over six months, emphasizing the importance of transparency and reproducibility in AI model development [2] - The paper disclosed significant details about the training costs and methodologies, including a total training cost of $294,000 (approximately 2.09 million RMB) for R1, achieved using 512 H800 GPUs over 198 hours [3] Group 2: Model Performance and Criticism - DeepSeek addressed initial criticisms regarding the "distillation" method used in R1, clarifying that all training data was sourced from the internet without intentional use of outputs from proprietary models like OpenAI's [3] - The R1 model has been recognized for its cost-effectiveness compared to other inference models, which often incur training costs in the tens of millions [3] Group 3: Future Developments - There is significant anticipation regarding the release of the R2 model, with speculation that delays may be due to computational limitations [4] - The recent release of DeepSeek-V3.1 has introduced a mixed inference architecture and improved efficiency, indicating a step towards the "Agent" era in AI [4][5] - DeepSeek's emphasis on using UE8M0 FP8 Scale parameter precision in V3.1 suggests a strategic alignment with domestic AI chip development, potentially enhancing the performance of future models [5]