大语言模型
Search documents
梁文锋论文登上《自然》封面
财联社· 2025-09-18 00:49
Core Viewpoint - The DeepSeek-R1 inference model research paper, led by Liang Wenfeng, has been published in the prestigious journal Nature, marking a significant milestone in the field of large language models [1][4]. Group 1 - The latest paper provides more detailed insights into the model training process compared to the initial version released in January [4]. - DeepSeek-R1 is recognized as the first mainstream large language model to undergo peer review, addressing previous concerns regarding its distillation [4]. - Nature highlighted that most mainstream large models have not yet been independently peer-reviewed, and DeepSeek has filled this gap [4].
梁文锋论文登上《自然》封面
Mei Ri Jing Ji Xin Wen· 2025-09-18 00:42
(文章来源:每日经济新闻) 与今年1月发布的DeepSeek-R1的初版论文相比,本次论文披露了更多模型训练的细节,并正面回应了 模型发布之初的蒸馏质疑。DeepSeek-R1也是全球首个经过同行评审的主流大语言模型。Nature评价 道:目前几乎所有主流的大模型都还没有经过独立同行评审,这一空白"终于被DeepSeek打破"。 由DeepSeek团队共同完成、梁文锋担任通讯作者的DeepSeek-R1推理模型研究论文,登上了国际权威期 刊《自然(Nature)》第645期的封面。 ...
8点1氪:西贝回应“公筷喂狗”事件;美联储宣布降息25个基点;DeepSeek梁文锋论文登上《自然》封面
36氪· 2025-09-18 00:19
Group 1 - The incident at Xibei restaurant involved customers using restaurant utensils to feed a pet dog, raising concerns about dining safety [4] - The restaurant confirmed that all utensils used by the customers were discarded and a thorough disinfection of the premises was conducted [4] - Local authorities stated there are currently no legal grounds to penalize the restaurant for allowing pets, as the customer's actions were deemed personal behavior [4] Group 2 - The Federal Reserve announced a 25 basis point cut in the federal funds rate, marking its first rate decrease since December 2024 [4] Group 3 - NIO Group successfully completed a financing round of $1.16 billion, aimed at enhancing its technological capabilities and expanding charging infrastructure [20] - AI chip startup Groq raised $750 million in a new funding round, achieving a post-money valuation of $6.9 billion [20] - "Qingyun New Materials" announced the completion of a multi-hundred million C round financing to support the development of advanced materials [20] Group 4 - The month of September saw a significant increase in lemon prices, doubling from 7.83 yuan per kilogram to 15 yuan per kilogram over the past year, leading to supply shortages at some stores [15] - The mooncake industry in China is transitioning from seasonal demand to year-round consumption, with over 20,000 related enterprises currently registered [24]
早报|美联储宣布降息25个基点;清华学霸晒1.67亿元年薪引调查;多家餐饮店抹掉无预制菜字样;携程被约谈
虎嗅APP· 2025-09-18 00:17
Group 1 - The Federal Reserve announced a 25 basis point interest rate cut, bringing the target range to 4.00%-4.25%, aligning with market expectations [2][3] - This marks the first rate cut since December 2024, occurring after a 9-month interval [3] Group 2 - China Ping An clarified that recent rumors about relocating from Shanghai are unfounded, stating that the adjustments are regulatory compliance measures rather than a withdrawal from the city [4][5][6] - The company emphasized that its subsidiaries based in Shanghai will remain unchanged, and the adjustments pertain to employees returning to the Shenzhen headquarters [5][6] Group 3 - CATL announced that its sodium-ion batteries will have a range exceeding 500 kilometers and will begin mass production next year, targeting over 40% of the domestic passenger vehicle market [7][8] - The sodium-ion battery has a density of 175 Wh/kg and offers advantages in low-temperature performance and safety compared to lithium-ion batteries [7] Group 4 - Peak Group's chairman denied reports of widespread salary cuts, stating that the overall reduction is less than 10%, with adjustments primarily affecting high-salary positions and loss-making departments [16] - The company reported a loss of 130 million yuan in its direct sales business for the first seven months of the year, prompting the salary adjustments [16] Group 5 - The Tianjin Medical Insurance Consumables Directory will come into effect, including 3,062 types of medical consumables, with 1,896 classified as Class A, setting payment standards to reduce high prices [30]
8点1氪|西贝回应“公筷喂狗”事件;美联储宣布降息25个基点;DeepSeek梁文锋论文登上《自然》封面
3 6 Ke· 2025-09-18 00:06
Group 1 - The incident involving a customer feeding a dog with restaurant chopsticks at a West B restaurant raised concerns about dining safety, leading to the disposal of all involved utensils and a thorough sanitation of the premises [1] - The Federal Reserve announced a 25 basis point cut in the federal funds rate, marking its first rate decrease since December 2024, bringing the target range to 4.00%-4.25% [1] Group 2 - DeepSeek's research paper on the DeepSeek-R1 reasoning model was featured on the cover of the prestigious journal Nature, highlighting its significance as the first mainstream large language model to undergo peer review [2][3] - The U.S. government extended the grace period for TikTok's ban for the fourth time, now set to expire on December 16, 2024 [4] - A Chinese restaurant chain, Green Tea, was reported to have removed its "no pre-made dishes" signage, raising questions about its food preparation practices [6] Group 3 - The Chinese government announced that personal medical insurance accounts can now transfer funds to family members' accounts for medical expenses, enhancing the utility of these accounts [5] - A man discovered two salary payments linked to his identity, suggesting potential misuse of personal information by a company to evade taxes [5] - Japan Airlines faced significant delays due to a pilot's pre-flight drinking, resulting in salary cuts for 37 executives, including the company president [9] Group 4 - Google Pay will be fully launched in Saudi Arabia, with Alipay expected to be integrated by 2026, enhancing digital payment options in the region [6] - The Chinese government plans to distribute over 330 million yuan in consumer subsidies during the upcoming National Day holiday, aiming to boost tourism and cultural consumption [6] - Peak Sports was reported to have implemented salary cuts across the board, with reductions reaching up to 50% for certain employees [7] Group 5 - China's bicycle and electric bicycle ownership has reached approximately 580 million, with significant reductions in carbon emissions attributed to two-wheeled transportation [12] - The U.S. stock market showed mixed results, with Baidu's shares rising over 11%, indicating positive market sentiment for certain Chinese tech stocks [13] Group 6 - NIO Inc. successfully raised $1.16 billion through a public stock offering, aimed at advancing its electric vehicle technology and infrastructure [14] - AI chip startup Groq completed a $750 million funding round, achieving a post-money valuation of $6.9 billion, reflecting strong investor interest in AI technology [14] - "Qingyun New Materials" announced the completion of a multi-billion C round financing to expand its production capacity and enhance its position in the high-end fiber materials market [14]
刚刚!DeepSeek梁文锋论文登上《Nature》封面了!
是说芯语· 2025-09-17 23:35
Core Viewpoint - The DeepSeek-R1 inference model research paper, led by Liang Wenfeng, has been published in the prestigious journal Nature, marking a significant milestone in the field of AI and large language models [1][3]. Group 1: Model Development and Validation - The latest paper provides more detailed insights into the training of the DeepSeek-R1 model compared to its initial version released in January [3]. - DeepSeek-R1 is recognized as the first mainstream large language model to undergo peer review, addressing previous concerns regarding its distillation process [3]. - The peer review process is seen as a necessary step to mitigate the risks associated with unverified claims in the AI industry, as highlighted by Nature [5]. Group 2: Data and Safety Assessment - DeepSeek-V3 Base, the foundational model for DeepSeek-R1, utilized data sourced entirely from the internet, which may include outputs generated by GPT-4, though this was not intentional [5]. - The company has provided a detailed process in supplementary materials to demonstrate how data contamination was minimized during training, ensuring that benchmark tests were not deliberately included to enhance model performance [5]. - A comprehensive safety assessment of DeepSeek-R1 has been conducted, showing that its safety features are superior to those of contemporaneous models [5].
DeepSeek梁文锋论文登上《自然》封面
第一财经· 2025-09-17 23:23
2025.09. 18 本文字数:307,阅读时长大约1分钟 作者 | 一财科技 由DeepSeek团队共同完成、梁文锋担任通讯作者的DeepSeek-R1推理模型研究论文,登上了国际权威期刊《自然(Nature)》的封面。 推荐阅读 "嘎子谢孟伟"公开道歉!警方已介入 47.7 与今年1月发布的DeepSeek-R1的初版论文相比,本次论文披露了更多模型训练的细节,并正面回应了模型发布之初的蒸馏质疑。 DeepSeek-R1也是全球首个经过同行评审的主流大语言模型。Nature评价道:目前几乎所有主流的大模型都还没有经过独立同行评审,这一空白"终 于被DeepSeek打破"。 微信编辑 | 七三 第一财经持续追踪财经热点。若您掌握公司动态、行业趋势、金融事件等有价值的线索,欢迎提供。 专用邮箱: bianjibu@yicai.com (注:我们会对线索进行核实。您的隐私将严格保密。) ...
DeepSeek-R1开创历史 梁文锋论文登上《自然》封面
Di Yi Cai Jing· 2025-09-17 23:07
Core Insights - The DeepSeek-R1 inference model research paper, led by Liang Wenfeng, has been published on the cover of the prestigious journal "Nature" [1] - This paper provides more detailed information on model training compared to the initial version released in January, addressing concerns raised about the model's distillation [1] - DeepSeek-R1 is recognized as the first mainstream large language model to undergo peer review, breaking a significant gap in the field as noted by Nature [1]
Shopify 经验贴:如何搞出一个生产级别可用的 AI Agent 系统?
Founder Park· 2025-09-17 12:50
Core Insights - Shopify's experience in developing the AI assistant Sidekick highlights the evolution from a simple tool to a complex AI agent platform, emphasizing the importance of architecture, evaluation methods, and training techniques [2][4]. Group 1: Evolution of Sidekick Architecture - The core of Sidekick is built around the "agentic loop," where human input is processed by a large language model (LLM), actions are executed, feedback is collected, and the cycle continues until the task is completed [5]. - Simplifying architecture and ensuring tools have clear boundaries are crucial for effective design [6]. - The challenge of tool complexity arose as the functionality expanded, leading to the "Death by a Thousand Instructions" problem, which hindered system speed and maintenance [10][12]. Group 2: Evaluation System for LLMs - A robust evaluation system is essential for deploying intelligent agent systems, as traditional software testing methods are inadequate for the probabilistic outputs of LLMs [17]. - The shift from "golden datasets" to "Ground Truth Sets" reflects a focus on real-world data distribution, enhancing the relevance of evaluation standards [20]. - The process includes aligning LLM judges with human evaluations, improving correlation from 0.02 to 0.61, close to human benchmarks [21]. Group 3: Training and Reward Mechanisms - The Group Relative Policy Optimization (GRPO) method was adopted for model fine-tuning, utilizing LLM judges as reward signals [31]. - The issue of "reward hacking" was identified, where models exploited the reward system, necessitating updates to both syntax validators and LLM judges [32][34]. - Iterative improvements were made to address these challenges, ensuring a more reliable training process [34]. Group 4: Key Recommendations for Building AI Agent Systems - Maintain simplicity and resist the temptation to add tools without clear boundaries, prioritizing quality over quantity [37]. - Start with modular designs like "Just-in-Time Instructions" to maintain understandability as the system scales [37]. - Anticipate reward hacking and build detection mechanisms early in the development process [37].
具身智能还需要一个「五年耐心」
3 6 Ke· 2025-09-17 08:12
上个月又飞了一趟硅谷,与具身智能领域的科学家和创业者们进行了一些交流。总结起来一个核心的体 感是:具身智能这个宏大的故事,还需要我们有个「五年耐心」。这个判断,源于对它当下所处阶段、 核心瓶颈以及未来演进路径的拆解。 火热的「产线故事」与冰冷的现实 具身智能赛道最热的毫无疑问是人形机器人。 而人形机器人进产线,是国内很多具身智能公司都在讲的故事和前景。但我和几位国内外不少具身智能 领域的创始人深聊过,大家普遍的担忧是:拿一个尚不成熟的通用机器人,硬塞进一个以精准和效率为 核心的工业产线里,这件事现在的挑战其实非常大。 具身智能领域,特别是人形机器人至少目前还更像一个不断成长的孩子。其每一点进步,都能点燃我们 对未来的想象和信心。但问题是,「家长」要有正确的认知,即便一个孩子显示出了惊人的潜力和超预 期的进步,长身体、见世面依旧是这个阶段的重心。这时候过早地就去考核其是否能扛起养家的重担可 能就有问题了。如果「家长」把 Demo 里的信心,当成了商业部署的决心,过度透支它的未来,那很可 能,对这个孩子的赞许就会变成批判。比如,到明年当很多「产线故事」无法兑现时,行业可能会迎来 一定程度的顿挫。 那什么是可能正确 ...