Workflow
大语言模型
icon
Search documents
8点1氪|西贝回应“公筷喂狗”事件;美联储宣布降息25个基点;DeepSeek梁文锋论文登上《自然》封面
3 6 Ke· 2025-09-18 00:06
Group 1 - The incident involving a customer feeding a dog with restaurant chopsticks at a West B restaurant raised concerns about dining safety, leading to the disposal of all involved utensils and a thorough sanitation of the premises [1] - The Federal Reserve announced a 25 basis point cut in the federal funds rate, marking its first rate decrease since December 2024, bringing the target range to 4.00%-4.25% [1] Group 2 - DeepSeek's research paper on the DeepSeek-R1 reasoning model was featured on the cover of the prestigious journal Nature, highlighting its significance as the first mainstream large language model to undergo peer review [2][3] - The U.S. government extended the grace period for TikTok's ban for the fourth time, now set to expire on December 16, 2024 [4] - A Chinese restaurant chain, Green Tea, was reported to have removed its "no pre-made dishes" signage, raising questions about its food preparation practices [6] Group 3 - The Chinese government announced that personal medical insurance accounts can now transfer funds to family members' accounts for medical expenses, enhancing the utility of these accounts [5] - A man discovered two salary payments linked to his identity, suggesting potential misuse of personal information by a company to evade taxes [5] - Japan Airlines faced significant delays due to a pilot's pre-flight drinking, resulting in salary cuts for 37 executives, including the company president [9] Group 4 - Google Pay will be fully launched in Saudi Arabia, with Alipay expected to be integrated by 2026, enhancing digital payment options in the region [6] - The Chinese government plans to distribute over 330 million yuan in consumer subsidies during the upcoming National Day holiday, aiming to boost tourism and cultural consumption [6] - Peak Sports was reported to have implemented salary cuts across the board, with reductions reaching up to 50% for certain employees [7] Group 5 - China's bicycle and electric bicycle ownership has reached approximately 580 million, with significant reductions in carbon emissions attributed to two-wheeled transportation [12] - The U.S. stock market showed mixed results, with Baidu's shares rising over 11%, indicating positive market sentiment for certain Chinese tech stocks [13] Group 6 - NIO Inc. successfully raised $1.16 billion through a public stock offering, aimed at advancing its electric vehicle technology and infrastructure [14] - AI chip startup Groq completed a $750 million funding round, achieving a post-money valuation of $6.9 billion, reflecting strong investor interest in AI technology [14] - "Qingyun New Materials" announced the completion of a multi-billion C round financing to expand its production capacity and enhance its position in the high-end fiber materials market [14]
刚刚!DeepSeek梁文锋论文登上《Nature》封面了!
是说芯语· 2025-09-17 23:35
Core Viewpoint - The DeepSeek-R1 inference model research paper, led by Liang Wenfeng, has been published in the prestigious journal Nature, marking a significant milestone in the field of AI and large language models [1][3]. Group 1: Model Development and Validation - The latest paper provides more detailed insights into the training of the DeepSeek-R1 model compared to its initial version released in January [3]. - DeepSeek-R1 is recognized as the first mainstream large language model to undergo peer review, addressing previous concerns regarding its distillation process [3]. - The peer review process is seen as a necessary step to mitigate the risks associated with unverified claims in the AI industry, as highlighted by Nature [5]. Group 2: Data and Safety Assessment - DeepSeek-V3 Base, the foundational model for DeepSeek-R1, utilized data sourced entirely from the internet, which may include outputs generated by GPT-4, though this was not intentional [5]. - The company has provided a detailed process in supplementary materials to demonstrate how data contamination was minimized during training, ensuring that benchmark tests were not deliberately included to enhance model performance [5]. - A comprehensive safety assessment of DeepSeek-R1 has been conducted, showing that its safety features are superior to those of contemporaneous models [5].
DeepSeek梁文锋论文登上《自然》封面
第一财经· 2025-09-17 23:23
2025.09. 18 本文字数:307,阅读时长大约1分钟 作者 | 一财科技 由DeepSeek团队共同完成、梁文锋担任通讯作者的DeepSeek-R1推理模型研究论文,登上了国际权威期刊《自然(Nature)》的封面。 推荐阅读 "嘎子谢孟伟"公开道歉!警方已介入 47.7 与今年1月发布的DeepSeek-R1的初版论文相比,本次论文披露了更多模型训练的细节,并正面回应了模型发布之初的蒸馏质疑。 DeepSeek-R1也是全球首个经过同行评审的主流大语言模型。Nature评价道:目前几乎所有主流的大模型都还没有经过独立同行评审,这一空白"终 于被DeepSeek打破"。 微信编辑 | 七三 第一财经持续追踪财经热点。若您掌握公司动态、行业趋势、金融事件等有价值的线索,欢迎提供。 专用邮箱: bianjibu@yicai.com (注:我们会对线索进行核实。您的隐私将严格保密。) ...
DeepSeek-R1开创历史 梁文锋论文登上《自然》封面
Di Yi Cai Jing· 2025-09-17 23:07
Core Insights - The DeepSeek-R1 inference model research paper, led by Liang Wenfeng, has been published on the cover of the prestigious journal "Nature" [1] - This paper provides more detailed information on model training compared to the initial version released in January, addressing concerns raised about the model's distillation [1] - DeepSeek-R1 is recognized as the first mainstream large language model to undergo peer review, breaking a significant gap in the field as noted by Nature [1]
Shopify 经验贴:如何搞出一个生产级别可用的 AI Agent 系统?
Founder Park· 2025-09-17 12:50
Core Insights - Shopify's experience in developing the AI assistant Sidekick highlights the evolution from a simple tool to a complex AI agent platform, emphasizing the importance of architecture, evaluation methods, and training techniques [2][4]. Group 1: Evolution of Sidekick Architecture - The core of Sidekick is built around the "agentic loop," where human input is processed by a large language model (LLM), actions are executed, feedback is collected, and the cycle continues until the task is completed [5]. - Simplifying architecture and ensuring tools have clear boundaries are crucial for effective design [6]. - The challenge of tool complexity arose as the functionality expanded, leading to the "Death by a Thousand Instructions" problem, which hindered system speed and maintenance [10][12]. Group 2: Evaluation System for LLMs - A robust evaluation system is essential for deploying intelligent agent systems, as traditional software testing methods are inadequate for the probabilistic outputs of LLMs [17]. - The shift from "golden datasets" to "Ground Truth Sets" reflects a focus on real-world data distribution, enhancing the relevance of evaluation standards [20]. - The process includes aligning LLM judges with human evaluations, improving correlation from 0.02 to 0.61, close to human benchmarks [21]. Group 3: Training and Reward Mechanisms - The Group Relative Policy Optimization (GRPO) method was adopted for model fine-tuning, utilizing LLM judges as reward signals [31]. - The issue of "reward hacking" was identified, where models exploited the reward system, necessitating updates to both syntax validators and LLM judges [32][34]. - Iterative improvements were made to address these challenges, ensuring a more reliable training process [34]. Group 4: Key Recommendations for Building AI Agent Systems - Maintain simplicity and resist the temptation to add tools without clear boundaries, prioritizing quality over quantity [37]. - Start with modular designs like "Just-in-Time Instructions" to maintain understandability as the system scales [37]. - Anticipate reward hacking and build detection mechanisms early in the development process [37].
具身智能还需要一个「五年耐心」
3 6 Ke· 2025-09-17 08:12
上个月又飞了一趟硅谷,与具身智能领域的科学家和创业者们进行了一些交流。总结起来一个核心的体 感是:具身智能这个宏大的故事,还需要我们有个「五年耐心」。这个判断,源于对它当下所处阶段、 核心瓶颈以及未来演进路径的拆解。 火热的「产线故事」与冰冷的现实 具身智能赛道最热的毫无疑问是人形机器人。 而人形机器人进产线,是国内很多具身智能公司都在讲的故事和前景。但我和几位国内外不少具身智能 领域的创始人深聊过,大家普遍的担忧是:拿一个尚不成熟的通用机器人,硬塞进一个以精准和效率为 核心的工业产线里,这件事现在的挑战其实非常大。 具身智能领域,特别是人形机器人至少目前还更像一个不断成长的孩子。其每一点进步,都能点燃我们 对未来的想象和信心。但问题是,「家长」要有正确的认知,即便一个孩子显示出了惊人的潜力和超预 期的进步,长身体、见世面依旧是这个阶段的重心。这时候过早地就去考核其是否能扛起养家的重担可 能就有问题了。如果「家长」把 Demo 里的信心,当成了商业部署的决心,过度透支它的未来,那很可 能,对这个孩子的赞许就会变成批判。比如,到明年当很多「产线故事」无法兑现时,行业可能会迎来 一定程度的顿挫。 那什么是可能正确 ...
早报|刘强东:近期又约过王兴见面;校方回应男留学生与女生混住;“车顶维权”女车主首赢特斯拉;太二回应门店活鱼现杀争议
虎嗅APP· 2025-09-17 00:20
Group 1 - Microsoft plans to invest over $30 billion in the UK over the next four years, with an additional $15.5 billion for capital expansion on top of the previously announced $3.2 billion for data center infrastructure [2][3] - The investment will also include $15.1 billion for various business initiatives in the UK, such as an AI lab in London and gaming projects [4] Group 2 - A collision involving two XPeng Heitech eVTOLs occurred at the Changchun Airshow, resulting in at least one passenger injury, with the company currently verifying the details [5][6] Group 3 - JD.com's founder Liu Qiangdong announced plans for a new hotel development strategy, emphasizing the need to avoid price wars that could harm service quality and profitability [7][19] Group 4 - The U.S. National Highway Traffic Safety Administration is investigating approximately 174,000 Tesla Model Y vehicles due to potential door handle malfunctions linked to low battery power [8] - A Beijing court ruled that Tesla must provide complete driving data from the 30 minutes prior to an accident, affirming consumer rights to information [15][16] Group 5 - Google's Gemini AI model has surpassed OpenAI's ChatGPT in the Apple App Store's free app rankings, indicating a significant user adoption and interest in Google's advancements in generative AI [9][10] Group 6 - Anta Group reported the dismissal of 74 employees for serious misconduct as part of its anti-corruption efforts, with 46 individuals referred to judicial authorities for criminal activities [13][14] - The company aims to enhance its internal auditing and oversight mechanisms by 2025 [14] Group 7 - Kering Group confirmed a data breach affecting millions of customers from brands like Gucci and Balenciaga, although financial information was not compromised [20] - The company has notified affected customers and reported the incident to relevant data protection authorities [20] Group 8 - The Global Public Safety Cooperation Forum will open in Lianyungang, with over 800 guests from 176 countries and international organizations attending, aiming to release a comprehensive public safety index report [29] - Meta's Connect 2025 conference will focus on the integration of AI glasses and the metaverse, with the anticipated launch of a new smart eyewear product [30] Group 9 - The Chinese government plans to introduce consumption stimulus measures, including appliance trade-in programs and tourism subsidies, to stimulate a trillion-dollar market [31] Group 10 - Ant Group's CEO predicts that large language models may eliminate the need for traditional apps, as intelligent agents take over various tasks, reshaping human-computer interaction [32][33]
起售价23.59万元,奥迪E5 Sportback上市
Bei Jing Shang Bao· 2025-09-16 14:26
据了解,奥迪 E5 Sportback搭载全新AUDI OS操作系统,融合高通骁龙8295数字座舱芯片,构建超感互 动数智座舱。同时,座舱中央搭载的奥迪助手,依托深度定制的火山引擎大语言模型"豆包",具备强大 的语义理解、多轮对话及车控交互能力。此外,中央扶手区的奥迪智慧岛,集成50多项可定制快捷盲操 功能。在辅助驾驶领域,奥迪与Momenta达成深度合作,双方共同开发"德系Driving DNA+端到端飞轮 大模型"方案,全面覆盖城市、高速与泊车场景。新车落地同时,上汽奥迪在加速渠道建设。按照规 划,预计今年底,上汽奥迪将在全国100多个城市,建立超过240家集销售和体验于一体的全功能用户中 心。 北京商报讯(记者 刘晓梦)9月16日,AUDI首款战略车奥迪E5 Sportback正式上市,全系推出先锋型、 先锋plus型、先锋quattro型和旗舰quattro型共4款配置车型,官方指导价23.59万—31.99万元。 ...
IPO研究 | 中国保险AI科技总可触及市场规模预计2029年将达1.35万亿元
Sou Hu Cai Jing· 2025-09-16 10:32
中国保险市场一直处于快速增长阶段,保费由2020年的人民币4.5万亿元增加至2024年的人民币5.7万亿 元,复合年增长率为5.9%。预计中国保险市场规模将进一步增长,在2029年达人民币9.8万亿元,2024 至2029年间复合年增长率为11.5%。其中,健康险市场规模由2020年的人民币0.8万亿元增至2024年的人 民币1.0万亿元,复合年增长率为4.6%。受公众健康意识提升以及产品、技术及服务不断推陈出新等因 素驱动,中国健康险保费预计将于2029年达到人民币1.7万亿元,2024年至2029年间复合年增长率为 11.6%。 尽管中国于2023年以保费规模位居全球第二大保险市场,但其保险渗透率仅为3.9%,保险密度为516美 元,远低于全球保险渗透率7.0%及保险密度889美元的水平。由此可见,中国保险业不仅具备强劲增长 动力,更蕴藏著广阔的发展潜力。 保险业正经历由技术发展与数据整合驱动的重大转变。AI的普及,特别是基于大语言模型的智能体的 应用,正全面提升保险价值链的营运效率,为产品设计、用户运营、承保、理赔审核及调查,以及健康 管理服务等环节赋能。AI驱动的解决方案使保险公司能建立更精准高效的风 ...
只要科学任务能打分,AI就能实现SOTA结果 | 谷歌最新论文
量子位· 2025-09-15 05:57
Core Viewpoint - The article discusses a new AI system developed by Google that assists scientists in creating expert-level empirical software, achieving state-of-the-art (SOTA) results across various scientific fields [10][12][30]. Group 1: AI System Development - The AI system utilizes a combination of Large Language Models (LLMs) and tree search algorithms to systematically improve software quality metrics [10][17]. - It addresses the slow and labor-intensive process of developing empirical software, which often takes years to complete [14][15]. - The system can automatically create empirical software for quantifiable tasks, significantly enhancing the efficiency of scientific research [17][24]. Group 2: Performance and Achievements - In bioinformatics, the system discovered 40 novel methods for single-cell data analysis, outperforming top human-developed methods on public leaderboards [25][30]. - In epidemiology, it generated 14 models that surpassed the CDC ensemble and all other individual models for forecasting COVID-19 hospitalizations [10][30]. - The system also produced state-of-the-art software for geospatial analysis, neural activity prediction in zebrafish, time series forecasting, and numerical solutions of integrals [10][30]. Group 3: Methodology and Innovation - The AI system enhances code mutation capabilities by injecting research ideas from highly cited papers, textbooks, and search engine results [21][24]. - It generates numerous candidate software solutions and employs tree search algorithms to filter and optimize these candidates [17][24]. - The integration of complex research ideas allows the system to explore a vast solution space, leading to the discovery of high-quality solutions [24][30]. Group 4: Community Response and Implications - The article notes that the introduction of AI in scientific research has sparked discussions about the appropriateness of delegating research authority to AI [32]. - There are concerns regarding the reliability of AI-generated results and the need for human oversight in the verification process [32][40].