大语言模型

Search documents
AI医疗进入精准化“深水区” :OpenAI医疗评估基准落地、大模型加速变革|AI医疗浪潮㉑
2 1 Shi Ji Jing Ji Bao Dao· 2025-05-17 05:05
Core Insights - OpenAI has launched HealthBench, an open-source benchmark for evaluating the performance and safety of large language models in the healthcare sector, which has sparked widespread discussion in the industry [1][3] - The benchmark was developed with the participation of 262 practicing doctors from 60 countries and integrates 5,000 real medical dialogue data, utilizing 48,562 unique scoring criteria created by doctors for meaningful open assessments [1][3] - The introduction of HealthBench is expected to enhance the scientific and comprehensive evaluation of AI medical models, accelerating the application of AI technology in healthcare and providing new development opportunities for related companies [1][3] Group 1: HealthBench Overview - HealthBench consists of 7 themes and 5 evaluation dimensions, focusing on areas such as emergency referrals and professional communication, with dimensions including accuracy and contextual understanding [3][4] - OpenAI has also introduced two special versions of HealthBench: HealthBench Consensus, which includes 34 critical evaluation dimensions verified by doctors, and HealthBench Hard, which presents more challenging assessment scenarios [4] - The credibility of HealthBench has been supported by a meta-evaluation comparing model scores with human doctor scores, showing high consistency in 6 out of 7 evaluation areas [4] Group 2: Trends in AI Healthcare Applications - The AI healthcare market is projected to grow at an annual rate of 43% from 2024 to 2032, potentially reaching a market size of $491 billion [6] - AI is expected to enhance healthcare accessibility and efficiency, addressing issues like personnel shortages in hospitals and improving diagnostic accuracy [6] - The evolution of AI in healthcare has transitioned from rule-driven to data-driven approaches, now entering a multi-modal integration phase, allowing for better understanding and modeling of diverse medical data [6][7] Group 3: Future Directions in AI Models - The focus of competition among large models has shifted from merely increasing parameter size to optimizing model efficiency and performance under limited computational resources [7] - Key trends in AI applications within the pharmaceutical industry include the emergence of models as products, local and edge deployment, and rapid expansion of AI applications in research and development [7][8] - The pharmaceutical industry is expected to see a rise in specialized models tailored for specific scenarios, enhancing the adaptability and effectiveness of AI solutions [7][8]
Anthropic联创克拉克最新专访:AI可能具备某种“外星人意识”
3 6 Ke· 2025-05-15 09:30
5月15日消息,近日Anthropic联合创始人杰克·克拉克(Jack Clark)做客乔治梅森大学经济学教授泰勒·考恩(Tyler Cowen)的播客,分享 了对AI未来的独到见解。他们探讨了AGI对经济的潜在影响、大模型竞争格局,以及监管和治理方面的挑战等问题。 克拉克认为,园艺、电工等高技能工艺领域的岗位将最晚被AGI取代,因为人们不仅为其技术买单,更为工匠的审美与声誉付费。 对于国家之间的AI竞争,克拉克认为多数国家最终会接纳强AI。虽然可能有少数国家拒绝大型AI系统,但在全球化趋势下,大多数国家 最终会融入这一体系,难以独立于AI技术发展之外。 不过,克拉克表示在全球范围内达成全面的AI治理协议"非常困难",但中美之间可能会就某些危险技术形成有限的共识,类似"核不扩 散"协议。他不认为这会是"合作",而更可能是出于共同防范风险的现实主义考量。 以下为克拉克专访精华内容: 01 手工业或创意性工作会被AGI最后取代 问:你觉得哪些工作会受到AGI的最后影响? 克拉克:我认为,那些依赖手工技能、经验判断和个人风格的工作,可能是AGI最晚才会替代的。像电工、修水管的管道工,或者园丁 这样的技术工种,有很多 ...
字节最新大模型秘籍:只挑能有推理潜力的数据训练!1.3B模型无需标签自动挑选
量子位· 2025-05-15 06:26
西风 发自 凹非寺 量子位 | 公众号 QbitAI 和人工标记数据说拜拜,利用预训练语言模型中的注意力机制就能选择 可激发推理能力的训练数据 ! 字节Seed团队最新宣布了一个重要成果—— At te ntionInflu en ce 。 无需训练,无需标签 ,只需用1.3B模型给7B模型选择数据,就能提升模型推理能力,甚至也能提升代码生成能力。 以往,筛选数据的方法通常依赖于监督分类器,需要人工或大语言模型进行标注,难免引入领域特定偏见。 字节Seed团队注意到: 预训练模型中的检索头与检索和上下文推理紧密相关。 检索头在训练早期就会出现,逐渐增强,并最终在训练的中后期阶段牢固建立,对模型性能起到至关重要的作用。 1.3B参数稠密模型中检索头的演化过程,be like: 但如果直接关闭它们会怎样? 他们用小型预训练语言模型通过简单的 注意力头屏蔽 操作,充当强大的模型的数据选择器。 具体操作是 , 识别重要检索头,屏蔽这些头以创建性能下降的"弱"模型, 计算"弱"模型与原始"强"模型之间的损失差异,根据损失增加幅度 对数据进行排名 ,形成影响分数 。 没想到,实验后他们得到了一个惊人结果。 将Attent ...
华东空管局技术保障中心上线智能体系统 空管通导业务迈入AI时代
Zhong Guo Min Hang Wang· 2025-05-15 04:28
Core Insights - The East China Air Traffic Management Bureau has successfully launched an intelligent system for air traffic control, marking a significant step in the transformation and upgrade of air traffic control services [1][5] - The intelligent system integrates professional knowledge and business processes in air traffic control, enabling intelligent data analysis, fault simulation, and operational decision support [1][2] Group 1: System Features - The intelligent system is built on advanced large language models and reasoning technologies, allowing it to deeply understand air traffic control knowledge and processes, thereby enhancing operational efficiency [2][5] - It serves as a "multi-faceted expert," breaking down information barriers and converting complex business processes into efficient digital solutions [2][5] Group 2: Application Scenarios - Four major intelligent application scenarios have been customized to enhance operational efficiency, including a qualification inspection application that significantly improves employee learning efficiency by reducing information retrieval time from over 10 minutes to around 1 minute, achieving a 90% efficiency increase [3] - Other applications include automated log analysis for quick historical record retrieval, a rapid Q&A function for technical documents, and an emergency troubleshooting assistant that provides decision-making references based on historical fault cases [3] Group 3: Technical Infrastructure - The system development utilized the Dify platform, which supports low-code development and allows for flexible integration of various models, significantly simplifying the AI application construction process [4] - The vLLM framework was employed for virtualization deployment, achieving high concurrency and low latency, which is crucial for the real-time and safety-critical nature of air traffic control [4] Group 4: Future Directions - The launch of the intelligent system demonstrates the feasibility of large model technology in high-safety industries, with plans for further integration of multimodal capabilities, including voice, image, and video recognition [5] - The technical team aims to enhance the intelligent system's multi-source perception capabilities and promote the comprehensive implementation of various AI application scenarios [5]
10万美元成本训练的小模型,在特定任务超越GPT-4o,延迟低99倍
3 6 Ke· 2025-05-14 09:45
现有的SOTA级别大语言模型固然拥有较强智能,在部分任务上达到或超过了人类的水准,但他们的参数尺寸动辄达到数千亿甚至万亿,无论是训练,部 署,还是推理,都成本高昂。对于企业和开发者来说,这些SOTA模型在一些相对简单,但需要大规模和高并发的任务上,未必是综合成本及性能的最优选 择。 一家叫Fastino的早期初创公司看到了这个痛点,使用低端游戏GPU,以平均不到10万美元的成本,训练出一系列称为"任务特定语言模型"(TLMs,Task- Specific Language Models)的小型模型,能够在特定任务上性能媲美大型语言模型,并且推理速度快99倍。 近日,Fastino获得由Khosla Ventures领投的1750万美元种子轮融资,Insight Partners,Valor Equity Partners,以及知名天使投资人前Docker首席执行官Scott Johnston和Weights & Biases首席执行官Lukas Biewald参与。在2024年11月,Fastino获得M12(微软旗下)和Insight Partners领投的700万美元前种子轮融资, 累计融资近2500万美 ...
原微软WizardLM项目团队加入腾讯混元
news flash· 2025-05-14 06:27
Core Insights - The creator of the WizardLM project, Xu Can, announced the team's departure from Microsoft to join Tencent's AI development organization, Hunyuan, with a focus on advancing LLM training technology and building better AI models [1] Company Developments - The WizardLM team consists of six key members, most of whom have left Microsoft to pursue their mission under Tencent [1]
微软这支神秘的华人AI团队加入腾讯混元,曝与裁员无关|独家
AI前线· 2025-05-14 05:47
Core Viewpoint - The WizardLM team, creators of advanced large language models, has transitioned from Microsoft to Tencent's AI development organization, Hunyuan, aiming to enhance LLM training technology and develop superior AI models [1][3][31]. Group 1: Team Transition and Background - The WizardLM team, consisting of six key members, has left Microsoft amid speculation regarding layoffs affecting 3% of the workforce, although their departure is reportedly unrelated to these layoffs [4][6]. - The team was established in early 2023, focusing on the development of advanced large language models, with notable members including Qingfeng Sun and Can Xu, both of whom have significant experience in AI research [7][9][10]. - The team has previously contributed to the development of models such as WizardLM, WizardCoder, and WizardMath, and has published over 40 papers in top international conferences [10][13]. Group 2: Model Development and Achievements - WizardLM has released models that outperform Google's Gemma 3 series and have ranked among the top four global large language models in competitions [3][16]. - The core algorithm, Evol-Instruct, allows for the efficient generation of complex instruction data, leading to superior performance in human evaluations compared to traditional methods [13][14][17]. - The WizardLM-30B model achieved a 97.8% score compared to ChatGPT in specific tests, showcasing its advanced capabilities [14]. Group 3: Tencent's AI Strategy - Tencent has restructured its AI development framework, focusing on "computing power, algorithms, and data," and plans to invest approximately 124.9 billion USD in AI development [28][30]. - The company has established new technical departments dedicated to large language models and multimodal models, aiming to enhance AI capabilities in natural language processing and data integration [28][29]. - Following the acquisition of the WizardLM team, Tencent's ambition in the AI sector is expected to grow, with the team continuing to develop and release AI models [31].
北京国电通申请基于生成对抗网络与大语言模型的人力资源管理专利,实现生成虚拟人力资源数据的多元化
Jin Rong Jie· 2025-05-14 03:56
Group 1 - Beijing Guodian Tong Network Technology Co., Ltd. applied for a patent titled "A Human Resource Management Method Based on Generative Adversarial Networks and Large Language Models" [1] - The patent aims to utilize generative adversarial networks to learn existing human resource management data and generate diverse virtual human resource management data [1] - The method involves training a human resource management model using both real and virtual data to optimize human resource decision-making [1] Group 2 - Beijing Guodian Tong Network Technology Co., Ltd. was established in 2000 with a registered capital of 73 million RMB and has invested in 4 companies [2] - State Grid Information Communication Industry Group Co., Ltd. was founded in 2015 with a registered capital of approximately 1.5 billion RMB and has invested in 40 companies [2] - The two companies have significant involvement in various projects, with Guodian Tong participating in 2019 bidding projects and State Grid participating in 5000 bidding projects [2]
特斯拉/美团/蔚来背后的神秘“捕手”:我在大语言模型上看不到持续竞争力
3 6 Ke· 2025-05-13 08:31
Group 1 - Baillie Gifford is a century-old investment firm based in Edinburgh, known for its value investment philosophy and long-term global growth strategy, focusing on identifying and investing in a select few high-quality companies with competitive advantages and innovation [1][2] - The firm has made early investments in major tech companies, including Amazon in 2004, Illumina in 2011, Tesla in 2013, and Alibaba in 2014, demonstrating a strong track record in identifying growth opportunities [1] - Baillie Gifford's significant investment in Tesla began with a $89 million stake in 2013, which grew to 14 million shares by 2017, resulting in a profit of approximately $17 billion after a seven-year holding period [1] Group 2 - In 2016, Baillie Gifford participated in Meituan's first round of financing and held a 12.08% stake during its IPO in 2018, maintaining its position through market fluctuations [2] - Peter Singlehurst, the firm's growth investment head, expressed confidence in ByteDance as a top investment opportunity, predicting a fivefold return despite current geopolitical tensions [2][5] - The firm has developed a framework of ten core due diligence questions to assess a company's growth potential, focusing on long-term growth opportunities, competitive advantages, organizational culture, and financial analysis [3][4] Group 3 - Baillie Gifford is cautious about investing in AI companies, particularly large language models, due to unclear competitive advantages at that level, despite recognizing the potential in foundational AI infrastructure [4][25] - The firm emphasizes the importance of maintaining strategic focus and avoiding "fill-duck" investments, which can lead to overvaluation and misallocation of resources [4][20] - The investment philosophy includes a focus on companies with strong return on equity (ROE) and sustainable business models, avoiding excessive capital influx that could distort long-term value [20][21] Group 4 - Baillie Gifford's investment in Amazon and Tesla exemplifies its strategy of identifying companies with scalable business models and long-term growth potential, even when they are initially unprofitable [24][50] - The firm believes that the current market conditions present unique opportunities for growth investments, particularly in companies that have demonstrated strong management and innovative business models [61][62] - The firm continues to actively seek investment opportunities in China, despite geopolitical risks, as it believes the risk-reward ratio remains favorable [46][44]
对话PingCAP黄东旭:AI大潮冲击下,软件公司如何顺流而上?
Tai Mei Ti A P P· 2025-05-13 06:04
如果在软件行业发展的坐标轴上划出一个分野点,华创资本管理合伙人吴海燕认为是2021年。因为这一 年,不仅是软件行业估值的高点,也是行业最受资本追捧的一年。因此,她把软件公司分为两类:一类 是 2021 年融到了很多钱的公司,一类则是 2021 年没有融到钱的公司。这之后,两类公司都不可避免地 遭遇挑战,但困难的程度和路径选择却截然不同。 华创派企业 PingCAP 就属于 2021 年融资成功的阵营。那个时候他们对未来的宏观形势有所预判,得以 抓住机会加速了全球化的布局。作为一家企业级开源分布式数据库厂商,PingCAP服务的客户如今已超 过20个国家和地区,创立的分布式关系型数据库 TiDB,能持续帮助企业最大化发挥数据价值。 随着 AI 浪潮的来临,数据价值也得到了前所未有的提升。但这股大潮的影响远不止于此,AI 将如何深 刻改变企业软件的交互方式与产品形态?基础软件在 AI 时代又该实现哪些自我革新和进化?近日, PingCAP 联合创始人兼 CTO 黄东旭做客「牛白丁」,与吴海燕一起探讨了AI大潮冲击下,软件公司该 如何顺流而上,发挥出自己独特的行业价值。 嘉宾介绍: 黄东旭, PingCAP 联合创 ...