语言模型

Search documents
ICML 2025 Spotlight | 多模态大模型暴露短板?EMMA基准深度揭秘多模态推理能力
机器之心· 2025-05-20 04:58
「三个点电荷 + Q、-2Q 和 + 3Q 等距放置,哪个向量最能描述作用在 + Q 电荷上的净电力方向?」 在解这道题时,我们可以通过绘制受力分析草图轻松解决。但即使是先进的多模态大语言模型,如 GPT-4o,也可能在理解「同性相斥」的基本物理原则时,错误 地判断斥力的方向(例如,错误地将 + 3Q 对 + Q 的斥力方向判断为右下方而非正确的左上方)。 这个看似简单的物理问题,却暴露了多模态大模型一个「致命缺陷」: 当前的 MLLMs 仍然无法进行需要深度视觉与文本融合的复杂多模态推理 !一项最新研究 推出的 EMMA 基准测试,如同一面「照妖镜」,揭示了即使是顶尖 MLLMs 也在这关键能力上显著不足。 目前该研究已被 ICML 2025 接收为 spotlight,代码数据已全部开源 ! 目前已有多个模型 / 方法在 EMMA 上验证其多模态推理能力,研究发现: 即使最先进的模型 ——Gemini-2.5-pro-exp-03-25 ,或者是能够进行视觉工具调用的 o3/o4-mini 模型在 EMMA 上的表现仍然落后人类专家超 20% ! 标题: Can MLLMs Reason in Multi ...
大语言模型在线辩论说服力超人类
news flash· 2025-05-19 22:01
《自然.人类行为》19日发表的一项人工智能(AI)研究发现,在线辩论中,GPT-4一类的大语言模型 (LLM)如能根据对手的个性化信息调整它们的论据,其说服力比人类辩手高出64%。研究结果显示了 GPT-4生成有针对性和说服力论据的能力,揭示出AI工具拥有影响人类观点的潜力,同时也提出应进一 步研究如何降低其说服人类时存在的风险。 ...
展鹏科技加速推进双主业融合 公司所处的电梯配件行业竞争大幅加剧 受此影响去年营业收入同比有所下降
Zheng Quan Ri Bao· 2025-05-19 16:11
Core Viewpoint - In 2024, the company experienced a decline in both revenue and net profit, attributed to intensified competition in the elevator parts industry and challenges in the military simulation sector [2][3] Financial Performance - The company reported total revenue of 469 million yuan in 2024, a year-on-year decrease of 6.80% - The net profit attributable to shareholders was 9.96 million yuan, down 87.80% year-on-year - In Q1 2025, revenue fell by 25.86% to 54.24 million yuan, with a net profit of -15.13 million yuan, indicating a shift from profit to loss [2][3] Business Segments - The company has established a dual business model focusing on elevator control systems and military simulation products, with the latter contributing significantly to profits in 2024 - Excluding the military simulation segment, the elevator control systems reported a net loss of 6.96 million yuan [2][3] Industry Challenges - The elevator parts industry is facing unprecedented challenges, including fierce competition and seasonal downturns, impacting overall revenue and profit [2][3] - The military simulation business is characterized by a unique industry nature, leading to fewer contract verifications and revenue generation [3] Strategic Developments - The company acquired a controlling stake in Beijing Lingwei Junrong Technology Co., Ltd., enhancing its dual business structure [2] - The military simulation segment focuses on developing products for aviation combat training, with a key product being the portable general digital air combat simulation system [3] Integration and Collaboration - The company is working on integrating resources between its existing operations and the newly acquired military simulation business, aiming for efficient resource allocation [3][4] - A new facility for the military simulation segment has been established, facilitating collaborative R&D efforts in various technical areas [3] Future Focus - The company plans to enhance its elevator control systems by developing new products and exploring IoT-based intelligent monitoring solutions - The military simulation segment aims to upgrade its product platform by incorporating large language models to improve performance and usability [5]
并行科技(839493) - 投资者关系活动记录表
2025-05-19 12:05
证券代码:839493 证券简称:并行科技 公告编号:2025-053 北京并行科技股份有限公司 投资者关系活动记录表 本公司及董事会全体成员保证公告内容的真实、准确和完整,没有虚假记载、 误导性陈述或者重大遗漏,并对其内容的真实性、准确性和完整性承担个别及连 带法律责任。 一、 投资者关系活动类别 □特定对象调研 √业绩说明会 □媒体采访 □现场参观 □新闻发布会 □分析师会议 □路演活动 □其他 二、 投资者关系活动情况 活动时间:2025 年 5 月 16 日 15:00-17:00 活动地点:全景网"投资者关系互动平台"(https://ir.p5w.net) 参会单位及人员:通过网络方式参加公司 2024 年年度报告业绩说明会的投 资者。 上市公司接待人员: 公司董事长、总经理:陈健先生; 公司董事、副总经理:乔楠先生; 公司财务总监:杨爱红女士; 公司董事会秘书:师健伟先生; 保荐机构中金公司成长企业投资银行部副总经理:倪佳伟先生; 保荐代表人:张伟健先生 三、 投资者关系活动主要内容 本次业绩说明会通过播放年报讲解视频、图片展示等形式对公司发展情况及 2024 年经营业绩情况进行介绍,同时公司在 ...
极光预计2025年第一季度营收显著高于预期指引
Ge Long Hui· 2025-05-19 07:56
Core Viewpoint - Aurora Mobile has raised its revenue guidance for Q1 2025, expecting revenue between RMB 87 million and RMB 90 million, representing a year-on-year growth of approximately 35% to 40% [1][2]. Financial Performance - The expected revenue for Q1 2025 is between RMB 87 million and RMB 90 million, compared to RMB 64.5 million in the same period of 2024, indicating a growth of about 35% to 40% [2]. - The adjusted net loss is anticipated to be between RMB 1 million and RMB 2 million, an improvement from a net loss of RMB 2.6 million in the same period of 2024 [3]. - As of March 31, 2025, the company's cash and cash equivalents are expected to be between RMB 113 million and RMB 114 million, down from RMB 119.5 million as of December 31, 2024 [3]. Business Growth Drivers - EngageLab, a core component of the company's overseas business, has shown strong growth with revenue increasing over 120% year-on-year [2]. - The launch of a large language model (R1 LLM) by a client has driven significant demand, contributing to revenue growth for Aurora Mobile [2]. - The company's financial risk management business has also seen substantial revenue increases due to heightened client demand [2]. - The AI platform GPTBots.ai continues to empower enterprises by providing no-code AI bot construction technology, facilitating efficient digital transformation [2]. - The dual strategy of "going global + AI empowerment" is proving effective in expanding market share and commercializing technology [2].
前景堪忧!苹果(AAPL.US)被曝在AI领域遭遇重重挫折
Zhi Tong Cai Jing· 2025-05-18 23:53
据媒体援引熟悉苹果(AAPL.US)内部讨论情况的人士的消息报道称,苹果在人工智能(AI)领域的持续挣 扎有可能破坏其在智能手机市场的主导地位,并危及该公司从机器人技术到下一代硬件等更广泛的雄心 壮志。 尽管苹果在2018年通过一项高调的人事任命曾一度激起外界对其AI战略的期待,但如今这家科技巨头 的AI之路却遭遇严重阻力。2018年,苹果聘请前谷歌(GOOGL.US)高管John Giannandrea领导AI战略。 这一任命曾被视为关键转折点,尤其在Siri远远落后于竞争对手的语音助手的背景下。 如今,苹果正进行架构重组。John Giannandrea已失去对Siri和相关产品开发的控制权,领导权转交给 Vision Pro头显项目负责人Mike Rockwell。苹果也在寻求与外部AI公司合作,例如OpenAI和Anthropic, 以增强自身能力。 测试聊天机器人 与此同时,工程师正在重构Siri架构,打造一个完全基于大型语言模型的新系统。苹果还在内部测试自 家聊天机器人,目标是实现与ChatGPT看齐的水平。在市场推广方面,苹果计划将Siri从"Apple Intelligence"这一更广泛品牌 ...
AI医疗进入精准化“深水区” :OpenAI医疗评估基准落地、大模型加速变革|AI医疗浪潮㉑
2 1 Shi Ji Jing Ji Bao Dao· 2025-05-17 05:05
Core Insights - OpenAI has launched HealthBench, an open-source benchmark for evaluating the performance and safety of large language models in the healthcare sector, which has sparked widespread discussion in the industry [1][3] - The benchmark was developed with the participation of 262 practicing doctors from 60 countries and integrates 5,000 real medical dialogue data, utilizing 48,562 unique scoring criteria created by doctors for meaningful open assessments [1][3] - The introduction of HealthBench is expected to enhance the scientific and comprehensive evaluation of AI medical models, accelerating the application of AI technology in healthcare and providing new development opportunities for related companies [1][3] Group 1: HealthBench Overview - HealthBench consists of 7 themes and 5 evaluation dimensions, focusing on areas such as emergency referrals and professional communication, with dimensions including accuracy and contextual understanding [3][4] - OpenAI has also introduced two special versions of HealthBench: HealthBench Consensus, which includes 34 critical evaluation dimensions verified by doctors, and HealthBench Hard, which presents more challenging assessment scenarios [4] - The credibility of HealthBench has been supported by a meta-evaluation comparing model scores with human doctor scores, showing high consistency in 6 out of 7 evaluation areas [4] Group 2: Trends in AI Healthcare Applications - The AI healthcare market is projected to grow at an annual rate of 43% from 2024 to 2032, potentially reaching a market size of $491 billion [6] - AI is expected to enhance healthcare accessibility and efficiency, addressing issues like personnel shortages in hospitals and improving diagnostic accuracy [6] - The evolution of AI in healthcare has transitioned from rule-driven to data-driven approaches, now entering a multi-modal integration phase, allowing for better understanding and modeling of diverse medical data [6][7] Group 3: Future Directions in AI Models - The focus of competition among large models has shifted from merely increasing parameter size to optimizing model efficiency and performance under limited computational resources [7] - Key trends in AI applications within the pharmaceutical industry include the emergence of models as products, local and edge deployment, and rapid expansion of AI applications in research and development [7][8] - The pharmaceutical industry is expected to see a rise in specialized models tailored for specific scenarios, enhancing the adaptability and effectiveness of AI solutions [7][8]
百模竞发的 365 天:Hugging Face 年度回顾揭示 VLM 能力曲线与拐点 | Jinqiu Select
锦秋集· 2025-05-16 15:42
2024年伊始,我们还在为大模型的"百亿参数竞赛"惊叹,转眼间,"小而强大"的多模态架构已如雨后春笋般涌现。 从Meta Chameleon到Qwen2.5-Omni,从DeepSeek Janus-Pro再到Gemma 3,新一代模型不仅参数更小、推理更强,还涌现出如多模态推理、智能体能力、长视频理解等 突破性进展。与此同时,"多模态检索增强生成(RAG)" "多模态智能体"等全新范式也初具雏形。 每一次模型发布、每一个技术节点,都在不断刷新我们对"视觉+语言"这一领域可能性的想象空间。 Hugging Face团队回顾并解析了过去一年视觉语言模型领域的关键事件与最新趋势: 这一年最值得关注的关键进展包括: 整体来看,过去一年视觉语言模型领域主要呈现出如下发展趋势: 锦秋基金(公众号:锦秋集;ID:jqcapital)认为,无论你关心的是模型结构的突破、能力的进阶,还是新基准的建立和实际落地的工具,这文章都将为你提供一个不 错的起点。 01 新模型趋势 在本节中,我们将探讨新型 VLM。虽然有些是全新的,但其他则是先前研究的改进版本。 任意到任意 (Any-to-any) 模型 任意到任意模型,顾名思义,是 ...
最新!高瓴旗下HHLR、景林资产调仓动向揭晓
Zheng Quan Shi Bao· 2025-05-16 05:24
Group 1 - HHLR Advisors reported a total market value increase from $2.887 billion to $3.539 billion, a nearly 23% rise in Q1 2025 [1] - HHLR's top ten holdings include Pinduoduo, Alibaba, Futu Holdings, BeiGene, NetEase, Beike, Legend Biotech, JD.com, Vipshop, and WNS Holdings, with 9 out of 10 being Chinese concept stocks [1][2] - HHLR made new investments in 10 companies including Atour Group, Huazhu Group, Baidu, and Li Auto, with significant increases in holdings for Yum China, Li Auto, and Baidu [2] Group 2 - Jinglin Asset Management's Hong Kong subsidiary reported a total market value of $3.228 billion, showing an increase from the previous year [3] - Jinglin's new investments include Alibaba and Hesai Technology, while it increased holdings in Meta, Beike, TSMC, and New Oriental, and reduced positions in Google, Microsoft, and Nvidia [3] - The market value increase for both HHLR and Jinglin is attributed to the revaluation of Chinese concept stocks, with the Nasdaq Golden Dragon China Index rising by 13.33% in Q1 2025 [4] Group 3 - The rapid development of AI models like DeepSeek is reshaping the valuation of Chinese tech companies, moving away from traditional pricing models that undervalued growth potential [5] - The shift in valuation logic is driven by the transition of Chinese AI companies from relying on foreign platforms to establishing their own technological capabilities [5]
最新!高瓴旗下HHLR、景林资产调仓动向揭晓→
证券时报· 2025-05-16 05:15
又到了巨头持仓逐渐揭晓的时刻。 5月16日,高瓴旗下HHLR Advisors公布2025年一季度美股持仓数据。 13F文件显示,高瓴旗下独立且专注于二级市场投资的基金管理平台HHLR Advisors一季度新进和增持了近20 只股票,持仓总市值从上季度末28.87亿美元增至35.39亿美元,增幅近23%。 截至2025年一季度末,HHLR前十大重仓股为:拼多多、阿里巴巴、富途控股、百济神州、网易、贝壳、传奇 生物、京东、唯品会、WNS HLDGS LTD,中概股占到9席。 此外,私募巨头景林资产的海外子公司景林资产管理香港有限公司也在近期提交了截至2025年一季度末的美股 持仓数据,数据显示,其截至今年一季度末的持仓总市值为32.28亿美元,较去年底有所增加,一季度新进阿 里巴巴、禾赛科技等。 新进买入了百度、理想等10家公司 数据显示,一季度HHLR继续加大中国资产配置,新进买入的亚朵集团、华住集团、百度、玉柴国际、理想汽 车、BOSS直聘、亿咖通科技等10家公司。从持仓占比来看,今年一季度期间,HHLR新进买入百胜中国、理 想汽车、百度、BOSS直聘、华住等持仓占比相比较大。 据了解,截至5月14日,H ...