大语言模型
Search documents
中康科技·天宫一号:完成对前沿大语言模型DeepSeek-V3.2-Exp的适配,持续深化开放的健康产业AI应用生态
Ge Long Hui· 2025-10-11 02:03
Core Insights - Zhongkang Technology's Tiangong-1 platform has recently completed the adaptation of the advanced language model DeepSeek-V3.2-Exp, emphasizing a dual strategy of technological independence and ecological openness [1][2] Group 1: Technology and Innovation - The Tiangong-1 platform serves as the AI application capability hub for the health industry, built on the dual-core driving architecture of the self-developed "Zhuomuniao" medical model and the "Tiangong-1" decision-making model [1] - This unique architecture integrates the professionalism of the medical field with the broad applicability of business decision-making, ensuring Tiangong-1's leading position and professional barriers in the complex health industry [1] Group 2: Ecosystem and Product Offering - The intelligent agent ecosystem of Tiangong-1 is designed as a combination of a "supermarket" and a "factory," providing standardized intelligent agent products that cover the entire spectrum of "medicine, pharmacy, patients, and management" for users to quickly address common issues [2] - The platform also offers powerful intelligent agent creation tools, allowing clients to customize their agents based on unique business processes, thereby securing proprietary intelligent agent assets and enabling continuous evolution of core capabilities [2] - The adaptation of excellent third-party models like DeepSeek-V3.2-Exp significantly enriches the "raw materials" library under the "factory" model, allowing enterprises to freely combine and call upon various models based on specific task performance, cost, and efficiency requirements, achieving a synergistic effect of "1+1>2" [2]
开发智能康养机器人,「如身机器人」完成千万级天使++轮融资 | 早起看早期
36氪· 2025-10-10 23:57
Core Viewpoint - The window for developing general-purpose elderly care service robots has arrived, driven by advancements in AI and natural language processing technologies [2][4]. Company Overview - RobotGym, a smart healthcare company, recently secured a multi-million RMB angel round funding to enhance core technology, product engineering, and market expansion [3]. - The founding team has a strong background in robotics and AI, with experience from top institutions and companies in the field [3]. Product Lines - RobotGym has planned two product lines: UniGym, a multi-functional rehabilitation robot series, and Qijia, an elderly care robot series [4]. - The UniGym series targets home rehabilitation, supporting personalized training plans and real-time adjustments, with over a thousand units already produced and exported [5]. Data Strategy - The company emphasizes the importance of data accumulation for AI model development, aiming to create a hardware network for large-scale data collection [5]. - The Qijia series addresses immediate elderly care needs, focusing on mobility assistance, emotional companionship, and intelligent care [6]. Technological Features - The Qijia robots are designed to assist elderly individuals with mobility and provide emotional support through natural conversation [7]. - The robots are categorized into levels based on their capabilities, with L1-L2 handling low-risk tasks and L3-L5 managing more complex care functions [8]. Future Development - Achieving L3 and above autonomous care services may take around five years, prompting the company to adopt a hybrid model of AI and remote operation for immediate commercialization [8]. - Safety is a priority in the design of the robots, with features ensuring stability and reliability during operation [9]. Market Positioning - The Qijia product line has established pilot cooperation intentions with leading elderly care institutions and is part of Tencent's "Silver Technology Partner Program," with plans for standardized mass production by 2026 [9].
传统的感知被嫌弃,VLA逐渐成为新秀...
自动驾驶之心· 2025-10-10 23:32
Core Insights - The focus of academia and industry is shifting towards VLA (Vision-Language-Action) for enhancing autonomous driving capabilities, providing human-like reasoning in vehicle decision-making processes [1][4] - Traditional methods in perception and lane detection are becoming mature, leading to a decline in interest, while VLA is seen as a critical area for development by major players in the autonomous driving sector [4][6] - A comprehensive learning roadmap for VLA has been designed, covering foundational principles to practical applications [6] Summary by Sections Course Overview - The course titled "Autonomous Driving VLA and Large Model Practical Course" aims to deepen understanding of VLA through detailed explanations of cutting-edge algorithms and practical assignments [6][22] Chapter 1: Introduction to VLA Algorithms - This chapter provides a conceptual overview of VLA algorithms, their historical development, and introduces open-source benchmarks and evaluation metrics relevant to VLA [13] Chapter 2: Algorithm Fundamentals of VLA - Focuses on foundational knowledge in Vision, Language, and Action modules, and includes a section on deploying and using popular open-source large models [14] Chapter 3: VLM as an Autonomous Driving Interpreter - Discusses the role of VLM (Vision-Language Model) in scene understanding prior to the introduction of VLA, covering classic and recent algorithms such as DriveGPT4 and TS-VLM [15] Chapter 4: Modular and Integrated VLA - Explores the evolution of language models from passive descriptions to active planning components, detailing modular and integrated VLA approaches, and includes practical coding exercises [16] Chapter 5: Reasoning-Enhanced VLA - Concentrates on the reasoning-enhanced VLA subfield, introducing new reasoning modules and discussing various algorithms and their applications in autonomous driving [17][19] Chapter 6: Major Project - The final chapter emphasizes hands-on practice, guiding participants through network construction, dataset customization, and model training using the ms-swift framework [20] Learning Requirements and Outcomes - Participants are expected to have a foundational understanding of autonomous driving, large models, and relevant mathematical concepts, with the course designed to equip them with the ability to understand and apply VLA algorithms in practical scenarios [24]
用4.39亿方块在《我的世界》手搓一款ChatGPT?玩家又一次“整活”,还把游戏玩出了新高度!
猿大侠· 2025-10-10 04:11
Core Viewpoint - The article discusses the innovative project where a developer named Sammyuri created a small language model called CraftGPT within the game Minecraft, utilizing the game's redstone mechanics to simulate a functional AI model with 5 million parameters [6][9][15]. Group 1: Project Overview - CraftGPT was built using approximately 439 million blocks in Minecraft, with dimensions of 1020 blocks long, 260 blocks high, and 1656 blocks wide [9][6]. - The model operates on a small scale, with 5,087,280 parameters, trained on the TinyChat dataset for basic English conversations [15][16]. - The project showcases the potential of Minecraft as a platform for complex computational tasks, previously demonstrated by other projects like a 16-bit CPU and a computer running DOOM [25][26]. Group 2: Technical Details - CraftGPT's architecture includes components such as tokenizers, matrix multipliers, and multi-headed attention mechanisms, all constructed using redstone circuits [11][13]. - The model's embedding dimension is 240, with a vocabulary size of 1920 tokens and a total of 6 layers [16]. - To optimize resource usage, most weights are quantized to 8 bits, while embedding and LayerNorm weights retain higher precision [17]. Group 3: Performance and Limitations - The response time for CraftGPT can be extremely long, with simple queries taking up to two hours to generate a reply [22][20]. - The model's context window is limited to 64 tokens, restricting the length of conversations it can handle [18]. - Users are advised to use the MCHPRS (Minecraft High-Performance Redstone Server) to improve performance, as running the model on standard Minecraft could take years for a single response [22][23]. Group 4: Community and Future Implications - The project has sparked interest and excitement within the gaming and AI communities, highlighting the creative potential of Minecraft [25][34]. - Sammyuri's work raises the bar for what can be achieved in Minecraft, suggesting that the game's limitations are primarily defined by human creativity [25][33].
Nature子刊:山东大学张磊/赵国平团队开发AI大模型,用于发现抗菌肽,对抗多重耐药菌
生物世界· 2025-10-10 04:05
撰文丨王聪 编辑丨王多鱼 排版丨水成文 世界卫生组织 (WHO) 曾发一份多重耐药菌名单,统称为 ESKAPE , 代表了六种最棘手、最常见的多重耐药细菌, 名单之首是 耐碳青霉烯类鲍曼不动杆菌 (CRAB) 。碳青霉烯类抗生素是所有其他治疗手段都失败时的"最后一道防线",但其极易受到抗生素耐药性的出现和传播的影响。鉴于这一紧迫问题,人们越 来越关注 抗菌肽 (AMP) 作为传统抗生素的有前景替代品。 与传统抗生素相比, 抗菌肽 (AMP) 因其广谱活性、快速杀菌机制以及诱导耐药性的可能性较小,成为很有前景的抗生素替代品。发现针对临床多重耐药菌的 新型抗菌肽,对于应对持续的抗生素耐药危机至关重要。 2025 年 10 月 3 日,山东大学齐鲁医学院 张磊 教授、 赵国平 教授团队在 Nature 子刊 Nature Microbiology 上发表了 题为: A generative artificial intelligence approach for the discovery of antimicrobial peptides against multidrug-resistant bacteria ...
用4.39亿方块在《我的世界》手搓一款ChatGPT?玩家又一次“整活”,还把游戏玩出了新高度
3 6 Ke· 2025-10-09 11:44
Core Insights - A developer named Sammyuri has successfully created a small language model called CraftGPT within the game Minecraft, utilizing 4.39 billion blocks to build a virtual environment for the model to operate [4][20]. - CraftGPT consists of 5,087,280 parameters and is designed to handle basic English conversations, although it is significantly smaller than models like GPT-1 and GPT-3 [25]. Technical Details - The CraftGPT project occupies a massive area in Minecraft, measuring 1020 blocks long, 260 blocks high, and 1656 blocks wide [7]. - The internal structure of CraftGPT includes various components such as tokenizers, matrix multipliers, and multi-headed attention mechanisms, all constructed using Minecraft's redstone circuitry [12][13]. - The model was trained on the TinyChat dataset, focusing on basic conversational English [13]. Performance and Limitations - Despite its innovative design, CraftGPT has significant limitations, including long response times that can take several hours for a single reply [16][17]. - The model's context window is limited to 64 tokens, restricting its ability to handle longer conversations [14]. - Users are advised to use a high-performance redstone server (MCHPRS) to improve response times, as running it on standard Minecraft could lead to impractically long wait times [16][17]. Community Reaction - The project has garnered significant attention and admiration from the gaming community, with many expressing astonishment at the creativity and technical achievement involved in building such a model within Minecraft [20][23]. - CraftGPT is seen as a continuation of previous impressive redstone projects in Minecraft, such as functioning CPUs and even a version of the game DOOM [20].
开发智能康养机器人,「如身机器人」完成千万级天使++轮融资 | 36氪首发
3 6 Ke· 2025-10-09 07:50
36氪获悉,康养具身智能公司「如身机器人」(RobotGym)近日获千万元级人民币天使++轮融资,由力合金融独家投资,资金主要用于核心技术的持续迭 代、产品工程化落地推进、养老场景规模化试点及市场前期布局。当前,如身机器人已启动Pre-A轮融资。 格物系列,主要面向居家康复场景,覆盖手部、上肢与下肢等全身康复训练;能够支持个性化康复训练计划,实时调节训练参数,查看训练报告等。该系列 产品相对轻量,已实现千台量产并出口至北美、欧洲、东南亚等市场。除了为公司带来持续现金流外,格物系列产品也是如身机器人深入康复场景、积累真 实世界数据和用户的触角。 数据积累,于具身智能的价值毋庸置疑。在师云雷看来,数据价值高度依赖于AI模型的架构,未来能够满足高级照护需求的AI模型,必然需要更多模态的 数据,比如触觉、力觉等现在尚未大规模采集的数据。因此如身机器人选择商业化先行,尽可能多地卖出产品,建立起能够快速、大规模采集多模态数据的 硬件网络,为未来技术迭代积累先机。 齐家系列,则直接切入当下养老刚需,尝试让智能机器人进入独居、半失能及失能老人的日常照护场景。基于深入调研,齐家Q1系列养老机器人的核心功 能被规划为三个层级模块 ...
更大,还能更快,更准!蚂蚁开源万亿参数语言模型Ling-1T,刷新多项SOTA
机器之心· 2025-10-09 02:24
Core Insights - The article discusses the launch of Ling-1T, a trillion-parameter open-source language model by Ant Group, highlighting its efficiency and performance in various benchmarks [2][5][52]. Group 1: Model Performance - Ling-1T has achieved impressive results in multiple benchmark tests, outperforming several leading models in key areas such as knowledge understanding and reasoning [6][9][10]. - In coding and math reasoning tasks, Ling-1T consistently ranks among the top performers, demonstrating strong logical consistency and cross-domain reasoning capabilities [8][11]. - The model's performance in specific benchmarks includes a score of 92.19 in C-Eval and 87.45 in FinanceReasoning, indicating its high knowledge density and reasoning ability [9][10]. Group 2: Efficiency and Architecture - Ling-1T utilizes a Mixture of Experts (MoE) architecture, allowing it to maintain high reasoning capabilities while significantly reducing computational costs [5][52]. - The model operates on a paradigm of "large parameter reserves + small parameter activation," enabling it to handle complex problems efficiently with a lower energy footprint [53][54]. - It supports a context length of 128K, enhancing its ability to process long documents without losing context, which is crucial for industries like finance and law [62]. Group 3: Open Source Philosophy - The article emphasizes the importance of open-source models in the AI landscape, suggesting that they enable faster iteration and lower costs for technology development [72][73]. - Ant Group's approach to open-sourcing Ling-1T allows for broader accessibility and collaboration, fostering an ecosystem where developers and small businesses can participate [74][75]. - The open-source model not only democratizes access to advanced AI capabilities but also enhances transparency and trust in AI applications across various sectors [72][74].
清华、北信科、复旦团队解读具身智能!大语言模型与世界模型如何让机器人懂物理、会思考?
机器人大讲堂· 2025-10-06 04:05
Core Insights - The article discusses the advancements in embodied AI, particularly the integration of large language models (LLMs) and world models (WMs) to achieve human-like understanding and interaction in physical environments [1][22]. Understanding Embodied Intelligence - Embodied intelligence differs from traditional AI as it actively interacts with the physical world, utilizing sensors for perception, cognitive systems for processing experiences, and actuators for actions, forming a closed loop of perception, cognition, and interaction [2][4]. - The ultimate goal of embodied intelligence is to approach human-level general intelligence, enabling robots to adapt autonomously in dynamic and uncertain environments [4]. Transition from Unimodal to Multimodal - Early embodied intelligence systems relied on single modalities, leading to limitations in performance [5][7]. - The shift to multimodal systems integrates various sensory inputs (visual, auditory, tactile) to enhance task handling capabilities, allowing robots to perform complex tasks more flexibly [8][9]. Core Technologies: LLMs and WMs - LLMs provide semantic understanding, enabling robots to comprehend and plan tasks based on human language, while WMs simulate physical environments to predict outcomes of actions [9][10]. - The combination of LLMs and WMs addresses the shortcomings of each technology, facilitating a more comprehensive approach to embodied intelligence [12][19]. Applications of Embodied Intelligence - In service robotics, modern robots can understand complex instructions and adapt their actions in real-time, improving efficiency and user interaction [20]. - In industrial settings, robots can switch tasks without reprogramming, thanks to the integration of LLMs and WMs, enhancing operational flexibility [20]. Future Challenges - Embodied intelligence requires extensive human-labeled data for training and must evolve towards autonomous learning and exploration in new environments [21]. - Hardware advancements are necessary to support real-time processing of multimodal data, emphasizing the need for efficient chips and low-latency sensors [21]. - Safety and interpretability are critical as robots interact directly with humans, necessitating traceable actions and adherence to ethical standards [21].
从「知题」到「知人」:UserRL让智能体学会「以人为本」
机器之心· 2025-10-05 06:42
"知人者智,自知者明。"——《道德经》 古人早已洞见:真正的人类智慧,不仅仅在于公式推演、掌握技艺,更是能理解他人、洞察人心。今天的大语言模型已能在代码、数学与工具使用上 出色 地完 成 任务 ,然而距离成为真正的 用户伙伴 ,它们依旧缺少那份 "知人" 的能力。这主要源于现实交互远比解题更加复杂: 这正是智能体面临的下一个时代课题: 从 "会解题" 迈向 "懂用户" 。而要真正回答这一课题,我们需要全新的动态评测框架与训练机制:不仅能测量模型在交互 中的表现,还能驱动其学会在用户不确定与多目标的世界里,问之有道,断之有衡,答之有据。为此,来自 UIUC 与 Salesforce 的研究团队提出了一套系统化方 案: 二者相辅相成,把 "以用户为中心" 从理念落地为 可复现的流程、接口与评测指标 。 UserBench 论文链接:https://arxiv.org/pdf/2507.22034 UserBench 代码仓库:https://github.com/SalesforceAIResearch/UserBench 现实交互中, 用户目标常常未在最初完全成形 (underspecification)、而是 ...