Large Language Model
Search documents
RoboSense 2025机器感知挑战赛正式启动!自动驾驶&具身方向~
自动驾驶之心· 2025-06-25 09:54
Core Viewpoint - The RoboSense Challenge 2025 aims to systematically evaluate the perception and understanding capabilities of robots in real-world scenarios, addressing key challenges in stability, robustness, and generalization of perception systems [2][43]. Group 1: Challenge Overview - The challenge consists of five major tracks focusing on real-world tasks, including language-driven autonomous driving, social navigation, sensor placement optimization, cross-modal drone navigation, and cross-platform 3D object detection [8][9][29][35]. - The event is co-hosted by several prestigious institutions and will be officially recognized at the IROS 2025 conference in Hangzhou, China [5][43]. Group 2: Task Details - **Language-Driven Autonomous Driving**: This track evaluates the ability of robots to understand and act upon natural language commands, aiming for a deep coupling of language, perception, and planning [10][11]. - **Social Navigation**: Focuses on robots navigating shared spaces with humans, emphasizing social compliance and safety [17][18]. - **Sensor Placement Optimization**: Assesses the robustness of perception models under various sensor configurations, crucial for reliable deployment in autonomous systems [23][24]. - **Cross-Modal Drone Navigation**: Involves training models to retrieve aerial images based on natural language descriptions, enhancing the efficiency of urban inspections and disaster responses [29][30]. - **Cross-Platform 3D Object Detection**: Aims to develop models that maintain high performance across different robotic platforms without extensive retraining [35][36]. Group 3: Evaluation and Performance Metrics - Each task includes specific performance metrics and baseline models, with detailed requirements for training and evaluation [16][21][28][42]. - The challenge encourages innovative solutions and provides a prize pool of up to $10,000, shared across the five tracks [42]. Group 4: Timeline and Participation - The challenge will officially start on June 15, 2025, with key deadlines for submissions and evaluations leading up to the award ceremony on October 19, 2025 [4][42]. - Participants are encouraged to engage in this global initiative to advance robotic perception technologies [43].
AI巨头,国际化大动作!
中国基金报· 2025-06-25 01:33
Core Viewpoint - The article highlights the internationalization strategy upgrade of iFlytek and iFlytek Medical, marking the launch of their global strategy with Hong Kong as a key hub for their artificial intelligence applications [2][3]. Group 1: Internationalization Strategy - iFlytek and iFlytek Medical have officially launched their international headquarters and research institute in Hong Kong, aiming to leverage the city's advantages as an innovation and technology hub [4][5]. - The companies have introduced Hong Kong and international versions of their AI products across various sectors, including healthcare, education, and office applications, based on the iFlytek Spark large model [4][6]. Group 2: Collaboration and Development - iFlytek plans to deepen collaborations with local universities and institutions in Hong Kong to enhance technology exchange and application expansion, targeting markets in Southeast Asia and along the Belt and Road [5][6]. - The Hong Kong government supports the establishment of iFlytek's international headquarters, which aligns with the local innovation and technology development direction, particularly in smart healthcare [6]. Group 3: Achievements and Impact - iFlytek Medical has successfully listed on the Hong Kong Stock Exchange, becoming the first medical large model stock in the market and included in the Hang Seng Composite Index [6]. - The establishment of iFlytek in Cyberport has contributed to the local AI ecosystem, with Cyberport housing over 2,200 companies, including 400 focused on AI and data science [6].
X @Avi Chawla
Avi Chawla· 2025-06-24 19:17
Model Fine-tuning Overview - The document outlines the process of fine-tuning models like DeepSeek-R1 [1] - The process includes dataset preparation, LoRA configuration, trainer definition, fine-tuning, and exporting to Ollama [1] Technical Implementation - The fine-tuning process of DeepSeek-R1 (distilled Llama) can be done 100% locally [1]
MinMax-M1:超越DeepSeek,支持百万级token上下文
自动驾驶之心· 2025-06-21 13:15
以下文章来源于AIGC面面观 ,作者欠阿贝尔两块钱 AIGC面面观 . 整理LLM、AIGC的入门笔记 | 论文学习笔记 | 一线大厂面经 | 探索AIGC落地 作者 | 欠阿贝尔两块钱 来源 | AIGC面面观 点击下方 卡片 ,关注" 自动驾驶之心 "公众号 戳我-> 领取 自动驾驶近15个 方向 学习 路线 >>点击进入→ 自动驾驶之心 『大模型』技术交流群 主要贡献 1. 高效混合架构设计 :结合MoE架构与Lightning Attention)的模型MiniMax-M1, 支持百万级上下文窗 口(1M tokens) ,生成长度达80K tokens时FLOPs仅为传统注意力模型的25%。 2. 超越DAPO的算法CISPO :通过 剪裁重要性采样权重 提升RL效率,相比DAPO实现2倍加速,避免了 传统方法(如PPO/GRPO)对低概率token有更好的采样效果。 3. 可扩展上下文 :支持从40K到80K Token生成长度的扩展。 本文只做学术分享,如有侵权,联系删文 1.混合注意力架构 Lighting Attention : 采用I/O感知的线性注意力计算,通过分块计算和内存优化 ,将长 ...
Sam Altman Says Meta Offered OpenAI Staffers $100 Million Bonuses
Bloomberg Television· 2025-06-18 08:04
So the figures are astronomical and there is a way to make sense of this stuff. And I'm going to use the acronym c, T Chips Data Talent. So those are the three pillars of AI development.And basically, you can kind of make sense of what all of these different AI firms are doing based on where they have weaknesses in those areas. So if you think about that, they have got the chips, they most definitely have got the data, but it's the talent where it's seeming like they're needing to pick up pace a little bit. ...
1200行代码逆袭!DeepSeek工程师开源轻量级vLLM,吞吐量逼近原版
机器之心· 2025-06-13 04:31
Core Viewpoint - vLLM is a high-performance, open-source LLM inference and service engine developed by the University of California, Berkeley, aimed at enhancing inference speed and resource utilization, particularly memory efficiency, while being compatible with popular model libraries like Hugging Face [2][3]. Group 1: vLLM and Nano-vLLM - vLLM enables mainstream models like GPT, Mistral, and LLaMA to run faster and consume fewer resources through its innovative attention mechanism called PagedAttention [3]. - A lightweight implementation of vLLM, named Nano-vLLM, was developed by DeepSeek AI researcher Yu Xingkai, simplifying the code to under 1200 lines [4][7]. - Nano-vLLM has gained over 200 stars on GitHub, indicating community interest and engagement [5]. Group 2: Features of Nano-vLLM - Nano-vLLM offers three core functionalities: 1. Fast offline inference with performance comparable to vLLM [6]. 2. A readable codebase with a simplified implementation [7]. 3. An optimization suite that includes features like prefix caching, Torch compilation, and CUDA computation graphs [8]. Group 3: Benchmarking Results - Benchmark tests showed that Nano-vLLM produced the same output tokens as vLLM but took slightly longer, resulting in a throughput of 1314.65 tokens/s compared to vLLM's 1353.86 tokens/s [9][11]. - The testing configuration included using an RTX 4070 GPU, with a model size of Qwen3-0.6B, and random sampling of input and output lengths between 100 and 1024 tokens [10].
大模型能够自发形成“人类思维地图”!Nature子刊重磅研究揭示多模态大模型类脑机制
机器人圈· 2025-06-11 11:43
Core Viewpoint - The research published in "Nature Machine Intelligence" demonstrates that multimodal large language models (MLLMs) can develop human-like object concept representations, challenging the notion that these models merely mimic human language without true understanding [2][4]. Group 1: Research Findings - The study analyzed 4.7 million behavioral judgment data to construct an "concept map" of AI models, confirming that MLLMs can form object concept representations similar to humans [3][6]. - The research identified 66 core dimensions of cognition through a sparse positive definite similarity embedding method, revealing that both ChatGPT-3.5 and the multimodal Gemini model exhibit stable low-dimensional representation structures [9]. - MLLMs spontaneously formed 18 high-level object concept categories with a classification accuracy of 78.3%, approaching human accuracy of 87.1% [13]. Group 2: Methodology - The research employed a novel "behavioral cognitive probe" method, integrating computational modeling, behavioral experiments, and neuroscience to analyze AI cognition [8]. - A triplet odd-one-out task was designed to assess the similarity of object representations between AI and humans, allowing for a comparative analysis of decision-making processes [5][31]. Group 3: Cognitive Dimensions - The study provided semantic labels for the cognitive dimensions of AI models, categorizing them into dimensions related to semantic categories, perceptual features, and physical components [17][19][20]. - The findings indicated a significant correlation between MLLM representations and human brain activity patterns, particularly in areas responsible for processing faces, scenes, and bodies [23][24]. Group 4: Implications and Future Directions - The research has broad applications, including the development of neuro-aligned AI systems, exploration of neural mechanisms for concept combination and reasoning, and enhancement of brain-computer interface systems [35]. - Future work will focus on expanding to next-generation multimodal models and establishing a cognitive benchmark testing platform to objectively assess AI's semantic understanding [35][36].
WWDC25: Introducing the Foundation Models framework
Apple Developer· 2025-06-10 23:01
Core Functionality - The FoundationModels framework provides access to on-device Large Language Model for Apple Intelligence via Swift API [1] - It is optimized for content generation, text summarization, and user input analysis [2] - Enables features like personalized search suggestions and dynamic dialog creation [1] Privacy and Efficiency - All data processed by the model remains private as it runs on-device [2] - The model can operate offline [2] - Integration into the operating system ensures no increase in app size [2]
一招缓解LLM偏科!调整训练集组成,“秘方”在此 | 上交大&上海AI Lab等
量子位· 2025-06-10 07:35
Core Viewpoint - The IDEAL method proposed by a joint team from Shanghai Jiao Tong University and Shanghai AI Lab significantly enhances the performance of large language models (LLMs) across various domains by adjusting the composition of the supervised fine-tuning (SFT) training dataset [3][4]. Group 1: Methodology - The IDEAL method focuses on preparing high-quality training datasets for different domains and modeling the optimization problem to minimize validation loss [5]. - The quantity of training data during the SFT phase is not the key factor; rather, the appropriate distribution of data is crucial to avoid exacerbating the "偏科" phenomenon in models [6][15]. - The research quantifies the impact of data adjustment on the optimal model's performance in the validation set, providing a theoretical foundation for the IDEAL approach [7]. Group 2: Computational Efficiency - The paper employs K-FAC theory to approximate the inverse of the Hessian matrix, which simplifies the computation and allows for scalability to LLM parameter sizes [8]. Group 3: Experimental Results - The IDEAL method was tested on the Llama 3.1 8B model, demonstrating a significant improvement in coding capabilities after just two iterations, regardless of the epoch [10]. - The initial distribution of training data can be further optimized, as IDEAL consistently improved average results across various benchmarks, regardless of the initial distribution [11]. Group 4: Practical Applications - IDEAL addresses the challenge of how to effectively combine high-quality training data from various domains into a unified training set, thus eliminating the need for manual adjustments [14]. - The paper suggests that the optimal value for the hyperparameter m should be around 0.15, as it balances the need for data distribution optimization without being too aggressive [15].
Concord Healthcare Announces Official Release of the Proton Therapy Large Model
Prnewswire· 2025-05-29 20:30
Core Viewpoint - Concord Healthcare Group has made significant advancements in precise tumor diagnosis and treatment technology, particularly with the launch of its self-developed large language model (LLM) for proton therapy, which has been successfully implemented in Guangzhou Concord Cancer Hospital [1][2]. Company Overview - Concord Medical Services Holdings Limited is a healthcare provider specializing in comprehensive oncology services, including cancer diagnosis, treatment, education, and prevention, with a focus on improving the quality and accessibility of cancer care across China [4]. - The company operates a network of self-owned cancer hospitals and clinics, equipped with advanced technology such as proton therapy systems, and aims to provide multidisciplinary cancer care [4]. Technology and Innovation - The proton LLM developed by Concord Healthcare is the first of its kind in China, utilizing a robust tumor diagnosis and treatment technology system built on extensive data accumulated over the years, including nearly 10,000 high-quality radiotherapy cases [2]. - The model integrates data from Proton China and professional journal literature to enhance its training and effectiveness in patient treatment [2]. Market Position - Concord Healthcare serves both cancer patients directly through its own medical institutions and indirectly through third-party medical institutions by providing medical equipment, software, and related services [5]. - The company has established a widespread network of enterprise customers, primarily hospitals, offering integrated oncology-related services, including sales and installation of medical equipment, management, technical support, and operating leases [5].