软硬件协同设计
Search documents
理想CTO谢炎在云栖大会分享理想自动驾驶芯片设计思路
理想TOP2· 2025-09-27 08:58
Core Viewpoint - The article discusses the evolution of intelligent driving algorithms and the importance of data flow architecture in the context of autonomous driving technology, emphasizing the need for advanced computational architectures to handle increasing demands for processing power and reasoning capabilities. Group 1: Evolution of Intelligent Driving Algorithms - The evolution of autonomous driving algorithms can be divided into three phases: the initial phase relied on rule-based algorithms, the second phase shifted towards end-to-end (E2E) learning, and the current phase is focusing on integrating visual language models (VLM) with reinforcement learning (RL) to enhance decision-making capabilities [4][5][6]. Group 2: Importance of Language Models - Language models are deemed essential for achieving long reasoning capabilities in autonomous driving, as they enable the system to generalize and handle corner cases that cannot be addressed solely through data collection or world models [7][8]. - The psychological aspect of having a driving model that aligns with human values and reasoning is highlighted, suggesting that language models can help instill a human-like worldview in autonomous systems [8][9]. Group 3: Computational Architecture - The article critiques the traditional von Neumann architecture, which prioritizes computation over data, and proposes a shift towards data-driven computation to better handle the complexities of AI processing [12][13]. - The company has developed a unique NPU architecture that focuses on data flow rather than traditional SOC designs, aiming to improve efficiency and performance in AI inference tasks [17][18]. Group 4: Performance Metrics - The performance of the company's NPU architecture is reported to be significantly higher than existing solutions, achieving up to 4.4 times the performance in CNN tasks and 2 to 3 times in LlaMA2 7B tasks, while maintaining similar transistor counts [2][18].
理想自动驾驶芯片最核心的是数据流架构与软硬件协同设计
理想TOP2· 2025-09-05 04:56
Core Viewpoint - The article discusses the advancements in Li Auto's self-developed chip architecture, particularly focusing on the VLA architecture and its implications for autonomous driving capabilities [1][2]. Group 1: Chip Development and Architecture - Li Auto's self-developed chip is designed with a data flow architecture that emphasizes hardware-software co-design, making it suitable for running large neural networks efficiently [5][9]. - The chip is expected to achieve 2x performance compared to leading chips when running large language models like GPT and 3x for vision models like CNN [5][8]. - The development timeline from project initiation to vehicle deployment is approximately three years, indicating a rapid pace compared to similar projects [5][8]. Group 2: Challenges and Innovations - Achieving real-time inference on the vehicle's chip is a significant challenge, with efforts focused on optimizing performance through various engineering techniques [3][4]. - Li Auto is implementing innovative parallel decoding methods to enhance the efficiency of action token inference, which is crucial for autonomous driving [4]. - The integration of CPU, GPU, and NPU in the Thor chip aims to improve versatility and performance in processing large amounts of data, which is essential for autonomous driving applications [3][6]. Group 3: Future Outlook - The company expresses strong confidence in its innovative architecture and full-stack development capabilities, which are expected to become key differentiators in the future [7][10]. - The relationship between increased computing power and improved performance in advanced driver-assistance systems (ADAS) is highlighted, suggesting a predictable enhancement in capabilities as technology evolves [6][9].
沉寂一个月,openPangu性能飙升8%!华为1B开源模型来了
机器之心· 2025-09-05 04:31
Core Viewpoint - Huawei's Pangu Embedded-1B model represents a significant advancement in edge AI, enabling powerful AI capabilities on resource-constrained devices, thus paving the way for intelligent upgrades in various industries [1][5]. Group 1: Model Performance and Efficiency - The openPangu Embedded-1B model, with 1 billion parameters, achieves a new state-of-the-art (SOTA) record in performance and efficiency, demonstrating that smaller models can deliver substantial capabilities [2][3]. - The model's overall average score reached 63.90, surpassing similar models and matching larger models like Qwen3-1.7B, showcasing its parameter efficiency [3][4]. - In mathematical reasoning, the model scored 82.76% on the GSM8K benchmark and 81.83% on the MATH dataset, significantly outperforming its peers [3][4]. Group 2: Technical Innovations - The model employs a soft-hardware collaborative design, optimizing its architecture to align with the characteristics of Ascend hardware, ensuring efficient resource utilization [9][10]. - A two-stage curriculum learning approach is utilized to enhance the model's reasoning capabilities, simulating a human-like learning process [15][16]. - The introduction of offline On-Policy knowledge distillation allows for a more flexible and effective training process, improving the model's accuracy and generalization [18][19]. Group 3: Reinforcement Learning and Future Directions - The model incorporates a multi-source reward reinforcement learning mechanism, enhancing its performance through targeted feedback based on task complexity [22][25]. - Future developments aim to integrate fast and slow thinking processes within a single model, allowing for adaptive responses based on problem difficulty, thus improving both speed and accuracy [29][30].
CoDesign 2025国际研讨会在大阪召开 共探高性能计算与AI融合新路径
Cai Jing Wang· 2025-07-18 04:22
Group 1 - The CoDesign 2025 International Symposium was successfully held in Osaka, Japan, focusing on the challenges of large-scale computing and big data, emphasizing the importance of hardware-software co-design for the development of high-performance computing (HPC) [1] - The conference highlighted four core areas: algorithms, application systems, system software and middleware, and hardware-software co-design architecture, covering key fields of high-performance and scalable computing [2] - Keynote speeches and technical presentations showcased cutting-edge research and developments, including the challenges of system fragmentation and the need for collaborative design between hardware and software [3] Group 2 - Roundtable discussions addressed the integration of HPC and AI, with experts sharing differing views on the future direction of computing architectures and the role of AI in scientific programming [4] - The pursuit of Zeta Scale computing was discussed, with experts identifying system reliability and power consumption as core obstacles to scaling [4] - The symposium provided a platform for global experts to share insights and reach consensus, which will significantly advance the integration of HPC and AI, addressing future challenges and opportunities in the computing field [4]