数据流架构
Search documents
英伟达为何斥资200亿美元收购Groq
半导体行业观察· 2026-01-01 01:26
公众号记得加星标⭐️,第一时间看推送不会错过。 今年夏天,人工智能芯片初创公司Groq融资7.5亿美元,估值达69亿美元。仅仅三个月后,英伟达就在假 期期间斥资近三倍于此,用于授权其技术并挖走其人才。 接下来的几天里,网络上的人工智能专家们纷纷猜测,英伟达如何才能证明花费 200 亿美元收购 Groq 的 技术和人才是合理的。 专家们认为英伟达掌握着我们所不知道的信息。各种猜测层出不穷,从英伟达打算放弃HBM转而使用 SRAM,到为了从三星获得更多代工产能,再到试图扼杀潜在竞争对手,不一而足。有些猜测比其他猜测 更有说服力,我们自己也有一些看法。 我们目前所了解的情况 英伟达支付了200 亿美元,获得了 Groq 的知识产权的非独家授权,其中包括其语言处理单元 (LPU) 和配 套软件库。 Groq 的 LPU 是其高性能推理即服务产品的基础,交易完成后,Groq 将保留该产品并继续不间断地运营。 这项安排显然是为了规避监管审查而设计的。英伟达并非收购Groq,而是获得其技术授权。但实际上…… 它确实是收购了Groq。 很遗憾地告诉你,SRAM 并没有什么特别之处。它几乎存在于所有现代处理器中,包括英伟达的芯 ...
理想CTO谢炎在云栖大会分享理想自动驾驶芯片设计思路
理想TOP2· 2025-09-27 08:58
Core Viewpoint - The article discusses the evolution of intelligent driving algorithms and the importance of data flow architecture in the context of autonomous driving technology, emphasizing the need for advanced computational architectures to handle increasing demands for processing power and reasoning capabilities. Group 1: Evolution of Intelligent Driving Algorithms - The evolution of autonomous driving algorithms can be divided into three phases: the initial phase relied on rule-based algorithms, the second phase shifted towards end-to-end (E2E) learning, and the current phase is focusing on integrating visual language models (VLM) with reinforcement learning (RL) to enhance decision-making capabilities [4][5][6]. Group 2: Importance of Language Models - Language models are deemed essential for achieving long reasoning capabilities in autonomous driving, as they enable the system to generalize and handle corner cases that cannot be addressed solely through data collection or world models [7][8]. - The psychological aspect of having a driving model that aligns with human values and reasoning is highlighted, suggesting that language models can help instill a human-like worldview in autonomous systems [8][9]. Group 3: Computational Architecture - The article critiques the traditional von Neumann architecture, which prioritizes computation over data, and proposes a shift towards data-driven computation to better handle the complexities of AI processing [12][13]. - The company has developed a unique NPU architecture that focuses on data flow rather than traditional SOC designs, aiming to improve efficiency and performance in AI inference tasks [17][18]. Group 4: Performance Metrics - The performance of the company's NPU architecture is reported to be significantly higher than existing solutions, achieving up to 4.4 times the performance in CNN tasks and 2 to 3 times in LlaMA2 7B tasks, while maintaining similar transistor counts [2][18].
聚焦“新算力”,清微智能新架构助力AI科技“换道超车”
Jing Ji Wang· 2025-09-18 09:15
Group 1 - The global AI chip market is witnessing a shift towards data flow architecture, with companies like SambaNova and Groq achieving significant valuations of $5 billion and $6 billion respectively [1] - Clear Microelectronics, originating from Tsinghua University, has successfully developed and mass-produced data flow reconfigurable chip technology, positioning itself as a leader in this emerging field [1][2] - The founder of Clear Microelectronics, Wang Bo, emphasizes the need for innovation beyond traditional GPU architectures to overcome limitations in technology and materials, advocating for a "leapfrog" approach similar to the automotive industry's transition to electric vehicles [2] Group 2 - Clear Microelectronics' first "new computing power" chip, TX81, has achieved over 20,000 orders and established intelligent computing centers across multiple regions in China within just six months of its launch [2] - Investment institutions are increasingly recognizing the value of new computing power, with significant investments from major funds indicating a strong market trend towards data flow architecture [3] - The transition to data flow architecture is seen as a critical signal for achieving self-sufficiency in the computing power industry, with support from initiatives like ChatGPT and DeepSeek3.1 [3]
理想自动驾驶芯片最核心的是数据流架构与软硬件协同设计
理想TOP2· 2025-09-05 04:56
Core Viewpoint - The article discusses the advancements in Li Auto's self-developed chip architecture, particularly focusing on the VLA architecture and its implications for autonomous driving capabilities [1][2]. Group 1: Chip Development and Architecture - Li Auto's self-developed chip is designed with a data flow architecture that emphasizes hardware-software co-design, making it suitable for running large neural networks efficiently [5][9]. - The chip is expected to achieve 2x performance compared to leading chips when running large language models like GPT and 3x for vision models like CNN [5][8]. - The development timeline from project initiation to vehicle deployment is approximately three years, indicating a rapid pace compared to similar projects [5][8]. Group 2: Challenges and Innovations - Achieving real-time inference on the vehicle's chip is a significant challenge, with efforts focused on optimizing performance through various engineering techniques [3][4]. - Li Auto is implementing innovative parallel decoding methods to enhance the efficiency of action token inference, which is crucial for autonomous driving [4]. - The integration of CPU, GPU, and NPU in the Thor chip aims to improve versatility and performance in processing large amounts of data, which is essential for autonomous driving applications [3][6]. Group 3: Future Outlook - The company expresses strong confidence in its innovative architecture and full-stack development capabilities, which are expected to become key differentiators in the future [7][10]. - The relationship between increased computing power and improved performance in advanced driver-assistance systems (ADAS) is highlighted, suggesting a predictable enhancement in capabilities as technology evolves [6][9].
重磅!中国团队发布SRDA新计算架构,从根源解决AI算力成本问题,DeepSeek“神预言”成真?
Xin Lang Cai Jing· 2025-06-09 13:27
Core Insights - The article discusses the challenges of current AI computing architectures, particularly the high cost of computational power relative to the value generated by large models, highlighting a need for innovative hardware solutions [1][3][5] - The release of the SRDA AI architecture white paper by Yupan AI proposes a new system-level simplified reconfigurable dataflow architecture aimed at addressing the core bottlenecks in AI computing [3][6][17] Current Challenges in AI Hardware - The existing GPGPU architecture is seen as a general-purpose solution that does not fully meet the specific needs of large model training and inference, leading to inefficiencies [6][7] - Many dedicated AI architectures designed before the explosion of large models in 2023 lack consideration for the specific demands of these models, resulting in low utilization rates and reliance on advanced manufacturing processes [7][8] Key Features of Next-Generation AI Computing Chips - The white paper identifies critical issues such as insufficient memory and interconnect bandwidth, low computational efficiency, complex network designs, and excessive power consumption as major challenges for current AI architectures [8][12][18] - The SRDA architecture emphasizes a dataflow-centric design, optimizing data movement and reducing memory access frequency, which is crucial for enhancing performance and energy efficiency [11][12][14] Innovations Proposed by SRDA - SRDA integrates high-bandwidth, large-capacity 3D-DRAM memory directly into the computing chip, addressing memory bottlenecks effectively [11][14] - The architecture features a unified network design that simplifies cluster complexity and reduces management overhead, potentially surpassing existing technologies like NVLink [12][16] - SRDA allows for reconfigurability to adapt to evolving AI models, focusing on core AI computations while minimizing unnecessary complexity [16][18] Implications for the AI Industry - The SRDA architecture presents a comprehensive solution to the I/O bottlenecks faced by AI computing, offering a systematic approach to the development of AI chips [17][18] - The adoption of the dataflow paradigm in AI chip design may lead to a shift in industry standards, with more companies likely to explore similar architectures in the near future [17][18]