VLM模型
Search documents
MinerU完成10余家国产AI芯片算力适配
Xin Lang Cai Jing· 2026-02-12 04:40
Core Viewpoint - The collaboration between Shanghai Artificial Intelligence Laboratory's OpenDataLab team, DeepLink team, and domestic chip manufacturers has successfully adapted over ten mainstream domestic computing power solutions for the MinerU AI document parsing tool, enhancing its ecosystem compatibility and adaptability [1] Group 1: Collaboration and Development - The OpenDataLab team and DeepLink team have partnered with more than ten domestic chip manufacturers, including Ascend, PingTouGe, MuXi, HaiGuang, SuiYuan, MoErThread, TianShuZhiXin, HanWuJi, KunLunChip, TaiChuYuanQi, and BiRan [1] - The collaboration aims to implement a full-stack optimization strategy through hardware and software synergy [1] Group 2: Product Features - MinerU, developed by the Shanghai Artificial Intelligence Laboratory, is an AI document parsing tool that utilizes a self-developed VLM model [1] - The tool achieves an impressive accuracy rate of 99% in capturing elements from PDFs and complex web pages [1]
何小鹏:大模型道路,大家都在摸着石头过河|36氪专访
3 6 Ke· 2025-06-12 11:29
Core Insights - Xiaopeng Motors has introduced the G7 SUV, which features the self-developed Turing AI chip, boasting an effective computing power of over 2200 Tops, significantly surpassing competitors' offerings [1][2] - The G7 is positioned as the first AI vehicle with L3-level computing power, with a starting price of 235,800 yuan [1][2] - The Turing chip aims to secure a five-year safety period for computing power, essential for future advancements in autonomous driving [1][2] Summary by Sections Product Launch and Features - The G7 SUV was unveiled on June 10, featuring three Turing AI chips, which collectively provide computing power equivalent to nine Orin-X chips [1] - The G7 is designed to meet the increasing demands for higher computing power in autonomous driving, with a pre-sale price starting at 235,800 yuan [1][2] Technology and Innovation - The Turing AI chip's effective computing power is claimed to be 3-28 times that of other industry chips, with current mainstream solutions offering around 508 Tops [1][2] - Xiaopeng's strategy includes local deployment of VLA-OL and VLM models, enhancing the vehicle's capabilities in both driving assistance and smart cockpit features [3][4] Market Position and Competition - The G7 is expected to fill the price gap between the G6 and G9 models, targeting the 200,000-250,000 yuan electric SUV market, where it will face competition from models like Xiaomi's YU7 and Li Auto's i6 [6] - The industry is witnessing a shift towards high-computing power solutions, with competitors like Tesla also advancing their hardware capabilities [5][6] Future Outlook - Xiaopeng aims to enhance the Turing chip's performance through ongoing optimizations, with expectations of significant feature updates via OTA in the coming months [8][9] - The company acknowledges the challenges of keeping pace with rapid technological changes in the autonomous driving sector, emphasizing the need for continuous innovation [6][14]
130多天后再谈AI!李想透露实现VLA的三个阶段,回应“智驾”是否该叫停
Mei Ri Jing Ji Xin Wen· 2025-05-08 02:01
Group 1 - The core idea presented by Li Xiang is that the true breakthrough of artificial intelligence (AI) will occur when it becomes a production tool, similar to how humans employ drivers [2][6] - Li Xiang emphasizes that the VLA (Vision-Language-Action) model developed by the company represents a significant advancement in AI, allowing for natural language communication with the driver agent and improved decision-making capabilities [2][3] - The VLA model is described as a combination of end-to-end and visual language models, enabling better handling of complex traffic scenarios compared to previous models [3][4] Group 2 - The evolution of the VLA model is outlined in three stages: starting from rule-based algorithms, progressing to end-to-end + VLM, and finally reaching the VLA stage, which aims to emulate human intelligence [4][6] - Li Xiang asserts that the VLA model is currently the most capable architecture, although its implementation poses significant challenges due to the increased complexity and hardware requirements [6][7] - The industry consensus indicates that the VLA model could serve as a critical bridge in the transition from L2 driver assistance to L4 autonomous driving, highlighting its potential impact on the future of intelligent driving [6]