量子位
Search documents
“杭州六小龙”第一股来了!浙大校友创业,年入8亿冲刺IPO
量子位· 2026-03-30 09:16
Core Viewpoint - Qunhe Technology is in the final stage of its IPO process on the Hong Kong Stock Exchange, aiming to become the first "space intelligence stock" and the first company among the "Hangzhou Six Dragons" to complete an IPO [2][3]. Company Overview - Founded in 2011 and headquartered in Hangzhou, Qunhe Technology focuses on GPU clusters and artificial intelligence technology [4]. - The company was co-founded by Huang Xiaohuang, Chen Hang, and Zhu Hao, and offers products such as CoolJia, Coohom, and the Qunhe Space Intelligence Platform [5]. Financing and Market Position - Prior to the IPO, Qunhe Technology completed eight rounds of financing from notable investors including IDG Capital and Hillhouse Capital [6]. - The company had previously aimed for a US IPO in 2021 with a valuation of $2 billion but shifted to the Hong Kong market due to various factors [7][8]. Product and Service Structure - Qunhe's product service system consists of three layers: specialized infrastructure, proprietary technology engine, and a product matrix centered around CoolJia [9][13]. - CoolJia, launched in 2013, is a cloud-native space design software that has become the largest in China due to the booming real estate market [12]. - Coohom targets overseas markets, providing localized space design solutions in multiple languages [15]. - The upcoming Qunhe Space Intelligence Platform (SpatialVerse) is designed to generate realistic virtual datasets for training AI models [16][18]. Market Share and Growth - As of 2024, Qunhe has become the largest space design software provider in China, holding approximately 23.2% market share [23]. - The space design software industry is projected to grow significantly, with the Chinese market expected to expand from 3.3 billion RMB in 2024 to 6.6 billion RMB by 2029 [25]. Financial Performance - Qunhe's total revenue is projected to reach 820 million RMB in 2025, up from 754.83 million RMB in 2024 and 663.54 million RMB in 2023, indicating steady growth [26][27]. - The company achieved a gross margin of 82.2% in 2025, up from 76.8% in 2023, reflecting improved cost efficiency [34][35]. - Qunhe is expected to turn a profit in 2025, with an adjusted net profit of 57.1 million RMB, a significant improvement from previous losses [39][40]. Customer Base and Revenue Model - As of December 31, 2025, Qunhe had 47,416 enterprise customers contributing 669 million RMB to subscription revenue, accounting for 84.2% of total subscription income [30]. - The company also saw growth in individual customers, reaching 416,175 and generating 126 million RMB in subscription revenue [32]. - Qunhe employs a "land first, then expand" strategy, attracting users through a freemium model and converting them into paying customers [33]. Future Plans Post-IPO - The funds raised from the IPO will primarily be used for international expansion, particularly in markets like South Korea, Southeast Asia, India, the US, and Japan [46]. - Qunhe plans to enhance existing product functionalities, especially in AIGC and geometric modeling, and invest in core technologies and infrastructure [47][48].
将深度信息作为VLM核心输入!视启未来×清华×IDEA帮机器人看懂物理世界
量子位· 2026-03-30 03:39
Core Viewpoint - The article discusses the limitations of current Visual-Language Models (VLM) in physical interactions and introduces the SpatialPoint framework, which integrates depth information to enhance spatial perception and interaction capabilities of AI systems [5][11][32]. Group 1: Limitations of Current VLM - Current VLMs can recognize objects but struggle with spatial operations due to reliance on RGB images without accurate depth information, leading to issues like misgrabbing and collisions [6][8]. - Traditional VLMs output 2D bounding boxes and semantic labels, which lack actionable 3D coordinates necessary for robotic execution, creating a gap between perception and action [8][9]. - Existing technologies often treat real and virtual points separately, lacking a unified framework to predict both types of critical spatial points needed for effective interaction [9][12]. Group 2: Introduction of SpatialPoint Framework - SpatialPoint is designed to address the shortcomings of traditional VLMs by incorporating structured depth information as a core input alongside RGB and language data, enabling direct output of actionable 3D coordinates [11][12]. - The framework employs a two-stage training strategy to seamlessly integrate depth information without compromising the existing capabilities of pre-trained VLMs [17][19]. - SpatialPoint allows for simultaneous prediction of both TouchablePoints (real points) and AirPoints (virtual points), significantly improving the efficiency and accuracy of robotic tasks [11][13]. Group 3: Technical Implementation - The framework includes a depth encoding process that converts single-channel depth maps into a format compatible with RGB inputs, ensuring aligned feature extraction [16]. - Multi-modal collaborative reasoning is facilitated by introducing specific boundary markers for depth tokens, allowing for integrated processing of RGB, depth, and language features [17][18]. - The output is structured in a 3D coordinate format (u, v, Z), which can be directly interpreted by robotic systems, reducing the complexity of translating model predictions into executable actions [18]. Group 4: Experimental Results - SpatialPoint demonstrated a significant improvement in identifying effective operational positions, achieving a 79% success rate in locating TouchablePoints, compared to 74.1% and 50.3% of other models [23]. - For AirPoints, the model showed a 50.71% success rate in direction finding and a 33.47% success rate in locating specific positions within 5 centimeters, outperforming traditional models [26]. - The framework's performance in complex spatial positioning tasks consistently exceeded that of other models, indicating its robustness across various scenarios [28]. Group 5: Practical Applications - SpatialPoint has been validated in real-world robotic applications, successfully executing tasks such as object retrieval and navigation without the need for model fine-tuning [29][30]. - The framework's unified visual interface allows for integrated multi-task operations, enhancing the efficiency of robotic systems in dynamic environments [31]. - By addressing the core challenges of spatial interaction, SpatialPoint aims to facilitate the transition of AI from virtual environments to real-world applications, contributing to the development of embodied intelligence [32][36].
量子位编辑作者招聘
量子位· 2026-03-30 03:39
Core Viewpoint - The article emphasizes the ongoing AI boom and invites individuals to join the company "Quantum Bit," which focuses on tracking AI advancements and has established itself as a leading content platform in the industry [1]. Group 1: Job Opportunities - The company is hiring for three main areas: AI Industry, AI Finance, and AI Product, seeking content experts in these fields [3][5]. - Positions are available for both full-time and internship roles, with opportunities for recent graduates to transition into permanent positions [5][9]. Group 2: AI Industry Direction - Responsibilities include tracking innovations in AI infrastructure, such as chips, AI infrastructure, and cloud computing, and producing accessible interpretations of technical reports and papers [7][9]. - Candidates should have a basic understanding of chips, GPUs, NPUs, servers, and cloud computing, with a preference for those with technical backgrounds in engineering or computer science [9]. Group 3: AI Finance Direction - The role focuses on venture capital, AI startups, public companies, and analyzing capital movements within the industry [8][9]. - Candidates should be data-sensitive, interested in financial reports and strategic planning, and possess strong logical structuring skills [9]. Group 4: AI Product Direction - This position involves evaluating AI applications and tracking new product releases across various platforms, including mobile, PC, and automotive [10][12]. - Candidates should have a keen understanding of smart hardware trends and possess strong logical and structured communication skills [12]. Group 5: Company Overview - By 2025, Quantum Bit aims to have over 2.4 million subscribers on WeChat and a total of over 7 million users across platforms, with a daily reading volume exceeding 2 million [11]. - The company is recognized as the top media outlet in the AI and frontier technology sectors according to third-party data platforms [11].
国产世界模型登顶全球第一!断层领先谷歌英伟达,3D准确度逼近满分
量子位· 2026-03-30 03:39
Core Insights - The article highlights the significant achievement of GigaWorld-1, developed by the Chinese company 极佳视界, which has surpassed major competitors like Google and NVIDIA to become the top-ranked embodied world model globally [1][10]. Group 1: GigaWorld-1 Performance - GigaWorld-1 is the only embodied world model to score over 60 on the WorldArena leaderboard, achieving a score of 62.34% [2]. - It leads in several key dimensions, including Visual Quality (63.04), Motion Quality (39.16), Content Consistency (65.17), Physics Adherence (97.02), and 3D Accuracy (57.28) [2][6]. - The model shows a 16% improvement in Physics Adherence compared to the second-ranked model, Ctrl-World [6]. Group 2: Technical Advancements - GigaWorld-1 is designed as an Action-Conditioned World Model (AC-WM), integrating explicit action modeling and a differentiable physics engine for accurate physical interactions [11][14]. - The model has been trained using high-quality real robot operation video data, enhancing its generalization capabilities in open scenarios [14]. Group 3: Company Background and Funding - 极佳视界 is recognized as the first company in China to focus on world models, combining technology development with substantial financing [20]. - The company recently completed a nearly 1 billion yuan Pre-B round of financing, attracting investments from top firms in the semiconductor and automotive industries [21][22]. - Previous investments include a strategic investment from Huawei's Hubble Investment, indicating strong interest in the world model sector [24][25]. Group 4: Product Ecosystem - The company offers a product matrix that includes GigaWorld, a world model platform, GigaBrain, an embodied foundational model, and Maker, a general-purpose embodied ontology [28]. - GigaWorld serves as a digital sandbox for simulating physical world operations and generating high-fidelity synthetic data, achieving efficiency improvements of 10-100 times compared to traditional simulators [30][32]. Group 5: Team and Expertise - The core team of 极佳视界 includes experts with extensive experience in physical AI, robotics, and world models, led by founder and CEO Huang Guan, who has a strong background in automation and AI competitions [41][46]. - The team has a proven track record of achieving global recognition in AI competitions and has published numerous influential papers in the field [44][47].
黄仁勋也站台的抱抱脸机器人卖爆了,背后公司竟来自中国
量子位· 2026-03-30 02:35
Core Insights - Hugging Face's Reachy Mini has achieved impressive sales, surpassing $1 million in revenue within 5 days and shipping over 3,000 units [2] - The open-source philosophy is being effectively integrated into the embodied intelligence sector, with companies like Seeed Studio playing a crucial role [4][9] - Reachy Mini is positioned as both an AI companion and a platform for developers, emphasizing its dual functionality [27] Group 1: Product Overview - Reachy Mini is a highly open desktop robot with multimodal interaction capabilities, including vision and voice [12][18] - It features a nine-axis motion capability, allowing it to express emotions and states through physical movements [13][14] - The robot is designed for both entertainment and development, supporting various applications due to its open-source nature [15][19] Group 2: Company Role and Ecosystem - Seeed Studio is a key player in the hardware system platform, providing essential components like control solutions, sensors, and open-source robot kits [6][9] - The company is a premier partner in the NVIDIA Jetson ecosystem, enhancing its position in the embodied intelligence industry [8][9] - The collaboration between Hugging Face and Seeed Studio aims to connect models, hardware, and developer ecosystems, fostering innovation [9] Group 3: User Experience and Accessibility - Reachy Mini is designed to be user-friendly, with assembly instructions that allow even children to build it [45] - The robot's design includes customizable features, enabling users to modify its appearance easily [30][34] - The platform aims to lower barriers for developers and non-developers alike, promoting engagement in the field of embodied intelligence [86][90] Group 4: Market Position and Future Trends - The future of low-cost robots is not solely about price but about accessibility and usability for developers [75][77] - The industry is moving towards modularity and standardization, which will enhance the development of embodied intelligence [97][130] - The emphasis on community and open-source development is seen as vital for the growth and sustainability of the robotics ecosystem [94][138]
20万条4D交互数据+运动学锚定,南洋理工让生成式仿真不再「脑补」机器人动作
量子位· 2026-03-30 02:35
Core Viewpoint - The article discusses the development of Kinema4D, a high-fidelity 4D embodied simulator created by NTU MMLab, which aims to enhance robot-environment interaction modeling by overcoming the limitations of traditional simulators and 2D video generation models [2][3]. Background and Challenges - Robot-environment interaction simulation is crucial for data augmentation, policy evaluation, and reinforcement learning in embodied intelligence. Traditional physical simulators face challenges such as insufficient visual realism and reliance on preset physical rules, making them difficult to scale to complex new scenarios [7]. - Recent efforts have utilized video generation models to synthesize robot-environment interactions, bypassing cumbersome physical modeling [8]. - Existing generative simulation methods have key deficiencies, including: 1. Dimensional limitations, as most models are confined to 2D pixel space, lacking the necessary 4D spatiotemporal constraints. 2. Insufficient accuracy due to reliance on high-level language instructions and static environment priors, leading to imprecise control and dynamic guidance [9]. Core Method - Kinema4D's core motivation is to ensure precise robot control while restoring the 4D spatiotemporal essence of interactions. It employs a "simulation decoupling" design philosophy, breaking down the interaction process into robot control and resulting environmental changes [13]. - The two supporting insights are: 1. Kinematics-driven precise 4D action representation, ensuring that robot actions in 4D space are physically deterministic and not predicted by the generative model [13]. 2. Controllable generative modeling of environmental responses in 4D, allowing the model to focus on synthesizing dynamic environmental responses rather than modeling the robot's own kinematics [13]. Dataset - The article introduces Robo4D-200k, the largest 4D robot interaction dataset, comprising 201,426 high-fidelity interaction sequences. This dataset integrates diverse real-world demonstration data and synthetic data to provide robust reasoning capabilities for embodied foundational models [17]. Experimental Analysis - Kinema4D has been benchmarked across three dimensions: video generation quality, geometric quality, and downstream policy evaluation. It achieved leading results in video generation quality, outperforming existing models [18]. - In terms of geometric quality, Kinema4D demonstrated superior performance compared to another 4D generative simulator, accurately replicating real trajectory execution effects [22]. - The simulator's results align closely with actual execution performance, showcasing its ability to synthesize successful execution trajectories and accurately identify failure cases, even under out-of-distribution conditions [29]. Summary and Outlook - Kinema4D represents a significant advancement in robot simulation, transitioning from traditional 2D pixel generation to 4D spatiotemporal reasoning. It successfully integrates deterministic mechanical control with dynamic environmental feedback [30]. - The article highlights the potential for Kinema4D to bridge the gap between virtual and real-world applications, showcasing strong zero-shot generalization capabilities. Future exploration may focus on incorporating explicit physical laws into generative networks to address challenges in extreme physical scenarios [30].
DeepSeek网页版大升级!随后宕机11小时崩上热搜,新模型真的来了
量子位· 2026-03-30 02:35
Core Viewpoint - DeepSeek experienced a significant service interruption lasting over 8 hours, which users interpreted as a potential model upgrade rather than a typical outage [1][2]. Group 1: Service Interruption and Model Upgrade - The service disruption was reported by users who noted changes in the DeepSeek web version, indicating a substantial enhancement in model capabilities [4]. - For instance, the model's performance in generating SVG images, such as a pelican riding a bicycle, showed marked improvement on March 29 compared to the previous week [5]. - DeepSeek has a history of silent model upgrades without prior announcements, suggesting this may not be an isolated incident [8]. Group 2: Model Versioning and Knowledge Cutoff - The updated version, identified as DeepSeek-V3, provides a more stable self-introduction compared to the previous version, which lacked clarity on its version number [10]. - The knowledge cutoff date appears to have changed, with the model now aware of U.S. election results up to 2025 but not events from February 2026, indicating a possible cutoff around January 2026 [11]. Group 3: Performance and Future Developments - The model's ability to generate code for front-end pages has significantly improved as of March 29 [14]. - Despite the service restoration, some issues remain, such as the model ceasing output after deep thinking mode, which does not display answers in the main text [18]. - The company has recently opened 17 positions related to agent development, hinting at significant upcoming advancements [21].
美国开源AI最后的旗帜,也倒了
量子位· 2026-03-30 01:34
Core Viewpoint - The Allen Institute for Artificial Intelligence (AI2) is significantly reducing funding for open-source model development, including OLMo, and shifting focus towards AI applications, which has led to the departure of key personnel to Microsoft [1][27][39]. Group 1: Personnel Changes - Key members of AI2, including former CEO Ali Farhadi and COO Sophie Leibrecht, have left to join Mustafa Suleyman's superintelligence team at Microsoft [2][10]. - Farhadi's departure marks the end of over two and a half years of leadership at AI2, where he was instrumental in the development of various AI projects [11][13]. - Other notable departures include Hannah Hajishirzi and Ranjay Krishna, both of whom were involved in significant AI initiatives at AI2 [3][19]. Group 2: Funding and Strategic Shifts - AI2's board chairman, Bill Hilf, indicated that the organization struggles to compete with tech giants like OpenAI and Google, which invest billions in training advanced models [27][28]. - The current funding model, primarily supported by the Paul G. Allen Family Foundation, is shifting from annual funding to a project proposal-based model, which may limit AI2's ability to pursue long-term open-source projects [33][38]. - The estimated training cost for cutting-edge models like GPT-4 is between $100 million to $200 million, highlighting the financial challenges faced by non-profit organizations like AI2 [29][30]. Group 3: Impact on Open-Source AI - The reduction in AI2's commitment to open-source model development is seen as a significant setback for the open-source AI community, with many expressing concern over the future of open-source initiatives in the U.S. [39][41]. - AI2's OLMo series was recognized for its commitment to transparency and open-source principles, but the recent changes may undermine these efforts [42][46]. - The shift in focus towards AI applications rather than foundational model development could accelerate the gap between U.S. and Chinese open-source AI capabilities [58][65]. Group 4: Future Outlook - Despite the challenges, AI2's interim CEO, Peter Clark, has stated that the organization remains committed to its mission and ongoing collaborations, such as the OMAI project with NSF and Nvidia [52]. - The landscape of open-source AI is evolving, with U.S. companies increasingly adopting models from China, indicating a shift in the global open-source AI dynamics [64][66].
预测这件事,人类越犹豫,这个大模型越有优势
量子位· 2026-03-30 01:34
Core Viewpoint - UniPat AI has developed a comprehensive predictive intelligence infrastructure called Echo, which includes a dynamic evaluation engine, a future-event training paradigm, and a dedicated predictive model, EchoZ-1.0, which has shown significant advantages in predictive capabilities compared to human trading markets [1][3]. Group 1: Echo System Overview - Echo consists of three tightly coupled components: a continuously operating dynamic evaluation engine, a future-event training process (Train-on-Future), and a potential AI-native predictive API [4]. - The core model, EchoZ-1.0, is the first end-to-end trained large language model under the Train-on-Future paradigm, ranking first on the General AI Prediction Leaderboard with an Elo score of 1034.2, surpassing competitors like Google’s Gemini-3.1-Pro and Anthropic’s Claude-Opus-4.6 [5]. Group 2: Validation Challenges - The predictive capability of models has gained increasing attention, but a fundamental validation issue remains: how to prove the ability to predict the future [2]. - Existing benchmarks primarily measure language understanding and reasoning, which do not equate to actual predictive performance [2]. Group 3: Robustness and Verification - EchoZ-1.0 has maintained its first-place ranking across all sensitivity tests, demonstrating stability that other models, such as GPT-5.2, could not achieve [8]. - The model's performance is compared against real human traders, with EchoZ showing a significant Elo score advantage over this baseline [8]. Group 4: Predictive Performance Comparison - In various domains, EchoZ has shown a win rate of 63.2% in governance, 59.3% in long-term predictions (over 7 days), and 57.9% in high uncertainty scenarios [10][11]. - The model's advantage is particularly pronounced in complex scenarios where human intuition is less reliable [11]. Group 5: Dynamic Evaluation Engine - Echo's evaluation engine is dynamic, continuously updating rankings and generating new predictive questions from real-time data streams, addressing the structural issues of existing static benchmarks [13][15]. - The system includes three data pipelines: one from prediction markets, one from real-time trends, and one from expert contributions in specialized fields [19][21]. Group 6: Train-on-Future Paradigm - The Train-on-Future paradigm addresses the limitations of traditional training methods by generating high-information predictive questions from real-time data, thus avoiding data leakage [28][30]. - It incorporates three core mechanisms: dynamic question synthesis, automated rubric search for evaluating reasoning quality, and a Map-Reduce agent architecture for distributed processing [31][35]. Group 7: Future Developments - UniPat plans to package EchoZ-1.0's predictive capabilities into an AI-native Prediction API, which will allow users to input natural language predictive questions and receive structured reports with probability distributions and evidence chains [37]. - The integration of predictive capabilities into decision-making processes across various sectors, including finance and corporate strategy, is anticipated to expand significantly [38].
Skill会吃掉APP吗?龙虾时代,这个问题值得认真聊聊|沙龙报名
量子位· 2026-03-30 01:34
Core Viewpoint - The article discusses the potential shift from traditional apps to "Skills" in the context of AI agents, suggesting that Skills may replace apps as the primary unit of software distribution and interaction [3][5][14]. Group 1: The Shift from Apps to Skills - There is a growing sentiment that apps may become redundant, with Skills emerging as callable units of capability integrated into agent workflows [3][5]. - The article raises questions about whether products that are evolving into Skills represent an opportunity or a downgrade in product development [7]. - The transformation in product forms is occurring rapidly, indicating a significant change in how software is designed and utilized [15][14]. Group 2: AI Salon and Community Engagement - The "AI Salon" organized by the company aims to explore whether Skills will indeed replace apps, inviting industry leaders to share their insights [5][18]. - The event encourages participants to bring their questions and thoughts, fostering a collaborative environment for discussing the future of AI applications [6][18]. - The salon is positioned as a platform for AI practitioners to engage in deep discussions about practical applications and future opportunities in the AI landscape [18][19].