Workflow
3D生成
icon
Search documents
图片生成仿真!这个AI让3D资产「开箱即用」,直接赋能机器人训练
量子位· 2025-11-23 04:09
Core Insights - The article introduces PhysX-Anything, the first framework for generating 3D assets with physical properties directly from a single image, aimed at enhancing embodied AI and robotics applications [5][27][28]. Group 1: Framework Overview - PhysX-Anything allows for the generation of high-quality, sim-ready 3D assets that include explicit geometric structures, joint movements, and physical parameters, addressing the limitations of existing 3D generation methods [5][6]. - The framework employs a "coarse-to-fine" generation approach, utilizing multiple dialogue rounds to create both global physical descriptions and detailed geometric information from a single image [8][14]. Group 2: Technical Innovations - A novel 3D representation method is introduced, achieving a compression ratio of 193 times while retaining geometric structure, inspired by voxel representation [9][27]. - The framework utilizes a tree-structured, VLM-friendly format to enhance the richness of physical attributes and textual descriptions, facilitating better understanding and reasoning by the VLM [12]. Group 3: Performance Evaluation - PhysX-Anything outperforms existing methods like URDFormer and PhysXGen in both geometric and physical attribute metrics, demonstrating superior generalization capabilities [18][20]. - Human evaluations indicate that the generated structures from PhysX-Anything received the highest scores for both geometric and physical attributes, confirming its effectiveness [22]. Group 4: Practical Applications - The generated sim-ready 3D assets can be directly imported into simulators for various robotic strategy learning tasks, showcasing their practical utility in embodied intelligence applications [25][26]. - The framework is expected to drive a paradigm shift from "visual modeling" to "physical modeling" in 3D vision and robotics research [28].
95 后团队做 3D 大模型,拿下头部游戏重磅合作,正在定义 3D 生成的新规则
Founder Park· 2025-11-18 11:06
Core Insights - The article highlights the significant advancements made by Yingmou Technology in the field of 3D generation, particularly through their model Rodin and its latest iteration, Rodin Gen-2, which has achieved substantial improvements in generation quality and controllability [2][6][9]. Group 1: Company Achievements - Yingmou Technology's Rodin model was showcased at GDC, capturing the attention of top game developers and leading to the successful application of 3D generation technology in mobile gaming [2]. - The company recently completed a multi-million dollar funding round led by BlueRun Ventures, with participation from ByteDance and Sequoia China, positioning it as a leading startup in the 3D large model sector [2]. - The research paper "CLAY" received nominations for best papers at SIGGRAPH, marking a significant milestone for the young team that has been focused on 3D research since its inception [2][3]. Group 2: Technological Innovations - Rodin Gen-2 has been upgraded to utilize a dataset of millions and billions of parameters, resulting in a qualitative leap in generation quality, including smoother geometric surfaces and reduced post-processing costs [6][9]. - The introduction of the "Bang to Parts" feature allows users to decompose generated models into smaller components, enhancing the controllability of 3D models and streamlining workflows in various applications [9][12]. - The model's ability to generate clean and clear 3D meshes reduces the need for extensive repairs in software like Blender and Unity, making it more production-ready [8]. Group 3: Industry Trends - Major companies are increasingly investing in 3D generation technologies, with Roblox open-sourcing CUBE 3D and ByteDance releasing Seed3D 1.0, indicating a growing trend in the industry [6]. - The demand for rapid and accurate 3D model generation is driving innovations, with Yingmou's technology achieving model generation speeds of under 10 seconds, catering to diverse industry needs [24]. - The team believes that 3D generation will play a crucial role in future applications, serving as a foundational technology for various sectors, including digital content creation, industrial design, and AR/VR interactions [29].
智能早报丨字节跳动推出3D生成大模型;美法官承认使用人工智能导致法院裁决出错
Guan Cha Zhe Wang· 2025-10-24 02:00
Group 1 - ByteDance's Seed team launched a 3D generative model called Seed3D 1.0, capable of generating high-quality simulation-level 3D models from a single image using a Diffusion Transformer architecture [1] - Kuaishou's StreamLake officially released an AI coding product matrix, including the intelligent development tool CodeFlicker and self-developed large models KAT-Coder, with KAT-Coder-Pro V1 achieving a 73.4% solution rate in SWE-bench Verified tests, surpassing GPT-5 and Claude Sonnet 4 [2] - Apple is reportedly considering acquiring Warner Bros to expand its Apple TV streaming lineup, with other major players like Amazon and Paramount also interested in bidding [3] Group 2 - Two federal judges in the U.S. acknowledged that court rulings were flawed due to the use of AI in drafting, which did not undergo the usual review process, prompting them to improve the review methods [4] - Due to worsening chip supply issues, semiconductor supplier Ansem Semiconductor has reduced or suspended deliveries, causing concerns in the German automotive industry, with Volkswagen forced to halt production at its Wolfsburg plant [5] - Ansem Semiconductor's largest packaging and testing facility is located in Dongguan, China, responsible for about 70% of its global packaging tasks, highlighting the critical role of this facility in the automotive supply chain [5]
10.23犀牛财经晚报:权益基金发行又见“日光基” 京东旗下公司已获香港保险经纪牌照
Xi Niu Cai Jing· 2025-10-23 10:25
Group 1: Equity Fund Market - The equity fund issuance market has seen a resurgence of "one-day sold-out" funds, with 16 equity funds sold out in one day since September [1] - The recently issued Huatai Bairui Yingtai Stable 3-Month Holding Mixed FOF fund raised over 5 billion yuan in a single day [1] - The increase in active fund issuance indicates a notable rise in investor risk appetite [1] Group 2: Banking and Financial Products - As of the end of Q3 2025, the total scale of the banking wealth management market reached 32.13 trillion yuan, a year-on-year increase of 9.42% [1] - The number of existing wealth management products in the market is 43,900, reflecting a year-on-year increase of 10.01% [1] - Wealth management products from financial companies account for 91.13% of the total market [1] Group 3: Corporate Developments - JD's subsidiary Jingda HK Trading Co., Limited has obtained a Hong Kong insurance brokerage license, valid until October 2028 [1] - ByteDance's Seed team launched a 3D generative model, Seed3D 1.0, which can create high-quality 3D models from single images [2] - Anshi Semiconductor (China) has assured clients that all products produced in China comply with local laws and regulations [2] Group 4: Regulatory Actions - Beijing Securities Regulatory Bureau has mandated corrective measures for Beijing Sunshine Tianhong Asset Management Co., Ltd. due to non-compliance with information disclosure regulations [3] Group 5: Financing and Investments - New Stone Technology has completed over $500 million in Pre-IPO financing, with Tencent and other notable investors participating [7] - Xinhua Securities has received approval from the China Securities Regulatory Commission to issue up to 10 billion yuan in technology innovation corporate bonds [7] Group 6: Project Contracts and Investments - Jinggong Steel Structure signed a contract for a project in Saudi Arabia worth 6.5 billion Saudi Riyals (approximately 1.23 billion yuan) [8] - Chuanfa Longmang plans to invest 366 million yuan in a 100,000 tons/year lithium dihydrogen phosphate project [9] Group 7: Financial Performance - High-speed Rail Electric reported a 54.32% year-on-year increase in net profit for the first three quarters of 2025 [10] - Huaguang Bio achieved a 146.55% year-on-year increase in net profit for the same period [11] - Northern Navigation turned a profit with a net profit of 125 million yuan, compared to a loss in the previous year [13]
暴走东京电玩展,Game Show也AI上了
量子位· 2025-09-27 07:00
Core Viewpoint - The article highlights the significant presence and influence of Chinese companies at the Tokyo Game Show (TGS), showcasing advancements in AI technology and its integration into the gaming industry [1][36]. Group 1: Chinese Companies at TGS - Major Chinese gaming companies such as NetEase, Tencent, and others have established impressive exhibition spaces, attracting numerous players [2][8]. - AI companies are also making their mark at TGS, demonstrating their capabilities and innovations in the gaming sector [8][10]. Group 2: AI Technology Showcase - Alibaba's booth prominently featured its open-source models, including Tongyi Qianwen and Tongyi Wanxiang, offering a range of commercial solutions from IaaS to SaaS [11][12]. - The Model Studio platform and AI development platform PAI were highlighted as part of Alibaba's offerings, indicating a strong push for AI integration in gaming [13][15]. Group 3: 3D Generation Technology - Tencent Cloud emphasized its cloud computing capabilities for game security and operations, while also discussing the potential of mixed reality 3D technology [21][22]. - VAST's Tripo, a leading open-source 3D generation project, is gaining attention from game developers both domestically and internationally [26][27]. Group 4: AI Applications in Gaming - HakkoAI, an AI gaming companion, showcased its ability to understand and interact with various games, outperforming several top general models in specific gaming scenarios [34]. - The integration of AI in gaming is creating new possibilities and enhancing player experiences, indicating a growing trend in the industry [36].
3D生成补上物理短板!首个系统性标注物理3D数据集上线,还有一个端到端框架
量子位· 2025-07-23 04:10
Core Viewpoint - The article discusses the introduction of PhysXNet, the first systematically annotated physical property 3D dataset, which aims to bridge the gap between virtual 3D generation and physical realism [1][3]. Group 1: Introduction of PhysXNet - PhysXNet contains over 26,000 richly annotated 3D objects, covering five core dimensions: physical scale, materials, affordance, kinematic information, and textual descriptions [3][11]. - An extended version, PhysXNet-XL, includes over 6 million programmatically generated 3D objects with physical annotations [12]. Group 2: Current Research Landscape - Existing 3D generation methods primarily focus on geometric structure and texture, neglecting the modeling based on physical properties [2][8]. - The demand for physical modeling, understanding, and reasoning in 3D space is increasing, necessitating a comprehensive physical-based 3D object modeling system [8][9]. Group 3: Data Annotation Process - The team designed a human-in-the-loop annotation process to efficiently collect and annotate physical information [16][19]. - The annotation framework consists of two main phases: initial data collection and determination of kinematic parameters [19]. Group 4: Generation Methodology - PhysXGen is introduced as a novel framework for generating 3D assets with physical properties, utilizing pre-trained 3D priors to achieve efficient training and good generalization [13][26]. - The method synchronously integrates basic physical properties during the generation process, optimizing structural branches for dual objectives [29][30]. Group 5: Experimental Evaluation - The team conducted qualitative and quantitative evaluations of the model, comparing it against a baseline that uses a separate structure to predict physical properties [33][34]. - PhysXGen demonstrated significant performance improvements in generating physical attributes, achieving relative performance gains of 24%, 64%, 28%, and 72% across various dimensions [38]. Group 6: Future Directions - The article emphasizes the importance of addressing key challenges in physical 3D generation tasks and outlines future research directions [43].
直击CVPR现场:中国玩家展商面前人从众,腾讯40+篇接收论文亮眼
具身智能之心· 2025-06-18 10:41
Core Insights - The article highlights the significant participation of Chinese companies in CVPR 2025, showcasing their technological advancements and commitment to AI development [4][9][46] - Key trends identified include a focus on multimodal and 3D generation technologies, with Gaussian Splatting emerging as a prominent technique [8][15][17] Group 1: Event Overview - CVPR 2025 has gained increased attention and social engagement, with a record number of Chinese enterprises participating [2][4] - The conference is recognized as a leading event in the field of computer vision, with the acceptance of papers indicating cutting-edge technological trends [12][13] Group 2: Research Trends - Multimodal and 3D generation are highlighted as popular research directions, with Gaussian Splatting being a frequently mentioned keyword in accepted papers [8][15][17] - A total of 2878 papers were analyzed, revealing high-frequency terms such as "Multimodal" (75 occurrences) and "Diffusion Model" (153 occurrences) [16] Group 3: Chinese Companies' Participation - Chinese companies, particularly Tencent, have shown deep involvement, with Tencent alone having over 40 accepted papers across various research areas [33][34] - The participation of Chinese firms in sponsorship and workshops indicates their commitment to the conference and the broader AI landscape [36][38] Group 4: Technological Advancements - Tencent's investment in AI research is substantial, with R&D spending exceeding 70.686 billion RMB in 2024, reflecting a strong commitment to technological innovation [46] - The company has also made significant strides in patent applications, with over 85,000 applications filed globally [46] Group 5: Talent Attraction - The presence of Chinese companies at top conferences serves to attract talent, emphasizing the importance of technical recognition over salary for top-tier professionals [47] - Tencent's diverse application scenarios, including WeChat and gaming, provide a robust ecosystem that supports ongoing technological development [49][50]
直击CVPR现场:中国玩家展商面前人从众,腾讯40+篇接收论文亮眼
量子位· 2025-06-17 07:41
Core Insights - The CVPR 2025 conference showcased significant participation from Chinese companies, highlighting their growing influence in the global AI and computer vision landscape [3][7][30] - The conference emphasized advanced topics such as multimodal and 3D generation technologies, with Gaussian Splatting emerging as a key focus area [6][15][17] - The acceptance rate for papers at CVPR 2025 was 22.1%, indicating a competitive environment and increasing recognition for high-quality research [11][13] Group 1: Conference Highlights - The conference received a record number of submissions, with 13,008 valid papers and 2,878 accepted, reflecting a growing interest in cutting-edge research [11] - Key topics included multimodal models, diffusion models, and large language models, with "multimodal" appearing 175 times in accepted paper titles [14] - The integration of computer vision and graphics was noted, with a significant rise in 3D-related research due to advancements in neural rendering [17][18] Group 2: Chinese Companies' Participation - Chinese companies, particularly Tencent, demonstrated strong engagement, with Tencent alone having over 40 accepted papers across various research areas [32] - The participation of Chinese firms in sponsorship and workshops indicates their commitment to advancing technology and attracting talent [34][36] - Tencent's investment in R&D reached approximately 70.686 billion RMB in 2024, showcasing their dedication to AI and technology development [44] Group 3: Talent Acquisition and Development - The conference served as a platform for companies to attract top talent, with Tencent's "Qingyun Plan" offering competitive salaries and career advancement opportunities [50][51] - The focus on technical talent is evident, with 73% of Tencent's workforce in technology roles, emphasizing the importance of skilled personnel in driving innovation [51] - The initiative aims to create a positive cycle where talent is nurtured and retained, contributing to the company's long-term technological advancements [46][48]
3D大模型公司VAST再获数千万美元融资 全球首个AI 3D工作台Tripo Studio:从 “算法领先” 到 “工作流闭环”
智通财经网· 2025-06-11 10:52
Core Insights - VAST has successfully completed a multi-million dollar Pre-A+ funding round led by the Beijing Artificial Intelligence Industry Investment Fund, with participation from Jingya Capital and other investors [1][12] - The company has launched Tripo Studio, the world's first AI-driven all-in-one 3D workspace, and is set to release the new algorithm Tripo 3.0, focusing on the development of the Tripo series of large models and the construction of an ecosystem platform [1][2] - VAST aims to create a comprehensive product system that covers professional (PGC), influencer (PUGC), and general user (UGC) creator profiles, solidifying its global leadership in the 3D generation field [1][3] Funding and Investment - The recent funding round will primarily be invested in the research and development of the Tripo series and the Tripo Studio product [1] - The Beijing Artificial Intelligence Industry Investment Fund and Jingya Capital express confidence in VAST's potential in the 3D model generation sector, highlighting the company's innovative capabilities and market opportunities [11][12] Product Development - VAST has iterated on the Tripo large model series, launching versions from Tripo 1.0 to Tripo 2.5, and has developed widely recognized 3D foundational models [2] - Tripo Studio has received high praise from users, with a 2.5x increase in platform payment rates and an annual recurring revenue (ARR) surpassing $3 million [2] - The company has introduced several innovative features in Tripo Studio, including intelligent part segmentation, magic texture brushes, intelligent low-poly generation, and automatic rigging, significantly enhancing the 3D creation process [4][5][6][8] Market Position and User Engagement - VAST has provided services to over 2 million 3D creators, 20,000 small developers, and 700 large enterprises, generating nearly 30 million models [2] - The company aims to redefine the 3D content creation process, allowing non-professional users to independently complete the entire workflow [9] - VAST collaborates with various industries, including gaming, industrial design, and home 3D printing, to enhance user engagement and creativity in 3D content generation [10] Future Outlook - VAST's CEO emphasizes the shift from merely providing tools to delivering complete solutions that enhance creator control and creativity [11] - The company envisions a future where 3D content creation becomes as ubiquitous and creative as photography, transforming the industry landscape [12]
阶跃星辰×光影焕像联合打造超强3D生成引擎Step1X-3D!还开源全链路训练代码
机器之心· 2025-05-16 02:42
Core Viewpoint - Step1X-3D is a newly released and open-sourced 3D model with a total parameter count of 4.8 billion, designed to generate high-fidelity and controllable 3D content for various applications including gaming, film, and industrial design [1][3]. Group 1: Data and Algorithm Optimization - Step1X-3D is built on a foundation of over 5 million raw data points, resulting in a training sample library of 2 million high-quality, standardized samples, addressing the industry's data scarcity and quality issues [4]. - The model employs enhanced mesh to SDF conversion techniques, improving the success rate of watertight geometry conversion by 20%, thus enhancing its generalization ability and detail capture [7]. Group 2: 3D Native Generation - The model features a two-stage architecture that decouples geometry and texture representation, ensuring the generated models are structurally reliable and visually accurate, avoiding geometric distortion [10]. - The geometry generation utilizes an innovative mixed VAE-DiT architecture to produce watertight TSDF representations, capturing rich geometric details through techniques like sharp edge sampling [15]. - Texture generation is optimized using a powerful SD-XL model, ensuring vibrant colors and realistic textures that maintain consistency across multiple views, effectively avoiding common distortions and seams [16]. Group 3: Control and Usability - Step1X-3D significantly enhances the controllability and usability of 3D content generation, allowing users to intuitively adjust various attributes such as symmetry and surface details [18][19]. - The architecture's design aligns closely with mainstream 2D generation models, facilitating the integration of established 2D control techniques, thus making the creation process more precise [18]. Group 4: Performance Evaluation - Step1X-3D underwent rigorous quantitative and qualitative assessments, outperforming several mainstream models in key dimensions, particularly achieving the highest CLIP-Score among compared models, indicating strong content and input semantic consistency [23][25]. Group 5: Team and Vision - The development teams, Step1X-3D and LightIllusions, aim to advance AGI and focus on 3D AIGC and spatial intelligence technologies, with a commitment to enhancing 3D content production capabilities and commercializing 3D applications [27].