视觉—语言—动作（VLA）模型 - filings, earnings calls, financial reports, news

视觉—语言—动作（VLA）模型

Search documents

机器之心· 2026-01-22 08:13

Core Insights - The AAAI 2026 conference announced five outstanding papers, with three led by Chinese teams from various universities, highlighting the significant contributions of Chinese researchers in the field of artificial intelligence [1][2]. Summary by Sections Conference Overview - AAAI 2026 will take place in Singapore from January 20 to 27, with a total of 23,680 submissions and an acceptance rate of 17.6%, resulting in 4,167 accepted papers [2]. Awarded Papers - **Paper 1: ReconVLA: Reconstructive Vision-Language-Action Model as Effective Robot Perceiver** - This paper addresses the challenges existing VLA models face in effectively allocating visual attention to target areas. The authors propose ReconVLA, which uses an implicit alignment paradigm to improve grounding of visual attention [4][5][6]. - The model incorporates a diffusion Transformer to reconstruct gaze regions corresponding to target objects, enhancing the model's ability to utilize task-relevant visual information for precise operations. A large-scale pre-training dataset was created, consisting of over 100,000 trajectories and 2 million data samples, improving the model's generalization capabilities [9]. - **Paper 2: LLM2CLIP: Powerful Language Model Unlocks Richer Cross-Modality Representation** - This research builds on the foundational CLIP model, enhancing it by integrating a powerful language model (LLM) to improve performance on complex textual descriptions. The authors developed an efficient fine-tuning framework that embeds LLM into pre-trained CLIP, achieving significant performance improvements without extensive retraining [10][12][16]. - **Paper 3: Model Change for Description Logic Concepts** - Although awarded, this paper has not yet been publicly released [17]. - **Paper 4: Causal Structure Learning for Dynamical Systems with Theoretical Score Analysis** - The authors introduce CADYT, a new method for causal discovery in dynamical systems that addresses challenges related to continuous time evolution and unknown causal structures. The method employs precise Gaussian process inference and a greedy search strategy to identify causal structures from trajectory data [19][20][23][24]. - **Paper 5: High-Pass Matters: Theoretical Insights and Sheaflet-Based Design for Hypergraph Neural Networks** - This paper has not yet been released, but the authors are affiliated with several prestigious institutions [25][27].