ReconVLA
Search documents
AAAI 2026结果公布,刷出88887高分!2.3万投稿录用率仅17.6%
具身智能之心· 2025-11-11 00:02
Core Insights - The AAAI 2026 conference received a record-high submission of 23,680 papers, with an acceptance rate of only 17.6%, indicating a significant increase in competition compared to previous years [3][4][45]. Submission Statistics - AAAI 2026 had 23,680 submissions, a substantial rise from 12,957 in 2025 [3][45]. - A total of 4,167 papers were accepted, which is a decrease from 3,032 accepted papers in 2025, reflecting a lower acceptance rate [4][45]. Research Highlights - Researchers from various institutions showcased their successful submissions, with notable works including: - "CogniTrust," which combines verifiable supervision with a three-tier memory model to enhance AI model reliability [12][14]. - Papers focusing on privacy protection in large models, multi-modal safety, and robust communication in autonomous driving [18][20]. - "ReconVLA," which achieved a high score of 88,887, proposing a new approach to visual representation learning [24][25]. Competitive Landscape - The competition for AAAI 2026 was described as exceptionally fierce, with some reviewers noting that only highly innovative papers were accepted [43][46]. - The overall trend indicates that papers scoring around 5 or higher had a chance of acceptance, but many authors faced rejections despite high scores [51][52]. Reviewer Experiences - Some reviewers reported unusual experiences during the review process, including significant score adjustments and perceived biases in evaluations [48][56][62].
AAAI 2026结果公布,刷出88887高分,2.3万投稿录用率仅17.6%
3 6 Ke· 2025-11-10 09:55
Core Insights - The AAAI 2026 conference has seen a record submission of 23,680 papers, with an acceptance rate of only 17.6%, indicating a highly competitive environment compared to previous years [1][37][40] - The conference will take place from January 20 to January 27, 2026, at the Singapore Expo, marking its 40th annual meeting [3] Submission Statistics - AAAI 2026 received 23,680 submissions, a significant increase from 12,957 in 2025 [1][37] - A total of 4,167 papers were accepted, compared to 3,032 in the previous year, reflecting a decrease in acceptance rate from 23.4% to 17.6% [1][37] Research Highlights - Researchers from various institutions have shared their successful submissions, with notable works including "CogniTrust," which combines verifiable supervision with a three-tier memory model [5][7] - Other accepted papers focus on critical areas such as privacy protection in large models, multi-agent safety communication, and robust methods for autonomous driving [11][12][16] Notable Achievements - A student from Peking University achieved a high score of 88,887 for their paper on "CogniTrust" [5][18] - Teams from Nanyang Technological University and Hong Kong University of Science and Technology also reported multiple accepted papers, showcasing significant contributions to the field [10][18][27] Community Reactions - The competitive nature of AAAI 2026 has sparked discussions online, with some expressing concerns about the fairness of the review process and the influence of personal relationships on paper evaluations [35][40][46] - There are reports of discrepancies in scoring, with some reviewers allegedly adjusting scores post-rebuttal, raising questions about the integrity of the review process [42][48][51]
ReconVLA:基于重建式VLA模型的机器人感知方法
具身智能之心· 2025-08-29 16:03
Core Viewpoint - The article discusses the rapid development of Vision-Language-Action (VLA) models and introduces a new model called ReconVLA, which aims to enhance the precision of robotic actions by improving visual attention and focus on target objects [2][3][27]. Summary by Sections Introduction - Existing VLA models struggle with visual attention in complex scenes, leading to errors in object manipulation. Traditional methods to improve visual localization have not significantly enhanced attention distribution [6]. Model Overview - ReconVLA introduces a reconstructive approach to visual localization, where the model first reconstructs the gaze region before predicting actions. This implicit supervision forces the model to focus on the correct object, improving action precision [8][11][14]. Methodology - The framework consists of two branches: visual reconstruction and action prediction. The model uses a frozen visual tokenizer to encode the gaze region and employs a diffusion transformer for denoising and reconstruction [13][16]. - A large-scale dataset with over 100,000 trajectories and 2 million samples was created to pre-train the model, enhancing its visual generalization and implicit grounding capabilities [19]. Performance Results - In simulations, ReconVLA achieved a near 95% success rate in long-term tasks, outperforming existing methods. The model demonstrated strong transferability to unseen objects, maintaining over 40% success rates even with novel items [9][26]. - The model's performance in real-world tasks, such as stacking bowls and placing fruits, showed significant improvements over previous models, achieving up to 90% success in specific tasks [25]. Contributions - ReconVLA is the first model to utilize a gaze region reconstruction paradigm, significantly enhancing visual attention and action prediction accuracy. The extensive pre-training on diverse datasets has established a solid foundation for its performance in various tasks [14][27]. Conclusion - The study highlights the limitations of current VLA models in visual focus and presents ReconVLA as a solution that effectively directs attention to key objects, paving the way for more reliable multi-modal robotic control [27].