Workflow
多模态大模型
icon
Search documents
从投稿来看,具身方向的论文已经出现了堆积.......
具身智能之心· 2025-11-18 10:00
Core Insights - The article discusses the increasing number of submissions to various conferences and the concerns of researchers regarding the suitability of different conferences and the preferences of reviewers [1] - It highlights the active research directions in embodied intelligence, including VLN, VLA, reinforcement learning, and real2sim2real, and provides guidance for newcomers on how to choose their research focus [1][3] - The article promotes a customized paper mentoring service aimed at helping researchers navigate the complexities of paper writing and submission [3][4][5] Group 1 - The article notes that many researchers are anxious about selecting the right conference and understanding which research directions are favored by reviewers [1] - It emphasizes that humanoid robots are particularly active in reinforcement learning and sim2real/real2sim2real research, suggesting that labs with relevant embodiments should explore these areas [1] - It mentions that mechanical arm embodiments are suitable for VLA, VLA+RL, and diffusion policy research, with a high computational power requirement for VLA [1] Group 2 - The article states that quadrupedal robots are also suitable for reinforcement learning research, although there may be fewer innovative points due to prior extensive work in this area [2] - It suggests that combining VLN and VLA with mobile manipulation could be a promising research direction [3] - The article introduces a paper mentoring service that offers one-on-one customized guidance across various top-tier conference topics, emphasizing the importance of having a good idea and navigating potential pitfalls for new researchers [3][4] Group 3 - The mentoring service covers a full process from topic innovation to experimental design, code debugging, paper writing, and submission strategy, aimed at producing high-quality results quickly [4] - It highlights the dual perspective of both industrial and academic value, focusing not only on publishing papers but also on practical applications [5] - The article offers a free matching service for the first ten inquiries, allowing researchers to have in-depth meetings with mentors based on their research direction and academic background [6]
AI+消费机器人「灵宇宙」顾嘉唯:两波红利造就新机会,好的AI产品一定要「主动」
IPO早知道· 2025-11-18 03:22
Core Insights - Ling Universe, an AI and consumer robotics company, has recently completed a 200 million RMB Pre-A funding round, with participation from major financial institutions and listed companies [7][10] - The company aims to create "partner-type" AI robot products for global households and individuals, focusing on enhancing human-computer interaction [7][9] - The funding will primarily be used for product technology development and market expansion, particularly in optimizing the LingOS operating system and multi-modal AI interaction technology [7][9] Company Background - The founder, Gu Jiawei, has a strong background in human-computer interaction, having worked at Microsoft Research and Baidu, and has been recognized in various prestigious lists for innovation [8] - Ling Universe's previous product, Luka, was the world's first multi-modal AI reading robot, achieving nearly 10 million units sold globally [9] Product Offerings - The product matrix includes reading robots for children aged 0-8, such as Luka and the portable AI companion, Ling Universe Xiaofangji [9] - The LingOS operating system and data flywheel are key technological barriers, enabling multi-modal perception and proactive interaction [9][15] Market Performance - The Ling Universe Xiaofangji topped sales charts during major shopping events, with sales increasing over 230% compared to the previous period [10] - The company has successfully secured multiple rounds of financing within a short time frame, indicating strong investor confidence [9] Investment Insights - Investors are attracted to Ling Universe due to its clear path in the niche market of family AI terminals and robots, supported by strong technological capabilities [11][12] - The company emphasizes the importance of a solid business model and the ability to adapt to market needs, which is crucial for attracting investment [12][14] Target Audience - The primary purchasing demographic for educational products is parents, who seek to balance their children's learning and entertainment [13] - Ling Universe targets high-net-worth individuals who are willing to invest in innovative educational tools for their children [14] Competitive Advantage - Ling Universe's competitive edge lies in its ability to provide personalized experiences through advanced AI algorithms and extensive data accumulation from previous products [15][16] - The company aims to create a seamless interaction experience that transcends traditional voice commands, focusing on proactive engagement [17][18] Future Expansion - Ling Universe plans to expand its product offerings to cater to a broader age range, from children to elderly users, emphasizing the adaptability of its technology [20][21] - The company is exploring international markets, leveraging its existing user base and adapting products to meet local demands [23][25][26]
从“技术力”到“增长力” 海康威视推进AI规模化落地
Zheng Quan Shi Bao· 2025-11-17 16:58
Core Viewpoint - The rise of AI technology presents a significant opportunity for the smart IoT sector, comparable to previous technological shifts such as the transition from analog to digital and from standard definition to high definition [5] Group 1: Company Growth and Development - Hikvision has grown from a small team to nearly 60,000 employees, becoming a global leader in security and smart IoT by seizing multiple technological paradigm shifts [1] - Since its IPO in 2010, the company has accumulated a net profit of approximately 138 billion yuan and distributed cash dividends totaling around 68.5 billion yuan [6] - The company has invested over 477 billion yuan in R&D over the past five years, maintaining a research expense ratio exceeding 10% [6] Group 2: AI Integration and Product Development - The majority of Hikvision's product lines now incorporate AI technology, enhancing their ability to meet diverse industry needs [3][4] - The company has developed a rapid coal quality analysis instrument in collaboration with the National Energy Group, significantly reducing the detection time from 8 hours to real-time [3] - Hikvision's product offerings include over 30,000 hardware models, with AI integrated to improve problem-solving capabilities [4] Group 3: Focus on Multi-Modal Large Models - Hikvision is prioritizing the development of multi-modal large models, leveraging its advantages in various sensing technologies to enhance perception capabilities [7] - The application of these models has led to significant improvements in detection rates, such as an 86% reduction in missed detections for prohibited items using millimeter-wave technology [7] - The "WenSou" series products enable cross-modal information retrieval, improving efficiency in security video searches [7] Group 4: Future Outlook and Strategic Direction - The company aims to continue innovating and launching more advanced large model products to accelerate the large-scale implementation of AI [8] - Hikvision is committed to providing AI-enabled intelligent applications across various industries, positioning itself to capture new growth opportunities [11] - The integration of AI with industry experience is seen as essential for effective implementation, with ongoing efforts to apply AI in both internal operations and external market strategies [10]
宇树科技王兴兴:AI技术将赋予机器人真正“理解世界”的能力
Zheng Quan Ri Bao Wang· 2025-11-16 12:49
Core Insights - The next decade in robotics is expected to be characterized by growth and blossoming, transitioning from mere movement capabilities to functional tasks, evolving from industry tools to life partners [1] - AI technology will enable robots to truly understand the world, with deep integration of multimodal large models enhancing their sensitivity and capabilities [1] Group 1: Future of Robotics - Industrial robots will collaborate with workers on production lines, autonomously handling material transport and precision assembly with simple instructions from humans, thus liberating them from repetitive tasks [1] - Small nursing robots will provide services to elderly individuals in community care stations, such as measuring blood pressure, reminding about medication, and offering companionship, addressing the shortage of nursing staff [1] - Home robots will take on tasks like cleaning, caregiving, and assisting with learning, becoming versatile helpers in every household [1] Group 2: Industry Collaboration and Standards - The robotics industry requires enhanced collaborative capabilities across the entire supply chain to operate reliably in more complex and open environments [2] - There is a need for building an ecosystem through partnerships, emphasizing the importance of cooperation with open-source communities to accelerate technology sharing and reduce innovation costs [2] - Establishing ethical and safety standards for robotics technology is crucial to ensure its development aligns with positive societal impacts, necessitating global collaboration to achieve breakthroughs [2]
王兴兴:下一个十年,是机器人迈向“生活伙伴”的十年
Xin Lang Ke Ji· 2025-11-16 02:01
Core Viewpoint - The next decade is expected to be a period of "growth and blossoming" for AI and robotics, transitioning from basic movement capabilities to performing tasks and becoming life partners for humans [1] Group 1: AI and Robotics Development - The past decade has been characterized by "germination and exploration," while the upcoming decade will focus on the integration of AI technology into robotics [1] - AI technology will enable robots to truly "understand the world," enhancing their functionality and adaptability [1] Group 2: Company Insights - Yushu Technology has developed humanoid robots capable of performing the majority of work actions, utilizing both offline pre-learning and real-time imitation [1] - The future will see a deeper integration of multimodal large models with robotics, leading to more sensitive and capable robots [1]
京东与港科大成立联合实验室,将聚焦智能供应链与具身智能技术
Xin Lang Cai Jing· 2025-11-14 04:59
Core Insights - JD Group and Hong Kong University of Science and Technology (HKUST) have officially established a joint laboratory focused on intelligent supply chain and embodied intelligence technology [1] Group 1: Joint Laboratory Overview - The "HKUST-JD Group Joint Laboratory" will be jointly managed by HKUST's Zheng Jiachun Robotics Research Institute, JD Exploration Research Institute, and JD Logistics [1] - The laboratory aims to conduct research in various sectors including logistics, healthcare, retail, and industry [1] Group 2: Research Focus Areas - Key research areas include tumor prediction and assisted diagnosis in the healthcare sector, and the construction of intelligent e-commerce scenarios in the retail sector [1] - The laboratory will leverage technologies such as multimodal large models and edge computing optimization algorithms to develop replicable industry-specific intelligent solutions [1]
京东与港科大成立联合实验室
Xin Lang Cai Jing· 2025-11-14 04:48
Core Insights - JD Group and Hong Kong University of Science and Technology (HKUST) have officially established a joint laboratory focused on intelligent supply chain and embodied intelligence technology [1] Group 1: Joint Laboratory Overview - The "HKUST-JD Group Joint Laboratory" is co-managed by HKUST's Zheng Jiachun Robotics Research Institute, JD Exploration Research Institute, and JD Logistics [1] - The laboratory will focus on research in logistics, healthcare, retail, and industrial sectors [1] Group 2: Research Focus Areas - Key research areas include tumor prediction and assisted diagnosis in the healthcare sector, and the construction of intelligent e-commerce scenarios in the retail sector [1] - The laboratory aims to integrate technologies such as multimodal large models and edge computing optimization algorithms to create replicable industry intelligence solutions [1]
开源又赢闭源,商汤8B模型空间智能碾压GPT-5,AI看懂世界又进了一步
3 6 Ke· 2025-11-11 08:45
Core Insights - SenseNova-SI series models, released by SenseTime, demonstrate superior performance in spatial intelligence benchmarks, particularly the SenseNova-SI-8B model, which achieved an average score of 60.99, significantly outperforming other open-source models like Qwen3-VL-8B (40.16) and BAGEL-7B (35.01) [1][2] - The SenseNova-SI-8B model also surpasses closed-source models such as GPT-5 (49.68) and Gemini-2.5-Pro (48.81) while maintaining the same parameter scale of 8 billion [2] - The performance improvement is attributed to a systematic training design and the establishment of a "spatial capability classification system" by SenseTime, which expanded the scale of spatial understanding data and validated the existence of "scaling law" in this domain [2][5] Model Performance - SenseNova-SI-8B outperformed GPT-5 in various spatial reasoning tasks, showcasing its stability and accuracy in understanding spatial relationships [3][18] - In specific tests, SenseNova-SI-8B consistently provided correct answers while GPT-5 made errors in tasks involving perspective judgment and spatial reasoning [6][10][12][15][16] Technological Advancements - The training methodology for SenseNova-SI incorporates a comprehensive approach to spatial intelligence, categorizing it into six core dimensions: spatial measurement, reconstruction, relationships, perspective transformation, deformation, and reasoning [5] - The model's architecture supports the enhancement of spatial capabilities across various foundational models, indicating a versatile application potential [5] Strategic Implications - The launch of SenseNova-SI aligns with SenseTime's broader strategy in spatial intelligence, complementing their "Wuneng" embodied intelligence platform aimed at improving robots' understanding and adaptability in the physical world [19] - The introduction of the EASI spatial intelligence evaluation platform further supports the development and collaboration within the open-source ecosystem [19] Future Outlook - The ongoing development of spatial intelligence capabilities is crucial for advancing AI's understanding of the physical world, which is essential for applications in autonomous driving and robotics [24]
十五运开幕式上人形机器人如何协作奏乐?揭秘→
Ren Min Ri Bao· 2025-11-11 02:13
Core Viewpoint - The performance of humanoid robots at the opening ceremony of the 15th National Games showcased advancements in robotics technology, particularly in collaborative music performance, highlighting the potential for robots to achieve precision and synchronization comparable to human musicians [1] Group 1: Technology and Innovation - The humanoid robots performed using "bronze jue," an ancient musical instrument, requiring precise control of striking position and force, which poses significant challenges for both humans and robots [1] - The largest bronze jue used in the performance measures 64 cm in height and weighs 40 kg, while the smallest is 36.8 cm tall and weighs 10.75 kg, indicating the scale and complexity of the instruments involved [1] - Achievements in group intelligence, multimodal large models, and "humanoid eyes" stereo vision perception were crucial for enabling the robots to perform accurately [1] Group 2: Performance Metrics - The robots achieved millimeter-level striking precision with an error margin of within 2 mm, and the synchronization error among the three robots was within 10 milliseconds, demonstrating high levels of coordination [1] - The robots were able to strike each bronze jue with a stability and precision that rivals human musicians, showcasing the potential for humanoid robots in performing arts [1]
人形机器人如何协作奏乐?(秒懂全运)
Ren Min Ri Bao· 2025-11-10 22:15
Core Viewpoint - The performance of humanoid robots playing the ancient bronze musical instrument "Qing Tong Ju Zhi" at the opening ceremony of the 15th National Games showcases advancements in robotics and artificial intelligence, highlighting the potential for robots to perform complex tasks traditionally done by humans [1]. Group 1: Technological Advancements - The humanoid robots demonstrated precise control in playing the "Qing Tong Ju Zhi," achieving millimeter-level striking accuracy with an error margin of within 2 millimeters [1]. - The synchronization of the three robots was impressive, with a movement error of only 10 milliseconds, indicating significant progress in multi-robot coordination [1]. - The technology utilized includes breakthroughs in group intelligence, multi-modal large models, and dual-eye stereoscopic vision perception, which are essential for the robots to perform collaboratively [1]. Group 2: Physical Specifications - The robots varied in size, with the largest measuring 64 centimeters in height and weighing 40 kilograms, while the smallest stood at 36.8 centimeters and weighed 10.75 kilograms [1]. - The engineering team faced challenges in ensuring the robots could work together effectively, which required advanced technological solutions [1].