Gemini 2.5 Pro模型
Search documents
大模型首次拥有“脖子”!纽大团队实现360度类人视觉搜索
量子位· 2025-11-27 07:30
Core Insights - The research introduces a new task called Humanoid Visual Search (HVS), enabling models to perform 360-degree visual searches in real-world environments like train stations and shopping malls [6][10][12] - A new benchmark test, H*Bench, has been developed to evaluate the search capabilities of intelligent agents in complex environments, moving beyond traditional simple household scenarios [7][8][9] - The study aims to transition visual spatial reasoning from a "disembodied passive paradigm" to an "embodied active paradigm," enhancing the model's ability to integrate physical actions with visual reasoning [9][12] Group 1: Humanoid Visual Search - HVS allows intelligent agents to autonomously rotate their heads to search for target objects or paths in immersive environments [6][12] - The task focuses on two main search problems: Humanoid Object Search (HOS) and Humanoid Path Search (HPS), each with varying levels of difficulty based on visibility and environmental cues [12][16] - HOS involves locating and focusing on target objects, while HPS requires identifying navigable paths and adjusting body orientation [16][12] Group 2: Benchmark and Dataset - The H* dataset consists of approximately 3,000 labeled task instances derived from diverse high-resolution panoramic videos, providing a comprehensive geographical coverage [21][22] - The benchmark includes scenes from six main categories: retail environments, transportation hubs, urban streets, public institutions, offices, and entertainment venues [24] - The dataset allows for 12,000 search rounds by initializing agents from four different starting directions [22] Group 3: Model Training and Performance - The research utilizes a multi-modal reasoning task, employing a strategy network to integrate tool usage and head rotation, enhancing the model's decision-making capabilities [17][28] - Training results show significant improvements in search accuracy for target search (from 14.83% to 47.38%) and path search (from 6.44% to 24.94%) after model training [28] - The study highlights that larger model sizes do not necessarily guarantee better performance, with smaller models outperforming larger counterparts in certain tasks [33][34] Group 4: Challenges and Insights - The research identifies fundamental bottlenecks in advanced reasoning that require physical, spatial, and social common sense, despite improvements in low-level perception and motion capabilities [34][36] - Errors in HOS primarily stem from insufficient perception in cluttered environments, while HPS errors are more complex, involving a lack of physical and social common sense [36] - The study emphasizes that active visual search (rotating in panoramic views) is more intuitive and effective than passive analysis of static images [36]
谷歌为 5 亿 Jio 用户狂撒福利!科创人工智能ETF华夏(589010) 早盘探底回升,科技成长板块调整中显露企稳信号
Mei Ri Jing Ji Xin Wen· 2025-11-05 02:42
银河证券表示,英伟达、OpenAI、阿里等全球AI巨头同频加速投入算力底座及生态建设,一方面 验证了当前市场需求旺盛,AI商业化正在加速落地逐渐形成商业闭环;另一方面也对全球AI基础设施 投资注入了信心,未来AI将有望在医疗、教育、科研、产业等更多场景全面落地。 科创人工智能ETF华夏(589010)紧密跟踪上证科创板人工智能指数,覆盖全产业链优质企业,兼 具高研发投入与政策红利支持,20%涨跌幅与中小盘弹性助力捕捉AI产业"奇点时刻"。 截至10点11分,科创人工智能ETF(589010) 早盘震荡走低,现报1.365元,跌幅1.59%。分时上 看,开盘短暂冲高后快速下探,最低触及1.357元,随后在分时均线附近企稳,小幅反弹。持仓股普遍 承压,30只成分股中仅4只上涨,其中石头科技、天准科技涨幅约2%,思看科技小幅上扬;下跌阵营占 多数,合合信息与福昕软件跌逾5%,复旦微电、海天瑞声等跌幅居前。盘面显示短期调整压力仍在, 但下方均线支撑力度较强。成交活跃度维持平稳,上午成交额超2千万元,换手充分。 消息方面,谷歌宣布,将与印度最大电信服务商信实Jio合作,免费向其逾5亿名用户提供Gemini人 工智能服务 ...
腾讯研究院AI速递 20250718
腾讯研究院· 2025-07-17 14:12
Group 1 - Google DeepMind's MoR architecture achieves two times inference speed by combining parameter sharing and adaptive computation, resulting in fewer parameters while maintaining large model performance [1] - The dynamic routing mechanism allocates different recursive depths based on token complexity, reducing redundant computations and optimizing KV cache [1] - Experimental results show that MoR improves inference throughput by 2.06 times, reduces training time by 19%, and decreases peak memory usage by 25% [1] Group 2 - Amazon launches Bedrock AgentCore preview, offering seven core AI agent services including runtime, memory, and authentication [2] - The introduction of Nova customization options and Strands Agents V1.0 simplifies agent development and enables multi-agent collaboration [2] - Amazon S3 Vectors cloud object storage is released, reducing vector storage costs by 90%, along with Kiro AI IDE to enhance developer experience [2] Group 3 - Elon Musk is seeking names for the male AI companion Grok, with suggestions like "Draven" that align with characters from "Twilight" and "Fifty Shades of Grey" [3] - A user named Jackywine has created an open-source 3D digital companion "Bella," which retains only the visual aspect without large language model capabilities [3] - The "Bella" project follows an "AI native" development path in three phases: perception core, generative self, and proactive companionship, with plans to incorporate voice recognition and affinity systems [3] Group 4 - Google Search introduces an AI feature that can make phone calls to book local services for users, such as pet grooming [4] - The search integrates the Gemini 2.5 Pro model and Deep Search functionality, capable of handling complex queries and generating in-depth reports [4] - This new feature has launched in the U.S. and will be gradually rolled out globally, sparking discussions about the effectiveness of AI automated calls and merchant experiences [4] Group 5 - The AI programming platform Windsurf reintroduces the Claude Sonnet 4 model, allowing Pro users 250 free calls per month [6] - Claude Sonnet 4 offers advantages such as cross-file intelligent refactoring, a 200,000 token context window, and precise code completion [6] - This renewed partnership follows OpenAI's acquisition failure and executive team changes, representing Windsurf's strategic move to regain user trust [6] Group 6 - Anthropic successfully rehires core programming leaders Boris Cherny and Cat Wu from Cursor within two weeks [7] - Anthropic reveals that direct sales of models and Claude yield a gross margin of 60%, while sales through AWS and Google Cloud result in a negative 30% margin [7] - Claude Code has become a new asset for Anthropic, with weekly downloads increasing sixfold to 3 million since June, contributing over $200 million in annualized revenue [7] Group 7 - CrePal launches the first AI video creation agent, allowing users to produce videos through a single command that orchestrates multiple models [8] - The system can automatically plan scripts, select appropriate models, generate visuals, and add sound effects, addressing high barriers in traditional AI video creation [8] - The innovation lies in transforming the creative process, enabling users to focus on creative expression rather than technical operations by integrating dispersed tools into a unified intelligent task [8] Group 8 - Apple's MLX framework adds CUDA support, enabling developers to train models using NVIDIA GPUs and deploy them back to Apple devices [9] - This move is seen as Apple's concession to the NVIDIA ecosystem, which dominates AI development with 5 million developers [9] - Despite past tensions over NVIDIA support, Apple opts to leverage NVIDIA's ecosystem for compliance and to expand its influence [9] Group 9 - HeShan Technology, founded by alumni from Tsinghua and Beihang University, focuses on AI tactile sensing technology and has developed the world's first AI tactile perception chip [10] - Utilizing capacitive tomography technology, HeShan achieves "sensing and control integration," addressing the tactile feedback needs in robotic precision operations [10] - The company has completed four rounds of financing and serves over 70% of domestic robot manufacturers, transitioning from a hardware provider to a comprehensive tactile solution provider [10] Group 10 - Nobel laureate John Jumper discusses the journey of AlphaFold, highlighting that the value of algorithm research is 100 times that of data [11] - AlphaFold predicts protein structures with atomic-level precision and has been cited 35,000 times, accelerating scientific discoveries [11] - Jumper predicts that AI4Science will become more generalized in the future, with AlphaFold enhancing the pace of structural biology development by 5-10%, leading to widespread advancements across scientific fields [11]
AI开发工具领域正经历重要变革,科创100指数ETF(588030)午后翻红上扬,近2周规模增长显著
Xin Lang Cai Jing· 2025-07-01 06:19
Group 1: Market Performance - As of July 1, 2025, the Sci-Tech Innovation Board 100 Index (000698) increased by 0.30%, with notable gains from Rongchang Bio (688331) at 10.76%, Xinmai Medical (688016) at 8.22%, and Zai Lab (688266) at 7.26% [3] - The Sci-Tech 100 Index ETF (588030) also rose by 0.30%, with a latest price of 1.02 yuan, and a cumulative increase of 3.99% over the past week as of June 30, 2025 [3] - The trading volume for the Sci-Tech 100 Index ETF reached 2.22 billion yuan, with a turnover rate of 3.52% [3] Group 2: Fund Growth and Leverage - The Sci-Tech 100 Index ETF saw a significant growth of 256 million yuan in scale over the past two weeks, ranking second among comparable funds [4] - The latest margin buying amount for the Sci-Tech 100 Index ETF reached 12.50 million yuan, with a margin balance of 217 million yuan [4] Group 3: Performance Metrics - As of June 30, 2025, the net value of the Sci-Tech 100 Index ETF increased by 13.69% over the past six months, ranking 416 out of 3427 index equity funds, placing it in the top 12.14% [5] - The ETF achieved a maximum monthly return of 27.67% since inception, with the longest consecutive monthly gain of 3 months and an average monthly return of 8.57% [5] - The ETF's Sharpe ratio for the past year was 1.03, indicating a favorable risk-adjusted return [5] Group 4: Fee Structure and Tracking Accuracy - The management fee for the Sci-Tech 100 Index ETF is 0.15%, and the custody fee is 0.05%, which are among the lowest in comparable funds [5] - The tracking error for the ETF over the past six months was 0.021%, demonstrating high tracking precision compared to similar funds [5] Group 5: Index Composition - The Sci-Tech 100 Index is composed of 100 securities selected from the Shanghai Stock Exchange's Sci-Tech Innovation Board, focusing on medium market capitalization and good liquidity [6] - As of June 30, 2025, the top ten weighted stocks in the index accounted for 22.99% of the total index weight, including companies like BeiGene (688235) and Huahong Semiconductor (688347) [6]