Artificial Intelligence
Search documents
超越CLIP!北大开源细粒度视觉识别大模型,每类识别训练仅需4张图像
量子位· 2026-02-11 01:55
Core Viewpoint - The article discusses the limitations of current multimodal large models in fine-grained visual recognition tasks and introduces the Fine-R1 model developed by Professor Peng Yuxin's team at Peking University, which significantly improves recognition accuracy with minimal training data [1][2][5]. Group 1: Fine-Grained Visual Recognition Challenges - Current multimodal large models excel in complex tasks but lag in fine-grained visual recognition compared to their visual encoders like CLIP [1]. - Real-world objects exhibit fine-grained characteristics, with numerous subclasses, such as over 500 types of fixed-wing aircraft, highlighting the importance of fine-grained recognition in practical applications [3]. Group 2: Fine-R1 Model Overview - The Fine-R1 model aims to leverage the rich knowledge of fine-grained subclasses and a generative decoding paradigm to overcome the limitations of traditional recognition methods, enabling fine-grained recognition of any visual object in an open domain [5]. - Fine-R1 enhances the model's ability to reason about unseen subclasses using a small number of training images (only 4 per subclass), outperforming models like OpenAI's CLIP and Google's DeepMind's SigLIP [5][15]. Group 3: Model Development Process - The development of Fine-R1 involves two main steps: 1. Chain-of-thought supervised fine-tuning, which simulates human reasoning to build inference capabilities [7]. 2. Triplet enhancement strategy optimization, which improves robustness to intra-class variations and inter-class distinctions by using positive and negative samples [8][10]. Group 4: Experimental Results - Fine-R1's performance was evaluated on six authoritative fine-grained image classification datasets, demonstrating superior accuracy in both seen and unseen categories compared to other models [15][17]. - The model's ability to utilize fine-grained subclass knowledge effectively was identified as the primary factor for its improved recognition accuracy, rather than enhancements in visual representation or knowledge storage [19]. Group 5: Conclusion and Future Work - The article concludes with the potential of Fine-R1 to excel in fine-grained visual recognition tasks, emphasizing its innovative approach to reasoning and knowledge application [21]. - The research has been accepted for ICLR 2026 and the code is open-sourced for further exploration [2][22].
未知机构:千问调研AlAgent落地的潜在风险随着AlAgent-20260211
未知机构· 2026-02-11 01:50
字节,阿里等大厂聚焦安全性与可追溯性,阿里通过ACT协议强制要求支付环节必须通过支付宝完成,利用其成 熟支付协议保障链路安全,并通过交易日志记录实现全流程可追溯,便于问题回溯; 国家层面尚未出台针对AlAgent执行权限的具体法规或监管条文; 千问调研 AlAgent落地的潜在风险: 千问调研 AlAgent落地的潜在风险: 随着AlAgent从辅助决策向直接执行操作延伸国内企业及监管机构面临新风险,AI可能跳过用户确认直接决策,导 致意外后果; 千问海外市场拓展规划 虽未发布独立出海产品,但已完成多语言模型开发,接入海外产品并统一使用谷歌UCP协议整合 海外电商,本地生活,Uber等,中间层打 阿里 海外电商及支付产品; 目标市场优先选择东南亚,南美,阿里生 态布局 完善,份额大; 第三方:头部平台如亚马逊接入意愿弱,垂直领域如Uber及中腰部/尾部服务商合作意愿高,目前处于接触阶段; AI驱动电商模式GMV渗透率预期: 2026年AI驱动电商GMV占比预计1.5%-2%;2027-2028年渗透率加速提升,2027年达5%,2028年超15%-20%; -驱动因素:用户心智快速转变;产品迭代重点转向任务执 ...
马斯克的xAI人事震荡:48小时内两位联合创始人相继离职
Sou Hu Cai Jing· 2026-02-11 01:28
IT之家 2 月 11 日消息,xAI 联合创始人吉米 · 巴(Jimmy Ba)表示,他已于当地时间周二离开埃隆 · 马斯克的这 家初创公司。 巴在 X 平台上写道:"是时候重新校准我对大局的认知梯度了。2026 年将会非同寻常,很可能是关乎人类未来最 忙碌(也最具决定性)的一年。" 根据去年年初的一份组织架构图,巴此前还管理着一支负责超千名 AI 导师的团队,该职位已于去年 9 月交由迭 戈 · 帕西尼(Diego Pasini)接任。 巴是不到 48 小时内第二位离职的公司联合创始人,托尼 · 吴已于当地时间周一晚间宣布从这家 AI 初创公司辞 职。 在吴离职前,xAI 再次进行了架构调整,他的多项工作职权被划归至张国栋名下。 马斯克于 2023 年与其他 11 位创始人共同创立了这家 AI 公司。目前已有 6 人离开,其中 5 人是在过去一年内离 职。 除在 xAI 任职外,巴还是多伦多大学计算机科学系的助理教授。他在该校师从被誉为"AI 教父"的诺贝尔奖得主 杰弗里 · 辛顿(Geoffrey Hinton),并获得博士学位。 马斯克曾表示,他创立 xAI,是为了打造有别于他口中"觉醒派"聊天机器人 ...
OpenAI力推成人内容 女高管反对被以“性别歧视”为由解雇
Feng Huang Wang· 2026-02-11 01:25
贝默斯特被解雇 OpenAI CEO萨姆.奥特曼(Sam Altman)已为扩大平台允许内容范围的决定辩护,称此举是"将成年用户当 作成年人对待"的一部分。 据知情人士透露,在被解雇前,贝默斯特曾向同事表示她反对成人模式,并担忧该功能可能对用户产生 有害影响。她向同事指出,她认为OpenAI防止儿童剥削内容的机制不够有效,且公司无法充分阻隔青 少年接触成人内容。 知情人士称,她是OpenAI内部对推出成人模式表达担忧的多名员工之一。 贝默斯特对此发布声明称:"对于我歧视他人的指控完全不实。" OpenAI发言人在一份声明中表示,贝默斯特"在任职期间做出了宝贵贡献,她的离职与她在公司工作期 间提出的任何问题无关"。 贝默斯特在2024年中加入OpenAI,担任OpenAI产品政策团队副总裁,该团队负责制定用户使用公司产 品的规则,并协助设计这些政策的执行机制。她被解雇发生在OpenAI计划于今年年初推出的成人模式 之前,该模式将允许用户在ChatGPT中创作AI成人内容。 据部分知情人士透露,这项计划中的功能将允许成人用户进行包含性话题在内的成人主题对话,但这一 计划已引发公司内部研究人员的批评,这些研究人员曾 ...
“AI智能体BlackZero”第零智能冲击港股IPO
Xin Lang Cai Jing· 2026-02-11 00:31
来源:独角兽早知道 合同助理。公司的合同助理解决方案透过支援合约审阅、建议潜在争议解决方法及精简法律文件模板的 生成流程,将商业合约工作流程数字化及自动化。 综合 | 招股书 编辑 | Arti 投资助理。公司的投资助理解决方案透过提供包括风险评估的全面投前报告、分析海量商业合同数据, 并生成投后报告追踪项目表现,从而简化投资生命周期。 品牌助理。公司的品牌助理解决方案透过对社交媒体及其他数字渠道进行实时多模态监测与分析,助力 消费品牌预测市场趋势及优化策略,从而帮助公司的客户实现营运转型、降低成本并发掘新的增长机 遇。 据招股书,第零智能是中国领先的企业AI智能体解决方案提供商。企业级AI智能体解决方案是指以大 语言模型(LLMs)及多智能体体系为核心基础,旨在透过为企业配置自动化AI智能体,以优化现有工作 流程、自动化复杂程序,并为决策提供辅助及支持的系统化解决方案。 公司致力于开发AI智能体解决方案,为客户提升效率、扩展性及精准度,根据弗若斯特沙利文,就 2024年的收益而言,第零智能已跃居中国企业级AI智能体解决方案第五大供应商,市场份额为3.0%。 第零智能的AI智能体解决方案由公司的专有AI平台B ...
云、AI与制造,中国出海的新三要素
吴晓波频道· 2026-02-11 00:20
Core Viewpoint - The article emphasizes that the combination of the global wave and the artificial intelligence revolution presents significant opportunities for Chinese entrepreneurs, marking a new era of "AI+ going global" as a crucial theme for the future [3][5]. Group 1: AI and Global Expansion - The popularity of generative AI in China surpassed 35% in early 2024, indicating a significant breakthrough in AI technology adoption [3]. - The emergence of humanoid robots during the Spring Festival has brought attention to embodied intelligence, which is expected to become a trillion-dollar industry in China, succeeding the electric vehicle sector [3]. - The demand for cloud services has surged as Chinese companies increasingly view international expansion as a necessity rather than an option, with Alibaba Cloud projected to surpass AWS in growth index by 2025 [5]. Group 2: Stages of Chinese Companies Going Global - Chinese companies have undergone four waves of international expansion, with the current phase being characterized as "full-factor going global," where companies are not just exporting products but also relocating equipment, technology, talent, and capital [11][14]. - The first wave in the mid-1990s involved component manufacturers, followed by the second wave in the late 1990s with "Made in China" products. The third wave in the mid-2010s was marked by the rise of cross-border e-commerce [12][13]. - The current fourth wave sees AI companies inherently designed for global markets, diverging from previous models of internationalization [15]. Group 3: New Challenges for AI Companies - AI companies face unique challenges in their global expansion, as their initial setup is already geared towards international markets, unlike previous generations of companies [15]. - New entrepreneurs are leveraging AI technology to create products aimed at global markets from the outset, with companies like MiniMax achieving rapid success in overseas markets [16][18]. - Established companies like Meitu are also accelerating their international presence, with significant user growth driven by AI features [21][22]. Group 4: Complexities of Global Operations - Companies expanding internationally must navigate a complex landscape characterized by geopolitical tensions and a shift towards a multi-core world, which complicates standardization and compliance [28][29]. - The operational challenges include adapting to diverse regulatory environments and cultural differences across regions, necessitating flexible and adaptive strategies [30][35]. - AI companies require robust cloud infrastructure to support their global operations, with a focus on seamless deployment and compliance with local regulations [36][38]. Group 5: Role of Cloud Services - Alibaba Cloud has emerged as a leading choice for over 80% of Chinese companies going global, providing standardized global technology architecture and support [42]. - The company is investing significantly in AI infrastructure, with plans to establish data centers in multiple countries to support international operations [44]. - The future of cloud services will be critical for AI companies as they seek to establish a competitive edge in global markets, with a focus on operational efficiency and technological support [45][50].
重塑AI时代的搜索可见性与内容营销—2026年GEO生成式引擎优化行业研究报告
艾瑞咨询· 2026-02-11 00:02
GEO丨研究报告 生态驱动与技术突破双轮驱动, 2025 年 AI 应用用户规模持续增加,流量分布呈现两极分化趋势 2025 年 AI 应用市场行业规模快速扩张,内部呈现显著分化趋势:豆包、 DeepSeek 、腾讯元宝、千问凭借生态优势或垂直场景突破实现爆发式增长, 为行业贡献主要增量;与此同时,部分应用增长停滞甚至略微下滑,反映出用户正加速向具备核心价值与生态协同能力的头部 APP 集中。 核心摘要: GEO 是一种新兴 的营销优化策略,基于 LLM 大语言模型的信息认知 - 答案输出的技术原理,通过优化内 容,使品牌或产品信息更易被生成式 AI 引擎抓取 、理 解、引用,并呈现在 AI 生成的答案中。 GEO 核心目标是构建品牌与 AI 之间的信任关联,促进品牌与产品被 AI "看见"并"信任"。 AI 行业发展现状 中国 AI 行业已进入以生成式 AI 为核心的规模化应用阶段, AI 正从效率工具演进为用户高频的信息获取与决策入口 AI 应用流量规模 随着 AI 应用普及,用户搜索行为发生哪些变化? 搜索范式变革,从传统的"链接导向"向"答案导向"迁移 在生成式人工智能的时代下,搜索引擎已从对话工具转 ...
GLM-5架构细节浮出水面:DeepSeek仍是绕不开的门槛
3 6 Ke· 2026-02-10 23:57
Core Insights - The article discusses the imminent release of new AI models in the Chinese market, particularly focusing on the GLM-5 model from Zhipu AI, which is expected to leverage advanced technologies and compete effectively in the AI landscape [1][16]. Group 1: Model Development and Features - The GLM-5 model has been linked to multiple technical platforms, indicating a strong collaborative effort in its development [2][4]. - GLM-5 incorporates a 78-layer Transformer decoder with a total parameter count of approximately 745 billion, which includes a mixture of dense and sparse architectures [6][8]. - The model utilizes a hybrid expert (MoE) architecture, activating only a small fraction of its parameters during inference, which enhances efficiency while maintaining performance [9][10]. Group 2: Technological Innovations - The integration of DeepSeek's Sparse Attention (DSA) mechanism allows GLM-5 to handle long sequences more efficiently, reducing computational costs significantly [12][13]. - Multi-Token Prediction (MTP) technology is employed to accelerate token generation, allowing the model to predict multiple tokens simultaneously, which is particularly beneficial for structured text generation tasks [15][16]. - The model's architecture reflects a shift towards efficiency over sheer parameter count, indicating a trend in the AI industry towards optimizing performance rather than simply increasing size [9][17]. Group 3: Market Position and Challenges - GLM-5 is expected to excel in code generation and logical reasoning tasks, positioning it competitively in software development and algorithm design [16]. - However, the model currently lacks multi-modal capabilities, which may limit its applicability in creative AI-generated content (AIGC) scenarios, especially as competitors advance in this area [16]. - The article highlights a broader industry trend where companies are moving towards open-source technology integration, emphasizing efficiency and practicality in AI model development [16][17].
李书福加持、资本热捧,“中国版马斯克”横空出世?
3 6 Ke· 2026-02-10 23:31
Core Insights - The article discusses the emergence of a new player in the AI industry, particularly focusing on the leadership of Yin Qi at Jieli Technology and the launch of the new model Step 3.5 Flash by Jiyue Xingchen, which aims to integrate AI into the physical world [2][9][35] Group 1: Industry Trends - The domestic large model sector is experiencing intense competition, with major players releasing new model versions and engaging in aggressive talent acquisition, including a combined investment of 4.5 billion yuan for recruitment during the Spring Festival [2] - The current landscape indicates a shift towards physical AI, as companies like OpenAI and Google are investing in robotics and integrating AI into hardware, marking a transition from virtual to physical applications of AI [4][28] Group 2: Company Developments - Jiyue Xingchen, under Yin Qi's leadership, is the only company in China focusing on both large models and terminal applications, aiming to bring AI into the physical world [2][9] - The company has secured 5 billion yuan in B+ round financing, setting a record for the largest single financing in the Chinese large model sector in the past year [9][25] Group 3: Leadership and Vision - Yin Qi, a veteran in the AI field, has shifted his focus from his previous company, Megvii, to the automotive sector, believing that AI combined with vehicles represents a significant opportunity for growth [9][12][28] - His strategy emphasizes the importance of profitability and creating a closed-loop system for AI applications, which he believes is essential for long-term success [14][25] Group 4: Technological Innovations - Jiyue Xingchen's new model, Step 3.5 Flash, is designed for intelligent agents, offering low latency and high-speed capabilities, which are crucial for applications in autonomous driving and smart cabins [26][29] - The company has achieved deep compatibility with several domestic AI chip manufacturers, ensuring a closed-loop for domestic computing power [27] Group 5: Market Positioning - The collaboration between Jiyue Xingchen and Jieli Technology aims to create a comprehensive intelligent driving system, which has already been implemented in over 300,000 vehicles [29] - Yin Qi's approach focuses on partnering with leading automotive companies to establish a strong market presence and build a data-driven feedback loop for continuous improvement [32]
字节Seedance 2.0火了,海外博主锐评:一周前我还在看好可灵……
3 6 Ke· 2026-02-10 23:25
Core Insights - ByteDance has launched its AI video generation model Seedance 2.0, which has been described as a "game changer" in the industry, showcasing significant advancements in video generation capabilities [1][4][23]. Group 1: Product Development and Features - Seedance was initially developed as a text-to-video model in 2023, undergoing two years of internal testing before its public release in June 2025 [1]. - The model has undergone multiple updates, with three major versions (1.0, 1.0 pro, 1.5 pro, and 2.0) released in the last eight months, culminating in its current iteration, Seedance 2.0 [1]. - Seedance 2.0 is praised for its ability to generate complex multi-shot scenes with synchronized sound effects, music, and multilingual dialogue, earning it the title of "new king" in AI video models [4][9]. Group 2: Market Reception and Impact - The launch of Seedance 2.0 has generated significant buzz in both domestic and international markets, with notable creators and media outlets discussing its capabilities [4][6]. - Global media coverage has highlighted Seedance 2.0's role in establishing ByteDance as a key player in the global AI video market, with early users expressing high praise for its innovative features [9][14]. - The model has been compared favorably against competitors like OpenAI's Sora and Google's Veo, with reports indicating it surpasses them in video generation speed and narrative control [14][28]. Group 3: Competitive Landscape - The competitive landscape in AI video generation is intensifying, with other Chinese companies like Kuaishou also launching their models, such as Kling 3.0, which Seedance 2.0 has outperformed [14][30]. - Analysts believe that Seedance 2.0 could enhance ByteDance's valuation in the capital markets, reflecting its potential impact on the company's financial performance [14]. - The advancements in Seedance 2.0 signify a broader trend of Chinese tech companies leading in AI innovation, potentially reshaping the global technology landscape [30].