Workflow
多模态基础模型
icon
Search documents
阿里巴巴(09988)组建机器人和具身智能团队 探索让AI从虚拟世界走向物理世界
智通财经网· 2025-10-09 07:49
Group 1 - Alibaba is transitioning its Tongyi Qianwen language model towards becoming an intelligent agent capable of real-world actions, utilizing tools and memory through reinforcement learning for long-horizon reasoning [1] - The company has formed a small team focused on robotics and embodied intelligence, indicating a strategic shift towards integrating AI with physical applications [1] - Alibaba has invested $140 million in the robotics company "Self-Variable Robotics" to accelerate AI and robotics technology development, product iteration, and commercialization [1] Group 2 - Alibaba's CEO stated that global AI investment is expected to reach $4 trillion over the next five years, emphasizing the need for Alibaba to keep pace with this growth [2] - The company plans to invest an additional 380 billion yuan in cloud and AI hardware infrastructure over the next three years, building on previously announced investments [2]
原力无限签订2.6亿元具身智能单笔订单;阿里通义已建立机器人和具身智能的小型团队丨智能制造日报
创业邦· 2025-10-09 03:23
1.【荣耀Magic8系列发布会官宣定档10月15日】10月9日,荣耀正式宣布,将于10月15日举办"开 新局·见未来"荣耀Magic8系列暨MagicOS 10发布会。据悉,荣耀新机定位"自进化AI原生手机", 搭载独家AI实体侧边键,支持短按快速进入影像界面、长按唤醒YOYO智能体。(新浪科技) 2.【原力无限签订2.6亿元具身智能单笔订单】10月9日,据原力无限机器人公众号消息,10月,原 力无限智能科技(杭州)有限公司与时华文旅控股集团在杭州正式签署战略合作协议,项目金额高达 2.6亿元人民币,这是迄今为止全球具身智能单笔金额最大的商业订单。双方将围绕"机器人+文旅"战 略赛道展开全方位合作,共同打造全国首批具身智能智慧景区样板工程,并将在智能导览、互动体 验、运营服务、数字管理等多个环节实现系统化创新。(财联社) 3.【美国可重复使用火箭开发商Stoke Space完成5.1亿美元D轮融资】美国可重复使用火箭开发商 Stoke Space当地时间10月8日宣布,已完成5.1亿美元的D轮融资,由美国创新技术基金(USIT) 领投。这笔新融资使Stoke Space的总融资额达到9.9亿美元,将用于加速 ...
阿里通义组建机器人和具身智能团队,要让智能体具备“行动力”
Xin Lang Cai Jing· 2025-10-09 02:07
Core Insights - Alibaba's Tongyi Qianwen team is transitioning from a language model to an intelligent agent capable of real-world actions, indicating a significant shift in their AI strategy [1] - The Tongyi Qianwen model family now covers multiple modalities, achieving top-tier performance globally, with flagship model Qwen3-Max surpassing competitors like GPT-5 and Claude Opus 4 [3] - Alibaba's financial performance supports its technological advancements, with a notable increase in revenue and AI-related product growth [4] Group 1: Technological Developments - The team led by Lin Junyang is developing small teams focused on robotics and embodied intelligence, aiming to enhance the capabilities of the Tongyi Qianwen model [1] - The flagship model Qwen3-Max has a pre-training data volume of 36 trillion tokens and over one trillion parameters, showcasing strong coding and agent tool capabilities [3] - Alibaba has open-sourced over 300 models, achieving over 600 million downloads globally, with 170,000 derivative models available [3] Group 2: Market Position and Growth - In the first half of 2025, the daily usage of enterprise-level models in China is expected to grow by 363% compared to the end of 2024, with Alibaba's Tongyi Qianwen holding a 17.7% market share [3] - Alibaba Cloud's revenue growth accelerated to 18%, reaching 30.127 billion yuan, driven by strong AI demand, with AI-related product revenue growing for seven consecutive quarters [4] - For the fourth quarter of fiscal year 2025, Alibaba reported a revenue of 236.454 billion yuan, a 7% year-on-year increase, and an operating profit of 28.465 billion yuan, up 93% [4]
阿里下场,通义千问牵头组建机器人AI团队
Xuan Gu Bao· 2025-10-09 00:14
Core Insights - Alibaba Group has established an internal robotics team, marking its entry into the competitive AI hardware market alongside global tech giants [1][3] - The formation of the "Robotics and Embodied AI Group" signifies a strategic shift from AI software to hardware applications [1][3] - Alibaba Cloud has made its first investment in embodied intelligence by leading a $140 million funding round for the startup X Square Robot [1][4] Group 1: Company Developments - Alibaba's CEO stated that global AI investment is expected to accelerate to $4 trillion over the next five years, necessitating Alibaba's alignment with this growth [1] - The company plans to invest an additional $58 billion in cloud and AI hardware infrastructure over the next three years [1] - The newly formed robotics team aims to leverage Alibaba's strengths in large models and AI technology to capture a share of the rapidly growing embodied AI market [3] Group 2: Market Context - The establishment of Alibaba's robotics team coincides with significant investments in the robotics sector by other tech giants, including SoftBank's $5.4 billion acquisition of ABB's industrial robotics business [1][6] - The global robotics market is projected to reach $7 trillion by 2050, attracting substantial capital from various investors [6] - NVIDIA's CEO highlighted AI and robotics as major growth opportunities, with autonomous vehicles expected to be a primary commercial application of robotics technology [6] Group 3: Startup Investment - Alibaba's investment in X Square Robot represents its first foray into the embodied intelligence sector, with the startup having raised a total of approximately $280 million in less than two years [4] - X Square Robot has developed a humanoid robot capable of 360-degree cleaning and is currently targeting institutional clients such as schools and hotels [5] - The company plans to prepare for an IPO next year, with expectations that its "robot butler" will become a reality within five years [5]
阿里通义林俊旸:已建立机器人和具身智能的小型团队
Xin Lang Cai Jing· 2025-10-08 15:00
Core Insights - The head of Alibaba's Tongyi Qianwen large language model, Lin Junyang, announced the establishment of a small team focused on robotics and embodied intelligence [1] - Lin emphasized that multimodal foundational models are evolving into foundational agents capable of long-horizon reasoning through reinforcement learning, suggesting a transition from virtual to physical environments [1] Group 1 - The formation of a dedicated team for robotics and embodied intelligence indicates Alibaba's commitment to advancing AI technologies [1] - The shift from multimodal foundational models to foundational agents highlights a significant evolution in AI capabilities, particularly in reasoning and tool utilization [1] - The intention to move from virtual to physical applications suggests potential new market opportunities for Alibaba in the robotics sector [1]
三个人、一篇论文,估值850亿
3 6 Ke· 2025-09-17 08:40
Core Insights - Thinking Machines Lab has achieved a remarkable valuation of $12 billion (approximately 85 billion RMB) within just seven months of its establishment, despite not having launched any formal products or having actual users [1][3] - The company, founded by former OpenAI CTO Mira Murati, has successfully completed a $2 billion seed funding round, attracting investments from major industry players like AMD and NVIDIA, positioning itself as a potential competitor to leading firms such as OpenAI, Anthropic, and Google DeepMind [1][3][4] Company Overview - Thinking Machines Lab focuses on multimodal foundational models and next-generation human-machine collaboration, with a core team of around 30 members, two-thirds of whom are from OpenAI [3][4] - The company has established a partnership with Google Cloud for computing power and plans to release its first product, which will include open-source components, in the coming months [3][4] Investment Dynamics - The investment landscape has shifted towards a GPU arms race, with Thinking Machines Lab securing a significant allocation of NVIDIA and AMD GPUs, which are critical for training large models [4][6] - The valuation reflects not just potential revenue but also the strategic positioning within the AI ecosystem, as the company is seen as a last major opportunity for investors to back a team with OpenAI's core decision-makers [5][6] Research and Development Focus - Thinking Machines Lab has adopted a "technology-driven" approach, using research publications and blogs to showcase its advancements in the field, which serves as a new model for AI startups [2][7] - The company recently published a paper addressing the non-determinism in large language model (LLM) inference, highlighting the importance of output stability and predictability for user trust and system reliability [7][8][10] Industry Implications - The focus on output consistency and predictability is crucial for high-risk sectors such as healthcare and finance, where user trust is paramount [10][12] - The insights from Thinking Machines Lab's research may lead to a shift in industry standards, emphasizing the need for "deterministic AI" and potentially creating a certification system for trustworthy AI [12][14] Future Trends - The AI industry is expected to evolve towards more efficient and interpretable model architectures, moving away from merely increasing parameter counts [13][14] - There will be a growing emphasis on energy efficiency and sustainable practices in AI model deployment, with expectations for significant reductions in energy consumption by 2027 [14]
千问团队开源图像基础模型 Qwen-Image
AI前线· 2025-09-02 06:52
作者 | Anthony Alford 译者 | 明知山 千问大模型团队 最近开源了 Qwen-Image,一个图像基础模型。Qwen-Image 支持从文本到图像 (T2I)的生成任务以及从文本图像到图像(TI2I)的编辑任务,并且在多项基准测试中均取得了超 越其他模型的卓越表现。 Qwen-Image 使用 Qwen2.5-VL 处理文本输入,使用变分自编码器(VAE)处理图像输入,并通过 多模态扩散变换器(MMDiT)进行图像生成。这一组模型在文本渲染方面表现出色,支持英语和中 文文本。千问团队在包括 DPG、GenEval、GEdit 和 ImgEdit 在内的 T2I 和 TI2I 基准测试中对模型 进行了评估,Qwen-Image 总体得分最高。在图像理解任务中,尽管不如专门训练的模型表现好, 但 Qwen-Image 的性能与它们"非常接近"。此外,千问团队还创建了 AI Arena,一个比较网站,人 类评估者可以在上面对生成的图像对进行评分。Qwen-Image 目前排名第三,与包括 GPT Image 1 在内的五个高质量闭源模型竞争。根据千问团队的说法: Qwen-Image 不仅仅是一个 ...
苹果最新模型,5年前的iPhone能跑
3 6 Ke· 2025-09-01 11:37
Core Insights - Apple has made significant advancements in large model development with the introduction of the new multimodal foundation model MobileCLIP2, which features a multimodal reinforcement training mechanism [1][12] - The model is designed for zero-shot classification and retrieval tasks, with inference latency ranging from 3 to 15 milliseconds and parameter sizes between 50 million to 1.5 billion [1][3] Model Performance - MobileCLIP2-B has achieved a 2.2% improvement in zero-shot accuracy on the ImageNet-1k dataset compared to its predecessor [1][11] - The MobileCLIP2-S4 variant matches the zero-shot accuracy of the larger SigLIP-SO400M/14 model while having only half the parameter count [4][6] Training Mechanism - The improved training mechanism integrates enhanced teacher supervision and caption data to boost zero-shot performance [2][9] - This mechanism allows for direct deployment of multimodal models on mobile and edge devices, ensuring low latency and memory usage [2][8] Open Source and Developer Support - All model variants' pre-trained weights and data generation code have been made publicly available, facilitating direct deployment and benchmarking for developers [2][12] - The data generation code supports distributed scalable processing, enabling developers to create customized datasets for further research and rapid prototyping [8][12] Technical Details - The training mechanism effectively distills knowledge from multiple sources into a smaller model, enhancing semantic coverage and reducing computational overhead during training and inference [9][10] - The integration of teacher models and caption generation has been optimized through a two-phase protocol, significantly improving the model's ability to express image content [11][12]