Hugging Face
Search documents
五年,终于等来Transformers v5
具身智能之心· 2025-12-03 03:47
编辑丨 机器之心 点击下方 卡片 ,关注" 具身智能之心 "公众号 >> 点击进入→ 具身 智能之心 技术交流群 更多干货,欢迎加入国内首个具身智能全栈学习社区: 具身智能之心知识星球(戳我) ,这里包含所有你想要的! 刚刚,Transformers v5 发布首个 RC(候选) 版本 v5.0.0rc0。 GitHub:https://github.com/huggingface/transformers/releases/tag/v5.0.0rc0 这次更新标志着这一全球最流行的 AI 基础设施库,正式跨越了从 v4 到 v5 长达 五年 的技术周期。 作为 Hugging Face 最核心的开源项目,自 2020 年 11 月 v4 版本发布以来,Transformers 的日下载量已从当时的 2 万次激增至如今的超过 300 万次 ,总安装量突破 12 亿次 。 它定义了业界如何使用模型,支持的架构也从最初的 40 个扩展至超过 400 个 ,涵盖了文本、视觉、音频及多模态领域,社区贡献的模型权重更是超过 75 万个 , 涵盖了文本、视觉、音频及多模态领域。 官方表示,在人工智能领域,「重塑」是保持长 ...
DeepSeek重磅上新,对标美国行业巨头,“所有群聊都炸锅了!”
Xin Lang Cai Jing· 2025-12-02 10:24
Core Insights - DeepSeek, a Chinese AI startup, launched two new models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, achieving performance levels comparable to leading models from OpenAI and Google DeepMind [1][4][7] - The release coincides with the NeurIPS conference, generating significant interest in the AI research community [2][7] - The V3.2 model is designed for practical use, while the V3.2-Speciale focuses on enhanced reasoning capabilities, achieving gold medal-level performance in prestigious competitions [5][6][7] Model Performance - DeepSeek-V3.2 matches OpenAI's GPT-5 in mainstream reasoning benchmarks and is slightly below Google’s Gemini-3.0 Pro [4][6] - The V3.2-Speciale version excels in reasoning tests, achieving scores that rival Gemini-3.0 Pro [4][5] - Both models have shown significant improvements in efficiency, reducing computational costs and user wait times [4][6] Competitive Landscape - The success of DeepSeek's models indicates that Chinese open-source AI systems are becoming competitive with top proprietary models from Silicon Valley [7][8] - The trend towards open-source AI in China contrasts with the closed strategies of major US tech companies, which prefer to maintain control over their advanced technologies [9][10] - Recent data shows that the download share of open-source AI models from Chinese teams has surpassed that of US teams for the first time [8][9] Industry Implications - The advancements from DeepSeek suggest a shift in the AI model release paradigm, with Chinese companies frequently launching new models and versions [9][10] - The focus on open-source models in China may lead to broader applications of AI technology, potentially challenging the dominance of US AI labs [10]
对标美国行业巨头,“所有群聊都炸锅了”
Guan Cha Zhe Wang· 2025-12-02 08:46
Core Insights - DeepSeek, a Chinese AI startup, has launched two new models, DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, which have achieved performance levels comparable to leading models from OpenAI and Google DeepMind [1][8] - The release of these models coincides with the upcoming NeurIPS conference, generating significant interest in the AI research community [2][8] Model Performance - DeepSeek-V3.2 is designed for practical use, achieving performance on par with OpenAI's GPT-5 in mainstream reasoning benchmarks, while DeepSeek-V3.2-Speciale excels in reasoning capabilities, matching Google DeepMind's Gemini 3.0 Pro [1][4] - The V3.2 model has shown a significant reduction in output length compared to Kimi-K2-Thinking, leading to lower computational costs and reduced user wait times [4] - DeepSeek-V3.2-Speciale has demonstrated exceptional performance in international competitions, including winning gold medals in IMO 2025 and IOI 2025, marking a significant achievement for open-source AI models [5][8] Competitive Landscape - The advancements made by DeepSeek indicate that Chinese open-source AI systems are becoming competitive with top proprietary models from Silicon Valley [8][10] - The trend towards open-source models in China contrasts with the closed strategies of major US tech companies, which tend to keep their advanced AI technologies proprietary [10][11] - Recent data shows that the download share of open-source AI models developed by Chinese teams has surpassed that of US teams for the first time, indicating a shift in the global AI landscape [9][10] Community and Industry Impact - The announcement of DeepSeek's new models has sparked excitement within the AI research community, with discussions and engagement across various platforms [2][8] - The models are now available on DeepSeek's official website, app, and API, with the Speciale version currently offered as a temporary API for community evaluation [5][7]
五年,终于等来Transformers v5
机器之心· 2025-12-02 06:47
Core Insights - The article discusses the release of the first release candidate version v5.0.0rc0 of the Transformers library, marking a significant transition from version 4 to version 5 after a five-year technical cycle [2] - The library has seen a dramatic increase in usage, with daily downloads rising from 20,000 at the time of v4's release to over 3 million today, and total installations surpassing 1.2 billion [2] - The core focus of the v5 update is on simplicity, pre-training, interoperability with high-performance inference engines, and making quantization a core feature [2][3] Evolution and Features - The v5 version establishes PyTorch as the sole core backend and emphasizes four key dimensions of evolution: extreme simplicity, transition from fine-tuning to pre-training, interoperability with high-performance inference engines, and enhanced quantization capabilities [2] - The team aims for a clean and clear model integration approach, promoting broader standardization and stronger generality [4] - Over the past five years, an average of 1-3 new models has been added weekly, with the goal of becoming the only trusted source for model definitions [4] Modular Design and Tools - Hugging Face has advanced a modular design approach, simplifying maintenance and speeding up integration while fostering community collaboration [6] - The introduction of the AttentionInterface provides a centralized abstraction layer for attention mechanisms, streamlining the management of common auxiliary functions [8] - Tools are being developed to identify similarities between new models and existing architectures, aiming to automate the model conversion process into the Transformers format [9][10] Training Enhancements - The v5 version increases support for pre-training, with redesigned model initialization and support for forward and backward propagation optimization operators [15][16] - Hugging Face continues to collaborate closely with fine-tuning tools in the Python ecosystem and ensures compatibility with tools in the JAX ecosystem [17] Inference Improvements - Inference is a key focus of the v5 update, introducing dedicated kernels, cleaner default settings, new APIs, and optimized support for inference engines [18][19] - The v5 version aims to complement specialized inference engines rather than replace them, ensuring compatibility with engines like vLLM, SGLang, and TensorRT-LLM [21] Local Deployment and Quantization - The team collaborates with popular inference engines to allow Transformers to be used as a backend, enhancing the value of models added to Transformers [23] - Quantization is positioned as a core capability of Transformers, ensuring compatibility with major functionalities and providing a reliable framework for training and inference [27]
深度|Hugging Face联创:中国模型成初创公司首选,开源将决定下一轮AI技术主导权
Z Potentials· 2025-11-28 02:52
Core Insights - The article discusses the evolving landscape of AI competition leading into 2026, highlighting trends such as the concentration of power among a few key players and the rise of new entrants in the open-source community, particularly from China [3][7][8] - It emphasizes the limitations of current large language models (LLMs) in achieving super intelligence and the challenges in generalization capabilities [15][18][22] - The article also explores the implications of open-source versus closed-source models, talent attraction, and the importance of policy support for fostering innovation in the AI sector [33][40][41] Group 1: AI Competition Trends - The AI industry is witnessing a concentration of power among a few core players due to the availability of computational resources, which will be a significant topic in 2026 [7][11] - There is a notable emergence of new laboratories in China producing high-quality models, which has prompted a resurgence of open-source initiatives in the U.S. as a response to China's advancements [8][9] - Companies seeking to explore new AI applications are increasingly turning to open-source models, as closed-source systems impose limitations [8][10] Group 2: Limitations of Current AI Models - Current LLMs exhibit weaker generalization capabilities than previously expected, leading to a ceiling effect that hinders the achievement of super intelligence [15][18] - The article posits that while AI can serve as a valuable research assistant, it struggles to define new research questions, which is crucial for groundbreaking scientific discoveries [20][22] - The notion that expanding model size will naturally lead to greater intelligence is challenged, with the argument that true innovation requires more than just scaling [22][24] Group 3: Open-source vs Closed-source Dynamics - The choice between open-source and closed-source models is influenced by various factors, including the need to attract top talent and the cultural context of the research environment [36][37] - In the U.S., closed-source models are becoming more attractive for researchers, while in China, open-source models are preferred [37][39] - The article suggests that policy support for open-source initiatives is crucial for maintaining a competitive edge in AI development [40][41] Group 4: Business Model and Future Directions - Hugging Face is transitioning its business model to focus on enterprise solutions, providing tools for organizations to manage and deploy AI models securely [50][51] - The company has entered the robotics field, emphasizing the importance of open-source ecosystems in this domain and launching affordable entry-level robotic products [52][58] - The introduction of a low-cost robotic arm and the Ritchie Mini robot aims to enhance human-robot interaction and make robotics more accessible [58][59]
What the head of IBM’s $500 million AI and quantum venture fund is looking for in a startup
Yahoo Finance· 2025-11-27 08:01
Core Insights - IBM Ventures is a $500 million venture fund primarily focused on AI and quantum computing, having made 23 investments in various innovative companies [1] - The investment strategy emphasizes business-to-business companies that align with IBM's ecosystem [2] - IBM aims for investments that are scalable, ready for partnerships, and utilize responsible AI, boasting a collaboration rate exceeding 90% with portfolio companies [3] Investment Criteria - Investments are evaluated based on three key areas: product capabilities, ecosystem partnership potential, and industry disruption [4] - IBM actively seeks products that it can utilize internally, exemplified by its AI-driven internal HR app, AskHR [5] Operational Efficiency - The internal adoption of AI tools is projected to save IBM $4.5 billion in operating expenses this year [6] Quantum Computing Focus - IBM's current focus in quantum computing is more on software and algorithms rather than hardware, with notable investments like QEDMA, which specializes in quantum error correction [7]
我们正处于“LLM 泡沫”,而非 AI 泡沫
阿尔法工场研究院· 2025-11-20 02:21
Group 1 - The core viewpoint is that large language models (LLMs) are not a universal solution, and the future will see more customized and specialized models that address specific problems [1][2] - The current focus and funding are heavily concentrated on the idea of building a "universal model" using vast computational resources, but the reality is that smaller, specialized models will emerge to effectively solve different issues [2] - The potential bursting of the LLM bubble may have limited impact on the company, as the AI industry is large and diversified, meaning that even if the sector is overvalued, it will not significantly affect the overall industry or its business [3] Group 2 - Hugging Face has adopted a cautious spending strategy, with half of the $400 million raised still in the bank, contrasting sharply with the "burn rate" of other AI companies, particularly in the large language model space [3] - The company aims to build a "long-term, sustainable, and globally impactful" business, learning from past industry cycles where many practitioners are eager for quick results and act with a short-term perspective [3]
谷歌CEO发出AI泡沫警告,行业未来路径引分歧
Sou Hu Cai Jing· 2025-11-19 04:10
与此相对,Hugging Face 首席执行官克莱姆・德兰格提出了不同见解。他认为当前不存在全面的 AI 泡沫,而是特定的"大语言模型泡沫",并预测这个泡沫 可能在明年破裂。他指出,未来趋势将是从追求通用大模型转向开发更多专用化的小型模型,这种转变将更符合实际商业需求。 【太平洋科技快讯】11 月 19 日消息,谷歌母公司 Alphabet 首席执行官桑达尔・皮查伊近日在接受 BBC 采访时指出,当前人工智能投资热潮中存在一定程 度的"非理性"成分。他警告称,若 AI 泡沫破裂,所有企业都将受到波及,无一能够幸免,即便是拥有全栈技术优势的谷歌也不例外。这番表态正值 AI 公司 估值持续飙升之际,引发了市场对行业过热的高度关注。 然而,AI 的快速发展也带来了严峻挑战。皮查伊特别指出,AI 已占全球电力消耗的 1.5%,这种"巨大"的能源需求可能成为经济发展的瓶颈。他承认公司气 候目标因此受到影响,但承诺仍将坚持 2030 年净零排放目标。在就业方面,他认为 AI 将是"最具变革性的技术",善于运用 AI 工具的人将在各个职业中脱 颖而出。 皮查伊将当前状况与 1996 年互联网泡沫时期的"非理性繁荣"相提并论 ...
Huggin Face CEO:我们身处“大模型泡沫”,而非“AI泡沫”,且这个泡沫即将破裂
Hua Er Jie Jian Wen· 2025-11-19 01:06
Core Insights - The current market is in a "large model bubble" rather than an AI bubble, which may soon burst, but this will not pose a significant threat to the overall AI industry [1] - Large models are receiving excessive attention, with resources concentrated on building a single model to solve all problems, while smaller, specialized models will emerge to address specific issues more efficiently [2] - The AI industry is large and diversified, providing a buffer against the potential impact of the large model bubble bursting [3] - Hugging Face adopts a capital-efficient strategy, retaining half of its $400 million funding, contrasting with other AI companies that spend billions [4] Summary by Sections Large Model Bubble - Clem Delangue, CEO of Hugging Face, believes the market is experiencing a "large model bubble" that may burst next year, but this will not significantly harm the AI industry as a whole [1] Limitations of Large Models - Delangue argues that large models are not a one-size-fits-all solution, and smaller, specialized models will be more widely adopted in the future, as they are cheaper and faster [2] Industry Diversification - The potential bursting of the large model bubble may affect Hugging Face to some extent, but the AI industry is vast and diversified, which mitigates the impact on the overall sector [3] Capital Efficiency Strategy - Hugging Face has a unique capital strategy, having raised $400 million but retaining half of it, which Delangue views as a sign of profitability compared to other companies that spend excessively [4]
X @TechCrunch
TechCrunch· 2025-11-18 21:44
Hugging Face CEO says we’re in an ‘LLM bubble,’ not an ‘AI bubble’ https://t.co/mKirsu3wzV ...