Workflow
Transformers v5
icon
Search documents
五年,终于等来Transformers v5
自动驾驶之心· 2025-12-04 03:03
Core Insights - The article discusses the release of Transformers v5.0.0rc0, marking a significant evolution in the AI infrastructure library after a five-year development cycle from v4 to v5 [3] - The update highlights the growth of the Transformers library, with daily downloads increasing from 20,000 to over 3 million and total installations surpassing 1.2 billion since the v4 release in November 2020 [3] - The new version focuses on four key dimensions: simplicity, transition from fine-tuning to pre-training, interoperability with high-performance inference engines, and making quantization a core feature [3] Simplification - The primary focus of the team is on simplicity, aiming for a clean and clear integration of models, which will enhance standardization, versatility, and community support [5][6] - The library has adopted a modular design approach, facilitating easier maintenance and faster integration, while promoting collaboration within the community [10] Model Updates - Transformers serves as a toolbox for model architectures, with the goal of including all the latest models and becoming the trusted source for model definitions [7] - Over the past five years, an average of 1-3 new models has been added weekly [8] Model Conversion Tools - Hugging Face is developing tools to identify similarities between new models and existing architectures, aiming to automate the model conversion process into the Transformers format [13][14] Training Enhancements - The v5 version emphasizes support for pre-training, with redesigned model initialization and broader compatibility with optimization operators [20] - Hugging Face continues to collaborate with fine-tuning tools in the Python ecosystem and is ensuring compatibility with tools in the JAX ecosystem [21] Inference Improvements - Inference is a key area of optimization in v5, with updates including dedicated kernels, cleaner default settings, new APIs, and enhanced support for inference engines [22][25] - The goal is not to replace specialized inference engines but to achieve compatibility with them [25] Local Deployment - The team collaborates with popular inference engines to ensure that models added to Transformers are immediately available and can leverage the advantages of these engines [27] - Hugging Face is also working on local inference capabilities, allowing models to run directly on devices, with expanding support for multimodal models [28] Quantization - Quantization is becoming a standard in modern model development, with many state-of-the-art models being released in low-precision formats such as 8-bit and 4-bit [29]
五年,终于等来Transformers v5
机器之心· 2025-12-02 06:47
Core Insights - The article discusses the release of the first release candidate version v5.0.0rc0 of the Transformers library, marking a significant transition from version 4 to version 5 after a five-year technical cycle [2] - The library has seen a dramatic increase in usage, with daily downloads rising from 20,000 at the time of v4's release to over 3 million today, and total installations surpassing 1.2 billion [2] - The core focus of the v5 update is on simplicity, pre-training, interoperability with high-performance inference engines, and making quantization a core feature [2][3] Evolution and Features - The v5 version establishes PyTorch as the sole core backend and emphasizes four key dimensions of evolution: extreme simplicity, transition from fine-tuning to pre-training, interoperability with high-performance inference engines, and enhanced quantization capabilities [2] - The team aims for a clean and clear model integration approach, promoting broader standardization and stronger generality [4] - Over the past five years, an average of 1-3 new models has been added weekly, with the goal of becoming the only trusted source for model definitions [4] Modular Design and Tools - Hugging Face has advanced a modular design approach, simplifying maintenance and speeding up integration while fostering community collaboration [6] - The introduction of the AttentionInterface provides a centralized abstraction layer for attention mechanisms, streamlining the management of common auxiliary functions [8] - Tools are being developed to identify similarities between new models and existing architectures, aiming to automate the model conversion process into the Transformers format [9][10] Training Enhancements - The v5 version increases support for pre-training, with redesigned model initialization and support for forward and backward propagation optimization operators [15][16] - Hugging Face continues to collaborate closely with fine-tuning tools in the Python ecosystem and ensures compatibility with tools in the JAX ecosystem [17] Inference Improvements - Inference is a key focus of the v5 update, introducing dedicated kernels, cleaner default settings, new APIs, and optimized support for inference engines [18][19] - The v5 version aims to complement specialized inference engines rather than replace them, ensuring compatibility with engines like vLLM, SGLang, and TensorRT-LLM [21] Local Deployment and Quantization - The team collaborates with popular inference engines to allow Transformers to be used as a backend, enhancing the value of models added to Transformers [23] - Quantization is positioned as a core capability of Transformers, ensuring compatibility with major functionalities and providing a reliable framework for training and inference [27]