Workflow
SongGeneration音乐生成大模型
icon
Search documents
人人皆可创作音乐!腾讯AI Lab开源音乐生成大模型SongGeneration
机器之心· 2025-06-20 00:58
Core Viewpoint - Tencent AI Lab has launched and open-sourced the SongGeneration music generation model, addressing common challenges in music AIGC such as sound quality, musicality, and generation speed, achieving superior performance compared to existing models [1][6]. Group 1: Model Performance and Features - SongGeneration significantly enhances sound quality while maintaining generation speed, outperforming many existing open-source models in various dimensions including melody, accompaniment, sound quality, and structure [1][5]. - The model supports features like text control, multi-track synthesis, and style following, catering to both C-end creators and B-end stability and scalability [2][8]. - Compared to traditional rule-based or small models, large model-based music generation shows stronger generalization and generation potential, transitioning AI music creation from "assistance" to "intelligent co-creation" [5][6]. Group 2: Technical Solutions - SongGeneration's architecture includes a music data pipeline and a generation model, utilizing modules for audio separation, structure analysis, and lyric recognition to train on a large dataset of songs [10][12]. - The model employs innovative low bitrate music encoding, achieving high-quality music reconstruction at extremely low bitrates, thus easing the modeling burden on the language model [19][20]. - A multi-category token parallel prediction strategy is introduced to enhance harmony between vocals and accompaniment, improving sound quality and musicality [21]. Group 3: Training Paradigm and Evaluation - SongGeneration adopts a novel three-stage training paradigm: pre-training, modular expansion training, and multi-preference alignment, optimizing music generation based on language models [27][30]. - The evaluation framework combines objective analysis and subjective perception, assessing SongGeneration against commercial and open-source models across multiple key dimensions [29][31]. - In objective assessments, SongGeneration ranks first among open-source models and is competitive with commercial models, showcasing its technical completeness and artistic expressiveness [32][33]. Group 4: User Experience and Accessibility - SongGeneration is available on Hugging Face for online experience, with all model weights and code open-sourced for community engagement and feedback [36].