上海AI独角兽抛出全模态“全家桶”

Core Insights - MiniMax has launched several advanced AI models, including the text model M2, video model Hailuo 2.3, speech model Speech 2.6, and music model Music 2.0, showcasing significant technological advancements in AI applications [1][2] Group 1: Text Model M2 - The newly released MiniMax M2 text model features 10 billion active parameters and a total of 230 billion parameters, achieving a historic breakthrough by ranking in the top five globally and first in open-source on the Artificial Analysis (AA) leaderboard [1] - The model's cost is approximately $0.53 per million tokens, which is only 8% of the cost of Claude 4.5 Sonnet, and it has a reasoning speed nearly twice that of its competitor [1] - Within just five days of its launch, M2 has become the fourth most called model globally on the OpenRouter API integration platform, ranking first among domestic models and third in programming scenarios [1] Group 2: Video Model Hailuo 2.3 - The Hailuo 2.3 video generation model represents a comprehensive technical upgrade over its predecessor, Hailuo 02, achieving significant improvements in dynamic expressiveness, stylization, and the intricacy of character performances [1] Group 3: Speech Model Speech 2.6 - The Speech 2.6 model has been deeply optimized for Voice Agent scenarios, reducing the first response time to 250 milliseconds, enabling complete interactive capabilities beyond simple voice-to-text conversion [2] Group 4: Music Model Music 2.0 - The Music 2.0 model excels in capturing and reproducing the subtle emotions of human voices and the dynamic tension of instruments, functioning like a "singing producer" that understands rhythm and emotion [2] - Significant breakthroughs have been made in vocal performance, with the model's sound quality closely resembling real human voices and capable of handling various singing styles and emotional expressions [2]