Core Viewpoint - MiniMax, a Shanghai-based AI unicorn, has launched a comprehensive multimodal model suite called "全家桶," marking a significant breakthrough for Chinese AI companies in the multimodal technology field and opening new avenues for commercialization [1][2]. Group 1: Investment Insights - MiniMax's multimodal "全家桶" encompasses a technology system covering text, vision, speech, and music, with its text model M2 ranking among the top globally in authoritative evaluations [2]. - The M2 model has achieved a breakthrough in balancing performance, speed, and cost, establishing a new benchmark in model efficiency and cost control [3]. Group 2: Model Performance - M2's inference cost is as low as $0.53 per million tokens, which is only 8% of Claude 4.5 Sonnet's cost, while its inference speed is nearly double that of the latter [3]. - Following its release, M2's API call volume surged, ranking fourth globally and first among domestic models within five days, demonstrating its excellent balance between high performance and low cost [3]. Group 3: Product Matrix and Technical Layout - The "全家桶" model suite includes Hailuo 2.3 for video generation, which supports generating native 1080p videos for up to 10 seconds, and Speech 2.6, optimized for voice agent scenarios with a response time reduced to 250 milliseconds [4]. - Music 2.0 can generate complete songs lasting up to 5 minutes, showcasing the company's commitment to high-quality generation and stability through the use of a complete attention mechanism [4].
国泰海通:MiniMax发布全模态AI“全家桶” M2登顶全球开源模型