频谱到信号原理(SSP)
Search documents
啊?微博7800美元训的大模型,数学能力超了DeepSeek-R1
量子位· 2025-11-18 05:02
Core Insights - Weibo has launched its first self-developed open-source large model, VibeThinker, which has only 1.5 billion parameters but outperformed the much larger DeepSeek R1 model with 671 billion parameters in benchmark tests [1][7] - The cost of a single post-training session for VibeThinker is only $7,800, significantly lower than competitors like DeepSeek and MiniMax, which have costs in the hundreds of thousands [2][10] - This breakthrough may shift the AI industry focus from a "scale competition" to an "efficiency revolution" [3][9] Industry Disruption - The AI industry has traditionally viewed parameter count as the primary measure of model capability, with a belief that complex reasoning requires over 100 billion parameters [5][6] - VibeThinker challenges this notion by demonstrating that a smaller model can achieve superior performance through optimized model structure and training methods, specifically the "Spectrum to Signal Principle" (SSP) [7][8] - The model's performance in high-difficulty mathematical tests has garnered significant attention, with endorsements from platforms like HuggingFace [7] Cost Revolution - VibeThinker's training cost is a fraction of what is typical in the industry, with the total cost being approximately $7,800 for the entire post-training process [10][13] - This cost efficiency allows for broader access to advanced AI capabilities, enabling smaller companies and research institutions to participate in AI innovation [13] Application and Ecosystem Development - Weibo is actively integrating AI technology across various business scenarios, enhancing user experience and content production efficiency [15][20] - The company plans to leverage its unique data assets to create a model that better understands public sentiment and social needs [17][18] - VibeThinker is expected to drive multiple AI applications within Weibo, enhancing user experience and potentially creating a new "social super-ecosystem" [19][20]
新浪微博发布其首个开源大模型 VibeThinker-1.5B
Sou Hu Cai Jing· 2025-11-13 21:18
Core Insights - Weibo has launched its first open-source large model, VibeThinker-1.5B, claiming that smaller models can also exhibit high intelligence [1][2] - The model, with 1.5 billion parameters, challenges the notion that only models with massive parameter counts can achieve high performance, demonstrating that innovative algorithm design can yield significant results [2][5] Model Performance - VibeThinker-1.5B outperformed the DeepSeek-R1-0120 model, which has 671 billion parameters, in three challenging mathematics test sets (AIME24, AIME25, HMMT25) [2][5] - Its performance is comparable to the MiniMax-M1 model, which has 456 billion parameters, and it achieved similar results in the LiveCodeBench v6 programming test set against models with significantly higher parameters [2][5] Training Methodology - The model's success is attributed to the "spectrum to signal principle" (SSP) training method, which encourages exploration of all possible solutions during the learning phase, followed by reinforcement learning for efficient strategy optimization [5][6] - The post-training cost for VibeThinker-1.5B is less than $8,000, significantly lower than the costs for DeepSeek-R1 and MiniMax-M1, which are $290,000 and $530,000 respectively [6] Accessibility and Impact - The open-source nature of VibeThinker-1.5B aims to provide a cost-effective research path for medium-sized enterprises and academic research teams with limited computational resources, promoting inclusivity in cutting-edge model training [6]