多模态训练 - filings, earnings calls, financial reports, news

多模态训练

Search documents

Hu Xiu· 2025-08-05 01:40

Core Insights - The article discusses the transformation of Silicon Valley from a hub of consumer internet innovation to a center focused on "hard technology" and artificial intelligence, marking a significant cultural and ideological shift in the tech industry [36][39]. Group 1: Evolution of Silicon Valley - Silicon Valley has transitioned from a vibrant, idealistic environment characterized by social media and consumer applications to a more serious and competitive landscape dominated by AI and advanced technologies [14][36]. - The current tech culture emphasizes technical expertise, with a shift in hiring criteria from storytelling and user-centric thinking to skills in distributed training and efficient data annotation [23][39]. - The atmosphere in Silicon Valley has become more austere, with a focus on long working hours and a less celebratory culture compared to the past [15][18]. Group 2: Changes in Entrepreneurial Dynamics - Entrepreneurs are now more reserved and less willing to share their stories, contrasting with the earlier era when they were eager to engage with the media [12][19]. - The media landscape has shifted from being independent recorders of events to being influenced by corporate public relations, complicating the flow of information [10][11]. - The competitive environment has intensified, with startups vying for dominance in AI, leading to a more aggressive and less collaborative atmosphere [19][28]. Group 3: Cultural and Ideological Shifts - The tech community is witnessing a rise in "libertarian conservative" voices, advocating against government regulation and shifting investment focus towards defense, energy, and aerospace [22]. - The narrative of Silicon Valley has evolved from creating a better lifestyle to constructing "superhuman intelligence," reflecting a deeper philosophical change in the tech industry's goals [28][39]. - The article suggests that Silicon Valley is moving from being a center of universal culture to a "technological nation-state," indicating a narrowing of its focus and a more intense competitive order [37][39].

DeepSeek同款GRPO训练大提速！魔搭开源全流程方案，支持多模态训练、训练加速和评测全链路

量子位· 2025-03-09 04:45

Core Viewpoint - The article discusses the advancements in GRPO training tools from ModelScope, highlighting the introduction of the SWIFT framework and its optimizations to enhance training efficiency and stability in reinforcement learning models [1][10]. Group 1: GRPO Training Enhancements - GRPO training is based on an improved PPO algorithm, focusing on sampling principles to simplify the value model, thereby increasing training stability and maintainability [1]. - The SWIFT framework has been optimized for GRPO training, addressing challenges such as low training speed and complex cluster configurations [3][10]. - The introduction of asynchronous sampling allows for simultaneous sampling and training, significantly reducing training time compared to synchronous methods [4][5]. Group 2: Sampling Efficiency - The sampling time in GRPO training is a critical factor, with single-instance sampling often insufficient for larger models [3]. - By allowing multiple instances for data parallel sampling, the SWIFT framework can effectively allocate resources, improving sampling efficiency [3]. - Experiments show that using asynchronous sampling can reduce training time to about two-thirds compared to synchronous sampling [5]. Group 3: Multi-Round Updates - Multi-round updates enable the reuse of sampled data across multiple iterations, balancing resource allocation between sampling and training [11][12]. - Setting the number of iterations for updates can significantly enhance training speed without adversely affecting model performance [11][14]. Group 4: Performance Comparison - In comparative tests, the SWIFT framework demonstrated a training time of approximately 120 seconds per step, outperforming other frameworks like veRL and TRL [18]. - The integration of various acceleration techniques within SWIFT has led to significant improvements in training efficiency for GRPO in medium and small clusters [18]. Group 5: Multi-Modal GRPO Training - The SWIFT framework supports multi-modal GRPO training, accommodating various data types such as images, videos, and audio [20]. - The framework has been tested with the CLEVR-70k-Counting dataset, achieving high accuracy in multi-modal tasks [20][22]. Group 6: Evaluation Framework - EvalScope is introduced as a comprehensive evaluation tool for large models, providing performance assessment and visualization capabilities [23]. - The framework addresses issues of underthinking and overthinking in reasoning models, enhancing their efficiency in generating correct answers [23][27]. Group 7: Conclusion and Future Directions - SWIFT aims to provide a differentiated technical approach for developers in RL training, with ongoing support for various training domains [26][27]. - Future explorations will focus on reasoning models' thinking efficiency and the emerging paradigm of multi-modal reasoning [27].