Scaling Law(规模定律)

Search documents
模型训练最重要的依然是 Scaling —— 对话阿里通义千问 Qwen 多语言负责人杨宝嵩 | Open AGI Forum
AI科技大本营· 2025-06-25 06:49
Core Viewpoint - The article discusses the rapid rise of large model technology globally, emphasizing Alibaba's Tongyi Qwen model's international success and its strategic focus on multilingual capabilities to cater to a global audience [2][3]. Group 1: Multilingual Strategy - Tongyi Qwen supports 119 languages, with a core strategy prioritizing multilingual data optimization from the outset to ensure equitable access to AI technology for global users [2][3]. - The team has developed a complex cultural annotation system to address the challenges of multilingual safety and cultural alignment, covering thousands of detailed categories to ensure compliance and effectiveness across different regions [3][12]. - The current industry faces a "multilingual reasoning challenge," where models often mix languages during processing, leading to inconsistencies. The team has adopted a compromise strategy to use native languages for strong languages and English for low-resource languages to maintain output stability [3][16]. Group 2: Scaling Law and Knowledge Density - The article highlights the importance of scaling model size and data volume while also focusing on increasing "knowledge density," which refers to the concentration of useful knowledge within the training data [19][20]. - Recent trends show that smaller models with higher knowledge density can outperform larger models, indicating a shift in focus from merely increasing data volume to refining data quality [20][21]. - The team is exploring data synthesis methods to enhance training data quality, which includes generating new knowledge and filtering redundant data to improve knowledge density [22][23]. Group 3: AI Integration and Future Prospects - The integration of AI models into various devices, such as smart glasses and earphones, is a growing trend, with the company planning to release smaller model versions optimized for these applications [28][30]. - The article discusses the potential for AI to enhance user experiences in everyday tasks, such as real-time translation and contextual assistance, although challenges remain in achieving seamless integration [30][32]. - The company acknowledges the importance of balancing the use of synthetic data with human-generated content to maintain diversity and avoid narrowing the model's knowledge base [25][26].