训练框架创新 - filings, earnings calls, financial reports, news

训练框架创新

Search documents

“复刻”幻方量化打造Deepseek 量化私募基金念空在大模型底层技术研发取得突破

经济观察报· 2025-06-03 11:17

Core Viewpoint - The article discusses the advancements in AI large models and the increasing focus of quantitative private equity funds on algorithm optimization in their research and development efforts, emphasizing the importance of collaboration between academia and industry for breakthroughs in foundational technology [1][6]. Group 1: AI Large Model Developments - Since May, global companies in large model development have intensified competition in areas such as semantic understanding and multimodal capabilities [2]. - DeepSeek's R1 model has undergone a minor upgrade, significantly enhancing its reasoning ability and depth of thought [2]. - The introduction of new models by Anthropic, such as the "Claude 4" series, sets higher standards for programming and reasoning applications in the industry [2]. Group 2: New Training Framework - The new training framework (SASR) proposed by NianKong Technology in collaboration with Shanghai Jiao Tong University has shown promising results, achieving over 80% accuracy on the GSM8K task with a 1.5B model, nearing GPT-4o's performance [2][5]. - This framework aims to optimize the balance between supervised fine-tuning (SFT) and reinforcement learning (RL), addressing the challenge of enhancing the model's intelligence without increasing data volume [3][10]. Group 3: Impact on Quantitative Investment - The new training framework has been applied in quantitative investment strategy development, achieving approximately 80% accuracy in market predictions compared to traditional models, with a correlation of less than 50% [5][6]. - The combination of the new framework and traditional models is expected to yield synergistic effects, enhancing overall investment strategy effectiveness [6]. Group 4: Industry Trends and Challenges - Many quantitative private equity funds have established AI Labs to focus on foundational technology research for large models, but replicating the success of DeepSeek is challenging due to high costs and resource requirements [9]. - The optimization of algorithms for general large models is becoming a crucial breakthrough point for enhancing overall model capabilities [9][12]. - The integration of academic research and private equity fund expertise is essential for achieving advancements in algorithm optimization and training framework innovation [12][13].

“复刻”幻方量化打造Deepseek 量化私募基金念空在大模型底层技术研发取得突破

Jing Ji Guan Cha Wang· 2025-06-03 06:57

Core Insights - The competition among global large model development companies has intensified, particularly in semantic understanding and multimodal capabilities since May [2] - Domestic quantitative private equity funds are also entering the race, achieving breakthroughs in AI large model foundational technology [2][5] - A new training framework (SASR) proposed by NianKong Technology in collaboration with Shanghai Jiao Tong University has shown promising results, achieving over 80% accuracy on the GSM8K task with a 1.5B model, nearing GPT-4o's performance [2][4] Group 1: Training Framework and Algorithm Optimization - The current training frameworks for large models primarily focus on Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL), with the challenge being to optimize the balance between these two methods [3][8] - The new training framework aims to dynamically adjust the relationship between SFT and RL, allowing the model to become "smarter" without increasing data volume [3][9] - The innovative training framework has been applied in quantitative investment strategy development, achieving approximately 80% market prediction accuracy compared to traditional models [4][13] Group 2: Industry Trends and Collaborations - Many quantitative private equity firms are establishing AI Labs to focus on foundational technology research for large models, emphasizing algorithm optimization [6][11] - The integration of academic research and private equity expertise is seen as a shortcut to breakthroughs in large model foundational technology [5][11] - The emergence of smarter large models with lower parameter counts but superior overall capabilities is attributed to innovations in training frameworks and algorithm optimization [10] Group 3: Future Directions and Challenges - The ability of large models to become "smarter" in various vertical fields depends on high-quality data and effective training modes [12] - NianKong Technology aims to empower large models to excel in more vertical fields, enhancing China's competitiveness in the global AI landscape [14]