Quantitative Models and Construction Methods 1. Model Name: Llama-Nemotron - Model Construction Idea: The Llama-Nemotron model aims to enhance inference capabilities while reducing memory usage without sacrificing performance[12][13] - Model Construction Process: - Stage 1: Neural Architecture Search (NAS): Optimizes from the Llama 3 model to accelerate inference using block-level local distillation and mixed-integer programming (MIP) solvers to select the most efficient configuration[14] - Stage 2: Vertical Compression and FFN Fusion: Introduces FFN fusion technology to reduce sequence depth and improve computational efficiency by identifying and replacing consecutive FFN blocks[14] - Stage 3: Knowledge Distillation and Continued Pre-training: Conducts knowledge distillation and continued pre-training to improve model quality and recover any quality loss from block replacement[15] - Stage 4: Supervised Fine-Tuning (SFT): Uses mixed instruction data and reasoning trajectories from strong teacher models for supervised fine-tuning[15] - Stage 5: Large-Scale Reinforcement Learning: Trains the model using large-scale reinforcement learning, particularly on complex mathematical and STEM datasets[15] - Model Evaluation: The model is designed to enhance inference efficiency and reduce memory usage while maintaining high performance[13][16] Model Backtesting Results - Llama-Nemotron Model: - HumanEval 0-shot: 92.1%[53] - LiveCodeBench (v6) 0-shot: 30.3%[53] - MultiPL-E average 0-shot: 81.4%[53] - ArenaHard 0-shot: 97.1%[53] - IfEval 0-shot: 89.4%[53] - Math500 Instruct 0-shot: 91.0%[53] - GPQA Diamond 5-shot CoT: 57.1%[53] - MMLU Pro 5-shot CoT: 77.2%[53] - RULER 32K: 96.0%[53] - RULER 128K: 90.2%[53] - MMMU 0-shot: 66.1%[53] - DocVQA 0-shot: 95.3%[53] - AI2D 0-shot: 93.7%[53] - ChartQA 0-shot: 82.6%[53] Quantitative Factors and Construction Methods 1. Factor Name: Mi-BRAG - Factor Construction Idea: The Mi-BRAG system addresses high knowledge update costs, lack of insight into proprietary knowledge bases, and data leakage risks in traditional large models[25] - Factor Construction Process: - Full-Format Compatibility: Integrates an intelligent parsing engine to handle various document formats like PDF, Word, and Excel[27] - Full-Modal Parsing: Accurately analyzes complex images, tables, and mixed information[27] - Multilingual Q&A: Supports document parsing and interactive Q&A in major languages[27] - Fine-Grained Traceability: Uses dynamic traceability technology to mark the original document and citation location for each generated result[27] - Factor Evaluation: The system enhances the intelligent knowledge center for various application scenarios, improving product intelligence and user experience[28] Factor Backtesting Results - Mi-BRAG Factor: - SuperCLUE-RAG Generation Capability Ranking: Ranked first in April 2025[31] 2. Factor Name: VPP (Video Prediction Policy) - Factor Construction Idea: VPP is designed to generate video actions based on text instructions, leveraging AIGC video diffusion models for predictive visual representation and action learning[36][39] - Factor Construction Process: - Stage 1: Uses video diffusion models to learn predictive visual representations[36] - Stage 2: Employs Video Former and DiT diffusion strategies for action learning[36] - Factor Evaluation: VPP significantly enhances the generalization ability of humanoid robots by learning from human actions and reducing dependency on high-quality robot data[36][40] Factor Backtesting Results - VPP Factor: - Calvin ABC-D Task Average Length: 4.33[42] - Real-World Dexterous Hand Task Success Rate: 67%[42]
AI动态汇总:英伟达Llama-Nemotron模型表现优异,小米Mi-BRAG智能引擎亮相