Workflow
园区网络智能体
icon
Search documents
AI动态汇总:英伟达Llama-Nemotron模型表现优异,小米Mi-BRAG智能引擎亮相
China Post Securities· 2025-05-14 13:08
Quantitative Models and Construction Methods 1. Model Name: Llama-Nemotron - **Model Construction Idea**: The Llama-Nemotron model aims to enhance inference capabilities while reducing memory usage without sacrificing performance[12][13] - **Model Construction Process**: - **Stage 1: Neural Architecture Search (NAS)**: Optimizes from the Llama 3 model to accelerate inference using block-level local distillation and mixed-integer programming (MIP) solvers to select the most efficient configuration[14] - **Stage 2: Vertical Compression and FFN Fusion**: Introduces FFN fusion technology to reduce sequence depth and improve computational efficiency by identifying and replacing consecutive FFN blocks[14] - **Stage 3: Knowledge Distillation and Continued Pre-training**: Conducts knowledge distillation and continued pre-training to improve model quality and recover any quality loss from block replacement[15] - **Stage 4: Supervised Fine-Tuning (SFT)**: Uses mixed instruction data and reasoning trajectories from strong teacher models for supervised fine-tuning[15] - **Stage 5: Large-Scale Reinforcement Learning**: Trains the model using large-scale reinforcement learning, particularly on complex mathematical and STEM datasets[15] - **Model Evaluation**: The model is designed to enhance inference efficiency and reduce memory usage while maintaining high performance[13][16] Model Backtesting Results - **Llama-Nemotron Model**: - **HumanEval 0-shot**: 92.1%[53] - **LiveCodeBench (v6) 0-shot**: 30.3%[53] - **MultiPL-E average 0-shot**: 81.4%[53] - **ArenaHard 0-shot**: 97.1%[53] - **IfEval 0-shot**: 89.4%[53] - **Math500 Instruct 0-shot**: 91.0%[53] - **GPQA Diamond 5-shot CoT**: 57.1%[53] - **MMLU Pro 5-shot CoT**: 77.2%[53] - **RULER 32K**: 96.0%[53] - **RULER 128K**: 90.2%[53] - **MMMU 0-shot**: 66.1%[53] - **DocVQA 0-shot**: 95.3%[53] - **AI2D 0-shot**: 93.7%[53] - **ChartQA 0-shot**: 82.6%[53] Quantitative Factors and Construction Methods 1. Factor Name: Mi-BRAG - **Factor Construction Idea**: The Mi-BRAG system addresses high knowledge update costs, lack of insight into proprietary knowledge bases, and data leakage risks in traditional large models[25] - **Factor Construction Process**: - **Full-Format Compatibility**: Integrates an intelligent parsing engine to handle various document formats like PDF, Word, and Excel[27] - **Full-Modal Parsing**: Accurately analyzes complex images, tables, and mixed information[27] - **Multilingual Q&A**: Supports document parsing and interactive Q&A in major languages[27] - **Fine-Grained Traceability**: Uses dynamic traceability technology to mark the original document and citation location for each generated result[27] - **Factor Evaluation**: The system enhances the intelligent knowledge center for various application scenarios, improving product intelligence and user experience[28] Factor Backtesting Results - **Mi-BRAG Factor**: - **SuperCLUE-RAG Generation Capability Ranking**: Ranked first in April 2025[31] 2. Factor Name: VPP (Video Prediction Policy) - **Factor Construction Idea**: VPP is designed to generate video actions based on text instructions, leveraging AIGC video diffusion models for predictive visual representation and action learning[36][39] - **Factor Construction Process**: - **Stage 1**: Uses video diffusion models to learn predictive visual representations[36] - **Stage 2**: Employs Video Former and DiT diffusion strategies for action learning[36] - **Factor Evaluation**: VPP significantly enhances the generalization ability of humanoid robots by learning from human actions and reducing dependency on high-quality robot data[36][40] Factor Backtesting Results - **VPP Factor**: - **Calvin ABC-D Task Average Length**: 4.33[42] - **Real-World Dexterous Hand Task Success Rate**: 67%[42]