Quantitative Models and Construction Methods Model Name: Jet-Nemotron - Model Construction Idea: The model is built using an innovative post-neural architecture search method, focusing on optimizing pre-trained Transformer models[15][16] - Model Construction Process: - Start with a pre-trained full-attention model and inherit its multi-layer perceptron weights - Use PostNAS method to determine the optimal placement of full-attention layers by training a "one-time" super network - Evaluate various linear attention modules and select Gated DeltaNet as the base, then design the JetBlock module with dynamic causal convolution kernels - Perform hardware-aware architecture search to ensure efficiency in real hardware deployment[16][17][19] - Model Evaluation: The model demonstrates significant performance and efficiency improvements, setting a new benchmark for linear attention design[20][22] Model Backtest Results Jet-Nemotron - MMLU Accuracy: 49.6[19] - Common Sense Reasoning Accuracy: 62.0[19] - Throughput Improvement: 47 times compared to Qwen3-1.7B-Base[19] - Cache Size Reduction: Reduced to one forty-seventh of the original size[19] Quantitative Factors and Construction Methods Factor Name: RLCF (Reinforcement Learning from Checklist Feedback) - Factor Construction Idea: Use dynamically generated checklists to evaluate model responses, providing a more effective alignment method compared to traditional reward models[48][49] - Factor Construction Process: - Define checklist core features: each item must be a verifiable yes/no question - Generate checklists using direct and candidate methods - Sample candidate response pairs from the base policy - Score each checklist item using AI judges and verification programs - Calculate weighted average scores and filter significantly different response pairs - Train using direct preference optimization[49][51][52] - Factor Evaluation: The method shows stable improvement in instruction adherence across various benchmarks, particularly excelling in handling "content" constraints[51][52] Factor Backtest Results RLCF - IFEval Improvement: 2.8-3.0%[51] - FollowBench Constraint Satisfaction Level: 8.2% improvement[51] - InFoBench Overall Requirement Adherence Rate: 6.9% improvement[51] - Content Constraint Hard Satisfaction Rate: 6.4 percentage points higher than baseline[51]
AI动态汇总:苹果推出Xcode26Beta7,英伟达开源Jet-Nemotron高性能语言模型