Workflow
SmODE(Smooth Ordinary Differential Equations)
icon
Search documents
ICLR 2025 | SmODE:用于生成平滑控制动作的常微分方程神经网络
自动驾驶之心· 2025-09-01 23:32
Core Viewpoint - The research team led by Professor Li Shengbo from Tsinghua University has developed a novel smoothing neural network called SmODE, which utilizes ordinary differential equations (ODE) to enhance the smoothness of control actions in reinforcement learning tasks, thereby improving the usability and safety of intelligent systems [4][23]. Background - Deep Reinforcement Learning (DRL) has proven effective in solving optimal control problems in various applications, including drone control and autonomous driving. However, the smoothness of control actions remains a significant challenge due to high-frequency noise and unregulated Lipschitz constants in neural networks [5][19]. Key Technologies of SmODE - **Smoothing ODE Design**: The team designed a smoothing neuron structure based on ODEs that can adaptively filter high-frequency noise while controlling the Lipschitz constant, thus enhancing the performance of control systems [8][9]. - **Smoothing Network Structure**: SmODE is structured to be integrated into various reinforcement learning frameworks, featuring an input module, a smoothing ODE module, and an output module, which can be adjusted based on task complexity [14][16]. - **Reinforcement Learning Algorithm Based on SmODE**: SmODE can be easily combined with existing deep reinforcement learning algorithms, requiring additional loss terms to regulate the time constant and Lipschitz constant during training [16][17]. Experimental Results - In experiments with Gaussian noise variance set at 0.05, SmODE demonstrated significantly lower action volatility compared to traditional MLP networks, enhancing vehicle comfort and safety during tasks such as sine curve tracking and lane changing [19][21]. - In the MuJoCo benchmark tests, SmODE outperformed other networks (LTC, LipsNet, and MLP) in terms of average action smoothness across various tasks, indicating its effectiveness in real-world applications [21][22]. Conclusion - The SmODE network effectively addresses the oscillation issues in action outputs within deep reinforcement learning, providing a new approach to enhance the performance and stability of intelligent systems in real-world applications [23].