Markov decision process

Search documents
Diffusion/VAE/RL 数学原理
自动驾驶之心· 2025-07-29 00:52
Core Viewpoint - The article discusses the principles and applications of Diffusion Models and Variational Autoencoders (VAE) in the context of machine learning, particularly focusing on their mathematical foundations and training methodologies. Group 1: Diffusion Models - The training objective of the network is to fit the mean and variance of two Gaussian distributions during the denoising process [7] - The KL divergence term is crucial for fitting the theoretical values and the network's predicted values in the denoising process [9] - The process of transforming the uncertain variable \(x_0\) into the uncertain noise \(\epsilon\) is iteratively predicted [15] Group 2: Variational Autoencoders (VAE) - VAE assumes that the latent distribution follows a Gaussian distribution, which is essential for its generative capabilities [19] - The training of VAE is transformed into a combination of reconstruction loss and KL divergence constraint loss to prevent the latent space from degenerating into a sharp distribution [26] - Minimizing the KL loss corresponds to maximizing the Evidence Lower Bound (ELBO) [27] Group 3: Reinforcement Learning (RL) - The Markov Decision Process (MDP) framework is utilized, which includes states and actions in a sequential manner [35] - The semantic representation aims to approach a pulse distribution, while the generated representation is expected to follow a Gaussian distribution [36] - Policy gradient methods are employed to enable the network to learn the optimal action given a state [42]