Normalizing Flows

Normalizing Flows is a technique used to transform complex data distributions into much simple latent distributions. They allow for an exact log-likelihood evaluation and posterior inference (retrieving unique latent \(z\) for each input \(x\)).

Change of Variables¶

We describe the flow as an transformation from an latent space \(p_{\theta}(z)\) into the data distribution \(p_{\theta}(x)\) using an invertible function \(f(z)=x\) and \(f^{-1}(x)=z\). Then, we can transform the latent distribution into the data distribution using the change of variables formula:

\[ p_{\theta}(x)=p_{\theta}(f^{-1}(x))\left|\det\left(\frac{\partial f^{-1}(x)}{\partial x}\right)\right| \]

To make this model very expressive without an extremely complicated transformation function, we repeatedly apply functions \(x=f_{\theta}(z)=f_{k}\circ\cdots\circ f_{2} \circ f_{1}(z)\) where each \(f_{i}\) has a simple to compute [[Jacobian Matrix]] determinant. For the distributions of \(x\) and \(z\), we can repeatedly apply the change of variables formula:

\[ p_{\theta}(x)=p_{\theta}(f^{-1}(x))\prod_{1}^{K}\left|\det\left(\frac{\partial f_{i}^{-1}(z_{i})}{\partial z_{i}}\right)\right| \]

and by moving into log-space, this converts to a sum:

\[ \log p_{\theta}(x)=\log p_{\theta}(f^{-1}(x)) + \sum\limits_{1}^{K}\log \left|\det\left(\frac{\partial f_{i}^{-1}(z_{i})}{\partial z_{i}}\right)\right| \]

Triangular Jacobian¶

By fixing the Jacobian to be a triangular matrix using coupling layers (see @dinhNICENonlinearIndependent2014), the determinant is equal to the product of diagonal elements in the matrix.