Direct Optimization through Argmax for Discrete Variational Auto-Encoder
Reparameterization of variational auto-encoders with continuous random variables is an effective method for reducing the variance of their gradient estimates. In this work we reparameterize discrete variational auto-encoders using the Gumbel-Max perturbation model that represents the Gibbs distribution using the $arg max$ of randomly perturbed encoder. We subsequently apply the direct loss minimization technique to propagate gradients through the reparameterized $arg max$. The resulting gradient is estimated by the difference of the encoder gradients that are evaluated in two $arg max$ predictions.