We are very excited to join forces with MLCommons and OctoML.ai! Contact Grigori Fursin for more details!

Conditional Variational Autoencoder for Neural Machine Translation

lib:6fae2f8499089e8f (v1.0.0)

Authors: Artidoro Pagnoni,Kevin Liu,Shangyan Li
ArXiv: 1812.04405
Document:  PDF  DOI 
Abstract URL: http://arxiv.org/abs/1812.04405v1

We explore the performance of latent variable models for conditional text generation in the context of neural machine translation (NMT). Similar to Zhang et al., we augment the encoder-decoder NMT paradigm by introducing a continuous latent variable to model features of the translation process. We extend this model with a co-attention mechanism motivated by Parikh et al. in the inference network. Compared to the vision domain, latent variable models for text face additional challenges due to the discrete nature of language, namely posterior collapse. We experiment with different approaches to mitigate this issue. We show that our conditional variational model improves upon both discriminative attention-based translation and the variational baseline presented in Zhang et al. Finally, we present some exploration of the learned latent space to illustrate what the latent variable is capable of capturing. This is the first reported conditional variational model for text that meaningfully utilizes the latent variable without weakening the translation model.

Relevant initiatives  

Related knowledge about this paper Reproduced results (crowd-benchmarking and competitions) Artifact and reproducibility checklists Common formats for research projects and shared artifacts Reproducibility initiatives


Please log in to add your comments!
If you notice any inapropriate content that should not be here, please report us as soon as possible and we will try to remove it within 48 hours!