Seq2seq model

A neural model particularly suited to learning mappings like language translations. It is a similar architecture as a Autoencoder (AE) but working on sequences of inputs and outputs.

  • Encoder: Transforms input sequence into a latent context vector.
  • Decoder: Transforms context vector into target sequence.

The separate networks are trained jointly using gradient descent. At runtime, the decoder predicts each new output sequentially from the last output generated and the context vector from the encoder. During training, the decoder receives the last output, the context vector as well as the actual target as inputs.