Sequence to sequence learning

In September 2014, Ilya Sutskever, Oriol Vinyals, and Quoc Le of Google published “Sequence to Sequence Learning with Neural Networks” on arXiv (1409.3215). The paper introduced an end-to-end approach in which one LSTM network (the encoder) reads an input sequence into a fixed-length vector, and a second LSTM (the decoder) generates the output sequence from it.

Applied to English-to-French machine translation, the method reached a BLEU score of 34.8, surpassing a strong phrase-based baseline, and improved further to 36.5 when used to rerank candidate translations. The authors also found that reversing the order of words in the source sentence markedly improved performance by shortening dependencies the network had to learn.

The sequence-to-sequence framework became foundational for translation, summarization, and dialogue, and its encoder-decoder structure directly influenced the attention mechanisms and transformer architectures that followed.

Sequence to sequence learning

Sources

Related