Machine Learning Academy · Lesson

Sequence-to-One: Sentiment Analysis with an LSTM

Learners will build an end-to-end sentiment classifier: tokenise reviews, embed words with nn.Embedding, run through LSTM, and classify the final hidden state.

Sequence-to-One Classification Explained

In sequence-to-one tasks, the model reads an entire sequence and produces a single output label. Sentiment analysis is the classic example: given a full movie review ('I loved every minute of this film'), predict a single label — positive or negative.

This contrasts with sequence-to-sequence (translation, where every input token produces an output token) and one-to-sequence (image captioning). In sequence-to-one, we discard intermediate LSTM outputs and use only the final hidden state h_T, which theoretically encodes the meaning of the entire input sequence.

Dataset: IMDB Movie Reviews

The IMDB dataset contains 50,000 movie reviews (25,000 train, 25,000 test) labelled as positive or negative. Each review is a variable-length string of text. Before feeding it to an LSTM, we need three preprocessing steps: tokenisation (split text into words), vocabulary building (map words to integer indices), and padding (make all sequences the same length).

We also need to handle reviews that are very long — LSTM training slows quadratically with sequence length, so truncating to 256 or 512 tokens is common practice without significantly hurting accuracy.

from torchtext.datasets import IMDB
from torchtext.data.utils import get_tokenizer
from torchtext.vocab import build_vocab_from_iterator

tokenizer = get_tokenizer('basic_english')

def yield_tokens(data_iter):
    for _, text in data_iter:
        yield tokenizer(text)

train_iter = IMDB(split='train')
vocab = build_vocab_from_iterator(yield_tokens(train_iter),
                                   specials=['<unk>', '<pad>'])
vocab.set_default_index(vocab['<unk>'])
print('Vocabulary size:', len(vocab))

All lessons in this course

Vanilla RNNs: Hidden State and Sequence Unrolling
The Vanishing Gradient Problem in Deep Time Steps
LSTM Cell: Input, Forget, and Output Gates
Sequence-to-One: Sentiment Analysis with an LSTM

← Back to Machine Learning Academy