Sequence-to-One: Sentiment Analysis with an LSTM
Learners will build an end-to-end sentiment classifier: tokenise reviews, embed words with nn.Embedding, run through LSTM, and classify the final hidden state.
Sequence-to-One Classification Explained
In sequence-to-one tasks, the model reads an entire sequence and produces a single output label. Sentiment analysis is the classic example: given a full movie review ('I loved every minute of this film'), predict a single label — positive or negative.
This contrasts with sequence-to-sequence (translation, where every input token produces an output token) and one-to-sequence (image captioning). In sequence-to-one, we discard intermediate LSTM outputs and use only the final hidden state h_T, which theoretically encodes the meaning of the entire input sequence.
Dataset: IMDB Movie Reviews
The IMDB dataset contains 50,000 movie reviews (25,000 train, 25,000 test) labelled as positive or negative. Each review is a variable-length string of text. Before feeding it to an LSTM, we need three preprocessing steps: tokenisation (split text into words), vocabulary building (map words to integer indices), and padding (make all sequences the same length).
We also need to handle reviews that are very long — LSTM training slows quadratically with sequence length, so truncating to 256 or 512 tokens is common practice without significantly hurting accuracy.
from torchtext.datasets import IMDB
from torchtext.data.utils import get_tokenizer
from torchtext.vocab import build_vocab_from_iterator
tokenizer = get_tokenizer('basic_english')
def yield_tokens(data_iter):
for _, text in data_iter:
yield tokenizer(text)
train_iter = IMDB(split='train')
vocab = build_vocab_from_iterator(yield_tokens(train_iter),
specials=['<unk>', '<pad>'])
vocab.set_default_index(vocab['<unk>'])
print('Vocabulary size:', len(vocab))All lessons in this course
- Vanilla RNNs: Hidden State and Sequence Unrolling
- The Vanishing Gradient Problem in Deep Time Steps
- LSTM Cell: Input, Forget, and Output Gates
- Sequence-to-One: Sentiment Analysis with an LSTM