Machine Learning Academy · Lesson

Projecting Data and Reconstructing from Components

Learners will transform a dataset to the principal-component space, visualise the 2D projection, and reconstruct the original features to quantify information loss.

Projection: From High-D to Low-D

After PCA finds the principal components, projection transforms each data point into the new component space. The projected coordinates are called scores. If you keep only 2 components from 64 original features, each 64-dimensional point becomes a 2-dimensional score. This is achieved by multiplying the centred data matrix by the matrix of eigenvectors (the loadings matrix).

The transform Method in sklearn

In scikit-learn, pca.fit(X) learns the components and pca.transform(X) projects the data. The convenience method pca.fit_transform(X) does both in one call. The result is a matrix of shape (n_samples, n_components) — each row is a point in the reduced space.

from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_digits

X, y = load_digits(return_X_y=True)  # 1797 x 64
X_scaled = StandardScaler().fit_transform(X)

pca = PCA(n_components=10)
X_reduced = pca.fit_transform(X_scaled)

print('Original shape:', X_scaled.shape)
print('Reduced shape:', X_reduced.shape)
print('Variance retained:', pca.explained_variance_ratio_.sum().round(4))

All lessons in this course

PCA: Variance, Eigenvectors, and Principal Components
Projecting Data and Reconstructing from Components
t-SNE: Neighbourhood Preservation for Visualisation
PCA as Preprocessing: Speed and Noise Reduction in Pipelines

← Back to Machine Learning Academy