Convolution and Filters: Detecting Edges and Patterns
Learners will apply a hand-crafted edge-detection kernel to an image, then let nn.Conv2d learn filters automatically and inspect their weights after training.
What Is a Convolution Operation?
Convolution is a mathematical operation that slides a small matrix called a filter (or kernel) across an input (e.g., an image), computing an element-wise dot product at each position. The result is a feature map that highlights where the pattern represented by the filter appears in the input. In image processing, different filters detect different features: edges, corners, textures, or higher-level concepts when stacked in multiple layers.
import torch
# Manual 2D convolution (single filter)
image = torch.tensor([
[1., 0., 1., 0.],
[0., 1., 0., 1.],
[1., 0., 1., 0.],
[0., 1., 0., 1.]
]).unsqueeze(0).unsqueeze(0) # shape: (1,1,4,4)
# Vertical edge detector filter
filter_ = torch.tensor([[[-1., 1., -1.],
[-1., 1., -1.],
[-1., 1., -1.]]]).unsqueeze(0) # (1,1,3,3)
import torch.nn.functional as F
feature_map = F.conv2d(image, filter_, padding=0)
print(feature_map.shape) # (1, 1, 2, 2)Filters and Edge Detection
Classic image processing uses hand-crafted filters: the Sobel filter detects horizontal or vertical edges; the Laplacian filter detects all edges; the Gaussian filter blurs (smooths) images. In deep learning, these filters are learned from data rather than hand-crafted. The key insight of CNNs is that the network automatically discovers which filters are useful for the task by minimising the training loss.
import torch
import torch.nn.functional as F
# Sobel horizontal edge detector
sobel_h = torch.tensor([[
[-1., -2., -1.],
[ 0., 0., 0.],
[ 1., 2., 1.]
]]).unsqueeze(0) # shape (1, 1, 3, 3)
# Create a simple gradient image (brightness increases left->right)
image = torch.arange(16.).reshape(1, 1, 4, 4)
feature_map = F.conv2d(image, sobel_h, padding=1)
print('Edge response shape:', feature_map.shape)
print('Edge values:', feature_map.squeeze())All lessons in this course
- Convolution and Filters: Detecting Edges and Patterns
- Pooling Layers: Spatial Downsampling and Invariance
- Building and Training a CNN on CIFAR-10
- Data Augmentation: Transforms for Robustness