Regex Patterns and Character Classes
\d, \w, \s, ., ^, $, +, *, ?, {n,m}, character classes [a-z], negation [^...].
Why Regex Matters for Text AI
Text AI starts with messy raw text. Regular expressions (regex) are a compact language for finding and extracting patterns — phone numbers, dates, hashtags, URLs — so you can clean and structure text before feeding a model.
This course teaches patterns, the re module, groups, and a real text-cleaning pipeline.
Literal Characters
The simplest pattern is literal text: it matches exactly those characters. Regex is case-sensitive by default.
cat matches "cat" inside "the cat sat" but not "Cat".
import re
print(re.findall("cat", "the cat sat on a cat mat"))
# ["cat", "cat"]All lessons in this course
- Regex Patterns and Character Classes
- re Module: search, match, findall, sub
- Capturing Groups and Named Groups
- Text Cleaning for AI with Regex