Directory Traversal and File Discovery
os.walk(), glob patterns, and filtering files by type or date.
Why Agents Need Directory Traversal
Agents often need to find files matching certain criteria — all Python files in a codebase, all CSV files in a data directory, or all recent logs. Python provides three main tools: os.walk(), Path.iterdir(), and Path.rglob(). Each has different strengths depending on how deep you need to search.
import os
from pathlib import Path
# Count all files in a directory tree
root = Path('/tmp/agent_workspace')
file_count = sum(1 for f in root.rglob('*') if f.is_file())
print(f'Found {file_count} files under {root}')os.walk() — Recursive Directory Traversal
os.walk(root) yields a tuple of (dirpath, dirnames, filenames) for every directory in the tree. It's the classic Python approach for recursive traversal and gives you fine-grained control over which subdirectories to descend into.
import os
root = '/tmp/project'
for dirpath, dirnames, filenames in os.walk(root):
# Skip hidden directories (like .git)
dirnames[:] = [d for d in dirnames if not d.startswith('.')]
print(f'In directory: {dirpath}')
print(f' Subdirs: {dirnames}')
print(f' Files: {filenames}')
for filename in filenames:
full_path = os.path.join(dirpath, filename)
print(f' File: {full_path}')All lessons in this course
- Reading and Writing Files in Agent Context
- Directory Traversal and File Discovery
- File Format Handling: CSV, JSON, and TXT
- Safe File Operations with Error Handling