MLOps Academy · Lesson

Pods, Deployments, and Services for Models

Map ML serving onto core Kubernetes objects.

The Pod Is the Smallest Unit

Kubernetes never runs a bare container. The smallest thing it schedules is a Pod, a wrapper holding one container (your model server) plus its shared network. 📦

One Model Server per Pod

For ML serving you usually put one model API container in each Pod. That keeps scaling, restarts, and resource limits simple to reason about per model.

All lessons in this course

Pods, Deployments, and Services for Models
Request CPU, Memory, and GPU
Configure with ConfigMaps and Secrets
Run Training as a Kubernetes Job

← Back to MLOps Academy