MLOps Academy · Lesson

Pods, Deployments, and Services for Models

Map ML serving onto core Kubernetes objects.

The Pod Is the Smallest Unit

Kubernetes never runs a bare container. The smallest thing it schedules is a Pod, a wrapper holding one container (your model server) plus its shared network. 📦

One Model Server per Pod

For ML serving you usually put one model API container in each Pod. That keeps scaling, restarts, and resource limits simple to reason about per model.

All lessons in this course

  1. Pods, Deployments, and Services for Models
  2. Request CPU, Memory, and GPU
  3. Configure with ConfigMaps and Secrets
  4. Run Training as a Kubernetes Job
← Back to MLOps Academy