MLOps Academy · Lesson

Request CPU, Memory, and GPU

Set resource limits so pods schedule correctly.

The Scheduler Needs Numbers

Kubernetes places Pods onto nodes that have room. To decide, the scheduler needs you to declare how much CPU and memory each model Pod will use.

A request is the amount guaranteed and used for scheduling. A limit is the ceiling a container may not cross. You almost always set both.