MLOps Academy · Lesson

Precompute and Cache Predictions

Blend batch and online to cut latency and cost.

Blend the Best of Both

You can mix batch and online: precompute likely answers ahead of time, then serve them instantly at request time. Best of both worlds. 🧩

To precompute is to run predictions before they are asked for, in a batch job, and stash them so the live path just looks them up.