MLOps Academy · Lesson

Configure Dynamic Batching in Triton

Tune max batch size and queue delay.

Models Live in a Repository

Triton loads models from a model repository, a folder where each model has its own subfolder holding the weights and a config file.

Each model gets a config.pbtxt next to its version folder. This text file tells Triton the inputs, outputs, and how to schedule requests.