What Sampling Unlocks
Let a server delegate reasoning back to the model.
Flipping the Direction
Usually the model calls your tools. Sampling flips that: your server asks the model to generate text for it. 🔄
Borrow the Model You Already Have
With sampling, your server can use the host’s LLM without ever holding an API key. The client owns the model access.