Machine Learning Academy · Lesson

Monitoring Prediction Distributions and Confidence Scores

Learners will log prediction probabilities to a time-series store, plot rolling mean confidence, and flag when average confidence drops below a deployment threshold.

Why Monitor Predictions, Not Just Inputs?

Input feature monitoring detects data drift but requires monitoring every feature. Prediction monitoring provides a single integrated signal: if any combination of input changes causes the model to produce different outputs, it will show up in the prediction distribution — even if no individual feature passes its drift threshold. Monitoring predictions is complementary to input monitoring: it catches what input monitoring misses by looking at the model's integrated response to all inputs together.

Logging Predictions in Production

The foundation of prediction monitoring is a prediction log: every inference request's features, predicted label, predicted probabilities, and timestamp stored to a database or file. This log enables retrospective analysis when drift or degradation is detected. Design the schema to include request ID (for joining with ground-truth labels when they arrive), model version, and latency alongside the prediction details.

import datetime
import json
import os
import torch
import torch.nn.functional as F

PREDICTION_LOG = '/tmp/prediction_log.jsonl'

def predict_and_log(features, model, model_version='v1.2.3'):
    import numpy as np
    with torch.no_grad():
        logits = model(torch.tensor(features, dtype=torch.float32).unsqueeze(0))
        probs = F.softmax(logits, dim=1).squeeze().numpy()
    pred_class = int(probs.argmax())
    confidence = float(probs.max())

    record = {
        'timestamp': datetime.datetime.utcnow().isoformat(),
        'model_version': model_version,
        'predicted_class': pred_class,
        'confidence': round(confidence, 4),
        'probabilities': probs.tolist()
    }
    with open(PREDICTION_LOG, 'a') as f:
        f.write(json.dumps(record) + '\n')
    return pred_class, confidence

All lessons in this course

Data Drift: Feature Distribution Shifts Over Time
Concept Drift: When the Relationship Between X and Y Changes
Monitoring Prediction Distributions and Confidence Scores
Building a Drift Alert Pipeline with Evidently AI

← Back to Machine Learning Academy