Machine Learning Academy · Lesson

LIME: Local Interpretable Model-Agnostic Explanations

Learners will apply LIME to explain an image classifier and a text classifier, generating local linear approximations that highlight which pixels or words drove the prediction.

LIME: The Core Idea

LIME (Local Interpretable Model-Agnostic Explanations) explains individual predictions of any black-box model by approximating it locally with a simple, interpretable model. The insight is that even if a neural network or ensemble is globally complex, its behaviour near a single input can often be well-approximated by a linear model. LIME is model-agnostic: it works with any classifier or regressor through its prediction function alone.

The LIME Algorithm Step by Step

LIME follows three steps for each prediction to explain: (1) Perturb the input sample by creating many slightly modified copies; (2) Query the black-box model for predictions on all perturbed samples; (3) Fit a weighted linear model where samples closer to the original get higher weights. The coefficients of this linear surrogate are the explanation.

# Conceptual pseudocode for LIME
def lime_explain(instance, model, n_samples=5000):
    perturbed = perturb_around(instance, n=n_samples)
    predictions = model.predict_proba(perturbed)
    weights = kernel(distance(perturbed, instance))
    surrogate = LinearRegression()
    surrogate.fit(perturbed, predictions, sample_weight=weights)
    return surrogate.coef_  # these are the LIME explanations

All lessons in this course

SHAP Values: Global and Local Feature Importance
LIME: Local Interpretable Model-Agnostic Explanations
Fairness Metrics: Demographic Parity and Equal Opportunity
Bias Mitigation Strategies: Pre-processing, In-processing, and Post-processing

← Back to Machine Learning Academy