Usage Metering, Quotas and Billing Hooks
Track per-tenant usage, enforce plan quotas, and emit billing events for metered SaaS pricing.
Why Metering Matters in SaaS
In a multi-tenant SaaS, every tenant shares the same FastAPI app but pays based on what they actually use. To support plans like Free / Pro / Enterprise you need three cooperating subsystems:
- Metering — count consumed resources (API calls, tokens, rows, GB).
- Quotas — block or throttle once a tenant exceeds its plan limit.
- Billing hooks — emit events that downstream billing (Stripe, etc.) turns into invoices.
The hard part is doing this per tenant, accurately, and without adding latency to every request. We will build each piece in this lesson.
Modeling Plans and Quotas
Start by modeling what each plan allows. A plan maps a metric (what we count) to a limit and a period (when the counter resets). Keeping this as plain data makes it easy to load from a DB or config.
Below, a metered plan tracks API requests and AI tokens with monthly limits. None means unlimited.
from dataclasses import dataclass
@dataclass(frozen=True)
class Quota:
metric: str
limit: int | None # None = unlimited
period: str # 'day' or 'month'
PLANS = {
"free": [Quota("api_calls", 1_000, "month"), Quota("ai_tokens", 50_000, "month")],
"pro": [Quota("api_calls", 100_000, "month"), Quota("ai_tokens", 5_000_000, "month")],
"enterprise": [Quota("api_calls", None, "month"), Quota("ai_tokens", None, "month")],
}
def limit_for(plan: str, metric: str) -> int | None:
for q in PLANS[plan]:
if q.metric == metric:
return q.limit
return 0 # metric not allowed on this plan
print(limit_for("free", "api_calls")) # 1000
print(limit_for("enterprise", "ai_tokens")) # None (unlimited)
print(limit_for("free", "unknown")) # 0All lessons in this course
- Tenant Isolation Strategies and Trade-offs
- Tenant Context Resolution and Middleware
- Row-Level Security and Data Partitioning
- Usage Metering, Quotas and Billing Hooks