Use case

Multi-tenant SaaS

Give each customer their own model allowlist, rate-limit policy, and audit bucket — without running a separate gateway per tenant. One deployment, strict per-tenant isolation.

Tenant isolation docs Get Early Access

How it works

One gateway, isolated tenants

The X-Tenant header identifies which customer's request this is. Kamiwaza evaluates that tenant's policy tree — their model allowlist, their rate limit, their audit bucket. Other tenants' data and configs are invisible.

# Multi-tenant routing config
version: v1
tenants:
  - id: enterprise-acme
    model_allowlist: [llama-3.1-70b, claude-3-5-haiku]
    rate_limit_rpm: 5000
    audit_bucket: s3://acme-audit-logs
  - id: startup-beta
    model_allowlist: [claude-3-5-haiku]
    rate_limit_rpm: 500
    audit_bucket: s3://startup-beta-audit
rules:
  - match:
      tenant: enterprise-acme
    route_to: private-gpu
  - match:
      tenant: startup-beta
    route_to: anthropic

Isolation guarantees

What "isolated" means in practice

Separate audit buckets — each tenant's routing logs go to their own S3/GCS bucket, accessible only to that tenant's team.

Per-tenant model allowlists — enterprise customers can be restricted to on-prem models; trial tiers can be limited to cheaper managed endpoints.

Independent rate limits — one tenant's traffic spike doesn't degrade another tenant's latency. Rate limit policies enforced per tenant ID.

Multi-tenant SaaS

One gateway, isolated tenants

What "isolated" means in practice

Related documentation