Quickstart

Route your first request in under 5 minutes.

On this page

Install the CLI
Register an endpoint
Write a routing policy
Send your first request
View the audit log

Step 1 — Install the CLI

Kamiwaza ships as a single binary. Install via pip (requires Python 3.9+) or Homebrew:

# pip
pip install kamiwaza-cli

# Homebrew (macOS/Linux)
brew tap kamiwazaai/cli
brew install kamiwaza

Verify the install:

kmw --version
# kamiwaza 0.9.1

Authenticate with your API key. You can find this in the Kamiwaza dashboard after signing up:

kmw auth login --api-key sk-kmw-XXXXXXXXXXXX

Step 2 — Register an endpoint

An endpoint is any model serving target — a managed API or a self-hosted vLLM server. Register your first endpoint:

kmw endpoints add \
  --id anthropic-haiku \
  --type anthropic \
  --api-key $ANTHROPIC_API_KEY \
  --model claude-3-5-haiku-20241022

You can add multiple endpoints in the same deployment. Kamiwaza evaluates all of them against your routing policy when a request arrives.

kmw endpoints list
# ID                    TYPE        STATUS    MODEL
# anthropic-haiku       anthropic   healthy   claude-3-5-haiku-20241022

Step 3 — Write a routing policy

Routing policies are YAML files that tell Kamiwaza which endpoint to use for which request. Create a file policy.yaml:

version: v1

endpoints:
  - id: anthropic-haiku
    type: anthropic
    models: [claude-3-5-haiku-20241022]

rules:
  - default:
    route_to: anthropic-haiku

Deploy the policy:

kmw policies apply policy.yaml
# Policy applied. Gateway listening on https://gw.kamiwazaai.org/v1

Tip: The default rule is the catch-all — it matches any request that no earlier rule matched. Rules are evaluated top-to-bottom, first match wins.

Step 4 — Send your first request

Kamiwaza exposes an OpenAI-compatible /v1/chat/completions endpoint. Drop it in as a base URL replacement:

curl https://gw.kamiwazaai.org/v1/chat/completions \
  -H "Authorization: Bearer sk-kmw-XXXXXXXXXXXX" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

The response is the upstream model's response, passed through unchanged. The "model": "auto" field tells Kamiwaza to apply your routing policy rather than pinning to a specific model.

You can also pass routing hints as HTTP headers:

curl https://gw.kamiwazaai.org/v1/chat/completions \
  -H "Authorization: Bearer sk-kmw-XXXXXXXXXXXX" \
  -H "X-Tenant: acme-corp" \
  -H "X-Data-Class: pii-restricted" \
  -H "Content-Type: application/json" \
  -d '{"model": "auto", "messages": [...]}'

Step 5 — View the audit log

Every request routed through Kamiwaza generates an audit record. View recent records:

kmw audit tail
# TIMESTAMP             TENANT        DATA_CLASS      RULE_MATCHED   ENDPOINT           LATENCY_MS
# 2025-05-28T14:22:01Z  acme-corp     pii-restricted  rule:0         private-gpu        234
# 2025-05-28T14:22:04Z  startup-beta  general         rule:default   anthropic-haiku    312

The audit log records which rule matched, which endpoint was used, and the round-trip latency. You can export records to S3 or stream them to your SIEM via the API.

Next: Now that your first request is routing, read the Routing Policies reference to build data-class guards, tenant-specific allowlists, and latency-budget failovers.

Routing Policies