K
KairosRoute
Docs/API Reference

API Reference

Complete documentation of all KairosRoute API endpoints and schemas.

Authentication

All API requests require authentication using your API key. Include your key in the Authorization header.

curl https://api.kairosroute.com/v1/chat/completions \ -H "Authorization: Bearer kr-your-key"

Security: Never expose your API key in client-side code. Always use it server-side.

POST /v1/chat/completions

Creates a model response for the given chat conversation using intelligent routing.

Request Parameters

ParameterTypeDescription
modelstringModel to use. Use auto for intelligent routing.
messagesarrayA list of messages comprising the conversation so far.
temperaturenumberWhat sampling temperature to use, between 0 and 2. Default: 1.
max_tokensintegerThe maximum number of tokens to generate in the response.
streambooleanIf true, partial message deltas will be streamed. Default: false.
pluginsstring[]Plugin IDs to enable for this request. Free: web-search, gutenberg, code-exec. Paid: image-gen, pdf-reader, memory. See Plugins.

Example Request

curl https://api.kairosroute.com/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer kr-your-key" \ -d '{ "model": "auto", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Explain quantum computing in simple terms." } ], "temperature": 0.7, "max_tokens": 500, "plugins": ["web-search"] }'

Response

{ "id": "chatcmpl-9xKmP...", "object": "chat.completion", "created": 1743710400, "model": "deepseek-chat", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Quantum computing is a revolutionary..." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 45, "completion_tokens": 180, "total_tokens": 225 }, "x_kairosroute": { "provider": "DeepSeek", "cost_usd": 0.000027, "savings_usd": 0.000423, "routed_from": "auto" } }

Custom Response Headers

HeaderDescription
X-KairosRoute-ProviderWhich model provider was used (e.g., "openai", "anthropic")
X-KairosRoute-CostCost in USD for this request
X-KairosRoute-SavingsAmount saved in USD by using KairosRoute routing

Error Codes

CodeMessage
401Unauthorized - Invalid or missing API key
403Forbidden - Premium plugin requires a paid account (PLUGIN_REQUIRES_PAID)
429Too Many Requests - Rate limit exceeded
500Internal Server Error - Routing service unavailable
503Service Unavailable - All models are currently unavailable

GET /v1/models

Lists the models available to use with the API.

Example Request

curl https://api.kairosroute.com/v1/models \ -H "Authorization: Bearer kr-your-key"

Example Response

{ "object": "list", "data": [ { "id": "auto", "object": "model", "owned_by": "kairosroute" }, { "id": "gpt-4.1", "object": "model", "owned_by": "openai" }, { "id": "claude-sonnet-4-20250514", "object": "model", "owned_by": "anthropic" }, { "id": "grok-4.1-fast", "object": "model", "owned_by": "xai" }, { "id": "deepseek-chat", "object": "model", "owned_by": "deepseek" }, { "id": "minimax-m2.5", "object": "model", "owned_by": "minimax" } ] }

API Key Management

Manage your API keys through the dashboard or these endpoints.

POST /api/auth

Create a new API key

curl -X POST https://kairosroute.com/api/auth \ -H "Content-Type: application/json" \ -H "Authorization: Bearer kr-your-key" \ -d '{ "name": "Production Key", "expires_in": 7776000 }'

GET /api/auth

List all your API keys

curl https://kairosroute.com/api/auth \ -H "Authorization: Bearer kr-your-key"

DELETE /api/auth/:key_id

Revoke an API key

curl -X DELETE https://kairosroute.com/api/auth/key_123abc \ -H "Authorization: Bearer kr-your-key"

Advanced Request Parameters

These optional parameters can be added to any /chat/completions request body:

data_collection, "allow" (default) or "deny". Excludes providers that may train on your data.

zdr, true for Zero Data Retention. Only routes to providers with formal ZDR guarantees.

preferred_max_latency. Max P75 latency (ms). Deprioritizes slower providers.

preferred_min_throughput. Min P50 throughput (tokens/sec). Deprioritizes slower providers.

min_quantization / max_quantization. Filter by model precision: fp32, fp16, bf16, fp8, int8, int4.

response_format, {"type":"json_object"} or {"type":"json_schema"}. Enables auto-healing of malformed JSON.

See Advanced Routing, Guardrails, and kr-compare for full documentation.

Rate Limits

  • Free: 60 requests/minute, 100K tokens/mo via BYOK plus $5 managed-key trial credit
  • Team ($99/mo): 600 requests/minute, 10M tokens included, $0.40 per 1M overage
  • Business ($499/mo): 3,000 requests/minute, 50M tokens included, $0.30 per 1M overage
  • Enterprise: Custom RPM, custom token commitment, volume pricing

Per-second burst limits scale with tier. When quota is exceeded you'll receive a 429 status code, retry after the time specified in the Retry-After header. See pricing for full tier details.