API Reference
Complete documentation of all KairosRoute API endpoints and schemas.
Authentication
All API requests require authentication using your API key. Include your key in the Authorization header.
curl https://api.kairosroute.com/v1/chat/completions \
-H "Authorization: Bearer kr-your-key"Security: Never expose your API key in client-side code. Always use it server-side.
POST /v1/chat/completions
Creates a model response for the given chat conversation using intelligent routing.
Request Parameters
| Parameter | Type | Description |
|---|---|---|
model | string | Model to use. Use auto for intelligent routing. |
messages | array | A list of messages comprising the conversation so far. |
temperature | number | What sampling temperature to use, between 0 and 2. Default: 1. |
max_tokens | integer | The maximum number of tokens to generate in the response. |
stream | boolean | If true, partial message deltas will be streamed. Default: false. |
plugins | string[] | Plugin IDs to enable for this request. Free: web-search, gutenberg, code-exec. Paid: image-gen, pdf-reader, memory. See Plugins. |
Example Request
curl https://api.kairosroute.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer kr-your-key" \
-d '{
"model": "auto",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Explain quantum computing in simple terms."
}
],
"temperature": 0.7,
"max_tokens": 500,
"plugins": ["web-search"]
}'Response
{
"id": "chatcmpl-9xKmP...",
"object": "chat.completion",
"created": 1743710400,
"model": "deepseek-chat",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Quantum computing is a revolutionary..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 45,
"completion_tokens": 180,
"total_tokens": 225
},
"x_kairosroute": {
"provider": "DeepSeek",
"cost_usd": 0.000027,
"savings_usd": 0.000423,
"routed_from": "auto"
}
}Custom Response Headers
| Header | Description |
|---|---|
X-KairosRoute-Provider | Which model provider was used (e.g., "openai", "anthropic") |
X-KairosRoute-Cost | Cost in USD for this request |
X-KairosRoute-Savings | Amount saved in USD by using KairosRoute routing |
Error Codes
| Code | Message |
|---|---|
401 | Unauthorized - Invalid or missing API key |
403 | Forbidden - Premium plugin requires a paid account (PLUGIN_REQUIRES_PAID) |
429 | Too Many Requests - Rate limit exceeded |
500 | Internal Server Error - Routing service unavailable |
503 | Service Unavailable - All models are currently unavailable |
GET /v1/models
Lists the models available to use with the API.
Example Request
curl https://api.kairosroute.com/v1/models \
-H "Authorization: Bearer kr-your-key"Example Response
{
"object": "list",
"data": [
{
"id": "auto",
"object": "model",
"owned_by": "kairosroute"
},
{
"id": "gpt-4.1",
"object": "model",
"owned_by": "openai"
},
{
"id": "claude-sonnet-4-20250514",
"object": "model",
"owned_by": "anthropic"
},
{
"id": "grok-4.1-fast",
"object": "model",
"owned_by": "xai"
},
{
"id": "deepseek-chat",
"object": "model",
"owned_by": "deepseek"
},
{
"id": "minimax-m2.5",
"object": "model",
"owned_by": "minimax"
}
]
}API Key Management
Manage your API keys through the dashboard or these endpoints.
POST /api/auth
Create a new API key
curl -X POST https://kairosroute.com/api/auth \
-H "Content-Type: application/json" \
-H "Authorization: Bearer kr-your-key" \
-d '{
"name": "Production Key",
"expires_in": 7776000
}'GET /api/auth
List all your API keys
curl https://kairosroute.com/api/auth \
-H "Authorization: Bearer kr-your-key"DELETE /api/auth/:key_id
Revoke an API key
curl -X DELETE https://kairosroute.com/api/auth/key_123abc \
-H "Authorization: Bearer kr-your-key"Advanced Request Parameters
These optional parameters can be added to any /chat/completions request body:
data_collection, "allow" (default) or "deny". Excludes providers that may train on your data.
zdr, true for Zero Data Retention. Only routes to providers with formal ZDR guarantees.
preferred_max_latency. Max P75 latency (ms). Deprioritizes slower providers.
preferred_min_throughput. Min P50 throughput (tokens/sec). Deprioritizes slower providers.
min_quantization / max_quantization. Filter by model precision: fp32, fp16, bf16, fp8, int8, int4.
response_format, {"type":"json_object"} or {"type":"json_schema"}. Enables auto-healing of malformed JSON.
See Advanced Routing, Guardrails, and kr-compare for full documentation.
Rate Limits
- Free: 60 requests/minute, 100K tokens/mo via BYOK plus $5 managed-key trial credit
- Team ($99/mo): 600 requests/minute, 10M tokens included, $0.40 per 1M overage
- Business ($499/mo): 3,000 requests/minute, 50M tokens included, $0.30 per 1M overage
- Enterprise: Custom RPM, custom token commitment, volume pricing
Per-second burst limits scale with tier. When quota is exceeded you'll receive a 429 status code, retry after the time specified in the Retry-After header. See pricing for full tier details.