Migrate from OpenAI to KairosRoute in 2 Minutes
If your codebase looks like this —
from openai import OpenAI client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
— then you're exactly two lines away from 50–85% lower API bills, multi-provider failover, and a real cost dashboard. This guide walks you through it.
The two-line version
KairosRoute is wire-compatible with the OpenAI SDK. All you do is point the client at our base URL and use your KairosRoute key.
from openai import OpenAI
client = OpenAI(
base_url="https://api.kairosroute.com/v1", # ← 1
api_key=os.environ["KAIROSROUTE_API_KEY"], # ← 2
)
# Everything else stays identical.
response = client.chat.completions.create(
model="auto", # or "gpt-5.4", "claude-sonnet-4.6", etc.
messages=[{"role": "user", "content": "Write a haiku about routing."}],
)That's it. Streaming, tool calls, JSON mode, vision inputs, and seed parameters all work unchanged. The model string can be "auto" (let us pick) or a specific model name from any of the 10 providers we support.
Get an API key
Sign up at kairosroute.com/broker/signup. No credit card required for the free tier (100K tokens/mo + a $5 managed-key trial credit). You'll land in the dashboard; the API key is on the first screen.
TypeScript / Node
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.kairosroute.com/v1",
apiKey: process.env.KAIROSROUTE_API_KEY!,
});
const res = await client.chat.completions.create({
model: "auto",
messages: [{ role: "user", content: "Summarize this thread..." }],
});Streaming in Node
const stream = await client.chat.completions.create({
model: "auto",
messages: [{ role: "user", content: "Explain routing to a PM." }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}Go
Most Go OpenAI clients (e.g. sashabaranov/go-openai) expose a BaseURL field on the config.
import openai "github.com/sashabaranov/go-openai"
config := openai.DefaultConfig(os.Getenv("KAIROSROUTE_API_KEY"))
config.BaseURL = "https://api.kairosroute.com/v1"
client := openai.NewClientWithConfig(config)
resp, err := client.CreateChatCompletion(
context.Background(),
openai.ChatCompletionRequest{
Model: "auto",
Messages: []openai.ChatCompletionMessage{
{Role: "user", Content: "Route me to the cheap model."},
},
},
)curl
curl https://api.kairosroute.com/v1/chat/completions \
-H "Authorization: Bearer $KAIROSROUTE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [{"role": "user", "content": "hi"}]
}'What changes (and what doesn't)
We aim for 100% wire compatibility. Here's what you should expect:
- Works the same: chat completions, streaming, function / tool calls, JSON mode, response format, vision inputs, seed, temperature, top_p, logprobs, stop sequences.
- Works the same, better: failover (we retry across providers automatically), cost attribution (returned in the response headers).
- New:
model="auto"— routes to the cheapest model meeting your quality floor. - New: extra_body.kr — override quality floor, latency budget, provider allowlist per request.
- Not supported (yet): Assistants v2 beta, Realtime voice, fine-tuning job endpoints. If you need any of these, keep them on OpenAI direct and route the rest through us.
Model name mapping
You can keep your existing gpt-* model strings — we recognize them. But if you want to let our router pick the cheapest valid model, use auto.
# These all work — no code changes needed: gpt-4 → routes to whatever your quality floor allows gpt-4-turbo → same gpt-5.4 → pinned to OpenAI GPT-5.4 gpt-5.4-mini → pinned to OpenAI GPT-5.4 Mini claude-sonnet-4.6 → pinned to Anthropic Sonnet auto → let kr-auto pick (recommended)
Gotchas
Rate limits are per-plan
We don't inherit OpenAI's rate limits. Free tier is 60 RPM, Team is 600 RPM, Business is 3,000 RPM, Enterprise is custom. If you're pushing OpenAI hard, size your plan accordingly.
Response headers include routing decisions
Look at the x-kr-* headers on every response:
x-kr-routed-to: anthropic/claude-haiku-4.5 x-kr-task-category: summarization x-kr-cost-usd: 0.00042 x-kr-fallback-chain: anthropic,openai,groq x-kr-classifier-confidence: 0.94
This is cheap observability — log these headers and you have a poor-man's routing dashboard before you even open our actual dashboard.
Errors use the same shape
OpenAI-style { error: { message, type, code } } payloads. Your existing error handling works.
After migration: what to check
- Run 24 hours on
model="auto"with your current traffic shape. Expected cost delta: -50% to -85%. If you're not seeing that, open an issue; we'll look at your routing traces with you. - Look at the cost-per-task-type breakdown in the dashboard. This is where most teams have an "oh no" moment — they realize 40% of their spend was on one task type that didn't need it.
- Set up a quality alert. We do this automatically, but you can tune the threshold per workload.
- Enable BYOK if you have existing provider contracts you want to honor.
Full API reference: /docs/api-reference. Quickstart with copy-pastable code: /docs/quickstart. Any questions, email support@kairosroute.com.
Ready to route smarter?
KairosRoute gives you a single OpenAI-compatible endpoint that routes every request to the cheapest model meeting your quality bar — plus the observability, A/B testing, and cost analytics that turn cheaper infrastructure into a durable margin.
Related Reading
A single OpenAI-compatible endpoint that routes every request to the cheapest model that still meets your quality bar — plus the observability, A/B testing, and cost analytics that make that optimization durable.
LiteLLM is a great Python library for calling multiple LLM providers from one interface. KairosRoute is a hosted routing-and-observability platform. Here is when you actually want the library vs. when you want the platform, and how they fit together.
LangChain already uses ChatOpenAI as its default LLM wrapper. Point it at KairosRoute, set model="kr-auto", and every chain, agent, and LCEL pipeline in your app starts routing to the cheapest model that meets your quality bar — no refactor required.