Connect to any LLM or provider, cloud or self-hosted, with the same API.
OpenAI
Anthropic
GMI Cloud
Google Vertex
Fireworks
Open Router
Azure Foundry
z.ai
Groq
Moonshot
Automatically direct each request to the fastest, most reliable, or most affordable model, no manual intervention required.
Monitor usage and costs in real-time to avoid expensive models or stay within budget. Keep your costs predictable and under control.
If a provider is slow or unavailable, instantly route traffic to healthy models so your users never experience downtime.
Cache common prompts and responses to improve speed and reduce unnecessary calls.
See exactly how requests are being routed, so you can feel confident that your system is working as intended.
Redact sensitive information and choose which providers can access your data, keeping you in control of privacy.
Ensure requests and data are only routed to trusted models and approved regions, meeting your privacy and regulatory requirements.
Distribute requests across multiple providers and keys to avoid rate limits and maintain high performance, even as you grow.
Prevent abuse and unexpected spikes with easy-to-set rate limits, so you can scale safely.
1
2
3
on_http_request:
- actions:- type: ai-router
config: {}import OpenAI from "openai";
const ngrokClient = new OpenAI({baseURL: 'https://your_endpoint.ngrok.dev',
apiKey: 'YOUR_PROVIDER_API_KEY',
});
const completion = await ngrokClient.chat.completions.create({
model: 'openai/gpt-4o',
messages: [
{ role: 'system', content: 'Talk like a pirate.' },{ role: 'user', content: `Are semicolons optional in
JavaScript?` },
],
stream: true
});1
on_http_request:
- actions:- type: ai-router
config: {}2
import OpenAI from "openai";
const ngrokClient = new OpenAI({baseURL: 'https://your_endpoint.ngrok.dev',
apiKey: 'YOUR_PROVIDER_API_KEY',
});3
const completion = await ngrokClient.chat.completions.create({
model: 'openai/gpt-4o',
messages: [
{ role: 'system', content: 'Talk like a pirate.' },{ role: 'user', content: `Are semicolons optional in
JavaScript?` },
],
stream: true
});Get early access, help shape the platform, and never fight AI traffic headaches again.