AI Orchestration Router

Optimize your AI infrastructure with intelligent, policy-driven routing.

How Routing Works

1. Semantic
Requests are embedded and checked against the Semantic Cache. If a similar request (similarity > 0.95) was settled recently, the cached response is returned instantly for $0.
2. Rank
Providers are ranked based on the requested mode:
  • cost: Cheapest model that meets capability requirements.
  • speed: Lowest TTFT (Time to First Token) provider.
  • quality: Highest ELO benchmark score (e.g., Claude Opus, GPT-4).
3. Execute
The router attempts the top-ranked provider. If it fails (rate limit, outage), it automaticallyfails over to the next best option seamlessly.

OpenRouter Meta-Provider

P402 integrates natively with OpenRouter as a primary meta-provider. This enables instant access to over 300+ specialized models through a single orchestration layer.

Latest Frontier Models

Access GPT-5.2, Claude 4.5, and Gemini 3.0 the moment they drop, with zero manual adapter updates required.

Unified Settlement

Use a single OPENROUTER_API_KEY to settle requests across hundreds of models while maintaining 1% platform fee transparency.

Configuration

Control routing behavior per-request via the configuration object.

{
  "mode": "balanced",
  "maxCost": 0.05,
  "provider": "anthropic", // Force a provider (optional)
  "model": "claude-3-opus-20240229" // Force a model (optional)
}