Docs/Tutorials/Quickstart

>_ DOCS / TUTORIAL

YOUR FIRST
ROUTED REQUEST.

Send a chat completion through P402 in under 5 minutes. No SDK, no dependencies. Just curl.

Prerequisites

  • 1.A P402 API key — create one free at p402.io
  • 2.curl installed (comes with macOS and most Linux distros)
  • 3.Optional: jq for pretty-printing JSON responses
1

Verify Your API Key

curl -s https://p402.io/api/v2/health \
  -H "Authorization: Bearer $P402_API_KEY" | jq .

Expected response

{
  "status": "healthy",
  "version": "2.0.0",
  "providers": 13,
  "cache": "connected",
  "timestamp": "2026-04-15T12:00:00.000Z"
}
2

Create a Session

curl -s -X POST https://p402.io/api/v2/sessions \
  -H "Authorization: Bearer $P402_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"budget_usd": 5}' | jq .

This creates a budget-capped session. All requests using this session_id are tracked against the $5 budget. Save the returned session_id for the next step.

3

Send a Chat Completion

curl -s -X POST https://p402.io/api/v2/chat/completions \
  -H "Authorization: Bearer $P402_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "What is the x402 protocol?"}],
    "p402": {
      "mode": "cost",
      "cache": true,
      "session_id": "YOUR_SESSION_ID"
    }
  }' | jq .
4

Read the Response Metadata

Every P402 response includes a p402_metadata object. This tells you which model was selected, how much it cost, what the direct cost would have been without routing, how much you saved, and whether the response was cached.

{
  "choices": [{ "message": { "content": "..." } }],
  "p402_metadata": {
    "provider": "deepseek",       // P402 selected DeepSeek (cheapest for this query)
    "model": "deepseek-v3",
    "cost_usd": 0.0003,           // What you paid
    "direct_cost": 0.0031,        // What GPT-4o would have cost
    "savings": 0.0028,            // You saved 90%
    "input_tokens": 24,
    "output_tokens": 187,
    "cached": false,              // First request; next identical query returns from cache
    "latency_ms": 1240
  }
}
5

Try a Cached Request

Send the exact same request again. P402's semantic cache recognizes it and returns the cached response at zero cost in under 50ms.

curl -s -X POST https://p402.io/api/v2/chat/completions \
  -H "Authorization: Bearer $P402_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "What is the x402 protocol?"}],
    "p402": {
      "mode": "cost",
      "cache": true,
      "session_id": "YOUR_SESSION_ID"
    }
  }' | jq .p402_metadata

Cached response

{
  "provider": "cache",
  "model": null,
  "cost_usd": 0.0000,
  "cached": true,
  "latency_ms": 12
}

What's next

You just routed your first request and saw the cache in action. Here is where to go next: