---
name: p402
description: >
  Build cost-aware AI applications with P402.io, the payment-aware AI orchestration layer combining
  multi-provider LLM routing (300+ models) with x402 stablecoin micropayments on Base. Use this skill
  when the user wants to: route AI requests across providers with cost/speed/quality optimization, add
  spending limits or billing guards to AI agents, implement session-based budget management, integrate
  x402 USDC micropayments for AI services, set up A2A agent communication with payment rails, create
  AP2 spending mandates, migrate from single-provider OpenAI/Anthropic to multi-provider routing,
  implement semantic caching for LLM costs, or compare model pricing. Trigger for any mention of P402,
  x402 payments, AI cost reduction, multi-model routing, agent spending controls, or payment-aware
  orchestration, even without exact terms.
---

# P402: Payment-Aware AI Orchestration

P402.io is an infrastructure layer that solves two problems at once: fragmented AI provider APIs and broken micropayment economics. It routes LLM requests to the optimal provider based on cost, speed, or quality, while settling payments in USDC/USDT on Base blockchain using the x402 protocol.

The key mental model: P402 sits between your application and AI providers the same way a payment network sits between merchants and banks. It handles routing, authorization, and settlement so your code just makes API calls.

## Why P402 Exists

Traditional payment processors charge ~$0.30 per transaction, which makes micropayments for AI API calls (often $0.001-$0.05) economically impossible. P402 uses stablecoin settlement on Base L2 where transaction costs are fractions of a cent, enabling true pay-per-request economics for AI agents.

Most AI applications are "cost-blind" -- they hardcode a single provider and overpay by 70-85%. P402 makes applications "cost-aware" by exposing real-time pricing across 300+ models and routing intelligently.

## Core Concepts

### The Session Lifecycle

Sessions are the foundational primitive. Every interaction with P402 flows through a session:

1. **Create** a session with a budget cap and optional policies
2. **Fund** the session (USDC via Base Pay, direct transfer, or test credits)
3. **Use** the session for chat completions -- P402 tracks spending per request
4. **Expire** -- sessions auto-close when budget is exhausted or time limit is reached

Sessions enforce spending boundaries. An agent with a $10 session physically cannot spend $10.01. This is the core safety mechanism for autonomous agent spending.

### Routing Modes

Every chat completion request can specify a routing mode that controls how P402 selects the provider and model:

| Mode | Optimizes For | Best For | Typical Provider |
|------|--------------|----------|-----------------|
| `cost` | Lowest price, acceptable quality | Batch processing, background tasks, high-volume | DeepSeek V3.2, Haiku 4.5, GPT-4o-mini |
| `quality` | Best output, price secondary | Final outputs, complex reasoning, user-facing answers | Claude Opus 4.6, GPT-5.2, Gemini 3.1 Pro |
| `speed` | Lowest latency, price secondary | Real-time chat, interactive UX, streaming | Groq (LPU), Flash models |
| `balanced` | Equal weight across all factors | General purpose, default choice | Sonnet 4.6, GPT-4o-mini, Gemini 3.1 Pro |

The router scores every available model using a proprietary weighted algorithm, filters by capability requirements and policy constraints, and returns the optimal choice. If the selected provider fails, automatic failover retries with the next-best alternative.

**Decision guidance for developers:** Start with `balanced` as your default. Switch to `cost` for any task where quality above "good enough" does not matter (summarization, classification, extraction). Use `quality` only for tasks where the output is the final product a human will read or where reasoning depth is critical. Use `speed` when time-to-first-byte matters more than cost (real-time chat, autocomplete).

### The Billing Guard

P402 enforces a multi-layer defense system on every request. Default limits are configurable per tenant via the [P402 dashboard](https://p402.io/dashboard). Developers should handle these error codes:

| Layer | Protection | Error Code | What To Do |
|-------|-----------|------------|------------|
| Rate limit | Requests per hour cap | `RATE_LIMIT_EXCEEDED` | Back off, use `retryAfterMs` from error |
| Daily circuit breaker | Daily spend cap | `DAILY_LIMIT_EXCEEDED` | Contact support or wait for daily reset |
| Concurrent requests | Simultaneous request cap | `TOO_MANY_CONCURRENT` | Queue requests, reduce parallelism |
| Anomaly detection | Statistical outlier detection | Logged warning (soft) | Unusual cost pattern flagged, not blocked |
| Per-request cap | Single request cost cap | `REQUEST_TOO_EXPENSIVE` | Use a cheaper model or reduce max_tokens |
| Budget reservation | Atomic budget check | Insufficient budget error | Fund the session with more USDC |

The guard runs a pre-check before every completion, executing all layers in sequence. If any hard layer fails, the request is rejected before it reaches the provider. This fail-closed design means an agent cannot spend money it has not been authorized to spend. View and adjust your limits at [p402.io/dashboard](https://p402.io/dashboard).

### The OpenAI-Compatible Interface

P402's chat completions endpoint is a drop-in replacement for OpenAI's API. The only additions are the `p402` extension object and the `p402_metadata` in the response:

```typescript
// Before: direct OpenAI call
const response = await fetch('https://api.openai.com/v1/chat/completions', {
  headers: { 'Authorization': `Bearer ${OPENAI_KEY}` },
  body: JSON.stringify({ model: 'gpt-4o', messages })
});

// After: P402 (same shape, smarter routing)
const response = await fetch('https://p402.io/api/v2/chat/completions', {
  headers: { 'Authorization': `Bearer ${P402_KEY}` },
  body: JSON.stringify({
    messages,
    // model is optional -- P402 picks the best one
    p402: { mode: 'cost', session_id: 'sess_abc', cache: true }
  })
});
```

The response includes `p402_metadata` with routing and cost telemetry. Key fields:

| Field | Type | Description |
|-------|------|-------------|
| `provider` | string | Provider that served the request |
| `model` | string | Exact model used |
| `cost_usd` | number | Actual cost charged |
| `latency_ms` | number | Total round-trip latency |
| `cached` | boolean | Whether result came from semantic cache |
| `human_verified` | boolean | True if request was World ID-verified |
| `human_usage_remaining` | number \| null | Free-trial uses remaining for this human |
| `reputation_score` | number \| null | Human-anchored reputation [0.0–1.0] |
| `credits_spent` | number \| null | Credits deducted this request (1 credit = $0.01) |
| `credits_balance` | number \| null | Remaining credit balance after this request |

Response headers also expose: `X-P402-Provider`, `X-P402-Cost-USD`, `X-P402-Latency-MS`, `X-P402-Request-ID`.

### Semantic Cache

P402 can cache responses for semantically similar prompts. When enabled (`p402.cache: true`), the system:

1. Generates an embedding of the request via OpenRouter
2. Searches existing cache entries for high semantic similarity
3. Returns the cached response instantly if a strong match is found
4. Otherwise, forwards to the provider and caches the result

Caching is scoped by tenant -- no cross-tenant data leakage is possible. Default TTL is 1 hour, max age is 24 hours. Both are configurable per request via `p402.cache_ttl`. Tune cache settings in your [P402 dashboard](https://p402.io/dashboard).

Cache is most effective for: classification tasks, FAQ-style queries, repeated extractions, and any workload with natural prompt repetition. It is least effective for: creative generation, conversation continuations, and tasks requiring fresh data.

## Getting Started

The fastest way to try P402 is to sign in at [p402.io](https://p402.io) with your **email address** — no MetaMask or seed phrases required. A Base embedded wallet is created automatically via Coinbase CDP Embedded Wallet (email OTP, keys secured in AWS Nitro Enclave). The guided onboarding walks you through: role selection → API key generation → dashboard orientation. Wallet funding is pay-as-you-go — you'll be prompted to send USDC on Base when you make your first routed request.

For programmatic access, grab your API key from the dashboard after signing in. You can also try P402 via the **Base Mini App** at [mini.p402.io](https://mini.p402.io) — connect a Base Account, fund with USDC via Base Pay, and start chatting with real-time cost and savings tracking.

**MCP Server:** If you use Claude Desktop, Cursor, or Windsurf, add `@p402/mcp-server` to your MCP config — no REST integration needed:

```json
{
  "mcpServers": {
    "p402": {
      "command": "npx",
      "args": ["-y", "@p402/mcp-server"],
      "env": { "P402_API_KEY": "p402_live_..." }
    }
  }
}
```

The MCP server is listed on the official registry as `io.github.Z333Q/p402`.

## API Quick Reference

**Base URL:** `https://p402.io`

| Endpoint | Method | Purpose |
|----------|--------|---------|
| `/api/v2/chat/completions` | POST | Main routing endpoint (OpenAI-compatible) |
| `/api/v2/sessions` | POST | Create a new session |
| `/api/v2/sessions` | GET | List active sessions |
| `/api/v2/sessions/:id` | GET | Get session details and budget status |
| `/api/v2/sessions/:id` | DELETE | End a session |
| `/api/v2/providers` | GET | List all 300+ available providers/models |
| `/api/v2/providers/compare` | POST | Compare model pricing for a token count |
| `/api/v2/analytics/spend` | GET | Spending analytics |
| `/api/v2/analytics/recommendations` | GET | Cost optimization suggestions |
| `/api/v2/cache/stats` | GET | Cache hit rate and savings |
| `/api/v2/credits/balance` | GET | Current credit balance (1 credit = $0.01) |
| `/api/v2/credits/purchase` | POST | Buy credits (test or paid mode) |
| `/api/v2/credits/history` | GET | Credit transaction history |
| `/api/a2a` | POST | A2A JSON-RPC endpoint |
| `/.well-known/agent.json` | GET | Agent discovery card |
| `/api/a2a/mandates` | POST | Create AP2 spending mandate |
| `/api/a2a/mandates/:id/use` | POST | Use a mandate for payment |
| `/models` | GET (page) | Live model comparison with cost calculator |

**MCP Tools** (via `@p402/mcp-server`, stdio transport):

| Tool | Purpose |
|------|---------|
| `p402_chat` | Route a prompt (cost/quality/speed/balanced modes) |
| `p402_create_session` | Create a USD-capped budget session |
| `p402_get_session` | Check session balance and spend |
| `p402_list_models` | List all 300+ routable models |
| `p402_compare_providers` | Cost and latency comparison per model |
| `p402_agent_status` | Check World AgentKit registration and credit balance for a wallet |
| `p402_health` | Router and facilitator health status |

For complete request/response schemas and code examples in TypeScript, Python, and curl, read `references/api-reference.md`.

## Integration Patterns

### Pattern 1: Drop-In OpenAI Replacement (Simplest)

Change the base URL and API key. Optionally remove the `model` field to let P402 choose. Add `p402.mode` for routing control. This is a 2-line migration for any OpenAI SDK user.

### Pattern 2: Session-Budgeted Agent

Create a session with a dollar budget, pass `session_id` on every request, and P402 enforces the cap. When budget runs low, the session status changes to `exhausted` and requests are rejected. This is the pattern for autonomous agents that need spending guardrails.

### Pattern 3: A2A Agent-to-Agent Communication

Use the JSON-RPC endpoint at `/api/a2a` with Google's A2A protocol. Agents discover each other via `/.well-known/agent.json`, exchange tasks, and settle payments via the x402 extension. AP2 mandates pre-authorize spending. Read `references/a2a-protocol.md` for the full flow.

### Pattern 4: x402 Payment Settlement

For machine-to-machine payments using the HTTP 402 flow: service responds with payment requirements, client submits payment proof (EIP-3009 signature or transaction hash), service verifies and releases the resource. Three schemes are supported: `exact` (gasless EIP-3009), `onchain` (direct tx verification), and `receipt` (reuse prior payments). Read `references/payment-flows.md` for implementation details.

### Pattern 5: Credits-Based Free Trial (World ID)

Human-verified users receive **500 free credits ($5.00)** when they first verify with World ID. Credits are consumed automatically on every chat completion request for verified humans (`credits_spent` in `p402_metadata`). When credits run out, the session falls back to standard USDC billing. Check balance via `GET /api/v2/credits/balance`. Purchase additional credits via `POST /api/v2/credits/purchase` with `{ amount_usd, mode: 'test' | 'paid' }`.

```typescript
// Check credit balance
const res = await fetch('https://p402.io/api/v2/credits/balance', {
  headers: { Authorization: `Bearer ${P402_KEY}` }
});
const { balance } = await res.json();  // e.g. 487 (= $4.87)

// After a request: read credits_balance from p402_metadata
console.log(response.p402_metadata.credits_spent);   // 3 (= $0.03)
console.log(response.p402_metadata.credits_balance);  // 484
```

### Pattern 6: MCP Server (Claude Desktop / Cursor / Windsurf)

Add `@p402/mcp-server` to your MCP client config. No REST integration needed — the server runs as a stdio subprocess and exposes all 6 tools natively. The client can call `p402_chat` to route and pay for LLM requests, `p402_create_session` to establish a budget, and `p402_list_models` to discover providers. Registry identifier: `io.github.Z333Q/p402`.

### Pattern 7: Cost Intelligence Dashboard

Use `/api/v2/analytics/spend` for spending data and `/api/v2/analytics/recommendations` for optimization suggestions. The recommendations endpoint identifies cheaper model alternatives and estimates potential savings. Use `/api/v2/providers/compare` to show users real-time pricing comparisons.

## Reference Files

Read these when you need deeper detail than this overview provides:

- **`references/api-reference.md`** -- Complete endpoint documentation with TypeScript interfaces, request/response examples, error codes, and authentication details. Read this when generating integration code.

- **`references/routing-guide.md`** -- Deep dive on the routing engine: scoring algorithm, provider capabilities, model tiers, failover logic, and advanced configuration like custom weights, provider exclusion, and capability filtering. Read this when the user needs fine-grained routing control.

- **`references/payment-flows.md`** -- x402 protocol implementation: the 3-step payment flow, EIP-3009 gasless transfers, onchain verification, receipt reuse, USDC contract addresses, and the settlement lifecycle. Read this when the user is integrating payment settlement.

- **`references/a2a-protocol.md`** -- Google A2A protocol integration: JSON-RPC methods, task lifecycle, agent discovery, AP2 mandates, the x402 payment extension, and the Bazaar service marketplace. Read this when the user is building agent-to-agent systems.

## World AgentKit (Proof-of-Human Free Trial)

World AgentKit (`@worldcoin/agentkit`) is integrated into P402 as an identity layer. Agents registered in the AgentBook contract on Base mainnet with a World ID ZK proof-of-human receive free-trial access to `/api/v2/chat/completions` (default: 5 requests per unique human per endpoint), bypassing the standard billing guard.

### Prerequisites for World AgentKit

1. **Server**: `AGENTKIT_ENABLED=true` must be set in the P402 router environment. DB migration `v2_017_agentkit.sql` must be applied.
2. **Agent wallet**: Must be registered in AgentBook on Base mainnet (`0xE1D1D3526A6FAa37eb36bD10B933C1b77f4561a4`). Registration requires World ID verification via the World App.
3. **SIWE challenge signing**: On each request, the agent must sign a CAIP-122 / EIP-4361 SIWE challenge from the server. The signed payload goes in the `agentkit` request header (base64-encoded JSON).

### SDK Integration (transparent retry)

```typescript
const client = new P402Client({
  apiKey: 'p402_live_...',
  worldId: {
    signer: {
      address: '0xYourAgentWallet',
      chainId: 'eip155:8453',
      signMessage: (msg) => walletClient.signMessage({ message: msg }),
    },
  },
});

// SDK automatically signs the SIWE challenge and retries when the server
// signals free-trial availability via agentkit_challenge in a 429 response.
const response = await client.chat({ messages: [...] });
console.log(response.p402_metadata.human_verified);      // true
console.log(response.p402_metadata.human_usage_remaining); // 4 (uses left)
```

### CLI Commands

```bash
# Check if a wallet is registered in AgentBook
p402 agent register 0xYourAgentWallet

# Show AgentKit status for the configured agent
P402_AGENT_ADDRESS=0xYourAgentWallet p402 agent status
```

### MCP Tool

The `p402_agent_status` MCP tool checks AgentBook registration for any wallet address. Set `P402_WORLD_ID_ENABLED=true` in the MCP server environment to enable World AgentKit awareness.

### Cost Optimization with World AgentKit

- Check `human_verified` in `p402_metadata` before deciding whether to use a session budget
- If `human_verified: true`, the request was free — no budget was consumed
- Use `human_usage_remaining` to monitor free-trial exhaustion and switch to paid sessions proactively
- Never assume the free trial is unlimited — it defaults to 5 uses per human per endpoint

### Human-Anchored Reputation (Phase 2, live)

`reputation_score` in `p402_metadata` is a float [0.0–1.0], Redis-cached per World ID nullifier hash. Starts at 0.5 (neutral) for new verified humans.

Score components (weights): `settlement_score` (0.4), `session_score` (0.3), `dispute_score` (0.2), `sentinel_score` (0.1).

Events that affect score:
- **+settlement**: `recordSettlement()` — sigmoid increase toward 0.95 over 50+ payments
- **+session**: `recordSession(true/false)` — regularized completion rate
- **–dispute**: `recordDispute()` — mandate violations; each costs ~10%, floor 0.2
- **–sentinel**: `recordAnomalyFlag()` — anomaly flags; each costs ~5%, floor 0.3

Query a wallet's reputation publicly: `GET /api/v2/agents/{address}/reputation`

AP2 mandates: if grantor sends `agentkit:` header, the mandate row stores `human_id_hash`. Violations automatically call `recordDispute()` for that human.

## Tech Stack Context

P402 is a Next.js App Router application using TypeScript. The design system follows neo-brutalist principles: primary color #B6FF2E (lime), 2px borders, no rounded corners, IBM Plex Sans + monospace fonts. When generating UI code for P402-related interfaces, follow this aesthetic.


---

# P402 API Reference

Complete endpoint documentation for P402.io V2 API. All endpoints use JSON request/response bodies and require authentication via Bearer token.

## Table of Contents
1. [Authentication](#authentication)
2. [Chat Completions](#chat-completions)
3. [Sessions](#sessions)
4. [Providers](#providers)
5. [Analytics](#analytics)
6. [Cache](#cache)
7. [TypeScript Interfaces](#typescript-interfaces)
8. [Error Handling](#error-handling)
9. [Response Headers](#response-headers)

---

## Authentication

All requests require a Bearer token in the Authorization header:

```
Authorization: Bearer <P402_API_KEY>
```

Multi-tenant requests can specify a tenant via header:

```
X-P402-Tenant: <tenant_id>
```

If no tenant header is provided, the `default` tenant is used.

---

## Chat Completions

### `POST /api/v2/chat/completions`

OpenAI-compatible chat completion with P402 orchestration. Supports streaming, multi-provider routing, semantic caching, and real-time cost tracking.

**Request Body:**

```typescript
{
  // OpenAI-compatible fields
  model?: string,              // Optional: specific model. Omit to let P402 route.
  messages: ChatMessage[],     // Required: conversation messages
  temperature?: number,        // Default: 0.7
  max_tokens?: number,
  top_p?: number,
  frequency_penalty?: number,
  presence_penalty?: number,
  stop?: string | string[],
  stream?: boolean,            // Enable SSE streaming
  tools?: any[],               // Function calling tools
  tool_choice?: any,
  response_format?: { type: 'text' | 'json_object' },
  user?: string,

  // P402 Extensions
  p402?: {
    mode?: 'cost' | 'quality' | 'speed' | 'balanced',  // Routing mode
    prefer_providers?: string[],    // Preferred providers (ordered)
    exclude_providers?: string[],   // Providers to exclude
    require_capabilities?: string[], // e.g., ['vision', 'code']
    max_cost?: number,              // Max cost per request in USD
    session_id?: string,            // Session for budget tracking
    cache?: boolean,                // Enable semantic caching
    cache_ttl?: number,             // Cache TTL in seconds
    failover?: boolean,             // Enable automatic failover
    tenant_id?: string              // Multi-tenant isolation
  }
}
```

**Response (non-streaming):**

```typescript
{
  id: string,
  object: 'chat.completion',
  created: number,
  model: string,                     // Actual model used (may differ from requested)
  choices: [{
    index: number,
    message: {
      role: 'assistant',
      content: string,
      tool_calls?: any[]
    },
    finish_reason: string
  }],
  usage: {
    prompt_tokens: number,
    completion_tokens: number,
    total_tokens: number
  },
  p402_metadata: {
    request_id: string,
    tenant_id: string,
    provider: string,                    // e.g., 'openai', 'anthropic', 'groq'
    model: string,                       // e.g., 'gpt-4o-mini', 'claude-sonnet-4-6'
    cost_usd: number,                   // Actual cost of this request
    latency_ms: number,                 // Total latency including routing overhead
    provider_latency_ms: number,        // Provider-only latency
    cached: boolean,                    // Whether response came from semantic cache
    routing_mode: string,               // Mode that was applied
    // World AgentKit / Credits fields (present when applicable)
    human_verified?: boolean,           // True if request came from a World ID-verified human
    human_usage_remaining?: number | null, // Remaining free-trial uses for this human
    reputation_score?: number | null,   // Human-anchored reputation [0.0–1.0]
    credits_spent?: number | null,      // Credits deducted this request (1 credit = $0.01)
    credits_balance?: number | null     // Remaining credit balance after this request
  }
}
```

**Response Headers:**

```
X-P402-Request-ID: req_abc123
X-P402-Provider: anthropic
X-P402-Cost-USD: 0.0034
X-P402-Latency-MS: 847
```

**Streaming:** When `stream: true`, the response is Server-Sent Events (SSE) in OpenAI-compatible format. Each chunk is `data: {json}\n\n` with the final chunk being `data: [DONE]\n\n`.

**Example (TypeScript):**

```typescript
const response = await fetch('https://p402.io/api/v2/chat/completions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': `Bearer ${process.env.P402_API_KEY}`
  },
  body: JSON.stringify({
    messages: [
      { role: 'system', content: 'You are a helpful assistant.' },
      { role: 'user', content: 'Explain quantum computing briefly.' }
    ],
    p402: {
      mode: 'cost',
      session_id: 'sess_abc123',
      cache: true
    }
  })
});

const data = await response.json();
console.log(`Provider: ${data.p402_metadata.provider}`);
console.log(`Cost: $${data.p402_metadata.cost_usd}`);
console.log(`Cached: ${data.p402_metadata.cached}`);
```

**Example (Python):**

```python
import requests

response = requests.post(
    "https://p402.io/api/v2/chat/completions",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "messages": [{"role": "user", "content": "Explain quantum computing briefly."}],
        "p402": {"mode": "cost", "session_id": "sess_abc123", "cache": True}
    }
)

data = response.json()
print(f"Provider: {data['p402_metadata']['provider']}")
print(f"Cost: ${data['p402_metadata']['cost_usd']}")
```

**Example (curl):**

```bash
curl https://p402.io/api/v2/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $P402_API_KEY" \
  -d '{
    "messages": [{"role": "user", "content": "Hello"}],
    "p402": {"mode": "balanced", "cache": true}
  }'
```

---

## Sessions

### `POST /api/v2/sessions`

Create a new session with budget and optional policies.

**Request Body:**

```typescript
{
  agent_identifier?: string,     // Label for the agent using this session
  budget_usd?: number,           // Total budget in USD (default varies)
  expires_in_hours?: number,     // Session duration (default: 24)
  policy?: {
    type: 'spending' | 'provider' | 'model' | 'task',
    // ... policy-specific config (see V2 spec)
  }
}
```

**Response:**

```typescript
{
  object: 'session',
  id: string,                    // e.g., 'sess_abc123'
  tenant_id: string,
  agent_identifier: string | null,
  budget: {
    total_usd: number,
    used_usd: number,
    remaining_usd: number,
    utilization_percent: number
  },
  status: 'active' | 'exhausted' | 'expired' | 'ended' | 'revoked',
  created_at: string,            // ISO 8601
  expires_at: string
}
```

**Example:**

```typescript
const session = await fetch('https://p402.io/api/v2/sessions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': `Bearer ${P402_API_KEY}`,
    'X-P402-Tenant': 'my_tenant'
  },
  body: JSON.stringify({
    agent_identifier: 'research-agent-v1',
    budget_usd: 5.00,
    expires_in_hours: 4
  })
});
```

### `GET /api/v2/sessions`

List sessions for the authenticated tenant. Accepts `?status=active` query parameter (default: `active`). Returns up to 100 sessions ordered by creation date descending.

### `GET /api/v2/sessions/:id`

Get detailed session info including budget utilization, time remaining, and active/expired status. The endpoint auto-expires sessions that have passed their `expires_at` timestamp.

**Response includes:**

```typescript
{
  // ... standard session fields ...
  meta: {
    is_active: boolean,
    is_expired: boolean,
    time_remaining_seconds: number | null
  }
}
```

### `DELETE /api/v2/sessions/:id`

End a session immediately. Sets status to `ended` and records `ended_at` timestamp. Remaining budget is released.

---

## Providers

### `GET /api/v2/providers`

List all available providers and their models with pricing, capabilities, and health status.

### `POST /api/v2/providers/compare`

Compare model pricing for a specific workload.

**Request Body:**

```typescript
{
  input_tokens: number,          // Estimated input tokens
  output_tokens: number,         // Estimated output tokens
  capabilities?: string[],      // Filter by capability: 'chat', 'code', 'vision', 'reasoning'
  tier?: 'budget' | 'mid' | 'premium',
  exclude_providers?: string[]
}
```

**Response:**

```typescript
{
  object: 'provider_comparison',
  models: [{
    rank: number,
    model: string,               // e.g., 'deepseek/deepseek-r1'
    provider: string,
    tier: string,
    cost: number,                // Total cost for the given token counts
    cost_breakdown: { input: number, output: number },
    quality_score: number,
    context_window: number,
    input_cost_per_1k: number,
    output_cost_per_1k: number
  }],
  picks: {
    cheapest: { model: string, cost: number },
    best_value: { model: string, cost: number, quality_score: number },
    fastest: { model: string, tier: string }
  }
}
```

---

## Analytics

### `GET /api/v2/analytics/spend`

Returns spending analytics for the authenticated tenant over a configurable time period.

### `GET /api/v2/analytics/recommendations`

Returns personalized cost optimization recommendations based on actual usage patterns. Recommendations include: switching to cheaper models, enabling caching, using cost-optimized routing, and identifying underutilized capabilities.

**Response:**

```typescript
{
  recommendations: [{
    id: string,
    type: 'model_switch' | 'rate_optimization' | 'caching' | 'provider_change',
    priority: 'high' | 'medium' | 'low',
    title: string,
    description: string,
    potential_savings_percent: number,
    implementation: string        // Code-level guidance
  }]
}
```

---

## Cache

### `GET /api/v2/cache/stats`

Cache hit rate, total savings, and entry count.

### `POST /api/v2/cache/invalidate`

Invalidate cache entries by namespace or pattern.

### `PUT /api/v2/cache/config`

Update cache settings (similarity threshold, TTL, max age).

### `GET /api/v2/cache/entries`

Browse cached entries for debugging.

---

## TypeScript Interfaces

These are the core types from P402's codebase that developers should use:

```typescript
// === Session Types ===

interface P402Session {
  session_id: string;
  tenant_id: string;
  agent_identifier?: string;
  balance_usdc: number;
  budget_total: number;
  budget_spent: number;
  budget: {
    total_usd: number;
    used_usd: number;
    remaining_usd: number;
  };
  policy?: Record<string, unknown>;
  status: 'active' | 'exhausted' | 'expired' | 'ended' | 'revoked';
  created_at: string;
  expires_at: string;
  ended_at?: string;
}

interface CreateSessionRequest {
  agent_identifier?: string;
  budget_usd?: number;
  expires_in_hours?: number;
  policy?: Record<string, unknown>;
}

interface FundSessionRequest {
  session_id: string;
  amount: string | number;
  tx_hash?: string;
  source?: 'base_pay' | 'direct' | 'test';
  network?: 'base' | 'base_sepolia';
}

// === Chat Types ===

interface ChatMessage {
  role: 'system' | 'user' | 'assistant' | 'tool';
  content: string | Array<{ type: string; text?: string; image_url?: any }>;
  name?: string;
  tool_calls?: any[];
  tool_call_id?: string;
}

interface P402Options {
  mode?: 'cost' | 'quality' | 'speed' | 'balanced';
  prefer_providers?: string[];
  exclude_providers?: string[];
  require_capabilities?: string[];
  max_cost?: number;
  session_id?: string;
  cache?: boolean;
  cache_ttl?: number;
  failover?: boolean;
  tenant_id?: string;
}

interface P402Metadata {
  request_id: string;
  tenant_id: string;
  provider: string;
  model: string;
  cost_usd: number;
  latency_ms: number;
  provider_latency_ms: number;
  cached: boolean;
  routing_mode: string;
}

// === Routing Types ===

type RoutingMode = 'cost' | 'quality' | 'speed' | 'balanced';

interface RoutingWeights {
  cost: number;
  quality: number;
  speed: number;
}

interface RoutingOptions {
  mode: RoutingMode | RoutingWeights;
  task?: string;
  requiredCapabilities?: string[];
  minContextWindow?: number;
  maxCostPerRequest?: number;
  preferProviders?: string[];
  excludeProviders?: string[];
  preferTier?: 'budget' | 'mid' | 'premium';
  maxTier?: 'budget' | 'mid' | 'premium';
  failover?: {
    enabled: boolean;
    maxRetries: number;
    fallbackProviders?: string[];
  };
  rateLimitStrategy?: 'switch' | 'queue' | 'fail';
}

// === Billing Guard Types ===

interface BillingContext {
  userId: string;
  sessionId?: string;
  tenantId?: string;
}

interface CostEstimate {
  estimatedCost: number;
  model: string;
  inputTokens: number;
  outputTokens: number;
}

// BillingGuardError codes:
// 'RATE_LIMIT_EXCEEDED' -- hourly request cap exceeded, retryAfterMs provided
// 'DAILY_LIMIT_EXCEEDED' -- daily spend cap reached
// 'TOO_MANY_CONCURRENT' -- concurrent request limit hit
// 'REQUEST_TOO_EXPENSIVE' -- single request exceeds per-request cost cap
// Default limits are configurable per tenant at https://p402.io/dashboard
```

---

## Error Handling

P402 returns standard HTTP status codes with structured error bodies:

```typescript
{
  error: {
    type: string,       // e.g., 'not_found', 'rate_limit', 'billing_error'
    message: string,    // Human-readable description
    code?: string,      // Machine-readable code (e.g., 'RATE_LIMIT_EXCEEDED')
    retryAfterMs?: number  // When to retry (for rate limits)
  }
}
```

| Status | Meaning |
|--------|---------|
| 200 | Success |
| 400 | Bad request (malformed body, missing required fields) |
| 401 | Invalid or missing API key |
| 402 | Payment required (x402 flow) |
| 404 | Resource not found (session, mandate) |
| 429 | Rate limited (check `retryAfterMs`) |
| 500 | Internal server error |
| 502 | Upstream provider error (failover may retry) |
| 503 | Service temporarily unavailable |

---

## Response Headers

Every chat completion response includes P402-specific headers:

| Header | Value | Example |
|--------|-------|---------|
| `X-P402-Request-ID` | Unique request identifier | `req_abc123` |
| `X-P402-Provider` | Provider that served the request | `anthropic` |
| `X-P402-Cost-USD` | Cost of this request in USD | `0.0034` |
| `X-P402-Latency-MS` | Total latency in milliseconds | `847` |

---

**Ready to integrate?** Get your API key at [p402.io](https://p402.io). Try the [P402 Mini App](https://mini.p402.io) for an instant demo with USDC on Base. Full API explorer and test playground available in the [P402 dashboard](https://p402.io/dashboard).


---

# P402 Routing Guide

Deep dive on the P402 smart routing engine: how it selects providers, scores models, handles failover, and how developers can fine-tune routing behavior.

## Table of Contents
1. [How Routing Works](#how-routing-works)
2. [Scoring Algorithm](#scoring-algorithm)
3. [Provider Landscape](#provider-landscape)
4. [Model Tiers](#model-tiers)
5. [Advanced Configuration](#advanced-configuration)
6. [Failover and Rate Limit Handling](#failover-and-rate-limit-handling)
7. [Common Routing Patterns](#common-routing-patterns)

---

## How Routing Works

When a request hits `/api/v2/chat/completions`, the routing engine executes this sequence:

1. **Parse** the request and extract routing hints from the `p402` options
2. **Filter** the model catalog by: availability (provider has valid API key or is accessible via OpenRouter), required capabilities (e.g., vision, code), minimum context window, tier restrictions, provider allowlist/blocklist
3. **Score** every remaining candidate model on three dimensions (cost, quality, speed) using the weights from the selected routing mode
4. **Select** the highest-scoring model
5. **Check** the Billing Guard (6-layer pre-check)
6. **Execute** the request against the selected provider
7. **Failover** to the next-best candidate if the selected provider returns 5xx or 429

The entire routing decision adds less than 50ms of overhead to the request. The routing decision is logged with the full scoring breakdown for observability.

## Scoring Algorithm

The router uses a proprietary weighted scoring model. Each candidate model is evaluated on three normalized dimensions:

**Cost Score:** Based on the model's per-token pricing relative to all available candidates. Cheaper models score higher.

**Quality Score:** Derived from model tier. Premium models (Claude Opus 4.6, GPT-5.2, Gemini 3.1 Pro) score highest. Mid-tier models (Sonnet 4.6, GPT-4o-mini, Gemini Flash) score moderately. Budget models (Haiku 4.5, DeepSeek V3.2, Llama 3.3) score lower but are often the best value.

**Speed Score:** Derived from expected inference speed. Budget and specialized inference models (Groq LPU) score highest. Premium models with larger parameter counts score lower.

**Routing Modes:**

Each mode applies a different weighting strategy:

| Mode | Strategy |
|------|----------|
| `cost` | Heavily favors price, with quality and speed as tiebreakers |
| `quality` | Heavily favors output quality, with cost and speed as tiebreakers |
| `speed` | Heavily favors low latency, with cost and quality as tiebreakers |
| `balanced` | Roughly equal weight across all three dimensions |

You can also pass custom weights to blend your own strategy:

```typescript
p402: {
  mode: { cost: 0.5, quality: 0.3, speed: 0.2 }  // Custom blend
}
```

The exact weighting values and scoring internals are managed by P402 and tuned continuously based on real-world performance data. Use the [provider comparison endpoint](https://p402.io/api/v2/providers/compare) to see how models rank for your specific workload.

## Provider Landscape

P402 accesses 300+ models primarily through **OpenRouter** as the upstream aggregator. This means developers do not need individual API keys for each provider; a single P402 API key provides access to the full catalog.

The provider registry includes direct adapters for:

| Provider | Adapter | Specialty |
|----------|---------|-----------|
| OpenAI | `openai` | GPT-5.2, o3, general purpose |
| Anthropic | `anthropic` | Claude family, long context, coding |
| Google | `google` | Gemini family, multimodal, massive context |
| Groq | `groq` | Ultra-low latency via LPU hardware |
| DeepSeek | `deepseek` | Cost-effective reasoning (R1/V3) |
| Mistral | `mistral` | Open-weight, efficient |
| Perplexity | `perplexity` | Real-time web search augmented |
| AI21 | `ai21` | Jamba/Jurassic models |
| Together | `together` | Open-source model hosting |
| Fireworks | `fireworks` | Fast inference for open models |
| Cohere | `cohere` | Enterprise NLP, RAG |

When a provider has a direct API key configured, P402 can route to it directly. When no direct key exists, it routes through OpenRouter transparently. This gives developers the best of both worlds: direct performance when available, broad coverage always.

## Model Tiers

P402 organizes models into three tiers that drive quality and speed scoring:

**Premium:**
Claude Opus 4.6, GPT-5.2, Gemini 3.1 Pro, o3. The most capable models for complex reasoning, nuanced writing, and multi-step analysis. Use when output quality is paramount.

**Mid:**
Claude Sonnet 4.6, GPT-4o-mini, Gemini Flash, Mistral Large, DeepSeek V3.2. Strong general-purpose models with good cost/quality tradeoffs. The sweet spot for most production workloads.

**Budget:**
Claude Haiku 4.5, Llama 3.3, Devstral 2, DeepSeek R1. Fast and cheap. Best for high-volume tasks where "good enough" quality is acceptable: classification, extraction, summarization, simple Q&A.

You can restrict routing to specific tiers via session policies. Configure this in your [P402 dashboard](https://p402.io/dashboard).

## Advanced Configuration

### Capability Filtering

Request models with specific capabilities:

```typescript
p402: {
  require_capabilities: ['vision', 'code']
  // Only models supporting both vision and code generation
}
```

Available capabilities: `chat`, `code`, `vision`, `reasoning`, `function_calling`.

### Provider Preferences

Steer routing toward or away from specific providers:

```typescript
p402: {
  prefer_providers: ['anthropic', 'openai'],  // Try these first
  exclude_providers: ['deepseek']              // Never route here
}
```

Provider preferences are soft hints: if preferred providers have no suitable models, the router falls back to the full catalog. Exclusions are hard constraints.

### Cost Caps

Prevent expensive requests:

```typescript
p402: {
  max_cost: 0.10  // Reject if estimated cost exceeds $0.10
}
```

This is enforced before the request is sent to the provider, using token count estimation.

## Failover and Rate Limit Handling

### Automatic Failover

When `p402.failover` is true (or when using session-based requests), the router automatically retries with the next-best scoring model if the primary provider returns:
- HTTP 5xx (server error)
- HTTP 429 (rate limited)
- Network timeout

Default: up to 3 retries with exponentially decreasing score threshold.

### Rate Limit Strategy

The `rateLimitStrategy` option controls behavior when a provider is rate-limited:

| Strategy | Behavior |
|----------|----------|
| `switch` | Immediately route to the next-best provider (default) |
| `queue` | Queue the request and retry after the rate limit window |
| `fail` | Return the 429 to the caller immediately |

For production workloads, `switch` is almost always the right choice. Use `fail` only when you need deterministic provider selection and prefer errors over alternative models.

## Common Routing Patterns

### High-Volume Batch Processing

```typescript
// Process 10,000 documents at minimal cost
p402: {
  mode: 'cost',
  cache: true,       // Many documents may be similar
  failover: true     // Don't lose work to transient errors
}
```

### Real-Time Chat Application

```typescript
// User-facing chat where latency matters
p402: {
  mode: 'speed',
  session_id: userSessionId,  // Track per-user spending
  cache: false                // Conversations are unique
}
```

### Code Generation Pipeline

```typescript
// Quality matters, budget matters too
p402: {
  mode: { cost: 0.3, quality: 0.5, speed: 0.2 },
  require_capabilities: ['code'],
  prefer_providers: ['anthropic']  // Claude excels at code
}
```

### Classification / Extraction

```typescript
// High volume, low complexity
p402: {
  mode: 'cost',
  cache: true,        // Many inputs will be similar
  max_cost: 0.01      // These should be cheap
}
```

### Multi-Tier Architecture

Use different modes for different stages of a pipeline:

```typescript
// Stage 1: Classify intent (cheap)
const intent = await p402Chat({
  messages: [{ role: 'user', content: userInput }],
  p402: { mode: 'cost' }
});

// Stage 2: Generate response (quality for complex, cost for simple)
const mode = intent.complexity === 'high' ? 'quality' : 'cost';
const response = await p402Chat({
  messages: conversationHistory,
  p402: { mode, session_id: sessionId }
});
```

This pattern reduces overall costs by 40-60% compared to routing everything through a premium model.

---

**Try it now:** The fastest way to see routing in action is the [P402 Mini App](https://mini.p402.io). Fund with USDC on Base, pick a routing mode, and watch real-time cost tracking as you chat. For API access, get your key at [p402.io](https://p402.io).


---

# P402 Payment Flows

Implementation guide for x402 protocol payment settlement through P402. Covers the 3-step payment lifecycle, all three payment schemes, USDC contract addresses, session funding, and integration with the Billing Guard.

## Table of Contents
1. [The x402 Protocol](#the-x402-protocol)
2. [Payment Schemes](#payment-schemes)
3. [Session-Based Payments](#session-based-payments)
4. [The X-402-Payment Header](#the-x-402-payment-header)
5. [USDC Contract Addresses](#usdc-contract-addresses)
6. [Settlement Verification](#settlement-verification)
7. [Integration Examples](#integration-examples)

---

## The x402 Protocol

x402 is an HTTP-native payment protocol co-founded by Coinbase and Cloudflare. It repurposes the HTTP 402 "Payment Required" status code for machine-to-machine payments. P402 acts as a **facilitator** in the x402 ecosystem, verifying payments and releasing AI services.

The core flow has three steps:

### Step 1: Payment Required (402 Response)

When a service requires payment, it responds with HTTP 402 and payment details:

```http
HTTP/1.1 402 Payment Required
Content-Type: application/json

{
  "payment_id": "pay_abc123",
  "schemes": [
    {
      "scheme": "exact",
      "recipient": "0xb23f146251e3816a011e800bcbae704baa5619ec",
      "amount": "50000",
      "asset": "USDC",
      "network": "base",
      "domain": "p402.io",
      "verifyingContract": "0x..."
    },
    {
      "scheme": "onchain",
      "recipient": "0xb23f146251e3816a011e800bcbae704baa5619ec",
      "amount": "50000",
      "asset": "USDC",
      "network": "base"
    }
  ],
  "service_description": "AI inference: claude-sonnet-4-6, ~500 tokens",
  "expires_at": "2026-02-24T12:00:00Z"
}
```

The `amount` is in the token's smallest unit (USDC has 6 decimals, so 50000 = $0.05).

### Step 2: Payment Submitted

The client submits payment proof. For the A2A protocol, this is a JSON-RPC call:

```json
{
  "jsonrpc": "2.0",
  "method": "x402/payment-submitted",
  "params": {
    "payment_id": "pay_abc123",
    "scheme": "onchain",
    "tx_hash": "0xabc..."
  },
  "id": 1
}
```

For direct HTTP, the client includes the `X-402-Payment` header on a retry of the original request.

### Step 3: Payment Completed

P402 verifies the payment on-chain and releases the resource:

```json
{
  "jsonrpc": "2.0",
  "result": {
    "payment_id": "pay_abc123",
    "status": "completed",
    "settlement": {
      "tx_hash": "0xabc...",
      "block_number": 12345678,
      "amount_settled": "50000",
      "fee_usd": 0.001
    },
    "receipt": {
      "receipt_id": "rec_xyz",
      "signature": "0x...",
      "valid_until": "2026-02-25T12:00:00Z"
    }
  },
  "id": 1
}
```

The receipt can be reused for subsequent requests within its validity window.

---

## Payment Schemes

### `exact` (EIP-3009 Gasless)

The primary x402 scheme and the one Coinbase's hosted facilitator uses. The client signs a `transferWithAuthorization` message (EIP-3009) without broadcasting a transaction. The facilitator (P402) submits the transaction on behalf of the client, making it gasless for the payer.

```typescript
// Client-side: sign the EIP-3009 authorization
const domain = {
  name: 'USD Coin',
  version: '2',
  chainId: 8453,               // Base mainnet
  verifyingContract: USDC_ADDRESS
};

const types = {
  TransferWithAuthorization: [
    { name: 'from', type: 'address' },
    { name: 'to', type: 'address' },
    { name: 'value', type: 'uint256' },
    { name: 'validAfter', type: 'uint256' },
    { name: 'validBefore', type: 'uint256' },
    { name: 'nonce', type: 'bytes32' }
  ]
};

const signature = await wallet.signTypedData(domain, types, {
  from: walletAddress,
  to: P402_TREASURY,
  value: parseUnits('0.05', 6),   // $0.05 USDC
  validAfter: 0,
  validBefore: Math.floor(Date.now() / 1000) + 3600,
  nonce: randomBytes(32)
});

// Submit to P402
await fetch('https://p402.io/api/a2a', {
  method: 'POST',
  body: JSON.stringify({
    jsonrpc: '2.0',
    method: 'x402/payment-submitted',
    params: {
      payment_id: 'pay_abc123',
      scheme: 'exact',
      signature: signature
    },
    id: 1
  })
});
```

### `onchain` (Direct Transaction)

The client broadcasts a USDC transfer to the P402 treasury address, then submits the transaction hash for verification.

```typescript
// Client-side: send USDC on Base
const tx = await usdcContract.transfer(
  P402_TREASURY,
  parseUnits('0.05', 6)
);
await tx.wait();

// Submit tx hash for verification
const header = `x402-v1;network=8453;token=${USDC_ADDRESS};tx=${tx.hash}`;

// Option A: via X-402-Payment header
const response = await fetch('https://p402.io/api/v2/chat/completions', {
  headers: {
    'X-402-Payment': header,
    'Authorization': `Bearer ${P402_KEY}`
  },
  body: JSON.stringify({ messages, p402: { mode: 'cost' } })
});

// Option B: via A2A JSON-RPC
await fetch('https://p402.io/api/a2a', {
  method: 'POST',
  body: JSON.stringify({
    jsonrpc: '2.0',
    method: 'x402/payment-submitted',
    params: { payment_id: 'pay_abc', scheme: 'onchain', tx_hash: tx.hash },
    id: 1
  })
});
```

P402 verifies the transaction by:
1. Fetching the transaction receipt from Base RPC
2. Confirming receipt status is `success`
3. Finding a USDC Transfer event log to the expected treasury address
4. Checking the transfer amount meets the minimum
5. Extracting the sender address from the event log

### `receipt` (Reuse Prior Payments)

After a successful payment, the `payment-completed` response includes a receipt with a `receipt_id` and `signature`. This receipt can be reused for subsequent requests within its validity window, avoiding repeated on-chain transactions for high-frequency micropayments.

```typescript
// After initial payment, save the receipt
const receipt = paymentResponse.result.receipt;

// Reuse for next request
await fetch('https://p402.io/api/a2a', {
  method: 'POST',
  body: JSON.stringify({
    jsonrpc: '2.0',
    method: 'x402/payment-submitted',
    params: {
      payment_id: 'pay_new_456',
      scheme: 'receipt',
      receipt_id: receipt.receipt_id
    },
    id: 2
  })
});
```

This amortizes gas costs across many requests, which is essential for true micropayment economics.

---

## Session-Based Payments

For most developers, session-based payments are simpler than the raw x402 flow. The pattern:

1. **Fund a session** with USDC (one transaction)
2. **Use the session** for many API calls (P402 deducts from the session balance per request)
3. **Session exhausts** when budget runs out

```typescript
// 1. Create session
const session = await createSession({ budget_usd: 10.00 });

// 2. Fund via Base Pay or direct USDC transfer
await fundSession({
  session_id: session.id,
  amount: '10.00',
  tx_hash: '0xabc...',
  source: 'direct',
  network: 'base'
});

// 3. Use session for requests (budget tracked automatically)
const response = await fetch('https://p402.io/api/v2/chat/completions', {
  headers: { 'Authorization': `Bearer ${P402_KEY}` },
  body: JSON.stringify({
    messages: [{ role: 'user', content: 'Hello' }],
    p402: { session_id: session.id, mode: 'cost' }
  })
});

// 4. Check remaining budget
const status = await fetch(`https://p402.io/api/v2/sessions/${session.id}`);
// status.budget.remaining_usd shows what's left
```

Session funding accepts:
- **`base_pay`**: USDC via Coinbase Base Pay (used in the P402 mini app)
- **`direct`**: Direct USDC transfer to P402 treasury, with tx_hash for verification
- **`test`**: Test credits for development (no real funds required)

---

## The X-402-Payment Header

The canonical header format for inline payment submission:

```
X-402-Payment: x402-v1;network=8453;token=0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913;tx=0xabc...
```

Fields are semicolon-delimited key=value pairs:

| Field | Required | Description |
|-------|----------|-------------|
| `x402-v1` | Yes | Protocol version (always first) |
| `network` | Yes | Chain ID (8453 for Base mainnet, 84532 for Sepolia) |
| `token` | Yes | Token contract address (USDC) |
| `tx` | Conditional | Transaction hash (for `onchain` scheme) |
| `sig` | Conditional | EIP-3009 signature (for `exact` scheme) |
| `receipt` | Conditional | Receipt ID (for `receipt` scheme) |
| `amount` | Optional | Payment amount in token units |

---

## USDC Contract Addresses

| Network | Chain ID | USDC Address |
|---------|----------|-------------|
| Base Mainnet | 8453 | `0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913` |
| Base Sepolia | 84532 | `0x036CbD53842c5426634e7929541eC2318f3dCF7e` |

P402 Treasury Address: `0xb23f146251e3816a011e800bcbae704baa5619ec`

---

## Settlement Verification

P402 handles all on-chain verification server-side. When you submit a payment (transaction hash, EIP-3009 signature, or receipt ID), P402 verifies it automatically before releasing the resource. The verification process checks:

1. Transaction receipt confirms success
2. USDC Transfer event to the correct treasury address
3. Transfer amount meets the minimum requirement
4. Sender address is extracted and recorded

Developers using the session-based flow do not need to implement any verification logic. P402 handles it automatically when funding a session with a transaction hash.

For the raw x402 flow, P402 verifies payment proof server-side and returns the result in the `payment-completed` response. See the [P402 dashboard](https://p402.io/dashboard) for settlement logs and transaction history.

---

## Integration Examples

### Minimal: Session-Based (Recommended for Most Developers)

```typescript
// setup.ts
const P402_BASE = 'https://p402.io';
const headers = {
  'Content-Type': 'application/json',
  'Authorization': `Bearer ${process.env.P402_API_KEY}`
};

// Create and fund a session once
async function setupAgent(budgetUsd: number) {
  const session = await fetch(`${P402_BASE}/api/v2/sessions`, {
    method: 'POST', headers,
    body: JSON.stringify({ budget_usd: budgetUsd, agent_identifier: 'my-agent' })
  }).then(r => r.json());

  return session.id;
}

// Use for every AI call
async function chat(sessionId: string, messages: any[], mode = 'balanced') {
  const response = await fetch(`${P402_BASE}/api/v2/chat/completions`, {
    method: 'POST', headers,
    body: JSON.stringify({
      messages,
      p402: { session_id: sessionId, mode, cache: true }
    })
  }).then(r => r.json());

  return {
    content: response.choices[0].message.content,
    cost: response.p402_metadata.cost_usd,
    provider: response.p402_metadata.provider
  };
}
```

### Advanced: x402 Pay-Per-Request

```typescript
// For agents that pay per request via x402 headers
async function payAndChat(messages: any[], walletClient: any) {
  // First request gets 402
  const initial = await fetch(`${P402_BASE}/api/v2/chat/completions`, {
    method: 'POST', headers,
    body: JSON.stringify({ messages, p402: { mode: 'cost' } })
  });

  if (initial.status === 402) {
    const invoice = await initial.json();
    const scheme = invoice.schemes[0]; // Pick preferred scheme

    // Pay on-chain
    const tx = await walletClient.sendTransaction({
      to: scheme.recipient,
      data: encodeUsdcTransfer(scheme.recipient, scheme.amount)
    });
    await tx.wait();

    // Retry with payment proof
    const paymentHeader = `x402-v1;network=${scheme.network};token=${USDC_ADDRESS};tx=${tx.hash}`;
    const paid = await fetch(`${P402_BASE}/api/v2/chat/completions`, {
      method: 'POST',
      headers: { ...headers, 'X-402-Payment': paymentHeader },
      body: JSON.stringify({ messages, p402: { mode: 'cost' } })
    });

    return paid.json();
  }

  return initial.json();
}
```

---

**Start with sessions:** For most developers, session-based payments are the fastest path to production. Create a session, fund it with USDC, and start making API calls. Try it instantly via the [P402 Mini App](https://mini.p402.io) -- connect a Base Account, fund via Base Pay, and chat with real-time cost tracking. For the full x402 payment flow, get your API key at [p402.io](https://p402.io).


---

# P402 A2A Protocol Guide

Integration guide for Google's Agent-to-Agent (A2A) protocol as implemented by P402, including JSON-RPC messaging, task lifecycle, agent discovery, AP2 mandates, the x402 payment extension, and the Bazaar service marketplace.

## Table of Contents
1. [Overview](#overview)
2. [Agent Discovery](#agent-discovery)
3. [JSON-RPC Messaging](#json-rpc-messaging)
4. [Task Lifecycle](#task-lifecycle)
5. [AP2 Mandates](#ap2-mandates)
6. [x402 Payment Extension](#x402-payment-extension)
7. [The Bazaar](#the-bazaar)
8. [TypeScript Types](#typescript-types)
9. [Complete Integration Example](#complete-integration-example)

---

## Overview

Google's A2A protocol standardizes how AI agents discover and communicate with each other. P402 implements A2A with a payment extension, enabling agents to not just exchange tasks but also settle payments for services rendered.

The protocol uses JSON-RPC 2.0 over HTTPS. Agents discover each other via a well-known endpoint, exchange messages containing tasks and artifacts, and optionally settle payments using the x402 extension.

P402's A2A endpoint: `https://p402.io/api/a2a`

---

## Agent Discovery

Every A2A-compliant agent publishes a discovery document:

```bash
curl https://p402.io/.well-known/agent.json
```

**Response:**

```json
{
  "protocolVersion": "0.1",
  "name": "P402 Router Agent",
  "description": "Payment-aware AI orchestration agent. Routes requests to 300+ models with cost optimization and x402 settlement.",
  "url": "https://p402.io",
  "iconUrl": "https://p402.io/icon.png",
  "version": "2.1.0",
  "capabilities": {
    "streaming": true,
    "pushNotifications": true
  },
  "skills": [
    {
      "id": "ai-routing",
      "name": "AI Model Routing",
      "description": "Route AI requests to optimal provider by cost, speed, or quality",
      "tags": ["ai", "routing", "llm"]
    },
    {
      "id": "cost-intelligence",
      "name": "Cost Intelligence",
      "description": "Compare model pricing and optimize AI spending",
      "tags": ["cost", "analytics", "optimization"]
    }
  ],
  "defaultInputModes": ["text"],
  "defaultOutputModes": ["text"],
  "extensions": [
    {
      "uri": "tag:x402.org,2025:x402-payment",
      "config": {
        "schemes": ["exact", "onchain", "receipt"],
        "assets": ["USDC", "USDT"],
        "networks": ["base"]
      }
    }
  ],
  "endpoints": {
    "a2a": {
      "jsonrpc": "https://p402.io/api/a2a",
      "stream": "https://p402.io/api/a2a/stream"
    }
  }
}
```

To discover other agents, check their `/.well-known/agent.json`. The `skills` array tells you what the agent can do. The `extensions` array tells you what protocols it supports (including x402 for payments).

---

## JSON-RPC Messaging

All A2A communication uses JSON-RPC 2.0:

### `message/send`

Send a message to the P402 agent:

```json
{
  "jsonrpc": "2.0",
  "method": "message/send",
  "params": {
    "message": {
      "role": "user",
      "parts": [
        { "type": "text", "text": "Summarize this document in 3 bullet points." }
      ]
    },
    "configuration": {
      "mode": "cost",
      "maxCost": 0.10,
      "provider": "anthropic",
      "model": "claude-sonnet-4-6"
    }
  },
  "id": 1
}
```

**Response:**

```json
{
  "jsonrpc": "2.0",
  "result": {
    "task": {
      "id": "task_abc123",
      "contextId": "ctx_xyz",
      "status": {
        "state": "completed",
        "message": {
          "role": "agent",
          "parts": [
            { "type": "text", "text": "Here are the three key points..." }
          ]
        },
        "timestamp": "2026-02-24T10:30:00Z"
      },
      "metadata": {
        "cost_usd": 0.003,
        "latency_ms": 1200
      }
    }
  },
  "id": 1
}
```

The `configuration` object maps directly to P402 routing options. The `contextId` groups related messages into a conversation for multi-turn interactions.

### Message Structure

Messages contain `parts` that can be text or structured data:

```typescript
interface A2AMessage {
  role: 'user' | 'agent' | 'system';
  parts: Array<
    | { type: 'text'; text: string }
    | { type: 'data'; data: any }
  >;
}
```

---

## Task Lifecycle

Every A2A interaction creates a task. Tasks flow through defined states:

```
pending -> processing -> completed
                     \-> failed
                     \-> cancelled
```

| State | Meaning |
|-------|---------|
| `pending` | Task received, not yet started |
| `processing` | Task is being worked on (streaming may be active) |
| `completed` | Task finished successfully, artifacts available |
| `failed` | Task failed (error in metadata) |
| `cancelled` | Task was cancelled by the requester |

Tasks can produce **artifacts** -- discrete outputs attached to the task:

```typescript
interface A2ATask {
  id: string;
  contextId?: string;
  status: {
    state: 'pending' | 'processing' | 'completed' | 'failed' | 'cancelled';
    message?: A2AMessage;
    timestamp: string;
  };
  artifacts?: Array<{
    id: string;
    name: string;
    parts: A2APart[];
  }>;
  metadata?: {
    cost_usd?: number;
    latency_ms?: number;
  };
}
```

---

## AP2 Mandates

AP2 (Agent-to-Platform) mandates are the authorization primitive for agent spending. A mandate is a pre-signed permission granting an agent the right to spend up to a specified amount.

### CDP Session Auto-Provisioning (Recommended)

When you create a session with `wallet_source: "cdp"` and an `agent_id`, P402 automatically issues a `payment` mandate scoped to that agent — no separate mandate API call needed:

```bash
curl -X POST https://p402.io/api/v2/sessions \
  -H "Content-Type: application/json" \
  -H "x-p402-session: YOUR_SESSION_KEY" \
  -d '{
    "wallet_source": "cdp",
    "agent_id": "my-autonomous-agent",
    "budget_usd": 10.00,
    "expires_in_hours": 24
  }'
```

The response `policy.ap2_mandate_id` field contains the auto-issued mandate ID. All subsequent auto-pay calls through this session are pre-checked against the mandate budget:
- **Budget check:** `amount_spent_usd + request_cost <= max_amount_usd` — returns 403 with `{ error: { type: "mandate_error", code: "MANDATE_BUDGET_EXCEEDED" } }` on failure
- **Expiry check:** mandate `valid_until` matches the session `expires_at`
- **ERC-8004 trust gate:** if `ERC8004_ENABLE_VALIDATION=true`, the agent's on-chain reputation must be ≥ 50 (returns 403 with `SECURITY_PACK_BLOCKED` on failure)

After each successful auto-pay, `budget_spent_usd` on the session increments atomically and ERC-8004 reputation feedback is queued for the agent.

Pre-existing sessions (created before this feature) have no `ap2_mandate_id` in their `policies` and are skipped — fully backwards compatible.

### Manual Mandate Creation

For non-CDP sessions or custom mandate constraints:

```bash
curl -X POST https://p402.io/api/a2a/mandates \
  -H "Content-Type: application/json" \
  -d '{
    "mandate": {
      "type": "intent",
      "user_did": "did:key:user123",
      "agent_did": "did:key:agent456",
      "constraints": {
        "max_amount_usd": 10.00,
        "allowed_categories": ["ai-inference", "web-search"],
        "valid_until": "2026-03-01T00:00:00Z"
      }
    }
  }'
```

**Response:**

```json
{
  "mandate": {
    "id": "mandate_abc",
    "tenant_id": "tenant_default",
    "type": "intent",
    "user_did": "did:key:user123",
    "agent_did": "did:key:agent456",
    "constraints": {
      "max_amount_usd": 10.00,
      "allowed_categories": ["ai-inference", "web-search"],
      "valid_until": "2026-03-01T00:00:00Z"
    },
    "amount_spent_usd": 0,
    "status": "active"
  }
}
```

### Use a Mandate

When an agent makes a request, it references its mandate. P402 checks:
1. Mandate exists and is active
2. Agent DID matches
3. `amount_spent_usd + request_cost <= max_amount_usd`
4. Category is allowed
5. Current time is before `valid_until`

If all checks pass, the request proceeds and `amount_spent_usd` is atomically incremented.

```bash
curl -X POST https://p402.io/api/a2a/mandates/mandate_abc/use \
  -H "Content-Type: application/json" \
  -d '{
    "amount_usd": 0.05,
    "category": "ai-inference",
    "description": "Claude Sonnet 4.6 completion, 500 tokens"
  }'
```

### Mandate Types

| Type | Use Case |
|------|----------|
| `intent` | Pre-authorize a spending budget for a specific agent |
| `cart` | Authorize a batch of specific services at fixed prices |
| `payment` | One-time payment authorization |

### Mandate Statuses

| Status | Meaning |
|--------|---------|
| `active` | Mandate is valid and has remaining budget |
| `exhausted` | `amount_spent_usd >= max_amount_usd` |
| `expired` | Current time is past `valid_until` |
| `revoked` | Manually revoked by the user |

The fail-closed design means: an agent cannot spend a single cent without a valid, active mandate with available budget.

---

## x402 Payment Extension

P402 implements the A2A x402 extension for cryptographic payment negotiation within agent conversations. The extension URI is:

```
tag:x402.org,2025:x402-payment
```

### Three-Message Payment Flow

Within an A2A conversation, payments follow a 3-message flow using dedicated JSON-RPC methods:

**1. `x402/payment-required`** (Agent to Client)

```json
{
  "method": "x402/payment-required",
  "params": {
    "payment_id": "pay_123",
    "schemes": [
      {
        "scheme": "exact",
        "recipient": "0xb23f...",
        "amount": "50000",
        "asset": "USDC",
        "network": "base"
      }
    ],
    "service_description": "AI inference via Claude Sonnet 4.6",
    "expires_at": "2026-02-24T12:00:00Z"
  }
}
```

**2. `x402/payment-submitted`** (Client to Agent)

```json
{
  "method": "x402/payment-submitted",
  "params": {
    "payment_id": "pay_123",
    "scheme": "exact",
    "signature": "0x..."
  }
}
```

**3. `x402/payment-completed`** (Agent to Client)

```json
{
  "result": {
    "payment_id": "pay_123",
    "status": "completed",
    "settlement": {
      "tx_hash": "0xabc...",
      "block_number": 12345678,
      "amount_settled": "50000"
    },
    "receipt": {
      "receipt_id": "rec_456",
      "signature": "0x...",
      "valid_until": "2026-02-25T12:00:00Z"
    }
  }
}
```

For detailed implementation of each payment scheme (exact, onchain, receipt), see `references/payment-flows.md`.

---

## The Bazaar

The Bazaar is P402's service discovery marketplace where agents publish capabilities and their prices. Other agents discover available services, compare pricing, and invoke them with automatic payment settlement.

### Discover Services

```bash
curl https://p402.io/api/v2/bazaar/discover?capability=web-search
```

**Response:**

```json
{
  "resources": [
    {
      "id": "baz_search_001",
      "name": "Web Search Agent",
      "description": "Real-time web search with summarization",
      "endpoint": "https://search-agent.example.com/api/a2a",
      "price_per_call_usd": 0.01,
      "capabilities": ["web-search", "summarization"],
      "health": "healthy",
      "avg_latency_ms": 2500
    }
  ]
}
```

### List All Services

```bash
curl https://p402.io/api/v2/bazaar/listings
```

### Call a Service

```bash
curl -X POST https://p402.io/api/v2/bazaar/call \
  -H "Content-Type: application/json" \
  -d '{
    "resource_id": "baz_search_001",
    "message": {
      "role": "user",
      "parts": [{ "type": "text", "text": "Search for latest AI news" }]
    },
    "mandate_id": "mandate_abc"
  }'
```

P402 handles the payment settlement between the caller and the service provider.

---

## TypeScript Types

Core types for A2A integration:

```typescript
// Message types
type A2ARole = 'user' | 'agent' | 'system';

interface A2APart {
  type: 'text' | 'data';
  text?: string;
  data?: any;
}

interface A2AMessage {
  role: A2ARole;
  parts: A2APart[];
}

// Task types
type A2ATaskState = 'pending' | 'processing' | 'completed' | 'failed' | 'cancelled';

interface A2ATask {
  id: string;
  contextId?: string;
  status: {
    state: A2ATaskState;
    message?: A2AMessage;
    timestamp: string;
  };
  artifacts?: Array<{
    id: string;
    name: string;
    parts: A2APart[];
  }>;
  metadata?: {
    cost_usd?: number;
    latency_ms?: number;
  };
}

// Mandate types
type MandateType = 'intent' | 'cart' | 'payment';

interface AP2Mandate {
  id: string;
  tenant_id: string;
  type: MandateType;
  user_did: string;
  agent_did: string;
  constraints: {
    max_amount_usd: number;
    allowed_categories?: string[];
    valid_until?: string;
  };
  signature?: string;
  public_key?: string;
  amount_spent_usd: number;
  status: 'active' | 'exhausted' | 'expired' | 'revoked';
}

// x402 Extension types
const X402_EXTENSION_URI = 'tag:x402.org,2025:x402-payment';

type X402PaymentScheme = 'exact' | 'onchain' | 'receipt';

interface X402PaymentRequired {
  payment_id: string;
  schemes: Array<{
    scheme: X402PaymentScheme;
    recipient: string;
    amount: string;
    asset: string;
    network: string;
    domain?: string;
    verifyingContract?: string;
    nonce?: string;
    valid_until?: string;
  }>;
  service_description: string;
  expires_at: string;
}

interface X402PaymentSubmitted {
  payment_id: string;
  scheme: X402PaymentScheme;
  signature?: string;  // For exact (EIP-3009)
  tx_hash?: string;    // For onchain
  receipt_id?: string; // For receipt reuse
}

interface X402PaymentCompleted {
  payment_id: string;
  settlement?: {
    tx_hash: string;
    block_number?: number;
    amount_settled: string;
    fee_usd?: number;
  };
  receipt?: {
    receipt_id: string;
    signature: string;
    valid_until?: string;
  };
  status: 'completed' | 'failed';
}

// Agent Card (discovery)
interface AgentCard {
  protocolVersion: string;
  name: string;
  description: string;
  url: string;
  iconUrl?: string;
  version?: string;
  capabilities?: Record<string, any>;
  skills?: Array<{
    id: string;
    name: string;
    description: string;
    tags?: string[];
  }>;
  defaultInputModes?: string[];
  defaultOutputModes?: string[];
  extensions?: Array<{
    uri: string;
    config?: any;
  }>;
  endpoints?: Record<string, any>;
}
```

---

## Complete Integration Example

An agent that discovers P402, creates a mandate, and makes a paid AI request:

```typescript
// 1. Discover P402's capabilities
const agentCard = await fetch('https://p402.io/.well-known/agent.json')
  .then(r => r.json());

const supportsPayments = agentCard.extensions?.some(
  (e: any) => e.uri === 'tag:x402.org,2025:x402-payment'
);

// 2. Create a spending mandate
const mandate = await fetch('https://p402.io/api/a2a/mandates', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    mandate: {
      type: 'intent',
      user_did: 'did:key:myuser',
      agent_did: 'did:key:myagent',
      constraints: {
        max_amount_usd: 5.00,
        allowed_categories: ['ai-inference'],
        valid_until: new Date(Date.now() + 86400000).toISOString()
      }
    }
  })
}).then(r => r.json());

// 3. Send a task via A2A
const task = await fetch('https://p402.io/api/a2a', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    jsonrpc: '2.0',
    method: 'message/send',
    params: {
      message: {
        role: 'user',
        parts: [{ type: 'text', text: 'What are the top 3 trends in AI this week?' }]
      },
      configuration: { mode: 'balanced' }
    },
    id: 1
  })
}).then(r => r.json());

console.log('Task:', task.result.task.id);
console.log('Response:', task.result.task.status.message.parts[0].text);
console.log('Cost:', task.result.task.metadata.cost_usd);
```

---

**Build your first agent integration:** Start by discovering P402's capabilities at [p402.io/.well-known/agent.json](https://p402.io/.well-known/agent.json), then create a mandate and send your first A2A message. The [P402 Mini App](https://mini.p402.io) demonstrates the full session and payment flow in a consumer-friendly interface. Get your API key at [p402.io](https://p402.io).
