>_ DOCS / TUTORIAL

BUILD A
BUDGET AGENT.

An AI agent that tracks its own spending, enforces hard budget caps, and automatically uses cached responses to stretch every dollar.

What you'll build

✓A Python agent with a $5 hard spending cap
✓Automatic cost-optimised routing on every request
✓Semantic cache that makes repeated queries free
✓Budget exhaustion handling with graceful shutdown
✓Real-time spend tracking via session stats

Prerequisites

1.A P402 API key — create one free
2.Python 3.9+ with pip install openai requests

Create a Session

A session is a budget-capped container. Every LLM call made with a session's ID is charged against its budget. When the budget is exhausted, the session rejects further requests — no surprise bills.

python

import os, requests

P402_API_KEY = os.environ["P402_API_KEY"]

def create_session(budget_usd: float) -> str:
    resp = requests.post(
        "https://p402.io/api/v2/sessions",
        headers={"Authorization": f"Bearer {P402_API_KEY}"},
        json={"budget_usd": budget_usd},
        timeout=10,
    )
    resp.raise_for_status()
    data = resp.json()
    print(f"Session {data['id']} — budget ${data['budget_usd']:.2f}")
    return data["id"]

SESSION_ID = create_session(5.00)   # Hard cap: $5

Wire Up the Agent

P402 is OpenAI-compatible. Replace the base URL and pass your session ID in the extra_body. No other SDK changes needed.

python

from openai import OpenAI

client = OpenAI(
    api_key=P402_API_KEY,
    base_url="https://p402.io/api/v2",
)

def ask(question: str, session_id: str) -> str:
    """Send a question and return the answer text."""
    response = client.chat.completions.create(
        model="auto",          # P402 picks the cheapest model that answers well
        messages=[{"role": "user", "content": question}],
        extra_body={
            "p402": {
                "session_id": session_id,
                "mode": "cost",    # Optimise for lowest cost
                "cache": True,     # Return cached answer if identical query seen before
            }
        },
    )

    # P402 metadata is attached to every response
    meta = getattr(response, "p402_metadata", {})
    provider = meta.get("provider", "unknown")
    cost     = meta.get("cost_usd", 0)
    cached   = meta.get("cached", False)

    label = "CACHED (free)" if cached else f"${cost:.4f} via {provider}"
    print(f"  [{label}]")

    return response.choices[0].message.content or ""

Track Spend in Real Time

Poll the session stats endpoint before each request. If you're within 10% of the cap, warn the user. At 100%, exit cleanly.

python

def get_session_stats(session_id: str) -> dict:
    resp = requests.get(
        f"https://p402.io/api/v2/sessions/{session_id}/stats",
        headers={"Authorization": f"Bearer {P402_API_KEY}"},
        timeout=5,
    )
    resp.raise_for_status()
    return resp.json()

def budget_remaining(session_id: str) -> float:
    stats = get_session_stats(session_id)
    spent  = stats.get("budget_spent_usd", 0)
    budget = stats.get("budget_usd", 0)
    return budget - spent

Handle Budget Exhaustion

When the session is exhausted, P402 returns HTTP 402 with error code SESSION_BUDGET_EXCEEDED. Catch it and gracefully shut the agent down or provision a new session.

python

import openai

def safe_ask(question: str, session_id: str) -> str | None:
    remaining = budget_remaining(session_id)

    if remaining <= 0:
        print("Budget exhausted. Shutting down.")
        return None

    if remaining < 0.50:
        print(f"Warning: only ${remaining:.2f} remaining.")

    try:
        return ask(question, session_id)
    except openai.BadRequestError as e:
        if "SESSION_BUDGET_EXCEEDED" in str(e):
            print("Session budget exhausted mid-run.")
            return None
        raise

Run the Agent

Put it together. The agent processes a queue of questions, tracks spend, and stops when the budget is gone.

python

QUESTIONS = [
    "What is the x402 payment protocol?",
    "Explain EIP-3009 transferWithAuthorization.",
    "What is the difference between cost and quality routing?",
    "What is the x402 payment protocol?",   # ← identical — will be served from cache
    "How does semantic caching work?",
]

def main():
    session_id = create_session(5.00)
    print(f"\nStarting agent with $5.00 budget\n{'─'*45}")

    for i, question in enumerate(QUESTIONS, 1):
        print(f"\nQ{i}: {question[:60]}...")
        answer = safe_ask(question, session_id)
        if answer is None:
            break
        print(f"A: {answer[:120]}...")

    stats = get_session_stats(session_id)
    print(f"\n{'─'*45}")
    print(f"Total spent:  ${stats['budget_spent_usd']:.4f}")
    print(f"Requests:     {stats['request_count']}")
    print(f"Cache hits:   {stats.get('cache_hits', 0)}")

if __name__ == "__main__":
    main()

Expected output

Starting agent with $5.00 budget
─────────────────────────────────────────────

Q1: What is the x402 payment protocol?...
  [$0.0003 via deepseek]
A: x402 is a machine-native payment standard...

Q2: Explain EIP-3009 transferWithAuthorization....
  [$0.0004 via deepseek]
A: EIP-3009 defines a way for token holders to...

Q3: What is the difference between cost and quality...
  [$0.0002 via deepseek]

Q4: What is the x402 payment protocol?...
  [CACHED (free)]                       ← identical query, zero cost

Q5: How does semantic caching work?...
  [$0.0003 via deepseek]

─────────────────────────────────────────────
Total spent:  $0.0012
Requests:     5
Cache hits:   1

TypeScript Variant

Same pattern, zero extra dependencies beyond the official OpenAI SDK.

typescript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.P402_API_KEY,
  baseURL: 'https://p402.io/api/v2',
});

// Create session
const session = await fetch('https://p402.io/api/v2/sessions', {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${process.env.P402_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({ budget_usd: 5 }),
}).then((r) => r.json());

// Ask with budget cap
const response = await client.chat.completions.create({
  model: 'auto',
  messages: [{ role: 'user', content: 'Explain EIP-3009.' }],
  // @ts-expect-error — P402 extension field
  p402: { session_id: session.id, mode: 'cost', cache: true },
});

const meta = (response as Record<string, unknown>).p402_metadata as {
  cost_usd: number;
  cached: boolean;
  provider: string;
} | undefined;

console.log(`Cost: $${meta?.cost_usd ?? 0} via ${meta?.provider}`);
console.log(response.choices[0]?.message.content);

What's next

You have a working budget agent. Here's how to go deeper:

BUILD ABUDGET AGENT.

Create a Session

Wire Up the Agent

Track Spend in Real Time

Handle Budget Exhaustion

Run the Agent

TypeScript Variant

BUILD A
BUDGET AGENT.