GitHub - rickcrawford/tokenomics [proxy]

  _____     _                            _
 |_   _|__ | | _____ _ __   ___  _ __ __(_) ___ ___
   | |/ _ \| |/ / _ \ '_ \ / _ \| '_ ` _ \| |/ __/ __|
   | | (_) |   <  __/ | | | (_) | | | | | | | (__\__ \
   |_|\___/|_|\_\___|_| |_|\___/|_| |_| |_|_|\___|___/

Personal Guardrails for Token Usage - Safety first (PII, prompts, rules), then scoped tokens with request and cost controls.

Tokenomics is an OpenAI-compatible reverse proxy you run yourself. It gives you the features of an AI gateway (guardrails, budgets, rate limits, multi-provider routing) but under your control from your client. No vendor lock-in, no sending traffic through a third party. Issue scoped wrapper tokens instead of raw API keys; each token enforces what models, content, and spend are allowed.

One binary. Zero code changes. Drop it in front of any agent that speaks the OpenAI protocol.

Created by Rick Crawford • LinkedIn • MIT License

Keywords

ai-proxy • llm-gateway • token-budgeting • cost-control • safety-guardrails • prompt-injection-detection • multi-provider-routing • rate-limiting • pii-masking • api-gateway • openai-compatible • agent-security

What It Does
Installation
Quick Start
Features
Use Cases
Documentation
OpenClaw Integration

What It Does

🔒 Safety Guardrails (First)

Content inspection runs on every request before it hits the provider. PII, prompts, and rules stay under your control.

PII masking auto-redacts SSNs, credit cards, emails, API keys, private keys, and 6 more types
Content rules regex, keyword, and PII rules with fail/warn/log/mask on input, output, or both
System prompts injected server-side so agents always run under the right instructions
Jailbreak detection blocks prompt injection attempts that try to override instructions
Retry and fallback chains recover from provider failures with cheaper models

🛑 PATs with Request and Cost Controls

Create scoped tokens (PATs) instead of handing out API keys. Each token has budgets, rate limits, and model restrictions. When limits are hit, the proxy blocks requests.

Token budgets daily, hourly, and monthly caps per token
Rate limiting requests/min, tokens/hour, max parallel; sliding or fixed window
Model allowlists so not every task burns your most expensive models
Token expiration durations (24h, 7d) or exact timestamps for temporary access
Multi-provider routing send requests to the provider that fits your constraints

📋 Usage Tracking

Every conversation that flows through the proxy is optionally recorded. Session logs capture the full request/response exchange with cost details. Store on disk, in Redis, or both for team access.

Per-token conversation logs in markdown, grouped by date and session
File or Redis backends for local development or shared team deployments
Configurable per policy, so sensitive tokens skip recording while others capture everything
Pattern-based file naming with {token_hash}, {date}, and {token_hash} placeholders

📊 Cost Attribution

The ledger writes a JSON summary for every proxy session into a .tokenomics/ directory. Commit it alongside your code. Over time, you get a complete record of token consumption per feature, per branch, per agent, per team.

Per-session JSON with request-level detail and rollups by model, provider, and token
Git context captures branch, start commit, and end commit for cost-per-feature analysis
Provider metadata normalizes cached tokens, reasoning tokens, actual model served, and rate limits
CLI commands (ledger summary, ledger sessions, ledger show) for usage analysis
Cost-per-feature attribution by committing .tokenomics/ and querying by branch

🔄 Multi-Provider Support

One wrapper token can route to any provider. The policy decides which API key to use based on model and constraints. Switch providers without changing your agent code.

Supported providers include OpenAI, Anthropic, Azure OpenAI, Google Gemini, Groq, Mistral, Cohere, Perplexity, DeepSeek, Together AI, Fireworks AI, Replicate, AWS Bedrock, and any OpenAI-compatible endpoint.

🔍 Observability

Every request produces structured JSON logs with token counts, latency, upstream IDs, rule matches, retry counts, and provider details. Webhooks fire on token events, violations, budget alerts, rate limit hits, and completions.

Installation

curl -fsSL https://github.com/rickcrawford/tokenomics/releases/latest/download/install.sh | bash

Or build from source:

git clone https://github.com/rickcrawford/tokenomics.git
cd tokenomics && make build
sudo cp bin/tokenomics /usr/local/bin/

Verify: tokenomics --help

Quick Start

Full guide: Quick Start.

1. Set environment variables

export OPENAI_PAT="<your-openai-api-key>"
export TOKENOMICS_HASH_KEY="<any-random-secret-string>"

2. Create a wrapper token

tokenomics token create --policy '{"base_key_env":"OPENAI_PAT"}'

3. Run

export TOKENOMICS_KEY="tkn_<paste-your-token-here>"
tokenomics run python my_script.py

The run command starts the proxy, configures environment variables, runs your command, and cleans up. No separate server setup needed. Admin is disabled for this ephemeral proxy unless you pass --admin and admin.enabled is true in config.

Default directory: Tokenomics stores data (tokens, ledger, certs) in ~/.tokenomics/ by default. Use --dir .tokenomics to use the current directory, or --dir /path for a custom location.

Embedded admin UI: Start the proxy and open http://localhost:8080 or https://localhost:8443 to view analytics, keys, sessions, and memory dashboards.

See examples/ for provider configs, sample policies, and an end-to-end walkthrough.

Features

Guardrail Type	Feature	Description
Safety	PII masking	Auto-redact SSNs, credit cards, emails, API keys, and 7 more types
Safety	Content rules	Regex, keyword, and PII rules with fail/warn/log/mask actions
Safety	System prompts	Server-side instruction injection on every request
Safety	Jailbreak detection	Detect prompt injection attempts that override instructions
Safety	Retry and fallback	Auto-recover from failures with model fallback chains
Cost Control	Token budgets	Per-token daily/monthly spending caps
Cost Control	Rate limiting	Requests/min, tokens/hour, max parallel; sliding or fixed window
Cost Control	Model allowlists	Exact match or regex-based model filtering
Cost Control	Token expiration	Temporary access with durations (24h, 7d) or timestamps
Tracking	Conversation logs	Per-token markdown logs of user/assistant exchanges
Tracking	Redis backend	Shared memory across distributed agents
Tracking	Session JSON	Per-session logs with request-level detail and rollups
Tracking	Git context	Branch, commit start/end for cost-per-feature analysis
Tracking	Provider metadata	Cached tokens, reasoning tokens, actual model, rate limits
Tracking	CLI commands	`ledger summary`, `ledger sessions`, `ledger show`
Routing	Multi-provider	Route to 17+ providers with model-based selection
Routing	Remote sync	Load tokens from a central config server via webhooks
Observability	Structured logging	JSON logs with rule matches, upstream IDs, and costs
Observability	Webhooks	Events for violations, budget alerts, rate limits
Security	Encryption	AES-256-GCM at-rest encryption for policies
Security	Token isolation	Scoped wrapper tokens instead of raw API keys

Use Cases

AI Safety & Compliance Teams Enforce content policies, detect prompt injection, mask PII, and maintain audit logs for regulated environments.

Multi-Tenant SaaS Platforms Issue scoped tokens to customers with per-customer budgets, rate limits, and model restrictions. Track costs per tenant.

Agent Fleet Operators Deploy 100+ autonomous agents with unified cost controls, safety guardrails, and usage tracking across your entire fleet.

Cost-Conscious Development Teams Set monthly budgets, prevent runaway spend with fallback providers, and analyze cost-per-feature using git context.

LLM Experimentation Switch between providers (OpenAI, Anthropic, Groq) without code changes. A/B test models and track costs.

Enterprise API Management Replace expensive third-party gateways with a single binary you control. No vendor lock-in, no traffic routing through third parties.

Documentation

Topic	Description
Features	Complete feature reference organized by category
Quick Start	Fast setup and first request in minutes
Examples	Provider configs, sample policies, webhook collector, env template
Configuration	config.yaml fields, environment variables, CLI flags
Secrets & Environment	API key management, .env file handling, secret rotation
Policies	Policy JSON schema, model filtering, rules, prompts, memory
Token Management	Creating, inspecting, updating, and deleting tokens
Agent Integration	Connecting agents via `run`, `init`, or manual proxy setup
TLS	Auto-generated certificates, CA trust, custom certs
Stats & Logging	Request logging, /stats endpoint, usage tracking
Events & Webhooks	Webhook events for token CRUD, rule violations, budget alerts
Multi-Model Routing	Provider routing, model matching, auth schemes, fallback chains
Session Ledger	Per-session token tracking, CLI commands, session JSON format
Web Admin	Embedded admin UI routes, APIs, auth, and architecture
Admin UI Guide	Admin tabs, policy editor workflow, embedded docs, and maintenance
Distribution	Installation methods, pre-built binaries, release process
OpenClaw Integration	Connect OpenClaw agents to Tokenomics guardrails

OpenClaw Integration

Tokenomics provides personal guardrails for OpenClaw autonomous agents. Set budgets, enforce safety policies, and track costs across distributed agent fleets, all without modifying agent code.

Example: Run a Slack bot with:

Daily budget: 1M tokens
Safety rules: Block jailbreaks, mask PII, detect injection attempts
Fallback providers: Try Anthropic if OpenAI is over capacity
Usage tracking: Record conversations and cost attribution

See examples/openclaw for complete examples (Slack, Discord, personal assistant) and docs/OPENCLAW_INTEGRATION.md for the integration guide.

GitHub - rickcrawford/tokenomics