AI GATEWAY

One SDK. Any model. Any provider.

Twelve OpenAI endpoints, native Anthropic /v1/messages, and native Gemini /v1beta. Swap SDKs, swap providers, swap models — without editing your app.

01

Two-line migration

Keep your SDK. Change the base URL.

import openai

client = openai.OpenAI(
    api_key="vsk_...",                    # Verosek virtual key
    base_url="http://your-gateway/v1",   # ← only line that changed
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "How many users are in the database?"}],
)
# Tools handled internally. Policy enforced. Every step audited.
print(response.choices[0].message.content)
02

Cross-SDK routing matrix

Run Claude through the OpenAI SDK. GPT through Anthropic. Any combination, any time.

SDK \\ ProviderOpenAIAnthropicGemini
OpenAI SDK
Anthropic SDK
Gemini SDK
03

API surface

Fifteen endpoints. Three SDKs. One gateway.

/v1/chat/completions
/v1/completions
/v1/responses
/v1/embeddings
/v1/images/generations
/v1/images/edits
/v1/images/variations
/v1/audio/speech
/v1/audio/transcriptions
/v1/audio/translations
/v1/moderations
GET /v1/models
POST /v1/messages (Anthropic)
/v1beta/models/{model}:generateContent (Gemini)
/v1beta/models/{model}:embedContent
04

Virtual keys

Identity for non-human actors.

Budget caps

Daily / weekly / monthly spend limits per key. Stops on breach.

TTL

Every key expires. No forgotten service accounts.

Rotate

One-click rotate with grace period for in-flight calls.

Revoke

Instant kill. Background audit continues.

Audit binding

Every trace carries the key_id. Permanent forensic link.

05

Routing

Weighted routing. Priority fallback. Automatic cooldown.

Deployments cool off after N consecutive failures. Traffic drains to the next priority tier until health returns. No per-request timeout wait.

06

Performance contract

< 30 ms P99 overhead. Measured on baseline profile.

StageBudgetWhat it does
Virtual-key check~1 msRedis cache hit on key_id
Request translation~1 msPure Python, no I/O
Shield pre-LLM enforce< 30 msLocal PII + secrets. ML classifiers run async.
Shield post-LLM enforce< 30 msSame pattern as pre-LLM.
Audit enqueue~0.5 msRedis RPUSH, never blocks.

Two lines to migrate. Every SDK to every provider.

Point your client at the gateway. Keep shipping.