Claude Code's ANTHROPIC_BASE_URL env var lets you redirect every request anywhere you want — but the 2-line setup hides a real production checklist. What breaks, what to watch for, when it's worth it.
---
title: Migrating Claude Code to a custom backend in 2 lines (and what to actually watch for)
published: True
description: Claude Code's ANTHROPIC_BASE_URL env var lets you redirect every request anywhere you want — but the 2-line setup hides a real production checklist. What breaks, what to watch for, when it's worth it.
tags: ai, claudeai, anthropic, devops
cover_image:
---
**TL;DR:** Claude Code's `ANTHROPIC_BASE_URL` env var lets you redirect every request to any compatible backend in one shell line. Almost no one does it, but it unlocks caching, rate-limit pooling, multi-vendor routing, audit logging, and billing routing without rewriting any code. Here's the 2-line setup, the things that quietly break, and the production checklist for actually running it.
---
export ANTHROPIC_BASE_URL=https://your-proxy.example.com export ANTHROPIC_API_KEY=your-key claude
That's it. The Anthropic SDK (which Claude Code is built on) doesn't care where it's sending requests, as long as the response comes back in Anthropic Messages format. It'll append `/v1/messages` automatically, forward all your `anthropic-version` and `anthropic-beta` headers, and stream SSE the same way.
Same trick works with Cursor, Cline, Continue, and anything else that builds on the Anthropic SDK.
---
I've found five patterns that justify a custom backend:
Anthropic's prompt cache feature requires manual `cache_control` markers that most teams skip. A proxy can inject them server-side on every system message. For a Claude Code session with a 20-30K-token system prompt and 50+ turns, this cuts your bill by 60-80%.
Most teams have 3-5 API keys spread across projects. Without a proxy, each is rate-limited independently — you can hit 429 on one while another sits idle. A proxy can round-robin across keys, exposing one virtual key with the combined rate budget.
Want to route `claude-haiku-4-5` calls to one provider and `claude-opus-4-7` calls to another (maybe one offers Haiku throughput discounts)? A proxy inspects the `model` field and routes accordingly. Your Claude Code config doesn't change.
Claude Code makes a *lot* of requests. Without instrumentation, you have no idea what each session cost. A proxy can log per-request metadata (model, tokens, latency, session id) to a warehouse.
Different team members hitting different cost centers? A proxy can map API keys to billing tags and forward usage to finance.
---
The 2-line setup works for the happy path. In practice, here's what bites you.
Anthropic returns chunked SSE for `stream: true`. If your proxy buffers responses or doesn't flush each chunk immediately, you'll see massive perceived latency — Claude Code will spin while waiting for the full response.
Fix: pass thro
// related articles
Twitter/X: 🚀 While you’re sleeping, the market is printing. I’m a futures trader, sharing my personal high…
Twitter/X: anthropic is literally crushing the new model benchmark https://t.co/Mhyy2SZtUQ
Twitter/X: EU regulator evaluating implications of Anthropic Mythos curbs after US directive https://t.co/FE…