LiteLLM proxy — one endpoint, rotated keys
Running LiteLLM as a proxy between my applications and the various LLM APIs I use. One endpoint, standard OpenAI-compatible API, backend switching in the proxy config.
The motivation: I use several LLMs (Anthropic, OpenAI, local llama.cpp) from several applications. Without a proxy, each application has its own API key, its own model selection logic, its own retry handling. When a key rotates or a model changes, I update multiple places.
With the proxy: applications talk to http://localhost:8000. The proxy handles routing, key management, and fallback.
# litellm_config.yaml
model_list:
- model_name: default
litellm_params:
model: anthropic/claude-3-haiku-20240307
api_key: os.environ/ANTHROPIC_API_KEY
- model_name: fast
litellm_params:
model: openai/gpt-4o-mini
api_key: os.environ/OPENAI_API_KEY
- model_name: local
litellm_params:
model: openai/mistral-7b
api_base: http://localhost:8080
api_key: none
router_settings:
routing_strategy: simple-shuffle
fallbacks: [{default: [fast]}]
Running as a systemd service on the homelab. Applications configure OPENAI_BASE_URL=http://homelab:8000 and OPENAI_API_KEY=litellm-virtual-key. The actual API keys live only in the proxy’s environment.
Key rotation: update the env var on the proxy, restart the service. No application changes.
The fallback config has saved me twice — once when the Anthropic API had a brief outage, once when I hit a rate limit.
The dashboard is useful for tracking token usage by application. Not essential but nice.