← local-ai
meica.ch/werkstatt/local-ai/litellm-proxy-keys
← Local AI

LiteLLM proxy — one endpoint, rotated keys

Running LiteLLM as a proxy between my applications and the various LLM APIs I use. One endpoint, standard OpenAI-compatible API, backend switching in the proxy config.

The motivation: I use several LLMs (Anthropic, OpenAI, local llama.cpp) from several applications. Without a proxy, each application has its own API key, its own model selection logic, its own retry handling. When a key rotates or a model changes, I update multiple places.

With the proxy: applications talk to http://localhost:8000. The proxy handles routing, key management, and fallback.

# litellm_config.yaml
model_list:
  - model_name: default
    litellm_params:
      model: anthropic/claude-3-haiku-20240307
      api_key: os.environ/ANTHROPIC_API_KEY

  - model_name: fast
    litellm_params:
      model: openai/gpt-4o-mini
      api_key: os.environ/OPENAI_API_KEY

  - model_name: local
    litellm_params:
      model: openai/mistral-7b
      api_base: http://localhost:8080
      api_key: none

router_settings:
  routing_strategy: simple-shuffle
  fallbacks: [{default: [fast]}]

Running as a systemd service on the homelab. Applications configure OPENAI_BASE_URL=http://homelab:8000 and OPENAI_API_KEY=litellm-virtual-key. The actual API keys live only in the proxy’s environment.

Key rotation: update the env var on the proxy, restart the service. No application changes.

The fallback config has saved me twice — once when the Anthropic API had a brief outage, once when I hit a rate limit.

The dashboard is useful for tracking token usage by application. Not essential but nice.