Control Plane for Multi-region Architecture (Enterprise)
Learn how to deploy LiteLLM across multiple regions while maintaining centralized administration and avoiding duplication of management overhead.
Overviewโ
When scaling LiteLLM for production use, you may want to deploy multiple instances across different regions or availability zones while maintaining a single point of administration. This guide covers how to set up a distributed LiteLLM deployment with:
- Regional Worker Instances: Handle LLM requests for users in specific regions
- Centralized Admin Instance: Manages configuration, users, keys, and monitoring
Architecture Pattern: Regional + Admin Instancesโ
Typical Deployment Scenarioโ
Benefits of This Architectureโ
- Reduced Management Overhead: Only one instance needs admin capabilities
- Regional Performance: Users get low-latency access from their region
- Centralized Control: All administration happens from a single interface
- Security: Limit admin access to designated instances only
- Cost Efficiency: Avoid duplicating admin infrastructure
Configurationโ
Admin Instance Configurationโ
The admin instance handles all management operations and provides the UI.
Environment Variables for Admin Instance:
# Keep admin capabilities enabled (default behavior)
# DISABLE_ADMIN_UI=false # Admin UI available
# DISABLE_ADMIN_ENDPOINTS=false # Management APIs available
DISABLE_LLM_API_ENDPOINTS=true # LLM APIs disabled
DATABASE_URL=postgresql://user:pass@global-db:5432/litellm
LITELLM_MASTER_KEY=your-master-key
Worker Instance Configurationโ
Worker instances handle LLM requests but have admin capabilities disabled.
Environment Variables for Worker Instances:
# Disable admin capabilities
DISABLE_ADMIN_UI=true # No admin UI
DISABLE_ADMIN_ENDPOINTS=true # No management endpoints
DATABASE_URL=postgresql://user:pass@global-db:5432/litellm
LITELLM_MASTER_KEY=your-master-key
Environment Variables Referenceโ
DISABLE_ADMIN_UIโ
Disables the LiteLLM Admin UI interface.
- Default:
false - Worker Instances: Set to
true - Admin Instance: Leave as
false(or don't set)
# Worker instances
DISABLE_ADMIN_UI=true
Effect: When enabled, the web UI at /ui becomes unavailable.
DISABLE_ADMIN_ENDPOINTSโ
Disables all management/admin API endpoints.
- Default:
false - Worker Instances: Set to
true - Admin Instance: Leave as
false(or don't set)
# Worker instances
DISABLE_ADMIN_ENDPOINTS=true
Disabled Endpoints Include:
/key/*- Key management/user/*- User management/team/*- Team management/config/*- Configuration updates- All other administrative endpoints
Available Endpoints (when disabled):
/chat/completions- LLM requests/v1/*- OpenAI-compatible APIs/vertex_ai/*- Vertex AI pass-through APIs/bedrock/*- Bedrock pass-through APIs/health- Basic health check/metrics- Prometheus metrics- All other LLM API endpoints
DISABLE_LLM_API_ENDPOINTSโ
Disables all LLM API endpoints.
- Default:
false - Worker Instances: Leave as
false(or don't set) - Admin Instance: Set to
true
# Admin instance
DISABLE_LLM_API_ENDPOINTS=true
Disabled Endpoints Include:
/chat/completions- LLM requests/v1/*- OpenAI-compatible APIs/vertex_ai/*- Vertex AI pass-through APIs/bedrock/*- Bedrock pass-through APIs- All other LLM API endpoints
Available Endpoints (when disabled):
/key/*- Key management/user/*- User management/team/*- Team management/config/*- Configuration updates- All other administrative endpoints
LITELLM_UI_API_DOC_BASE_URLโ
Optional override for the API Reference base URL (used in sample code/docs) when the admin UI runs on a different host than the proxy.
Usage Patternsโ
Client Usageโ
For LLM Requests (use regional endpoints):
import openai
# US users
client_us = openai.OpenAI(
base_url="https://us.company.com/v1",
api_key="your-litellm-key"
)
# EU users
client_eu = openai.OpenAI(
base_url="https://eu.company.com/v1",
api_key="your-litellm-key"
)
response = client_us.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}]
)
For Administration (use admin endpoint):
import requests
# Create a new API key
response = requests.post(
"https://admin.company.com/key/generate",
headers={"Authorization": "Bearer sk-1234"},
json={"duration": "30d"}
)
Related Documentationโ
- Virtual Keys - Managing API keys and users
- Health Checks - Monitoring instance health
- Prometheus Metrics - Collecting metrics
- Production Deployment - Production best practices