Managing LLMs & Customer LLM Endpoints
This guide explains how to manage Large Language Models (LLMs), token pricing, and customer-provided LLM endpoints in Deepdesk. For technical architecture, see LLM Gateway.
1. Managing LLMs
Viewing LLMs
Navigate to:
Admin → LLM Configs → LLMs
You can view:
- LLM code
- Name
- Model type
- Current token costs (per 1M tokens)
Use filters to narrow by model type or realtime support.
Creating or Editing an LLM
- Click Add LLM or select an existing LLM
- Configure:
- Code (immutable identifier, e.g.
gpt-4) - Name (human-readable label)
- Model type (Chat completion, Realtime, or Embeddings)
- Code (immutable identifier, e.g.
- Save changes
Changing an LLM code may impact references across pricing and configurations.
2. Managing Token Costs
Token pricing is versioned and time-based.
Viewing Token Costs
Navigate to:
Admin → LLM Token Costs
Each entry defines:
- LLM
- Input token price (per 1M)
- Output token price (per 1M)
- Currency
- Start date
- Optional end date
Adding or Editing Token Costs
- Click Add LLM Token Cost or select an existing entry
- Configure:
- LLM
- Start date
- End date (optional)
- Text input tokens
- Text output tokens
- Audio input/output tokens (if applicable)
- Currency
The system automatically selects the pricing valid at request time.
- Do not overlap date ranges for the same LLM
- Always add a new pricing entry for changes
- Avoid editing historical prices
3. Managing LLM Configs (Endpoints)
Navigate to:
Admin → LLM Configs → Add LLM Config
The default Deepdesk LLM endpoints are automatically provisioned by Deepdesk, and not visible in the Admin interface. Only customer-managed endpoints are shown here.
Supported Models
Each LLM Config explicitly lists which models it supports. Only selected models can be routed through the endpoint.
Provider Configuration
Azure (Deepdesk-managed)
Fields:
- Base URL
- API key
Used for Deepdesk-provisioned Azure OpenAI deployments.
Azure (Customer-managed)
Customer endpoints authenticate using OAuth.
Required fields:
- Base URL
- OAuth token URL
- OAuth client ID
- OAuth client secret
- OAuth scopes
Optional:
- Deployment prefix
Deployment Naming
Deepdesk by default assumes that deployment names match the model code.
For example, when an eval is requested for the model gpt-4, the corresponding endpoint will be https://customer-base-url/openai/deployments/gpt-4/chat/completions.
Customers can override this behavior by specifying a deployment prefix, as they may have configured dedicated endpoints for Deepdesk, alongside other ones.
If a deployment prefix is set (e.g., custom-), the gateway will look for deployments named custom-gpt-4, so https://customer-base-url/openai/deployments/custom-gpt-4/chat/completions.
Note that this still requires the deployment name to match the model code, with the optional prefix.
Secrets are stored securely and loaded at runtime.
Config propagation may take several minutes after saving. Secrets are stored in Secret Manager and synced every 10 minutes.
4. Customer-Provided Endpoints
Customer endpoints allow clients to:
- Use their own Azure OpenAI subscription
- Retain compliance and data locality
- Control quotas and models
Deepdesk acts as a secure proxy via the LLM Gateway.
Request Flow
- Backend sends request to LLM Gateway
- Gateway resolves:
- LLM Config
- Endpoint
- Authentication is applied
- Request forwarded to Azure OpenAI
- Response returned to Deepdesk
5. Load Balancing & Failover
For Deepdesk-managed endpoints:
- Primary and secondary Azure regions are configured
- Automatic failover is handled by the LLM Gateway
Customer-managed endpoints are responsible for their own redundancy.
6. Common Workflows
Add a New Model
- Create LLM
- Add token pricing
- Enable model in LLM Configs
Update Pricing
- Add new token cost entry
- Set a future start date
- Leave previous pricing unchanged
Onboard Customer Endpoint
- Create LLM Config
- Select provider = Azure
- Configure OAuth credentials
- Select supported models
- Save