Large Language Models (LLM)

This page provides an overview of how Large Language Models (LLMs), their configurations, and pricing are managed in Deepdesk. For technical details on routing and integration, see LLM Gateway.

What is an LLM?

An LLM in Deepdesk is a logical model definition (e.g., gpt-4, gpt-4o-mini, text-embedding-ada-002). Each LLM defines:

Code (unique identifier)
Display name
Model type (Chat, Realtime, Embeddings)
Associated token pricing (time-based)

LLM Config

An LLM Config specifies how Deepdesk connects to a real provider endpoint (Azure OpenAI or customer-managed). It includes:

Provider
Authentication method
Base URL
Supported models
Optional deployment prefix

Only customer endpoints are shown in Admin

The default Deepdesk LLM endpoints are automatically provisioned by Deepdesk, and not visible in the Admin interface. Only customer-managed endpoints are shown there.

LLM Gateway

The LLM Gateway is the runtime routing layer that:

Receives LLM requests from Deepdesk services
Resolves the correct endpoint
Applies authentication
Handles failover and routing

Design Principles

Separation of models, pricing, and endpoints: Each is managed independently for flexibility and clarity.
Time-based pricing: Ensures accurate billing and historical tracking.
Provider abstraction: Supports both Deepdesk-managed and customer-managed endpoints.
Secure secret handling: All credentials are stored securely and loaded at runtime.

For step-by-step instructions, see the LLM User Guide.

What is an LLM?​

LLM Config​

LLM Gateway​

Design Principles​

What is an LLM?

LLM Config

LLM Gateway

Design Principles