Skip to main content

Large Language Models (LLM)

This page provides an overview of how Large Language Models (LLMs), their configurations, and pricing are managed in Deepdesk. For technical details on routing and integration, see LLM Gateway.

What is an LLM?

An LLM in Deepdesk is a logical model definition (e.g., gpt-4, gpt-4o-mini, text-embedding-ada-002). Each LLM defines:

  • Code (unique identifier)
  • Display name
  • Model type (Chat, Realtime, Embeddings)
  • Associated token pricing (time-based)

LLM Config

An LLM Config specifies how Deepdesk connects to a real provider endpoint (Azure OpenAI or customer-managed). It includes:

  • Provider
  • Authentication method
  • Base URL
  • Supported models
  • Optional deployment prefix
Only customer endpoints are shown in Admin

The default Deepdesk LLM endpoints are automatically provisioned by Deepdesk, and not visible in the Admin interface. Only customer-managed endpoints are shown there.

LLM Gateway

The LLM Gateway is the runtime routing layer that:

  • Receives LLM requests from Deepdesk services
  • Resolves the correct endpoint
  • Applies authentication
  • Handles failover and routing

Design Principles

  • Separation of models, pricing, and endpoints: Each is managed independently for flexibility and clarity.
  • Time-based pricing: Ensures accurate billing and historical tracking.
  • Provider abstraction: Supports both Deepdesk-managed and customer-managed endpoints.
  • Secure secret handling: All credentials are stored securely and loaded at runtime.

For step-by-step instructions, see the LLM User Guide.