Skip to content
DeepTokenInference Gateway
HomeDashboardModelsLeaderboardDocsPricingEnterpriseBlog

    Introduction

    • Getting started
    • Quickstart
    • Integrations

    API

    • Authentication
    • Chat Completions
    • Models
    • Errors

    Billing

    • Billing
    • Organizations

    Authentication

    Every /v1/* call is authenticated by an API key sent as a Bearer token:

    Authorization: Bearer dtk_...
    

    [!WARNING] There is no second-factor on the gateway itself β€” API keys are the absolute security boundary. Rotate keys immediately if they may have leaked, and prefer short-lived, environment-specific keys over a single shared one.

    Key lifecycle

    • Create β€” Dashboard β†’ API Keys β†’ New key.

      [!IMPORTANT] Copy the secret key immediately. We only show it once upon creation and never store the plaintext key on our servers (only the hash). It cannot be retrieved later if lost.

    • Revoke β€” Revoke flips the key to a revoked state. Calls immediately return 401 invalid_api_key. Revoked keys never reactivate.

    • Delete β€” Deleting a revoked key removes the row entirely. Usage history attributed to the key remains.

    Scopes & limits

    Each key supports:

    • Model allowlist β€” comma-separated model ids. Empty means "all models". Calls to disallowed models return 403 model_not_allowed.
    • IP allowlist β€” CIDR list. Calls from outside the list return 403 ip_not_allowed.
    • Rolling USD ceilings β€” 5h / 1d / 7d windows. Breaching any window returns 429 budget_exceeded until the window rolls.

    Per-call cost attribution

    The gateway writes one row to the usage ledger per request, capturing the API key id, organization (if X-DeepToken-Org is set), model, tokens, credits, and time-to-first-token for streaming calls. The dashboard's Usage page, the wallet's recent-debits list, and admin's analytics all read from the same ledger.

    Previous

    Integrations

    Next

    Chat Completions

    On this page

    • Key lifecycle
    • Scopes & limits
    • Per-call cost attribution