Chat Completions
The POST /v1/chat/completions endpoint is the primary method for generating text, code, or structured JSON responses from supported language models. It is 100% compatible with the OpenAI Chat Completions API schema.
[!NOTE] DeepToken acts as an intelligent proxy. You call this endpoint, and DeepToken handles routing, prioritization, provider failovers, token counting, and direct credit deductions from your balance.
Endpoint Details
- URL:
https://api.deeptoken.app/v1/chat/completions - Method:
POST - Headers:
Authorization: Bearer <DEEPTOKEN_API_KEY>(Required)Content-Type: application/json(Required)X-DeepToken-Org: <ORG_SLUG>(Optional, to attribute costs to a specific organization wallet)
Code Examples
Select your preferred integration method to view a sample request:
curl https://api.deeptoken.app/v1/chat/completions \
-H "Authorization: Bearer $DEEPTOKEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "Why is the sky blue?"
}
],
"temperature": 0.7
}'
from openai import OpenAI
client = OpenAI(
api_key="$DEEPTOKEN_API_KEY",
base_url="https://api.deeptoken.app/v1"
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": "Why is the sky blue?"}
],
temperature=0.7
)
print(response.choices[0].message.content)
import OpenAI from "openai"
const client = new OpenAI({
apiKey: process.env.DEEPTOKEN_API_KEY,
baseURL: "https://api.deeptoken.app/v1"
})
const response = await client.chat.completions.create({
model: "gpt-4o-mini",
messages: [
{ role: "user", content: "Why is the sky blue?" }
],
temperature: 0.7
})
console.log(response.choices[0].message.content)
Request Parameters
The request body must be a JSON object containing the following parameters:
| Parameter | Type | Required? | Description |
|---|---|---|---|
model | string | Yes | The ID of the model to use. See the catalog for all supported model IDs. |
messages | array | Yes | A list of message objects representing the conversation history. See Message Object below. |
temperature | number | No (default: 1) | Sampling temperature between 0 and 2. Higher values make output more random, lower values make it more focused. |
top_p | number | No (default: 1) | Nucleus sampling factor. 0.1 means only tokens comprising the top 10% probability mass are considered. |
stream | boolean | No (default: false) | If true, tokens are sent as Server-Sent Events (SSE) as they become available. |
max_tokens | integer | No | The maximum number of tokens to generate in the completion. |
stop | string or array | No | Up to 4 sequences where the API will stop generating further tokens. |
response_format | object | No | Specify { "type": "json_object" } or schema definition to enforce JSON output. |
tools | array | No | A list of tools (functions) the model may call. |
tool_choice | string or object | No | Controls which tool is called by the model (none, auto, required, or object). |
Message Object
Each object in the messages array has the following structure:
| Field | Type | Required? | Description |
|---|---|---|---|
role | string | Yes | The role of the messages author: system, user, assistant, or tool. |
content | string or array | Yes | The contents of the message (text, or array of content parts for multimodal input). |
name | string | No | An optional name for the participant, useful to distinguish multiple users. |
tool_call_id | string | No (for tool role) | The ID of the tool call this message responds to. |
Response Schema
A successful non-streaming response returns a JSON object with the following fields:
{
"id": "chatcmpl-9A8b9C...",
"object": "chat.completion",
"created": 1718029562,
"model": "gpt-4o-mini",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The sky is blue because of Rayleigh scattering..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 85,
"total_tokens": 98
}
}
Response Fields
id: A unique identifier for the chat completion.object: The object type, alwayschat.completion.created: The Unix timestamp (in seconds) of when the chat completion was created.model: The model used for generating the completion.choices: A list of completion choices. Each choice contains:message: The generated message object.finish_reason: Why the model stopped generating (stop,length,tool_calls, etc.).
usage: Token usage statistics for the request.-
[!IMPORTANT]
- DeepToken calculates billing based on the token counts returned in
usage. Ensure your code handles this block if you are tracking usage client-side.
-