Skip to main content

Chat

The Faces API exposes three chat endpoints. Pass a face alias in the model field to load that face’s compiled persona as the system context.
EndpointFormat
/v1/chat/completionsOpenAI chat completions
/v1/messagesAnthropic messages
/v1/responsesOpenAI responses
Auto-routing: You can send any model to any endpoint. The API automatically converts the request format and routes to the correct upstream. For example, sending gpt-5.4 to /v1/chat/completions auto-routes to the Responses API, and sending claude-sonnet-4-6 auto-routes to Anthropic Messages.

Basic usage

curl -X POST https://api.faces.sh/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alice",
    "messages": [
      {"role": "user", "content": "What matters most to you in your work?"}
    ]
  }'
The face’s compiled psychological primitives and basic_facts are injected automatically into the system prompt. You do not need to manage this context yourself.

Model override syntax

By default, a face uses its configured default_model, or the system default if none is set. You can override this per-request using the alias@llm-model format:
"model": "alice@gpt-4o-mini"
"model": "alice@claude-sonnet-4-6"
"model": "alice@accounts/fireworks/models/llama-v3p1-8b-instruct"
The format is alias@llm-model. The model must be in the supported models list. You can also pass a bare model name with no face prefix — e.g. "model": "gpt-4o-mini". This proxies directly to the LLM with no persona injected. See Face templates for how to use face profiles without a primary persona.
Model overrides always use the system API key — no user-stored credentials are required or used.

Streaming

Add "stream": true for SSE streaming:
curl -X POST https://api.faces.sh/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alice",
    "messages": [{"role": "user", "content": "Tell me about your childhood."}],
    "stream": true
  }'
The response is a standard OpenAI-format SSE stream (data: {"choices":[{"delta":{"content":"..."}}]}\n\n).

Multi-turn conversations

Pass the full message history as you would with any OpenAI-compatible client:
{
  "model": "alice",
  "messages": [
    {"role": "user", "content": "What city do you live in?"},
    {"role": "assistant", "content": "I live in Berlin."},
    {"role": "user", "content": "What neighborhood?"}
  ]
}

Face templates

You can reference any face in your messages using ${face-alias} syntax. Each ${...} token is replaced with the face’s display name in your message text, and the referenced face’s profile is provided to the model as context. This lets you bring multiple faces into a single prompt.
curl -X POST https://api.faces.sh/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alice@gpt-4o-mini",
    "messages": [
      {"role": "user", "content": "You are debating ${bob}. Argue your position on space exploration."}
    ]
  }'
Here Alice is the primary persona (system prompt) and ${bob} is replaced with “Bob” in the message. Bob’s profile is automatically injected as additional context so the model knows who Alice is debating. You can reference as many faces as you need:
{
  "model": "alice@gpt-4o-mini",
  "messages": [
    {"role": "user", "content": "You are moderating a debate between ${bob} and ${carol}. Summarize their positions."}
  ]
}
If you reference the primary face (the one in the model field) with a template — e.g. model: "alice@gpt-4o-mini" and ${alice} in a message — the token is replaced with Alice’s display name but no duplicate profile is added, since Alice’s full persona is already injected.

Templates without a primary face

You can pass a bare model name instead of alias@model. No persona is injected — the model acts as a standard assistant. Templates still work, so you can inject face profiles as context:
curl -X POST https://api.faces.sh/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {"role": "user", "content": "Compare the worldviews of ${alice} and ${bob}. Where do they agree?"}
    ]
  }'
This works on all three endpoints: /v1/chat/completions, /v1/messages, and /v1/responses.

Supported endpoints

Templates work in all endpoints:
EndpointWhere templates are scanned
/v1/chat/completionsmessages[*].content (string or content blocks)
/v1/messagesmessages[*].content and system field
/v1/responsesinput (string or message list) and instructions field
Only type: "text" content blocks are scanned — image, tool, and other block types are left untouched.

Escaping

To include a literal ${...} without triggering expansion, prefix it with a backslash:
{"role": "user", "content": "Use the syntax \\${name} to reference a face."}

Rules

  • Referenced aliases must be faces you own.
  • Synthetic faces (those defined by a formula) work in templates.
  • Unknown aliases return 422 with a list of the unrecognized names.
  • Templates are expanded once — profile text is never re-scanned for nested ${...} tokens.

User preferences

Set account-level defaults via the preferences endpoint:
curl -X PATCH https://api.faces.sh/v1/user/preferences \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "default_model": "gpt-5.4",
    "api_fallback": true
  }'
FieldTypeDescription
default_modelstringDefault LLM model inherited by newly created faces. Must be a valid model from /v1/models.
api_fallbackboolAllow fallback to paid system keys when OAuth fails (default: false).
Read current preferences:
curl https://api.faces.sh/v1/user/preferences \
  -H "Authorization: Bearer YOUR_API_KEY"

Supported models

curl https://api.faces.sh/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"
Returns all models available for use in the model field override syntax.

Using with OpenAI SDK

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.faces.sh/v1"
)

response = client.chat.completions.create(
    model="alice",
    messages=[{"role": "user", "content": "What matters most to you?"}]
)
print(response.choices[0].message.content)

OAuth routing (Connect plan)

Connect-plan users with a linked OpenAI account get certain models routed through their ChatGPT subscription at no cost. When you request a model like gpt-5.4 via /v1/responses, the API automatically tries the Codex OAuth endpoint first. If OAuth is unavailable or fails, it falls back to the system API key (which charges your credit balance). To prevent unexpected charges, pass "oauth_only": true in the request body:
curl -X POST https://api.faces.sh/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alice@gpt-5.4",
    "input": [{"role": "user", "content": "Hi!"}],
    "oauth_only": true
  }'
When oauth_only is true and OAuth is unavailable, the request fails with an error instead of falling back to the paid path.

Response headers

All proxy responses include these headers:
HeaderExampleDescription
X-Faces-Provideropenai-codex or openaiWhich provider actually served the request
X-Faces-Cost-USD0.00 or 0.0035Cost charged to your account for this request
These let your client know whether OAuth was used (cost = 0) or the system key was used (cost > 0).

Error codes

CodeMeaning
402Insufficient credits
404Face not found or not owned by you
422Invalid request body or unknown face alias in ${...} template