Chat

The Faces API exposes three chat endpoints. Pass a face alias in the model field to load that face’s compiled persona as the system context.

Endpoint	Format
`/v1/chat/completions`	OpenAI chat completions
`/v1/messages`	Anthropic messages
`/v1/responses`	OpenAI responses

Auto-routing: You can send any model to any endpoint. The API automatically converts the request format and routes to the correct upstream. For example, sending gpt-5.4 to /v1/chat/completions auto-routes to the Responses API, and sending claude-sonnet-4-6 auto-routes to Anthropic Messages.

Basic usage

curl -X POST https://api.faces.sh/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alice",
    "messages": [
      {"role": "user", "content": "What matters most to you in your work?"}
    ]
  }'

The face’s compiled psychological primitives and basic_facts are injected automatically into the system prompt. You do not need to manage this context yourself.

Model override syntax

By default, a face uses its configured default_model, or the system default if none is set. You can override this per-request using the alias@llm-model format:

"model": "alice@gpt-4o-mini"
"model": "alice@claude-sonnet-4-6"
"model": "alice@accounts/fireworks/models/llama-v3p1-8b-instruct"

The format is alias@llm-model. The model must be in the supported models list. You can also pass a bare model name with no face prefix — e.g. "model": "gpt-4o-mini". This proxies directly to the LLM with no persona injected. See Face templates for how to use face profiles without a primary persona.

Model overrides always use the system API key — no user-stored credentials are required or used.

Streaming

Add "stream": true for SSE streaming:

curl -X POST https://api.faces.sh/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alice",
    "messages": [{"role": "user", "content": "Tell me about your childhood."}],
    "stream": true
  }'

The response is a standard OpenAI-format SSE stream (data: {"choices":[{"delta":{"content":"..."}}]}\n\n).

Multi-turn conversations

Pass the full message history as you would with any OpenAI-compatible client:

{
  "model": "alice",
  "messages": [
    {"role": "user", "content": "What city do you live in?"},
    {"role": "assistant", "content": "I live in Berlin."},
    {"role": "user", "content": "What neighborhood?"}
  ]
}

Face templates

You can reference any face in your messages using ${face-alias} syntax. Each ${...} token is replaced with the face’s display name in your message text, and the referenced face’s profile is provided to the model as context. This lets you bring multiple faces into a single prompt.

curl -X POST https://api.faces.sh/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alice@gpt-4o-mini",
    "messages": [
      {"role": "user", "content": "You are debating ${bob}. Argue your position on space exploration."}
    ]
  }'

Here Alice is the primary persona (system prompt) and ${bob} is replaced with “Bob” in the message. Bob’s profile is automatically injected as additional context so the model knows who Alice is debating. You can reference as many faces as you need:

{
  "model": "alice@gpt-4o-mini",
  "messages": [
    {"role": "user", "content": "You are moderating a debate between ${bob} and ${carol}. Summarize their positions."}
  ]
}

If you reference the primary face (the one in the model field) with a template — e.g. model: "alice@gpt-4o-mini" and ${alice} in a message — the token is replaced with Alice’s display name but no duplicate profile is added, since Alice’s full persona is already injected.

Templates without a primary face

You can pass a bare model name instead of alias@model. No persona is injected — the model acts as a standard assistant. Templates still work, so you can inject face profiles as context:

curl -X POST https://api.faces.sh/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {"role": "user", "content": "Compare the worldviews of ${alice} and ${bob}. Where do they agree?"}
    ]
  }'

This works on all three endpoints: /v1/chat/completions, /v1/messages, and /v1/responses.

Supported endpoints

Templates work in all endpoints:

Endpoint	Where templates are scanned
`/v1/chat/completions`	`messages[*].content` (string or content blocks)
`/v1/messages`	`messages[*].content` and `system` field
`/v1/responses`	`input` (string or message list) and `instructions` field

Only type: "text" content blocks are scanned — image, tool, and other block types are left untouched.

Escaping

To include a literal ${...} without triggering expansion, prefix it with a backslash:

{"role": "user", "content": "Use the syntax \\${name} to reference a face."}

Rules

Referenced aliases must be faces you own.
Synthetic faces (those defined by a formula) work in templates.
Unknown aliases return 422 with a list of the unrecognized names.
Templates are expanded once — profile text is never re-scanned for nested ${...} tokens.

User preferences

Set account-level defaults via the preferences endpoint:

curl -X PATCH https://api.faces.sh/v1/user/preferences \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "default_model": "gpt-5.4",
    "api_fallback": true
  }'

Field	Type	Description
`default_model`	string	Default LLM model inherited by newly created faces. Must be a valid model from `/v1/models`.
`api_fallback`	bool	Allow fallback to paid system keys when OAuth fails (default: `false`).

Read current preferences:

curl https://api.faces.sh/v1/user/preferences \
  -H "Authorization: Bearer YOUR_API_KEY"

Supported models

curl https://api.faces.sh/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

Returns all models available for use in the model field override syntax.

Using with OpenAI SDK

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.faces.sh/v1"
)

response = client.chat.completions.create(
    model="alice",
    messages=[{"role": "user", "content": "What matters most to you?"}]
)
print(response.choices[0].message.content)

OAuth routing (Connect plan)

Connect-plan users with a linked OpenAI account get certain models routed through their ChatGPT subscription at no cost. When you request a model like gpt-5.4 via /v1/responses, the API automatically tries the Codex OAuth endpoint first. If OAuth is unavailable or fails, it falls back to the system API key (which charges your credit balance). To prevent unexpected charges, pass "oauth_only": true in the request body:

curl -X POST https://api.faces.sh/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alice@gpt-5.4",
    "input": [{"role": "user", "content": "Hi!"}],
    "oauth_only": true
  }'

When oauth_only is true and OAuth is unavailable, the request fails with an error instead of falling back to the paid path.

Response headers

All proxy responses include these headers:

Header	Example	Description
`X-Faces-Provider`	`openai-codex` or `openai`	Which provider actually served the request
`X-Faces-Cost-USD`	`0.00` or `0.0035`	Cost charged to your account for this request

These let your client know whether OAuth was used (cost = 0) or the system key was used (cost > 0).

Error codes

Code	Meaning
`402`	Insufficient credits
`404`	Face not found or not owned by you
`422`	Invalid request body or unknown face alias in `${...}` template

Get Started

Guides

Concepts

Chat

Chat

Basic usage

Model override syntax

Streaming

Multi-turn conversations

Face templates

Templates without a primary face

Supported endpoints

Escaping

Rules

User preferences

Supported models

Using with OpenAI SDK

OAuth routing (Connect plan)

Response headers

Error codes

Get Started

Guides

Concepts

​Chat

​Basic usage

​Model override syntax

​Streaming

​Multi-turn conversations

​Face templates

​Templates without a primary face

​Supported endpoints

​Escaping

​Rules

​User preferences

​Supported models

​Using with OpenAI SDK

​OAuth routing (Connect plan)

​Response headers

​Error codes

Chat

Basic usage

Model override syntax

Streaming

Multi-turn conversations

Face templates

Templates without a primary face

Supported endpoints

Escaping

Rules

User preferences

Supported models

Using with OpenAI SDK

OAuth routing (Connect plan)

Response headers

Error codes