Chat
The Faces API exposes three chat endpoints. Pass a face alias in the model field to load that face’s compiled persona as the system context.
| Endpoint | Format |
|---|
/v1/chat/completions | OpenAI chat completions |
/v1/messages | Anthropic messages |
/v1/responses | OpenAI responses |
Auto-routing: You can send any model to any endpoint. The API automatically converts the request format and routes to the correct upstream. For example, sending gpt-5.4 to /v1/chat/completions auto-routes to the Responses API, and sending claude-sonnet-4-6 auto-routes to Anthropic Messages.
Basic usage
curl -X POST https://api.faces.sh/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "alice",
"messages": [
{"role": "user", "content": "What matters most to you in your work?"}
]
}'
The face’s compiled psychological primitives and basic_facts are injected automatically into the system prompt. You do not need to manage this context yourself.
Model override syntax
By default, a face uses its configured default_model, or the system default if none is set. You can override this per-request using the alias@llm-model format:
"model": "alice@gpt-4o-mini"
"model": "alice@claude-sonnet-4-6"
"model": "alice@accounts/fireworks/models/llama-v3p1-8b-instruct"
The format is alias@llm-model. The model must be in the supported models list.
You can also pass a bare model name with no face prefix — e.g. "model": "gpt-4o-mini". This proxies directly to the LLM with no persona injected. See Face templates for how to use face profiles without a primary persona.
Model overrides always use the system API key — no user-stored credentials are required or used.
Streaming
Add "stream": true for SSE streaming:
curl -X POST https://api.faces.sh/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "alice",
"messages": [{"role": "user", "content": "Tell me about your childhood."}],
"stream": true
}'
The response is a standard OpenAI-format SSE stream (data: {"choices":[{"delta":{"content":"..."}}]}\n\n).
Multi-turn conversations
Pass the full message history as you would with any OpenAI-compatible client:
{
"model": "alice",
"messages": [
{"role": "user", "content": "What city do you live in?"},
{"role": "assistant", "content": "I live in Berlin."},
{"role": "user", "content": "What neighborhood?"}
]
}
Face templates
You can reference any face in your messages using ${face-alias} syntax. Each ${...} token is replaced with the face’s display name in your message text, and the referenced face’s profile is provided to the model as context. This lets you bring multiple faces into a single prompt.
curl -X POST https://api.faces.sh/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "alice@gpt-4o-mini",
"messages": [
{"role": "user", "content": "You are debating ${bob}. Argue your position on space exploration."}
]
}'
Here Alice is the primary persona (system prompt) and ${bob} is replaced with “Bob” in the message. Bob’s profile is automatically injected as additional context so the model knows who Alice is debating.
You can reference as many faces as you need:
{
"model": "alice@gpt-4o-mini",
"messages": [
{"role": "user", "content": "You are moderating a debate between ${bob} and ${carol}. Summarize their positions."}
]
}
If you reference the primary face (the one in the model field) with a template — e.g. model: "alice@gpt-4o-mini" and ${alice} in a message — the token is replaced with Alice’s display name but no duplicate profile is added, since Alice’s full persona is already injected.
Templates without a primary face
You can pass a bare model name instead of alias@model. No persona is injected — the model acts as a standard assistant. Templates still work, so you can inject face profiles as context:
curl -X POST https://api.faces.sh/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{"role": "user", "content": "Compare the worldviews of ${alice} and ${bob}. Where do they agree?"}
]
}'
This works on all three endpoints: /v1/chat/completions, /v1/messages, and /v1/responses.
Supported endpoints
Templates work in all endpoints:
| Endpoint | Where templates are scanned |
|---|
/v1/chat/completions | messages[*].content (string or content blocks) |
/v1/messages | messages[*].content and system field |
/v1/responses | input (string or message list) and instructions field |
Only type: "text" content blocks are scanned — image, tool, and other block types are left untouched.
Escaping
To include a literal ${...} without triggering expansion, prefix it with a backslash:
{"role": "user", "content": "Use the syntax \\${name} to reference a face."}
Rules
- Referenced aliases must be faces you own.
- Synthetic faces (those defined by a formula) work in templates.
- Unknown aliases return
422 with a list of the unrecognized names.
- Templates are expanded once — profile text is never re-scanned for nested
${...} tokens.
User preferences
Set account-level defaults via the preferences endpoint:
curl -X PATCH https://api.faces.sh/v1/user/preferences \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"default_model": "gpt-5.4",
"api_fallback": true
}'
| Field | Type | Description |
|---|
default_model | string | Default LLM model inherited by newly created faces. Must be a valid model from /v1/models. |
api_fallback | bool | Allow fallback to paid system keys when OAuth fails (default: false). |
Read current preferences:
curl https://api.faces.sh/v1/user/preferences \
-H "Authorization: Bearer YOUR_API_KEY"
Supported models
curl https://api.faces.sh/v1/models \
-H "Authorization: Bearer YOUR_API_KEY"
Returns all models available for use in the model field override syntax.
Using with OpenAI SDK
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.faces.sh/v1"
)
response = client.chat.completions.create(
model="alice",
messages=[{"role": "user", "content": "What matters most to you?"}]
)
print(response.choices[0].message.content)
OAuth routing (Connect plan)
Connect-plan users with a linked OpenAI account get certain models routed through their ChatGPT subscription at no cost. When you request a model like gpt-5.4 via /v1/responses, the API automatically tries the Codex OAuth endpoint first. If OAuth is unavailable or fails, it falls back to the system API key (which charges your credit balance).
To prevent unexpected charges, pass "oauth_only": true in the request body:
curl -X POST https://api.faces.sh/v1/responses \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "alice@gpt-5.4",
"input": [{"role": "user", "content": "Hi!"}],
"oauth_only": true
}'
When oauth_only is true and OAuth is unavailable, the request fails with an error instead of falling back to the paid path.
All proxy responses include these headers:
| Header | Example | Description |
|---|
X-Faces-Provider | openai-codex or openai | Which provider actually served the request |
X-Faces-Cost-USD | 0.00 or 0.0035 | Cost charged to your account for this request |
These let your client know whether OAuth was used (cost = 0) or the system key was used (cost > 0).
Error codes
| Code | Meaning |
|---|
402 | Insufficient credits |
404 | Face not found or not owned by you |
422 | Invalid request body or unknown face alias in ${...} template |