Error Handling
Understand error responses and implement robust error handling
When making API requests through OpenModel, you may encounter two categories of errors: gateway errors returned by OpenModel itself, and upstream errors forwarded from the underlying AI provider.
Error Categories
Gateway Errors
Gateway errors originate from OpenModel when a request fails before reaching the upstream provider (e.g. authentication failure, rate limiting, insufficient balance).
For proxy endpoints (/v1/responses, /v1/messages, /v1beta/models/*), gateway errors are returned in the provider-specific format matching the API you are using, so that SDK error handling works seamlessly:
OpenAI format (/v1/responses):
{
"error": {
"message": "API key is invalid or has been revoked",
"type": "invalid_request_error",
"code": "invalid_api_key"
}
}Anthropic format (/v1/messages):
{
"type": "error",
"error": {
"type": "authentication_error",
"message": "API key is invalid or has been revoked"
}
}Gemini format (/v1beta/models/*):
{
"error": {
"code": 401,
"message": "API key is invalid or has been revoked",
"status": "UNAUTHENTICATED"
}
}For management API endpoints (/web/v1/*), errors use the standard response format:
{
"success": false,
"data": null,
"error": {
"code": "UNAUTHORIZED",
"msg": "API key is invalid or has been revoked"
}
}Messages protocol providers: All providers using the Messages protocol (Anthropic, DeepSeek, Xiaomi, Kimi, MiniMax, Zai) return errors in the Anthropic format shown above. DashScope returns OpenAI-format errors when accessed via
/v1/responses, or Anthropic-format errors when accessed via/v1/messages. Your existing SDK error handling works for all providers on the same protocol without changes.
Upstream Errors
When the upstream AI provider returns an error, OpenModel forwards it in the provider's original format. This means the error structure depends on which API format you are using:
OpenAI format:
{
"error": {
"message": "The model `gpt-5` does not exist",
"type": "invalid_request_error",
"code": "model_not_found"
}
}Anthropic format:
{
"type": "error",
"error": {
"type": "invalid_request_error",
"message": "max_tokens must be less than 8192"
}
}Gemini format:
{
"error": {
"code": 400,
"message": "Invalid value at 'contents'",
"status": "INVALID_ARGUMENT"
}
}HTTP Status Codes
| Status | Meaning | Action |
|---|---|---|
400 | Bad Request — malformed request or invalid parameters | Fix the request and do not retry |
401 | Unauthorized — invalid or missing API key | Check your API key |
402 | Payment Required — insufficient balance | Top up your balance |
403 | Forbidden — insufficient permissions | Verify your account has access |
404 | Not Found — invalid endpoint or model | Check the URL and model name |
429 | Too Many Requests — rate limit exceeded | Retry with exponential backoff |
500 | Internal Server Error — gateway issue | Retry with backoff |
502 | Bad Gateway — upstream provider unreachable | Retry with backoff |
503 | Service Unavailable — temporary overload | Retry with backoff |
504 | Gateway Timeout — upstream provider timeout | Retry with backoff |
Retry Strategy
Not all errors should be retried. Here is a recommended approach:
Retryable Errors
Retry these status codes with exponential backoff and jitter:
- 429 — Rate limit exceeded. Wait before retrying.
- 500 — Internal server error. Likely transient.
- 502 — Bad gateway. The upstream provider may be temporarily down.
- 503 — Service unavailable. Temporary overload.
- 504 — Gateway timeout. The upstream provider took too long to respond.
Non-Retryable Errors
Do not retry these — the request itself is invalid:
- 400 — Fix the request body or parameters.
- 401 — Check your API key.
- 403 — Verify permissions.
- 404 — Check the endpoint URL or model name.
Exponential Backoff with Jitter
A good retry pattern uses exponential backoff with random jitter to avoid thundering herd problems:
import time
import random
from openai import OpenAI, APIStatusError
client = OpenAI(
base_url="https://api.openmodel.ai/v1",
api_key="your-api-key",
)
def make_request_with_retry(max_retries=5):
for attempt in range(max_retries):
try:
return client.responses.create(
model="gpt-4o",
input="Hello!",
)
except APIStatusError as e:
if e.status_code in (429, 500, 502, 503, 504):
if attempt == max_retries - 1:
raise
# Exponential backoff: 1s, 2s, 4s, 8s, 16s + jitter
delay = (2 ** attempt) + random.random()
print(f"Retrying in {delay:.1f}s (attempt {attempt + 1})")
time.sleep(delay)
else:
raise # Don't retry client errorsTimeout Handling
Non-Streaming Requests
Set a reasonable timeout for standard requests. Most completions finish within 30-60 seconds, but complex prompts or large outputs may take longer:
import httpx
from openai import OpenAI
client = OpenAI(
base_url="https://api.openmodel.ai/v1",
api_key="your-api-key",
timeout=httpx.Timeout(60.0, connect=10.0),
)Streaming Requests
Streaming requests should use much longer timeouts since the connection stays open for the entire generation process:
import httpx
from openai import OpenAI
client = OpenAI(
base_url="https://api.openmodel.ai/v1",
api_key="your-api-key",
timeout=httpx.Timeout(300.0, connect=10.0),
)Error Handling Patterns
Python — Comprehensive Error Handling
from openai import OpenAI, APIConnectionError, APIStatusError, APITimeoutError
client = OpenAI(
base_url="https://api.openmodel.ai/v1",
api_key="your-api-key",
)
try:
response = client.responses.create(
model="gpt-4o",
input="Hello!",
)
print(response.output_text)
except APIConnectionError:
print("Failed to connect to the API. Check your network.")
except APITimeoutError:
print("Request timed out. Try again or increase the timeout.")
except APIStatusError as e:
if e.status_code == 401:
print("Invalid API key. Check your credentials.")
elif e.status_code == 429:
print("Rate limited. Slow down your requests.")
elif e.status_code >= 500:
print(f"Server error ({e.status_code}). Retry later.")
else:
print(f"API error {e.status_code}: {e.message}")Node.js — Comprehensive Error Handling
import OpenAI from "openai"
const client = new OpenAI({
baseURL: "https://api.openmodel.ai/v1",
apiKey: "your-api-key",
})
try {
const response = await client.responses.create({
model: "gpt-4o",
input: "Hello!",
})
console.log(response.output_text)
} catch (error) {
if (error instanceof OpenAI.APIConnectionError) {
console.error("Failed to connect to the API. Check your network.")
} else if (error instanceof OpenAI.APIConnectionTimeoutError) {
console.error("Request timed out. Try again or increase the timeout.")
} else if (error instanceof OpenAI.APIError) {
switch (error.status) {
case 401:
console.error("Invalid API key. Check your credentials.")
break
case 429:
console.error("Rate limited. Slow down your requests.")
break
default:
if (error.status >= 500) {
console.error(`Server error (${error.status}). Retry later.`)
} else {
console.error(`API error ${error.status}: ${error.message}`)
}
}
} else {
throw error
}
}Best Practices
- Always handle errors explicitly — Don't let API errors crash your application. Wrap calls in try/catch blocks.
- Log error details — Record the status code, error message, and request ID for debugging.
- Set timeouts — Use shorter timeouts for non-streaming requests (~60s) and longer ones for streaming (~5min).
- Implement circuit breakers — If you see repeated 5xx errors, back off for a longer period before retrying.
- Distinguish error types — Handle gateway errors and upstream errors differently in your code if needed.
- Monitor error rates — Track error rates in your application metrics to detect issues early.