Error Handling

When making API requests through OpenModel, you may encounter two categories of errors: gateway errors returned by OpenModel itself, and upstream errors forwarded from the underlying AI provider.

Error Categories

Gateway Errors

Gateway errors originate from OpenModel when a request fails before reaching the upstream provider (e.g. authentication failure, rate limiting, insufficient balance).

For proxy endpoints (/v1/responses, /v1/messages, /v1beta/models/*), gateway errors are returned in the provider-specific format matching the API you are using, so that SDK error handling works seamlessly:

OpenAI format (/v1/responses):

{
  "error": {
    "message": "API key is invalid or has been revoked",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}

Anthropic format (/v1/messages):

{
  "type": "error",
  "error": {
    "type": "authentication_error",
    "message": "API key is invalid or has been revoked"
  }
}

Gemini format (/v1beta/models/*):

{
  "error": {
    "code": 401,
    "message": "API key is invalid or has been revoked",
    "status": "UNAUTHENTICATED"
  }
}

For management API endpoints (/web/v1/*), errors use the standard response format:

{
  "success": false,
  "data": null,
  "error": {
    "code": "UNAUTHORIZED",
    "msg": "API key is invalid or has been revoked"
  }
}

Messages protocol providers: All providers using the Messages protocol (Anthropic, DeepSeek, Xiaomi, Kimi, MiniMax, Zai) return errors in the Anthropic format shown above. DashScope returns OpenAI-format errors when accessed via /v1/responses, or Anthropic-format errors when accessed via /v1/messages. Your existing SDK error handling works for all providers on the same protocol without changes.

Upstream Errors

When the upstream AI provider returns an error, OpenModel forwards it in the provider's original format. This means the error structure depends on which API format you are using:

OpenAI format:

{
  "error": {
    "message": "The model `gpt-5` does not exist",
    "type": "invalid_request_error",
    "code": "model_not_found"
  }
}

Anthropic format:

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "max_tokens must be less than 8192"
  }
}

Gemini format:

{
  "error": {
    "code": 400,
    "message": "Invalid value at 'contents'",
    "status": "INVALID_ARGUMENT"
  }
}

HTTP Status Codes

Status	Meaning	Action
`400`	Bad Request — malformed request or invalid parameters	Fix the request and do not retry
`401`	Unauthorized — invalid or missing API key	Check your API key
`402`	Payment Required — insufficient balance	Top up your balance
`403`	Forbidden — insufficient permissions	Verify your account has access
`404`	Not Found — invalid endpoint or model	Check the URL and model name
`429`	Too Many Requests — rate limit exceeded	Retry with exponential backoff
`500`	Internal Server Error — gateway issue	Retry with backoff
`502`	Bad Gateway — upstream provider unreachable	Retry with backoff
`503`	Service Unavailable — temporary overload	Retry with backoff
`504`	Gateway Timeout — upstream provider timeout	Retry with backoff

Retry Strategy

Not all errors should be retried. Here is a recommended approach:

Retryable Errors

Retry these status codes with exponential backoff and jitter:

429 — Rate limit exceeded. Wait before retrying.
500 — Internal server error. Likely transient.
502 — Bad gateway. The upstream provider may be temporarily down.
503 — Service unavailable. Temporary overload.
504 — Gateway timeout. The upstream provider took too long to respond.

Non-Retryable Errors

Do not retry these — the request itself is invalid:

400 — Fix the request body or parameters.
401 — Check your API key.
403 — Verify permissions.
404 — Check the endpoint URL or model name.

Exponential Backoff with Jitter

A good retry pattern uses exponential backoff with random jitter to avoid thundering herd problems:

import time
import random
from openai import OpenAI, APIStatusError

client = OpenAI(
    base_url="https://api.openmodel.ai/v1",
    api_key="your-api-key",
)

def make_request_with_retry(max_retries=5):
    for attempt in range(max_retries):
        try:
            return client.responses.create(
                model="gpt-4o",
                input="Hello!",
            )
        except APIStatusError as e:
            if e.status_code in (429, 500, 502, 503, 504):
                if attempt == max_retries - 1:
                    raise
                # Exponential backoff: 1s, 2s, 4s, 8s, 16s + jitter
                delay = (2 ** attempt) + random.random()
                print(f"Retrying in {delay:.1f}s (attempt {attempt + 1})")
                time.sleep(delay)
            else:
                raise  # Don't retry client errors

Timeout Handling

Non-Streaming Requests

Set a reasonable timeout for standard requests. Most completions finish within 30-60 seconds, but complex prompts or large outputs may take longer:

import httpx
from openai import OpenAI

client = OpenAI(
    base_url="https://api.openmodel.ai/v1",
    api_key="your-api-key",
    timeout=httpx.Timeout(60.0, connect=10.0),
)

Streaming Requests

Streaming requests should use much longer timeouts since the connection stays open for the entire generation process:

import httpx
from openai import OpenAI

client = OpenAI(
    base_url="https://api.openmodel.ai/v1",
    api_key="your-api-key",
    timeout=httpx.Timeout(300.0, connect=10.0),
)

Error Handling Patterns

Python — Comprehensive Error Handling

from openai import OpenAI, APIConnectionError, APIStatusError, APITimeoutError

client = OpenAI(
    base_url="https://api.openmodel.ai/v1",
    api_key="your-api-key",
)

try:
    response = client.responses.create(
        model="gpt-4o",
        input="Hello!",
    )
    print(response.output_text)

except APIConnectionError:
    print("Failed to connect to the API. Check your network.")

except APITimeoutError:
    print("Request timed out. Try again or increase the timeout.")

except APIStatusError as e:
    if e.status_code == 401:
        print("Invalid API key. Check your credentials.")
    elif e.status_code == 429:
        print("Rate limited. Slow down your requests.")
    elif e.status_code >= 500:
        print(f"Server error ({e.status_code}). Retry later.")
    else:
        print(f"API error {e.status_code}: {e.message}")

Node.js — Comprehensive Error Handling

import OpenAI from "openai"

const client = new OpenAI({
  baseURL: "https://api.openmodel.ai/v1",
  apiKey: "your-api-key",
})

try {
  const response = await client.responses.create({
    model: "gpt-4o",
    input: "Hello!",
  })
  console.log(response.output_text)
} catch (error) {
  if (error instanceof OpenAI.APIConnectionError) {
    console.error("Failed to connect to the API. Check your network.")
  } else if (error instanceof OpenAI.APIConnectionTimeoutError) {
    console.error("Request timed out. Try again or increase the timeout.")
  } else if (error instanceof OpenAI.APIError) {
    switch (error.status) {
      case 401:
        console.error("Invalid API key. Check your credentials.")
        break
      case 429:
        console.error("Rate limited. Slow down your requests.")
        break
      default:
        if (error.status >= 500) {
          console.error(`Server error (${error.status}). Retry later.`)
        } else {
          console.error(`API error ${error.status}: ${error.message}`)
        }
    }
  } else {
    throw error
  }
}

Best Practices

Always handle errors explicitly — Don't let API errors crash your application. Wrap calls in try/catch blocks.
Log error details — Record the status code, error message, and request ID for debugging.
Set timeouts — Use shorter timeouts for non-streaming requests (~60s) and longer ones for streaming (~5min).
Implement circuit breakers — If you see repeated 5xx errors, back off for a longer period before retrying.
Distinguish error types — Handle gateway errors and upstream errors differently in your code if needed.
Monitor error rates — Track error rates in your application metrics to detect issues early.

On this page