Skip to main content

Vision

Send images through the /v1/chat/completions endpoint to let the model understand and describe image content. Supports both URL references and Base64 encoding.

Supported Models

gpt-4o, gpt-4o-mini, claude-sonnet-4-20250514, gemini-2.5-flash, gemini-2.5-pro, and other vision-capable models.

Send Image via URL

curl https://crazyrouter.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": [
          {"type": "text", "text": "Describe the content of this image"},
          {
            "type": "image_url",
            "image_url": {
              "url": "https://example.com/photo.jpg",
              "detail": "high"
            }
          }
        ]
      }
    ],
    "max_tokens": 500
  }'

Send Image via Base64

Encode the image as a Base64 string and pass it using the data: URI format.
curl https://crazyrouter.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": [
          {"type": "text", "text": "What is in this image?"},
          {
            "type": "image_url",
            "image_url": {
              "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUg..."
            }
          }
        ]
      }
    ],
    "max_tokens": 500
  }'

Streaming Vision

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://crazyrouter.com/v1"
)

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this image in detail"},
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/photo.jpg"}
                }
            ]
        }
    ],
    max_tokens=1000,
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

Multiple Image Understanding

You can send multiple images in a single message:
Python
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Compare the differences between these two images"},
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/image1.jpg"}
                },
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/image2.jpg"}
                }
            ]
        }
    ],
    max_tokens=1000
)

detail Parameter

ValueDescription
lowLow resolution mode, consumes fewer tokens, suitable for quick recognition
highHigh resolution mode, consumes more tokens, suitable for detailed analysis
autoDefault, the model chooses automatically
Images consume additional tokens. Higher resolution images consume more tokens. Choose the appropriate detail level based on your actual needs.
Base64 encoding increases image size by approximately 33%. For large images, it is recommended to use the URL method to reduce request size.