Vision
Send images through the /v1/chat/completions endpoint to let the model understand and describe image content. Supports both URL references and Base64 encoding.
Supported Models
gpt-4o, gpt-4o-mini, claude-sonnet-4-20250514, gemini-2.5-flash, gemini-2.5-pro, and other vision-capable models.
Send Image via URL
curl https://crazyrouter.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Describe the content of this image"},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/photo.jpg",
"detail": "high"
}
}
]
}
],
"max_tokens": 500
}'
Send Image via Base64
Encode the image as a Base64 string and pass it using the data: URI format.
curl https://crazyrouter.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUg..."
}
}
]
}
],
"max_tokens": 500
}'
Streaming Vision
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://crazyrouter.com/v1"
)
stream = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image in detail"},
{
"type": "image_url",
"image_url": {"url": "https://example.com/photo.jpg"}
}
]
}
],
max_tokens=1000,
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="")
Multiple Image Understanding
You can send multiple images in a single message:
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Compare the differences between these two images"},
{
"type": "image_url",
"image_url": {"url": "https://example.com/image1.jpg"}
},
{
"type": "image_url",
"image_url": {"url": "https://example.com/image2.jpg"}
}
]
}
],
max_tokens=1000
)
detail Parameter
| Value | Description |
|---|
low | Low resolution mode, consumes fewer tokens, suitable for quick recognition |
high | High resolution mode, consumes more tokens, suitable for detailed analysis |
auto | Default, the model chooses automatically |
Images consume additional tokens. Higher resolution images consume more tokens. Choose the appropriate detail level based on your actual needs.
Base64 encoding increases image size by approximately 33%. For large images, it is recommended to use the URL method to reduce request size.