Skip to main content

Gemini Image Editing

Gemini image models support editing and modifying existing images. By sending the original image along with editing instructions, the model returns the modified image.
POST /v1beta/models/{model}:generateContent

Basic Image Editing

curl "https://crazyrouter.com/v1beta/models/gemini-2-5-flash-image:generateContent?key=YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [
          {"text": "Change the background of this image to a starry sky"},
          {
            "inlineData": {
              "mimeType": "image/jpeg",
              "data": "/9j/4AAQSkZJRgABAQAA..."
            }
          }
        ]
      }
    ],
    "generationConfig": {
      "responseModalities": ["TEXT", "IMAGE"]
    }
  }'

Style Transfer

Python
response = model.generate_content(
    [
        "Convert this photo to watercolor painting style, keeping the original composition",
        {"mime_type": "image/jpeg", "data": image_data}
    ],
    generation_config=genai.GenerationConfig(
        response_modalities=["TEXT", "IMAGE"]
    )
)

Local Editing

Specify the editing area through detailed text descriptions:
Python
response = model.generate_content(
    [
        "Change the person's clothing color from red to blue, keep everything else unchanged",
        {"mime_type": "image/jpeg", "data": image_data}
    ],
    generation_config=genai.GenerationConfig(
        response_modalities=["TEXT", "IMAGE"]
    )
)

Multi-Image Reference Editing

You can send multiple images as references:
Python
response = model.generate_content(
    [
        "Redraw the first image using the style of the second image",
        {"mime_type": "image/jpeg", "data": content_image},
        {"mime_type": "image/jpeg", "data": style_image}
    ],
    generation_config=genai.GenerationConfig(
        response_modalities=["TEXT", "IMAGE"]
    )
)
The quality of image editing depends on the clarity of the editing instructions. It is recommended to use specific, clear descriptions to specify what needs to be modified.
Both input and output images consume tokens. Large images significantly increase token consumption.