Skill Listing

Review the skill and install it into your account in one click.

← Back to Skills

Image Generation

Generate and edit images from text prompts using diffusion-style models.

Free

Category

creative

Provider

computer agents

Code Files

1

AI review

Quality 65Trust 45Discovery 39

Competent, useful image-generation CLI skill with clear examples, but missing critical setup, provenance, and safety details that limit trust and launch-readiness.

This listing describes a practical CLI-based image generation and editing skill with clear usage examples, options, and typical workflows (text-to-image, editing, combining images). The documentation is concise and gives useful defaults and output conventions, but it lacks installation/setup steps, dependency and authentication instructions, provenance for the claimed model (Gemini 3 Pro Image), and any safety or content-moderation guidance. Because of those omissions and the absence of verifiable domain or model access details, the listing needs additional implementation and trust-related information before it’s ready for broad discovery.

Source full AI review

Strengths

  • Clear CLI usage and concrete examples (text-to-image, edit, combine).
  • Supports multiple inputs, aspect ratios, and multiple resolutions.

Considerations

  • No setup, dependencies, or authentication instructions — unclear how to access the claimed model.
  • No safety, copyright, or content-moderation guidance and no provenance/domain verification.

Why this ranks

Agent List ranks listings using quality, trust, traction, and freshness instead of follower count alone. Paid Computer Agents badges are identity signals only and do not raise discovery score.

  • The listing is under additional review, which can affect discoverability.

Trust signals

AI review

Review pending.

Badge guide

Paid account badge

Automatic for active Computer Agents Individual, Team, and Enterprise subscriptions. It confirms account status only and does not increase discovery ranking.

Agent List reviewed

Agent List review

Granted after Agent List reviews the creator profile and marketplace presence. This is separate from the blue paid account badge.

name:
image-generation
description:
Generate and edit images using AI. Use when asked to create, generate, draw, or edit images, illustrations, diagrams, comics, pictures, artwork, logos, or any visual content. Also use for image editing, style transfer, adding elements to photos, or combining multiple images.

Image Generation & Editing

Generate and edit images using Gemini 3 Pro Image - a state-of-the-art model for professional image creation.

When to Use

Use this skill when you need to:

  • Generate images from text descriptions
  • Edit existing images (add/remove/modify elements)
  • Combine multiple images into new compositions
  • Apply style transfers to images
  • Create visual assets, illustrations, or diagrams
  • Generate images with text/logos (high-fidelity text rendering)

Usage

Generate from Text (Text-to-Image)

terminal
Loading...

Edit an Existing Image

terminal
Loading...

Combine Multiple Images

terminal
Loading...

Specify Output Options

terminal
Loading...

Options

OptionShortDefaultDescription
--output-oAuto-generated in /workspace/generated_images/Output file path
--input-iNoneInput image for editing (can specify multiple)
--aspect-ratio-a1:1Output aspect ratio
--resolution-r1KOutput resolution (1K, 2K, or 4K)

By default, images are saved to /workspace/generated_images/ with timestamped filenames like image_20250120_143052_your_prompt.png.

Aspect Ratios

1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9

Resolutions

  • 1K - ~1024px (default, fastest)
  • 2K - ~2048px (higher quality)
  • 4K - ~4096px (highest quality, slower)

Examples

Text-to-Image Generation

terminal
Loading...

Image Editing

terminal
Loading...

Multi-Image Composition

terminal
Loading...

Capabilities

Gemini 3 Pro Image Features

  • High-resolution output: 1K, 2K, and 4K generation
  • Advanced text rendering: Legible, stylized text for logos, diagrams, marketing
  • Thinking mode: Model reasons through complex prompts for better results
  • Up to 14 reference images: Mix images for composition (5 high-fidelity people)
  • Semantic masking: Edit specific parts without explicit masks

Requirements

  • GEMINI_API_KEY environment variable must be set
  • Python 3.10+ with google-genai and Pillow packages installed

Tips for Better Results

For Generation

  • Be descriptive: "A photorealistic close-up portrait with soft golden hour lighting" beats "a portrait"
  • Specify style: Include art style references (minimalist, photorealistic, watercolor, etc.)
  • Add camera details: Mention lens type, lighting setup, camera angle for photorealistic images
  • Use step-by-step: For complex scenes, describe background first, then foreground elements

For Editing

  • Be specific about what to preserve: "Keep the woman's face unchanged, only add..."
  • Describe the integration: "The hat should look naturally placed, matching the lighting"
  • Use semantic descriptions: Instead of "mask the sofa", say "change only the sofa"

For Text in Images

  • Specify font style descriptively: "clean, bold, sans-serif" or "elegant script"
  • Place text explicitly: "text at the top center of the image"