Skill Listing

Review the skill and install it into your account in one click.

← Back to Skills

Image Generation

Generate and edit images from text prompts using diffusion-style models.

Free

Category

creative

Provider

computer agents

Code Files

View on GitHub

AI review

Quality 62Trust 45Discovery 38

Clear, practical CLI skill for text-to-image and image-editing workflows with useful options, but key integration and provenance details are missing and the claim of using 'Gemini 3 Pro Image' is unverified.

This listing provides a straightforward, well-organized CLI script and examples for text-to-image generation, editing, and composing multiple inputs, with sensible options for aspect ratios and resolutions. However, it lacks essential implementation details such as dependency installation, authentication or API integration steps, licensing, and how the proprietary 'Gemini 3 Pro Image' model is accessed, which reduces confidence the tool will work out of the box. Improve the README with setup instructions, dependency list, and verified model/API access to raise trust and polish.

Source full AI review

Strengths

Clear CLI usage examples for generation, editing, and composition
Supports multiple aspect ratios, resolutions, and input combinations

Considerations

No instructions for authentication, model/API access, or required dependencies
Claims use of 'Gemini 3 Pro Image' without verification or integration details

Why this ranks

Agent List ranks listings using quality, trust, traction, and freshness instead of follower count alone. Paid Computer Agents badges are identity signals only and do not raise discovery score.

The listing is under additional review, which can affect discoverability.

Trust signals

AI review

Review pending.

Badge guide

Paid account badge

Automatic for active Computer Agents Individual, Team, and Enterprise subscriptions. It confirms account status only and does not increase discovery ranking.

Agent List reviewed

Agent List review

Granted after Agent List reviews the creator profile and marketplace presence. This is separate from the blue paid account badge.

name:

image-generation

description:

Generate and edit images using AI. Use when asked to create, generate, draw, or edit images, illustrations, diagrams, comics, pictures, artwork, logos, or any visual content. Also use for image editing, style transfer, adding elements to photos, or combining multiple images.

Image Generation & Editing

Generate and edit images using Gemini 3 Pro Image - a state-of-the-art model for professional image creation.

When to Use

Use this skill when you need to:

Generate images from text descriptions
Edit existing images (add/remove/modify elements)
Combine multiple images into new compositions
Apply style transfers to images
Create visual assets, illustrations, or diagrams
Generate images with text/logos (high-fidelity text rendering)

Usage

Generate from Text (Text-to-Image)

terminal

Edit an Existing Image

terminal

Combine Multiple Images

terminal

Specify Output Options

terminal

Options

Option	Short	Default	Description
`--output`	`-o`	Auto-generated in `/workspace/generated_images/`	Output file path
`--input`	`-i`	None	Input image for editing (can specify multiple)
`--aspect-ratio`	`-a`	`1:1`	Output aspect ratio
`--resolution`	`-r`	`1K`	Output resolution (1K, 2K, or 4K)

By default, images are saved to /workspace/generated_images/ with timestamped filenames like image_20250120_143052_your_prompt.png.

Aspect Ratios

1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9

Resolutions

1K - ~1024px (default, fastest)
2K - ~2048px (higher quality)
4K - ~4096px (highest quality, slower)

Examples

Text-to-Image Generation

terminal

Image Editing

terminal

Multi-Image Composition

terminal

Capabilities

Gemini 3 Pro Image Features

High-resolution output: 1K, 2K, and 4K generation
Advanced text rendering: Legible, stylized text for logos, diagrams, marketing
Thinking mode: Model reasons through complex prompts for better results
Up to 14 reference images: Mix images for composition (5 high-fidelity people)
Semantic masking: Edit specific parts without explicit masks

Requirements

GEMINI_API_KEY environment variable must be set
Python 3.10+ with google-genai and Pillow packages installed

Tips for Better Results

For Generation

Be descriptive: "A photorealistic close-up portrait with soft golden hour lighting" beats "a portrait"
Specify style: Include art style references (minimalist, photorealistic, watercolor, etc.)
Add camera details: Mention lens type, lighting setup, camera angle for photorealistic images
Use step-by-step: For complex scenes, describe background first, then foreground elements

For Editing

Be specific about what to preserve: "Keep the woman's face unchanged, only add..."
Describe the integration: "The hat should look naturally placed, matching the lighting"
Use semantic descriptions: Instead of "mask the sofa", say "change only the sofa"

For Text in Images

Specify font style descriptively: "clean, bold, sans-serif" or "elegant script"
Place text explicitly: "text at the top center of the image"