AI Thumbnail Generator | YouTube Podcast Thumbnails with Claude & Gemini

Name: AI Thumbnail Generator
Author: Abhinav Sinha

The Problem

YouTube thumbnails make or break click-through rates, but creating good ones is:

Time-consuming: 30-60 minutes per thumbnail with design tools
Skill-dependent: Requires knowing design principles, color theory, expressions
Hit-or-miss: Hard to predict what will perform well
Inconsistent: Maintaining brand identity across 70+ episodes is tough

I needed a system that could generate multiple professional options quickly for A/B testing.

My Approach

I built a multi-stage generation pipeline:

Hook Generation (Claude): Analyzes episode and generates 5 hook options using different psychological approaches
Hook Selection (Claude): Evaluates all hooks and picks the best 3 for testing
Expression Mapping: Maps hook mood to facial expressions (revelation, authority, controversy)
Image Generation (Gemini): Renders thumbnails with guest images for face consistency

The key insight was separating conceptual work (what message?) from visual work (how to render it?).

Architecture

AI Thumbnail Generator - Architecture Diagram

Key Features

5 Hook Approaches: Each uses different psychological trigger
Objective Selection: Claude evaluates without bias toward its own outputs
Expression Consistency: Mood maps to specific facial expressions
Character Persistence: Reference images maintain face identity
Brand Colors: Saved and reused across episodes
Session Management: Save/restore incomplete workflows
Prompts-Only Mode: Generate hooks without image rendering

Results & Metrics

Metric	Value
Hooks Generated	5 per episode
Thumbnails Output	3 per session
Image Resolution	2048x2048
Reference Images	Up to 5 guests + host
Rate Limits	Claude: 50/min, Gemini: 10/min
Output Files	thumbnails + prompts.txt + metadata.json

What I Learned

The hardest part was character consistency. Early versions generated great compositions but the guest's face looked different in each thumbnail. I solved this by:

Reference image feeding: Pass up to 5 guest photos to Gemini
Explicit face instructions: "Maintain exact facial features from reference"
Expression guidance: Specific descriptions like "widened eyes, slight forward lean"

Another challenge was rate limiting. Gemini's image API has strict limits (10/minute), so I added exponential backoff:

# Automatic retry with backoff
@retry(
    stop=stop_after_attempt(5),
    wait=wait_exponential(multiplier=1, min=4, max=60)
)
async def generate_image(prompt: str):
    ...

The separation of Claude for conceptual work and Gemini for visual work was key—each model excels at different tasks.

Frequently Asked Questions

What problem does this generator solve?

It reduces thumbnail creation from 30-60 minutes to 5 minutes per episode. Instead of manually designing, you get 3 A/B-testable options with psychological hooks and consistent character rendering.

What technologies power this project?

Claude API for hook generation and selection, Gemini API for image synthesis, Pydantic for data validation, and an interactive Python CLI for the workflow.

How good are the generated thumbnails?

Quality is high for podcast-style thumbnails with text overlays and host/guest faces. Complex scenes or multiple elements may require manual refinement. The psychological hooks are based on proven CTR frameworks.

Frequently Asked Questions

It reduces thumbnail creation from 30-60 minutes to 5 minutes per episode. Instead of manually designing, you get 3 A/B-testable options with psychological hooks and consistent character rendering.

Claude API for hook generation and selection, Gemini API for image synthesis, Pydantic for data validation, and an interactive Python CLI for the workflow.

More Projects

View all

Financial

Credit Card Benefits Organizer

Financial

Bank Statement Converter

E-Commerce

Shopify Blind Box App

Built by Abhinav Sinha

AI-First Product Manager who builds production-grade tools. Passionate about turning complex problems into elegant solutions using AI, automation, and modern web technologies.

Connect on LinkedIn Get in Touch View All Projects

The Problem

My Approach

Architecture

Key Features

Results &#x26; Metrics

What I Learned

Frequently Asked Questions

What problem does this generator solve?

What technologies power this project?

How good are the generated thumbnails?

Frequently Asked Questions

More Projects

Credit Card Benefits Organizer

Bank Statement Converter

Shopify Blind Box App

Built by Abhinav Sinha

Results & Metrics