This repository is a collection of explorations and examples of how to use Genkit with the Gemini CLI. The goal is to provide a hands-on guide for developers who want to get started with Genkit and Gemini for building AI-powered features.
Recommendation: Always use the latest versions of Genkit and its plugins for best compatibility.
- macOS (Intel or Apple Silicon)
- Node.js 20+ installed
- Gemini CLI installed (
npm install -g @genkit-ai/cli)
This repository includes a convenient script to quickly generate any of the documented examples:
# Make the script executable (only needed once)
chmod +x run-example.sh
# Run the script (interactive mode - will ask for confirmations)
./run-example.sh
# Run in YOLO mode (auto-approves all tool calls)
./run-example.sh --yolo
# Alternative: source the script to stay in the generated directory
# source run-example.shThe script will:
- ✅ Check if Gemini CLI is installed (and provide install instructions if not)
- 🎯 Present a menu of all available examples from this README
- 📁 Create a new folder for your selected example (auto-increments if folder exists)
- 📄 Copy
gemini.mddocumentation to the project folder - 🤖 Run Gemini CLI with the appropriate prompt to generate the code
Each generated example will be placed in its own folder like genkit-learn-text-generation-001, genkit-learn-tts-002, etc.
--yolo, Gemini CLI will automatically approve all actions including file creation, package installation, and command execution. Use with caution!
The repository is organized by language, with each language having its own directory under the docs folder:
docs/
└── nodejs/
└── gemini.md # Comprehensive Node.js guide for Genkit with Gemini
Status: Tested ✓
Simple text generation using Gemini models through Genkit flows.
Build a creative story generation service that:
- Takes a topic and generates stories of varying lengths
- Uses Gemini 2.5 Flash for fast, creative responses
- Returns word count along with the story
I need a story writing app using Genkit. Users should be able to enter any topic
and get back a creative story about it. Give them options for story length
(short, medium, or long). Use Gemini AI to generate the stories.
- ✓ Fully functional Genkit project with story generation flow
- ✓ Node.js 20+ and TypeScript environment configured automatically
- ✓ All dependencies installed (genkit, @genkit-ai/core, @genkit-ai/googleai, zod)
- ✓ Input validation for topic and length selection
- ✓ Word count tracking in output
- ✓ TypeScript code that compiles successfully
- ✓ DevUI integration for testing
- ✓ Requires Gemini API key to be set as
GEMINI_API_KEYenvironment variable
{
"topic": "a robot learning to paint",
"length": "medium"
}{
"story": "In a small workshop filled with canvases and paint...",
"wordCount": 427
}# Set your API key
export GEMINI_API_KEY="your-api-key-here"
# Start Genkit DevUI
genkit start -- npx tsx --watch src/index.ts
# Access the UI at http://localhost:4000Status: Tested ✓
Transform text into natural-sounding speech using Gemini's TTS capabilities.
Build a translation and text-to-speech service with two flows:
- translateText: Translates English text to Spanish using Gemini
- textToSpeech: Generates audio from any text input
- Returns playable WAV files in DevUI
Build me a project that uses Genkit on the backend to take input from the user,
translate it to Spanish from English and create an audio file using text-to-speech
to read back the spanish text by generating the audio file. Use Gemini models for both tasks.
- ✓ Two separate flows: translateText and textToSpeech
- ✓ Node.js 20+ environment with all dependencies auto-installed
- ✓ Translation flow converts English to Spanish
- ✓ TTS flow generates WAV audio from any text
- ✓ Audio output playable directly in DevUI
- ✓ TypeScript code that compiles successfully
- ✓ Includes wav package for PCM to WAV conversion
- ✓ Requires Gemini API key as environment variable
# Set your API key
export GEMINI_API_KEY="your-api-key-here"
# Start Genkit DevUI
genkit start -- npx tsx --watch src/index.ts
# Access the UI at http://localhost:4000- Gemini API key is required (project will error without it - by design)
- Genkit DevUI must be run interactively (cannot be automated from CLI)
- Audio output is accessible directly through the DevUI interface
Status: Tested ✓
Generate conversational audio with multiple distinct voices for realistic dialogues.
Build a dialogue generation service that:
- Uses speaker tags format:
<speaker="Speaker1">text</speaker> - Supports two different voices in a single audio file
- Generates conversational audio with proper voice switching
- Returns a single WAV file with both speakers
Build a Genkit app that creates podcast-style audio conversations. I want to
write dialogue between two people and have the app generate audio where each
person sounds different. For example: "Host: Welcome!" and "Guest: Thanks!"
should have distinct voices. Use Gemini's text-to-speech.
- ✓ Single audio file with multiple speakers
- ✓ Complete project with Node.js 20+ and all dependencies
- ✓ Voice switching based on speaker tags
- ✓ Speaker1 uses 'algenib' voice, Speaker2 uses 'kore' voice
- ✓ WAV audio playable directly in DevUI
- ✓ Requires Gemini API key with TTS model access
{
"text": "<speaker=\"Speaker1\">Welcome to our podcast about AI!</speaker> <speaker=\"Speaker2\">Thanks for having me. I'm excited to discuss the future.</speaker> <speaker=\"Speaker1\">Let's start with Genkit. What makes it special?</speaker>"
}{
"audioDataUri": "data:audio/wav;base64,UklGRi..."
}- Must include
responseModalities: ['AUDIO']in config - Speaker tags must match exactly:
<speaker="Speaker1">not<speaker=Speaker1> - Only two speakers supported per request
- Audio is returned as base64 PCM data (converted to WAV)
- Default voices: 'algenib' for Speaker1, 'kore' for Speaker2
Status: Ready for Testing
Fast, accessible image generation using Gemini's built-in image generation capabilities.
Build a quick image generation service that:
- Uses Gemini Flash for fast image generation
- Supports creative prompts and descriptions
- Returns base64 images for immediate display
- Ideal for prototyping and general use cases
I need an AI image generator using Genkit with Gemini Flash. Users should be
able to type what they want to see (like "a cat wearing a superhero cape")
and get back a generated image quickly. Use the Gemini Flash model for fast
image generation.
- ✓ Complete Genkit project with Gemini Flash image generation
- ✓ Single flow:
generateImageWithGemini - ✓ Base64 image output for DevUI display
- ✓ Fast generation times
- ✓ Uses Google AI plugin (not Vertex AI)
- ✓ Requires only basic Gemini API key
{
"prompt": "A serene Japanese garden with cherry blossoms"
}{
"image": {
"base64": "iVBORw0KGgoAAAANS...",
"mimeType": "image/png"
},
"model": "gemini-2.0-flash-preview-image-generation"
}Status: Ready for Testing
Premium image generation using Google's dedicated Imagen models for superior quality.
Build a high-quality image generation service that:
- Uses Imagen 4 preview for superior image quality
- Supports advanced style controls
- Can generate multiple images (via multiple API calls)
- Ideal for professional or artistic use cases
Build an image generator with Genkit using Google's Imagen 4 preview model. I want
high-quality, professional images with style controls (like "photorealistic"
or "watercolor"). Users should be able to generate multiple variations of
their prompt.
- ✓ Complete Genkit project with Imagen integration
- ✓ Single flow:
generateImageWithImagen - ✓ Style customization options
- ✓ Multiple image generation (via looped API calls)
- ✓ Base64 image output for DevUI display
- ✓ Higher quality than Gemini Flash
- ✓ Uses Google AI plugin (not Vertex AI)
- ✓ Requires paid tier Gemini API key with Imagen access
{
"prompt": "A futuristic cityscape at sunset",
"style": "photorealistic",
"numberOfImages": 2
}{
"images": [
{
"base64": "iVBORw0KGgoAAAANS...",
"mimeType": "image/png"
},
{
"base64": "jWCPRw1LKgoAAAANS...",
"mimeType": "image/png"
}
],
"model": "imagen-4.0-generate-preview-06-06"
}- Imagen 4 preview offers the latest features and best quality
- Requires paid tier API key (no free tier for Imagen)
- Multiple images are generated via sequential API calls
- Best for professional or artistic applications where quality matters most
Status: Ready for Testing
Generate video content from text prompts (Veo 3) or starting images (Veo 2).
Video generation has NO FREE TIER in Gemini API. Each video generation request will incur charges.
- Fixed parameters: 8 seconds duration, 720p resolution, 24fps
- Current pricing: Check Google AI Studio Pricing for latest rates
Build a video generation service with two flows:
- generateVideoFromText: Creates videos from text descriptions (Veo 3)
- generateVideoFromImage: Creates videos from a starting image (Veo 2)
- Supports aspect ratio selection (16:9, 9:16)
- Includes negative prompts to exclude unwanted elements (Veo 3 only)
- Returns video URL after asynchronous generation
Build a video creation app with Genkit where users describe scenes in text
and get AI-generated videos. Support both text-to-video and image-to-video
generation. Include options for aspect ratio and negative prompts. Use
Google's Veo 3 preview model for text-to-video and Veo 2 for image-to-video.
- ✓ Complete project with Node.js 20+ and all dependencies
- ✓ Two flows: text-to-video (Veo 3) and image-to-video (Veo 2)
- ✓ Asynchronous video generation with polling via
ai.checkOperation() - ✓ Aspect ratio options (16:9, 9:16)
- ✓ Negative prompt support (Veo 3 only)
- ✓ Videos automatically downloaded locally
- ✓ Text-to-video: 8 seconds fixed, 720p, 24fps (Veo 3)
- ✓ Image-to-video: 5-8 seconds variable, 720p, 30fps (Veo 2)
- ✓ Cost warnings prominently displayed
- ✓ Requires Gemini API key with Veo model access AND billing enabled
{
"prompt": "A time-lapse of a flower blooming in a garden, with soft morning light",
"aspectRatio": "16:9",
"negativePrompt": "cartoon, animated, low quality"
}{
"prompt": "Camera slowly zooms in while petals gently sway in the breeze",
"imageBase64": "iVBORw0KGgoAAAANS...",
"imageMimeType": "image/png",
"aspectRatio": "16:9",
"negativePrompt": "blurry, distorted"
}{
"videoUrl": "https://storage.googleapis.com/...",
"videoPath": "./output-1737123456789.mp4",
"metadata": {
"duration": 8,
"resolution": "720p",
"fps": 24,
"aspectRatio": "16:9"
}
}# Set your API key
export GEMINI_API_KEY="your-api-key-here"
# Start Genkit DevUI
genkit start -- npx tsx --watch src/index.ts
# Access the UI at http://localhost:4000- BILLING REQUIRED: Video generation is NOT free - every request costs money
- Video generation is asynchronous and requires polling (may take several minutes)
- Text-to-video (Veo 3): Fixed 8 seconds, 720p, 24fps
- Image-to-video (Veo 2): Variable 5-8 seconds, 720p, 30fps
- Video URLs are temporary and expire after 24 hours
- Not all API keys have access to Veo models by default
- Image-to-video requires a starting image in base64 format
- Veo 3 does NOT support image-to-video generation
- Clone this repository
- Navigate to your language-specific documentation (e.g.,
docs/nodejs/gemini.md) - Follow the detailed setup and implementation guides
- Test scenarios using the provided prompts and examples
When adding new scenarios:
- Update this README with the scenario details
- Add comprehensive documentation to the appropriate language guide
- Include working code examples and test results
- Mark the status appropriately