PromptLab: Convert Images & Videos to AI Prompts

Yesterday, a small build landed on r/PromptEngineering that does something most prompt tools don’t attempt. PromptLab is a free, open-source Chrome extension built by u/Puzzleheaded-End2493 that converts visual references into AI generation prompts. Feed it an image or a video file, and it writes the prompt for you. No manual description required. No staring at a clip trying to articulate camera angles, lighting conditions, and motion style in plain text while the words come out wrong anyway.

Here’s the part worth paying attention to: for local video files, it doesn’t guess from a single screenshot. It extracts key frames across the clip and builds a Seedance 2.0-style video generation prompt from what it finds. That’s reverse engineering a video’s visual DNA into something reusable. If you’ve ever tried to recreate a specific visual style and ended up with something that only vaguely resembles the reference, this is the tool that addresses exactly that gap.

What the Extension Actually Does

Three input types work right now: web images pulled from the browser, local image files, and local video files. Full online video extraction (think YouTube links) isn’t supported yet, so you’re working with files already on your machine. That’s a reasonable constraint for a first release, and local files are where most serious creative work lives anyway. Most reference material that actually matters to your workflow is already saved somewhere on your drive.

The extension runs entirely with your own Gemini API key, stored locally in the extension settings. Nothing goes to a third-party server. That’s a genuine privacy win compared to most tools in this space, where your reference material quietly ends up on someone else’s infrastructure. For anyone working with unreleased footage, client content, or competitive research, that distinction matters more than most feature comparisons.

The creative loop this enables is the interesting part. Most people work forward: write a prompt, generate something, adjust, repeat. That process puts all the descriptive burden on you upfront. PromptLab runs that backwards. Start from a finished visual reference, extract what its underlying prompt logic might look like, then generate from there. You’re not guessing at language anymore. You’re reading what a model would have needed to produce something close to what you’re holding. It’s a completely different way into the workflow, and it collapses the gap between ‘I know what I want’ and ‘I can communicate that to a model.’

Step-by-Step Workflow

Install PromptLab directly from GitHub (free, open source, setup takes under 2 minutes). Clone the repo, load it as an unpacked extension in Chrome developer mode, and you’re running.
🎬 Load your source file: a web image, local image, or local video. For video, shorter clips with clear visual variety tend to produce richer prompts since there’s more for the frame extractor to work with across the timeline.
🔑 Enter your Gemini API key in the extension settings (stays in your browser, no separate account beyond a Google account for the API itself).
⚙️ For videos, the extension pulls key frames and sends them to Gemini for analysis. The model looks at composition, color grading, motion cues, and subject detail across multiple moments in the clip, not just one frozen frame.
📝 Get back a structured Seedance 2.0-style prompt, ready to paste into any video model. The output covers camera movement, visual style, subject description, and pacing in a format most modern video generators understand out of the box.

Pro Tips

The output is framed for Seedance 2.0, but the author confirmed the prompts are adjustable for other video generators like Kling or Hailuo. Think of it as a starting draft, not a locked format. A quick pass to add your specific subject or tweak the motion language and it slots cleanly into most generators without much rework.
The strongest use case is reference analysis. Drop a competitor’s ad, a viral clip, or a scene you want to recreate. Instead of describing it from memory, let the extension extract the visual structure for you. That alone cuts a lot of guesswork out of the creative process. For ad teams specifically, this turns every strong-performing reference video into a reusable prompt template, which is a workflow that previously required a dedicated prompt engineer or a painful amount of trial and error to get right.
The free Gemini API tier covers light usage. If you’re running batches of video files, keep an eye on your quota before it becomes a surprise. The Gemini Flash tier is generous for single clips, but batch processing will push you toward paid limits faster than you’d expect on a heavy session.

For anyone doing video ad work or content repurposing, the reverse-prompt approach is genuinely underexplored. You’re not describing what you want from scratch. You’re showing the model what you want and asking it to translate. The difference in precision is significant, especially when you’re trying to match a specific visual style or recreate a particular aesthetic without access to the original project files. Working backwards from a reference is almost always faster than working forwards from a blank prompt.

🔗 The full project is on GitHub and the author posted a short demo video. Check both before you decide if it fits your workflow.

I built a free image video to prompt Chrome extension (open source)
by u/Puzzleheaded-End2493 in PromptEngineering

What the Extension Actually Does

Step-by-Step Workflow

Pro Tips

Related: