Most AI tools wait for you to ask. You type a prompt, you get an answer, and then you type another one. Then I came across this walkthrough from Jeff Su, a creator who’s really good at making messy tech feel simple, and he flips that whole idea on its head with Google’s new Gemini Spark. Instead of a chatbot that reacts, Spark is an agent that goes and does the work on its own. I was genuinely impressed by how low the barrier is here.
Here’s the pattern break: the regular Gemini chat is reactive, Gemini Spark is proactive. The author frames it perfectly. Gemini chat can find and summarize a file in your Drive, but it fumbles on basic stuff like renaming files or moving them to the right folder. Spark can watch a folder, spot a new upload, rename it based on what’s inside, and drop it in the correct place. Your Drive basically cleans itself.
Old way vs new way
The contrast shows up everywhere in his demo, and it’s the clearest way to understand why Spark matters.
- Reports: Gemini chat writes a great report, but you have to ask every single time. Spark builds that same report on a daily or weekly schedule without you lifting a finger.
- Email: Chat can summarize your inbox, but the formatting changes each time and it can’t archive or label anything. Spark runs a triage workflow on its own, formats summaries the way you like, suggests labels, and archives the junk.
- Context: Chat answers what’s in front of it. Spark pulls the right context from across Gmail, Calendar, Drive, and Search, then takes action.
The original poster uses a mental model I loved. Imagine the search bars from Drive, Calendar, Gmail, and Google Search all merged into one box, with a virtual assistant behind it who can actually act on your behalf. That’s Spark.
The four capabilities that matter
The creator breaks Spark down into four jobs, and each one builds on the last.
- Connect the dots. He asks Spark to build a spreadsheet of every World Cup match, then pull predictions from a recent meeting transcript and add them in. Spark searches the web for the schedule, digs into his calendar for the transcript, and stitches it all together without him opening a single app. In another example, he says “review my latest email thread with Austin” and Spark figures out which Austin, reads the thread, reuses the time slot from their last meeting, and drafts a calendar invite for approval.
- Follow your templates. Build a template once, point Spark at it, and it produces a consistent deliverable every time. He drops a messy sales-team update into Spark, tells it to follow his weekly report template, and gets back a clean leadership brief. No re-explaining.
- Create repeating playbooks (Skills). This is where it clicks. A Skill is a fixed set of instructions that turns the same kind of input into the same kind of output. His waffle-iron analogy nails it: your input is the batter, the Skill is the iron, the output is the waffle. Once he saves a “weekly report” Skill, he just types a slash command in a new chat to run it. Skills can even be exported and shared with teammates as a simple file.
- Automate your routines (scheduled tasks). A Skill still needs you to run it. A scheduled task runs itself, either on a timer or when something happens. He sets a weekly report to fire every Monday at 9:30 a.m. For an event trigger, he tells Spark that whenever an email lands from the meeting-notes address, analyze the transcript, brief him, and draft a recap. Then Spark just sits and waits.
Set it up in two moves
Before any of this works, the expert says do two things:
- Turn on memory. Head into Gemini settings, personal intelligence, and switch on memory so Spark learns how you work. Then check connected apps and make sure Google Workspace and Search are enabled. (Enterprise accounts only see the connected apps tab.)
- Create a home base. Make a folder called “Spark OS” in the root of your Drive. He builds sub-folders inside it for templates and temp outputs, so everything Spark generates has a place to live.
The honest trade-offs
What I appreciated is that the author doesn’t oversell it. His verdict:
- Spark is the most beginner-friendly because it’s already baked into the Google tools you use, so setup is minimal.
- It runs fully in the cloud, meaning scheduled tasks fire even when your laptop is off. That’s a real edge over Claude Cowork, Claude Code, and Codex.
- The weaknesses: not many external tools connect yet, the memory is a bit of a black box, and you can’t pick the model, so output can be inconsistent. He also shows a power-user move: create a “Spark.md” rules file you control and tell Spark to read it before responding, which forces your preferences instead of trusting the black box.
He even asked the head of Gemini at Google whether power users will get more control, and the answer was a clear yes, both on the markdown side and the Skills side. So this is only going to get deeper.
If you’ve been curious about AI agents but found Codex or Claude Code too intimidating, this is the gentlest on-ramp I’ve seen. Watch the full video for the exact prompts and the step-by-step Skill setup, then go build your own Spark OS folder and try one workflow today. 🚀