Forget vague music prompts. This 6-bracket structure actually works

TL;DR: A 6-part bracket formula turns AI music prompts from vague into cinematic. Name, mood, instruments, BPM, vocals, scene. Five tested examples, all surprisingly good.

The Formula

Six brackets. That’s the whole structure:

[Track Name], [Mood], [Instruments + Sound Design], [BPM], [Vocals], [Scene/Context]

Each bracket does a specific job. Name gives the AI an identity to build around. Mood sets emotional direction. Instruments get specific so you avoid the generic “epic orchestral” slop. BPM anchors energy. Vocals on/off sidesteps the AI’s weird default tendencies. Scene locks in narrative context.

The instrument bracket is where most people leave the most on the table. Saying “orchestral” tells the model almost nothing. Saying “deep tympani, low brass stabs, distant choir” gives it a sonic palette to actually work from. Think of it like the difference between telling a chef “make something savory” versus “sear a ribeye with garlic butter and fresh thyme.” One of those gets a meal worth eating.

The scene bracket is the hidden multiplier. It reframes the whole prompt from a musical description into a narrative moment. “Mountain summit drama” tells the model where this music lives in a story. That context bleeds into every decision the model makes about dynamics, build, and release. You’re not asking for music. You’re asking for a specific feeling at a specific point in time.

Why It Works

AI music generators need constraints. The more vague your prompt, the more the model guesses. Guessing means generic.

This structure forces you to think cinematically before you hit generate. You’re not describing music. You’re describing a moment in a story.

There’s also a cognitive benefit for you as the creator. Filling out six specific brackets forces you to make decisions you would have otherwise skipped. Most people sit down with “I want something dramatic and intense” and that’s it. By the time you’ve answered all six brackets, you’ve already done creative direction work. The model is just executing your vision, not inventing one for you.

The BPM bracket in particular does something subtle: it commits you to energy level before the model has a chance to guess wrong. 120 BPM and 46 BPM are both valid. But “dramatic” could mean either one depending on the model’s training distribution. You make the call, not the algorithm.

🎵 Five Real Examples

  • Dragon’s Peak, epic brass + deep tympani + choir, 80 BPM, no vocals, mountain summit drama
  • Spirit Guardian, shakuhachi + taiko + resonant gong, 120 BPM, no vocals, temple guardian combat
  • Time Witch, reverse orchestra + clock motif + glitching choir, 100 BPM, no vocals, time-manipulation boss fight
  • Enchanted World Restored, harp arpeggios + strings + light choir, 80 BPM, no vocals, world-saving victory
  • The Price of Power, solo piano over dissonant strings, 46 BPM, no vocals, anti-hero realization cutscene

That last one is 46 BPM for regret. That level of specificity is what separates interesting AI music from elevator background noise.

Notice how “Spirit Guardian” and “Dragon’s Peak” are both combat-adjacent, but the instrument choices pull them into completely different cultural and emotional spaces. One is ancient East Asian temple combat. The other is Western fantasy conquest. Same energy level, totally different worlds. That distinction came from the instrument bracket, not from anything the model invented on its own.

“Time Witch” shows you can push into abstract territory. “Reverse orchestra” and “glitching choir” are sound design descriptors as much as instrument choices. Most AI music tools respond well to this because they’re trained on diverse production libraries. You can describe textures, not just instruments. Granular synth pads. Tape saturation warmth. Lo-fi vinyl crackle. These work.

Use Cases

  • Game devs prototyping soundtracks before hiring a composer. You can test a dozen moods in an afternoon and walk into that composer conversation with actual references instead of vibes.
  • Content creators who need custom background music without licensing headaches. One prompt, one track, owned outright with no royalty risk depending on your platform’s terms.
  • Writers building mood playlists to write to. Different chapters, different emotional registers, all generated to match the specific scene you’re working on.
  • Podcast intro and transition music. You can iterate fast enough to match the exact tone of your show instead of settling for stock library tracks everyone else is already using.

Prompt of the Day

Copy this. Fill in the brackets. Run it in Suno, Udio, or whatever AI music tool you’re testing.

[Your Track Name], [Mood in 2-4 words], [3 specific instruments + sound effects], [BPM number], [no vocals], [scene description in 5-8 words]

If you want to go further, try adding a 7th bracket for era. [1980s synthwave] versus [2010s orchestral hybrid] completely changes how the model interprets everything else in the prompt. A 46 BPM melancholy piano piece hits differently with [late 1990s neo-classical] tagged on the end than it does without it. Era adds historical emotional context, and these models have absorbed enough of that history to use it.

One more tweak worth testing: run the same six-bracket prompt twice without changing anything. AI music generation has enough randomness that you’ll often get two meaningfully different interpretations of the same prompt. The formula is a filter, not a lock. Sometimes the second or third generation is the one that actually lands.

Build something worth listening to and drop it in the comments below.

This 6-part bracket structure produces surprisingly good AI music — here are 5 tested examples
by u/Excellent-Way-8707 in ChatGPTPromptGenius

Scroll to Top