Picture a Sims player dropping furniture into an empty lot. They do not start with the throw pillows. They start with the foundation, then the walls, then the room layout, and only after the space makes spatial sense do they layer in decoration. Skip that order and the bedroom ends up inside the kitchen. The couch floats. The lighting points at nothing. The whole build feels off even if each individual piece looks fine in isolation.
That same logic applies directly to prompts. And u/withAuxly on r/PromptEngineering picked it up from an AI researcher who framed it as a layered hierarchy, and it’s been killing drift in long generations ever since. The reason it works is not magic. It is sequence. AI models process what comes first as the interpretive anchor for everything that follows. Feed it garnish first and it builds the world around throw pillows.
🏗️ The Sims Framework, Mapped to Your Prompt
Think of your prompt like a Sims build: you are not describing a scene, you are architecting a world the AI can inhabit before it starts generating. Here are the five layers, in order:
- 🌍 Domain (the physics and vibe): “Cinematic 35mm, high-contrast lighting, brutalist architecture.” This is the terrain and atmosphere. The rules of the world before any object exists. Think of it as the physics engine running in the background. Every creative decision downstream gets constrained and shaped by what you define here.
- Building (the core object): “A lone concrete tower in a desert.” Your main subject, placed before any relationships are defined. One subject. One sentence. Resist the urge to describe it yet. Just plant it.
- Relations (how things interact): “Sand piling against the north wall. Shadows stretching toward the camera.” This is where drift dies. The AI now understands spatial logic before it starts painting. Without this layer, the model invents its own spatial relationships, and those invented relationships are where inconsistency lives.
- Camera (the observer): “Low-angle shot, wide lens, looking up.” The viewpoint from which everything gets rendered. This layer tells the model where the reader or viewer is standing relative to the scene, which determines scale, emphasis, and what detail matters.
- ✨ Garnish (the tiny finishing touches): “Dust motes in the light, a single cracked window.” The throw pillow. Always last, never first. This is the layer most people write first, which is exactly why most prompts drift.
When these five layers appear in sequence, the AI processes spatial relationships before visual details. Color bleed between subjects drops sharply. The “desert bleeding into the concrete” problem that ruins image generation outputs mostly disappears. The model is not guessing at structure anymore. You gave it a blueprint before you asked it to build.
🛠️ How to Apply It
Replace your one-paragraph prompt with five labeled sections in this exact order. You do not need a wall of text in each layer. One sharp sentence per section is enough. The key is sequence: world rules before the object, spatial logic before visual texture, perspective before fine detail.
Label them explicitly: Domain, Building, Relations, Camera, Garnish. Most AI models respond well to clear structural labels. You are giving the model a construction blueprint, not a description to interpret. If you want to test this, take an existing prompt that has been producing inconsistent outputs and split it into these five buckets without changing a single word. Just reorder. The difference in output consistency is often immediate and significant enough to change how you write every prompt going forward.
One common mistake: stuffing the Relations layer with adjectives instead of spatial logic. “Beautiful golden sand” belongs in Domain or Garnish. “Sand drifting against the base of the north wall and thinning toward the east face” belongs in Relations. The question to ask yourself is: does this sentence define where something is relative to something else? If yes, it goes in Relations. If it describes how something looks in isolation, it goes somewhere else.
💡 Use Cases
- Image generation: Stop color bleed between subjects. Use the Relations layer to anchor each element spatially before the diffusion model starts rendering pixels. If you have two characters in a scene, define their position relative to each other and to the background before you describe clothing or expression.
- Long-form writing: Map genre and tone to Domain, the protagonist to Building, conflict dynamics to Relations, narrative voice to Camera, and sensory texture to Garnish. A thriller chapter prompt built this way produces tighter pacing because the AI knows the tension architecture before it writes a single line of dialogue.
- Code generation: Tech stack and constraints become Domain, the core module is Building, data flow and dependencies are Relations, the API surface is Camera, and edge cases are Garnish. This stops the model from solving edge cases before understanding data flow, which is where most AI-generated code introduces subtle bugs.
🎯 Build Your Next Prompt From the Foundation Up
Take a prompt you have already written. Paste it into a doc and tag each sentence with one of the five layers. You will likely find most of your sentences are Garnish sitting on top of a missing foundation. Reorder them. The outputs will change noticeably.
Do this exercise with three prompts in a row and the layered hierarchy stops feeling like a framework and starts feeling like the only logical way to write a prompt. You will catch yourself instinctively reaching for Domain before you write anything else. That instinct is the whole point.
The shift from “describing a picture” to “architecting a scene” is small. The consistency gain is not.
prompting like a ‘sims’ player: a framework for zero-drift outputs
by u/withAuxly in PromptEngineering