Giving AI Agents Long-Term Memory for Large Repos

Managing large codebases with AI often feels like the movie 50 First Dates; the agent forgets the project structure the moment the session ends. It wastes time and tokens just rediscovering where files live. But u/K_Kolomeitsev recently shared a fascinating methodology to stop this cycle of amnesia.

The author realized that for a platform with over 100 microservices, the context window isn’t enough. Instead of forcing the model to re-read the map every time, this innovator moved “project memory” out of the context window and onto the disk.

The Twist: The “Why” Matters More Than the “What”

The core of this project, called the Data Structure Protocol (DSP), involves placing a small .dsp/ folder in the repo. It acts as a structural index. While dependency graphs are common, the creator found that knowing what imports what is only half the battle. The real value comes from documenting why a dependency exists, which allows the agent to perform safe impact analysis without breaking downstream code.

How It Works 📂

The system models the repository as a graph of entities. For each important file or module, the protocol uses a few text files to create a map:

  1. Description: A file detailing where the entity lives, its function, and its purpose.
  2. Imports: A list of dependencies the entity relies on.
  3. Shared/Exports: This is the secret sauce. It lists what is public and who uses it, but adds a short note explaining why the consumer needs it.

The Reality Check

I appreciate the honesty this expert showed regarding the setup. Bootstrapping this on a massive system is not cheap or fast. The author suggests starting with the services you touch the most and expanding the map gradually. However, once that map is in place, navigation becomes significantly faster and cheaper because the agent stops asking “Where am I?” and starts coding.

Get Started

If you want to stop your agents from getting lost, the creator has open-sourced the skeleton of this protocol (including the folder layout and a CLI script). You can find the link to the GitHub repo in the original Reddit discussion.

How I stopped an AI agent from getting lost in a 100+ microservice repo
by u/K_Kolomeitsev in ChatGPTPromptGenius

Scroll to Top