Open Source AI Memory Tool: Multi-Hop Reasoning, Connected Facts

A developer just shipped a tool that targets one of the most frustrating gaps in modern AI assistants: the inability to connect related facts across conversations. According to Hacker News, where the project landed with 161 points under a “Show HN” post, the launch pairs a working memory system with a benchmark designed specifically to measure how well AI tools handle connected facts.

The pitch is straightforward. Most AI memory layers store snippets of information but fail when you ask the assistant to reason across them. You tell it on Monday that your team uses Postgres. On Tuesday you mention you’re hiring a backend engineer. By Friday, when you ask “what database experience should the new hire have,” the assistant draws a blank. This tool aims to fix exactly that.

What’s in the box

The installer runs a four-step setup that takes the guesswork out of plugging memory into an existing AI client:

spaCy model download. It pulls en_core_web_sm, the lightweight English NLP model spaCy uses for entity recognition and parsing. That’s the engine that identifies which facts are worth linking.
Database initialization. A local store gets provisioned to hold the memory graph. No cloud account required.
MCP config injection. The setup writes Model Context Protocol settings into ~/.claude/settings.json, wiring the tool directly into Claude Code.
Memory rules. It appends behavior rules to ~/.claude/CLAUDE.md so the assistant actually uses the new memory layer instead of ignoring it.

Restart your AI client and you’re done.

Why the benchmark matters

Shipping a memory tool is one thing. Shipping a benchmark to prove it works is the move that caught the Hacker News crowd’s attention. Existing memory systems for LLMs, including OpenAI’s built-in ChatGPT memory and tools like Mem0 or Letta, mostly get evaluated on retrieval accuracy: can the model recall a fact you told it earlier? That’s the easy case.

The hard case is multi-hop reasoning over stored facts. If you stored A and B separately, can the assistant infer C? Most memory systems quietly fail here, and there hasn’t been a clean way to measure how badly. A purpose-built benchmark gives developers something to optimize against and gives users a way to compare tools honestly.

How it stacks up

The AI memory space is getting crowded fast. ChatGPT’s memory feature is closed and limited to one platform. Mem0 and Letta offer open frameworks but lean on vector similarity, which is good for retrieval and weak on connection. Anthropic’s MCP standard, which this tool plugs into, is becoming the default way to add capabilities to Claude and a growing list of compatible clients.

The focus on connected facts puts this project in a different lane. It’s not trying to remember more. It’s trying to reason better across what’s already remembered.

Who can use it

The tool ships open source with a one-command install. The default integration targets Claude Code, but because it speaks MCP, any MCP-compatible client should work with minor config changes. There’s no pricing tier mentioned, no signup, no waitlist. Clone, install, restart, and the memory layer is live locally on your machine.

What stands out here is the combination. A working tool plus a benchmark that lets the community actually measure progress on a problem everyone has been hand-waving about. Expect other memory frameworks to start publishing scores against it within weeks. Full setup notes and the benchmark methodology are at the original Hacker News post.

Read original article

What’s in the box

Why the benchmark matters

How it stacks up

Who can use it

Related: