Sem: A New Primitive for Code Understanding & AI Devs

A developer tool called Sem is making the rounds on Hacker News, where it climbed to 174 points by pitching a different way for machines to understand code. According to Hacker News, Sem bills itself as a “new primitive for code understanding” that ditches the Language Server Protocol model and instead tracks code as entities layered on top of Git.

That framing is the whole story. Most code intelligence today runs through LSPs, the same machinery your editor uses for jump-to-definition, autocomplete, and error squiggles. LSPs are powerful, but they’re built around a live view of your current files. Sem flips the reference point. Instead of asking “what does this file say right now,” it asks “what is this entity, and how has it changed across the project’s history?”

What Sem actually does

Based on the Hacker News post, the core idea breaks down like this:

Entities, not files. Sem treats functions, classes, and other code units as first-class objects you can track, rather than text living inside a document.
Git as the foundation. It builds on your existing commit history instead of standing up a separate language server per project.
History-aware understanding. Because it sits on Git, it can follow how a given entity was added, modified, or removed over time.

The demo shared on Hacker News drives the point home with a simple auth refactor diff. In one commit, a function called validateToken gets added, authenticateUser swaps silent null returns for thrown errors and adds rate limiting, and an old legacyAuth function with a raw SQL query gets deleted. To an LSP, that’s just the new state of a file. To Sem, it’s three distinct entity events: one born, one changed, one retired.

Why this matters

What stands out here is the timing. AI coding assistants are everywhere, and they’re hungry for context about a codebase. The trouble is that feeding a model the current file rarely tells the whole story. Why was a function written this way? What got tried and abandoned? An entity-on-Git model is a natural fit for that problem, because it captures the why and the when, not just the what.

There’s also a practical angle. LSPs can be heavy to run and have to be configured per language. Building on Git, a tool every developer already uses, lowers the setup cost and travels across languages more easily. If Sem delivers on that promise, it could slot in as a context layer underneath AI agents, code review bots, and search tools.

The caveats

Here’s where I’ll be straight with you: the Hacker News submission is light on the details that decide whether a tool like this sticks. The post leads with the concept and a diff, not a spec sheet. There’s no clear word on pricing, on which languages are supported at launch, on whether it’s open source, or on how it performs against a real LSP for the things LSPs are genuinely good at, like precise type resolution.

That gap matters. “Entities on top of Git” is a clean idea, but Git history alone doesn’t understand types, scope, or call graphs the way a language server does. The interesting question, and one the post leaves open, is whether Sem complements LSPs or actually tries to replace them for AI use cases.

What comes next

The strong Hacker News reception suggests developers are receptive to rethinking how code gets understood, especially as AI tools push for richer context. Watch for whether Sem ships concrete benchmarks, language coverage, and an access model. Those will tell you if this is a genuine new primitive or a sharp idea still looking for proof. For now, the concept is worth a look, and you can find the full discussion and demo at the original source.

Read original article

What Sem actually does

Why this matters

The caveats

What comes next

Related: