Some engineering students from KAIST shipped a transcription app this week. The reason it can be unlimited and free is the part worth paying attention to.
What they built
Alt is a speech-to-text and note-taking app for Apple Silicon (M1-M4 Mac, iPhone, iPad). Unlimited transcription. Completely free. Not “300 minutes and then surprise paywall” free. Actually unlimited.
And it is not barebones either. The app handles real-time transcription while you speak, not just file uploads. It identifies who is speaking. It can summarize what was said. You get a clean note at the end instead of a wall of raw transcript you have to clean up yourself. Every major transcription service charges you for at least one of those features, usually more than one.
The twist
The model runs on your device. No cloud. No server costs. No server costs means there is no reason to charge you by the minute. That is the whole trick.
This matters more than it sounds. Cloud transcription services pay for GPUs every single time you hit record. Those costs add up, so they pass them to you. Some charge per minute. Some charge per seat. Some give you a free tier designed to run out right when you actually need it. The pricing is not arbitrary greed, it reflects real infrastructure expenses. Alt just sidesteps that entirely.
This is how they pulled it off without wrecking battery life:
- Quantized a 1.6GB voice model to run efficiently on Apple Silicon. Quantization compresses the model weights so the model runs faster and uses less memory without losing much accuracy. You get most of the quality at a fraction of the compute cost.
- Rebuilt the engine with GGML + CoreML: 12ms per audio chunk vs the 46ms benchmark. That 3x speed improvement is the reason live transcription actually keeps up with normal speech instead of lagging behind and losing words.
- Pyannote runs locally for real-time speaker identification. So when your teammate interrupts you mid-sentence, the transcript labels it correctly instead of treating the whole meeting as one undifferentiated blob of text.
Because the AI lives on your machine, it works completely offline too. Flights, bad hotel wifi, that conference room where the connection somehow gets worse the more people join. Medical appointments where you want a record but do not want your health information sitting on someone else’s server. Interviews. Sensitive client calls. Any situation where you would normally hesitate before hitting record because you are not sure where the audio actually goes.
Getting started 🎙️
- Go to altalt.io (M-chip Mac, iPhone, or iPad required)
- Download and install Alt
- Start recording, no account needed for core transcription. The app does not ask you to sign up before it becomes useful, which is already better than most tools in this category.
- Want AI summaries? Hook it up to a local LLM on the free tier. Ollama is the easiest path here, takes about ten minutes to set up if you have not already.
- Need translations or hosted API calls? The $4/mo pro plan handles that. At that price it is cheaper than one month of most competing services, and you are only paying for the features that require external processing.
Pro tip 💡
Pair it with Ollama running locally and you get fully private AI summaries on top of the transcription. Zero data leaves your machine at any point. If you are in a field where confidentiality matters, that is not a nice-to-have. It is the whole point. A lawyer transcribing a client meeting, a therapist taking session notes, a founder discussing something before an announcement, all of these become much simpler when you know the audio never traveled anywhere.
If you want to go further, you can also feed the transcripts into any local RAG setup. Build a searchable archive of every meeting you have ever had. Query it later. “What did we agree on in the Q1 planning call” becomes a question you can actually answer.
This is what happens when the developers are also the users. Not an API wrapper with a landing page. Real engineering for a real problem. They optimized for the thing that mattered, which was making unlimited transcription technically possible without a server, and then they gave it away because there was no reason not to.
Worth grabbing if you are on Apple Silicon. 🔖
Frequently Asked Questions
Q: What’s the actual advantage of running this locally instead of using cloud-based transcription apps?
Local processing means zero internet dependency, it works offline on a flight or in a dead-zone conference room. More importantly, your audio never leaves your device, so there’s no privacy risk or data-center costs. That’s the magic: because Alt runs on your machine, there are no server bills to pay back, which is why they can offer unlimited transcription forever.
Q: Can I really use Alt with a free local LLM instead of paying for summaries?
Yep, that’s the whole design philosophy. You can connect any free, open-source LLM running locally for summaries instead of relying on paid APIs. If you don’t want to manage your own LLM setup, the $4/mo pro plan handles Claude/GPT/Gemini calls for you, but the transcription itself stays free and unlimited either way.
Q: What devices can actually run Alt?
Currently it’s M-chip Apple only: Mac (M1-M4), iPhone, or iPad. The quantized voice model is optimized specifically for Apple Silicon’s neural engine, which is partly why they nailed the 12ms latency without draining the battery.
Q: Is the transcription really unlimited, or are there hidden monthly caps?
The transcription is genuinely unlimited on the free tier, no “you’ve hit your 300-minute cap” nonsense. The paywall only covers automated summaries and translations via API. Speaker diarization (who said what) and offline transcription are completely free.
AI note-taking apps charging by the minute is getting ridiculous. Found one built by some students that runs 100% locally and is completely free.
by u/Exact_Pen_8973 in PromptEngineering