Python Sandbox: Safe Code Execution with MicroPython

Simon Willison has shipped an alpha package called micropython-wasm, a new way to run untrusted Python code safely inside your own Python applications. Willison reports he’s been chasing a good code sandbox for years, and this latest attempt finally hits all the marks he’s been looking for. He’s already using it in a plugin called datasette-agent-micropython. This is significant because safe code execution is one of the hardest problems for anyone building plugin systems or AI agents that run generated code.

What follows is a practical walkthrough of his approach, plus the reasoning behind each decision.

Quick Start

You’ll learn why a sandbox matters, what a good one needs, and how Willison combined MicroPython and WebAssembly to build one. To follow along you’ll need basic familiarity with Python, the concept of WebAssembly (WASM), and a willingness to compile a custom build. No prior sandbox experience required.

Why You’d Want a Sandbox

Willison’s core projects (Datasette, LLM, sqlite-utils) all support plugins through Python and Pluggy. Plugins let software grow new features overnight without touching the core. The catch: plugin code runs with full privileges. A buggy or malicious plugin could break everything or leak private data.

A sandbox solves this. It lets you run plugin-style code that can’t read unapproved files, hit the network, or harm the host machine. Willison also wants it for features like scheduled jobs that fetch JSON, reshape it with a little code, and insert rows into SQLite.

What a Good Sandbox Needs

Before building, Willison defined his requirements. Use these as your own checklist:

Clean install from PyPI, including binary wheels across platforms, so users take no extra steps.
Memory and CPU limits, so something like while True: s += "longer string" can’t crash the app or machine.
Strict file access control: either no filesystem access, or you decide exactly what’s readable and writable.
Controlled network access, so sandboxed code can’t talk to anything without going through a layer you control.
Host function support, so you can carefully expose selected platform features to the running code.
Robust, supported, documented. Willison notes he’s lost count of abandoned sandbox projects warning they aren’t maintained.

Step 1: Pick WebAssembly Over JavaScript Engines

Browsers run untrusted code on every page load, so JavaScript engines look like natural sandboxes. The problem: engines like V8 are hugely complex and weren’t built for embedding. Most V8-in-Python projects are rarely maintained and warn against fully untrusted code.

WebAssembly is the better fit. It was designed from the start for the isolation properties Willison wants, and it’s been battle-tested in browsers for nearly a decade. Crucially, the wasmtime Python library is actively maintained and ships binary wheels.

Step 2: Choose MicroPython, Not Pyodide

WASM engines run WASM binaries. Static languages like Rust compile cleanly. Dynamic languages like Python are harder because features like eval() need a full interpreter at runtime. So you need a Python interpreter compiled to WASM.

Pyodide is excellent for running Python in the browser, but server-side use isn’t supported. Recent guidance Willison found states Pyodide “can only run in a browser or Node.js.” That pushed him to MicroPython, which describes itself as a “lean and efficient” Python 3 built for “constrained environments.” As Willison puts it, WebAssembly sure feels like a constrained environment.

Step 3: Build a Custom MicroPython WASM Binary

Willison had GPT-5.5 Pro research the problem. It surfaced a pull request against MicroPython by Yamamoto Takahashi titled “Experimental WASI support for ports/unix,” then produced a research.md document. He handed that to Codex Desktop and GPT-5.5 with this instruction:

“read the research.md document and build this. You will probably need to write a script that compiles a custom WASM version of MicroPython as part of this project – fetch the MicroPython code to a /tmp directory for this as part of that script.”

It worked. He had a prototype library running Python inside a WASM sandbox.

Step 4: Solve Persistent Interpreter State

The trickiest part was keeping state alive between runs. The WASM build exposes a single entry point that starts the interpreter, runs the code, then shuts it down. That’s fine for one-off scripts. But for an agent, you want variables defined in one execution to survive into the next, which means keeping the interpreter alive across calls rather than tearing it down each time.

Why This Matters

What stands out here is the combination. AI agents increasingly generate and run code, and doing that safely on a server has been a real gap. MicroPython plus wasmtime gives a pip-installable path with no browser dependency, real resource limits, and controllable I/O. It’s still alpha, so treat it accordingly, but it’s a promising template for anyone building agentic or plugin-heavy tools.

Next Steps

If you want to go further, try installing the micropython-wasm alpha and run a throwaway script through it. Map your own version of Willison’s requirements checklist against your project. Experiment with exposing one safe host function. And watch the datasette-agent-micropython plugin as a real-world reference. Full technical details are available at the original source.

Read original article