Orchestrating coding agents across projects

Trust boundaries

You still want good code. But you cannot read all the code you run. An agent pulls in dependencies, runs build scripts, makes tool calls, and most of that is code you never wrote and never reviewed. Quality review does not save you here, because the dangerous case is code that is working exactly as written, and written to harm you.

So the question that decides how much that matters is not "is this code good" but "if this code is hostile, what can it reach?" And reach is something you control directly. A trust boundary is the line a piece of code cannot cross: the keys it cannot read, the projects it cannot touch, the machine it cannot own. Draw those lines well and a hostile dependency is a contained nuisance. Draw them badly, or not at all, and it is a breach. The rest of this post is about where to put those boundaries when you are running agents across many projects, and the one thing you should let cross them.

Isolation came first

I built github.com/f0i/dev before I was thinking about AI agents at all. It gives each project its own isolated environment, so that bad code in one place cannot take over the system and cannot reach across into another project.

The problem it solves predates agents by years. Pull in hundreds or thousands of npm dependencies, or use any package manager where anyone can publish, and you are already running a large amount of code that nobody on your team has read. A supply-chain compromise, malicious code arriving quietly through a dependency you never audited, has always been able to ride in that way. Agents do not introduce this problem. They inherit it, and they enlarge it, because an agent will happily add the dependency and run its install script for you, at a pace no human reviewer keeps up with.

Isolation shrinks the blast radius to one box. A compromised package runs in an environment that does not hold the keys to anything else. It cannot read the SSH keys that talk to your servers. It cannot exfiltrate the API tokens for your other work. A bad postinstall script is, at worst, a bad day for one container. The damage is real but local, and that locality is the security property.

Why not just isolate everything?

If isolation is this good, the obvious move is to take it all the way: every project fully sealed, no channel between them, done. That would be safe. It would also be the reason not to build any of this, because it throws away the thing that makes working across many projects worthwhile.

I do not want to repeat every rule and every lesson in each project. A convention I settle once, a mistake I make once and learn from, a standing decision about how things are done here: I want all of it to reach every project without carrying it across by hand. Seal the boxes completely and each one starts from zero. Ten projects become ten beginners, each relearning what the others already know.

So the same wall that stops an attack from spreading is the wall that stops a lesson from spreading. That is the real tension, and it is why full isolation is not the answer.

The asymmetry

The way through is to notice that damage and understanding are not the same kind of thing, and should not obey the same rule at a boundary.

Damage should stay local. Understanding should travel.

A boundary that blocks both is too strict. A boundary that allows both is no boundary at all. What you want is a membrane that is asymmetric on purpose: opaque to harm, transparent to context. Compromise stops at the wall. Knowledge crosses it. Put that way, it stops being a contradiction and becomes a design target: build something that can see across the boundaries without dissolving them.

The orchestrator above the walls

That something is the orchestrator. It sits one level above the isolated environments and is the only component allowed to look across all of them. It connects into each project's container and starts tasks there. It injects the shared context that the sub-agents inside those containers need. It watches how every session in every project unfolds. The agents stay sealed. The orchestrator is the channel through which context, and only context, moves between them.

It never lowers a wall to do this. It does not merge the projects or hand one container another's keys. It carries context down into each box and pulls observations back up, while the boxes stay boxes. The membrane stays asymmetric: it passes through what is safe to share and nothing else.

Security as a stack of levels

Layered this way, an attack no longer has a single wall to beat. It has a staircase. A malicious dependency is contained at the level of its project, and to reach anything that matters it has to climb out of the container first, which is already the hard part. A subtler threat, a prompt injection that turns a sub-agent against you, is contained the same way: it can mislead the agent inside one project, but to reach the orchestrator and the context it holds across all projects it would have to cross another boundary entirely, one the orchestrator exists to guard.

The security of the whole is not any single perfect defence. It is the number of levels an attacker has to defeat in sequence, each one built to stop a different class of thing. Bad scripts are contained per project. Injections are contained per agent. The orchestrator is reachable only by climbing past both.

The orchestrator has to curate, not just carry

The catch is that the upward channel, the one good thing crossing the boundary, is also the one an attacker would most like to use. A lesson learned inside a compromised project must not become a rule the orchestrator quietly injects into every other project. If shared context flowed up untouched, "understanding travels" would just be a longer name for "compromise travels".

So the orchestrator's job is not plumbing. It is curation: deciding what earns promotion to a shared lesson, and being deliberate about it, the way you would not merge a stranger's pull request without reading it. That discipline is what makes the learning safe enough to be worth having. It is also why the orchestrator is the part you have to guard hardest. It holds context from everywhere, so a breach of it is the one breach that is not contained. The design earns its keep only if that layer stays small, scrutable, and slow to trust.

What you get in return is the only component that can actually accumulate anything. It sees every session in every project, so it can notice what recurs and get better at it, while a sealed agent stays sharp about its one task and forgets it the moment it closes. Over time that is what lets some routine decisions move from you to the machine: not because the machine is trusted blindly, but because it is the layer with the most context, which is the only place that authority is safe to sit.

The principle underneath

The rule I keep coming back to is small. Trust nothing fully, bound everything, and let only vetted lessons cross between the boxes. Containment flows down; understanding flows up, curated rather than raw. Damage stays local; knowledge spreads on purpose.

That is a security architecture, but it is also a way to work with code you did not write and cannot fully vouch for. You do not earn safety by trusting harder. You earn it by being deliberate about what you let through, and by reserving the view from above for the one part of the system careful enough to deserve it.