Stop Babysitting Your AI Agent

AI coding agents are autonomous enough to work unsupervised. The bottleneck is no longer the model. It is the notification layer.

The permission prompt problem

You start a task in Claude Code, switch to another window, and come back twenty minutes later. The agent finished in three minutes and has been waiting for a permission prompt ever since. Or it completed the whole task and you could have moved on to the next one fifteen minutes ago.

This is not an edge case. GitHub issue #11380 on the Claude Code repo has 62 upvotes and 68 comments. One commenter called permission prompts "the single largest flaw in Claude Code." Cursor's forum has eight separate threads about agents stuck waiting for invisible approval buttons. "Stop babysitting Claude Code" has become its own blog post genre.

The problem is the same everywhere: an agent that needs human input is visually indistinguishable from an agent that is actively working. The only way to know is to check. And checking means context-switching, which is exactly what you were trying to avoid by using an agent in the first place.

Why software notifications fall short

The obvious solution is desktop notifications. Several tools exist: CCNotify for macOS, code-notify for cross-platform, Telegram bridges that forward permission requests to your phone. They work. I have used them.

The problem is that notifications are interruptive or ignorable. There is no middle ground. A toast notification demands attention for two seconds and then vanishes. A sound alert interrupts whatever you are doing. Both compete with every other notification on your machine. After a few hours you either disable them or stop noticing them.

What I actually wanted was something closer to a wall clock. You do not "check" a wall clock. You glance at it. It is always there, always current, and it never interrupts you. The information is ambient.

Ambient awareness

This is the idea behind Halo Terminal: a small USB device with 12 RGB LEDs that sits on your desk and shows what your AI agent is doing. Yellow means thinking. Red means waiting for input. Blue means idle.

Three states, visible from across the room, no screen required.

The distinction matters because of how developers actually work with agents. The workflow is increasingly fire-and-forget: start a task, go do something else, come back when needed. That is supervision, not collaboration. And supervision needs different UX than collaboration.

When you are collaborating, you are in the IDE, watching output, reading diffs. A status indicator in the terminal is fine. When you are supervising, you might be in a different app, on a different monitor, or making coffee. The status needs to come to you.

The hook convergence

The technical side turned out simpler than expected. Claude Code introduced lifecycle hooks in late 2025: shell commands that fire on events like PreToolUse, Stop, and permission requests. You register them in .claude/settings.json.

Then something interesting happened. Cursor added hooks in v1.7. Windsurf added Cascade Hooks. GitHub Copilot CLI added them. Kiro added them. The implementations differ in detail, but the pattern is identical: a JSON config file where you register a shell command that receives a JSON payload on stdin when a lifecycle event fires.

This means a single CLI tool that reads JSON from stdin and maps events to LED states works across all of them. One device, one integration pattern, every major AI coding tool.

$ hat hookup claude   # writes hooks to .claude/settings.json
$ hat hookup cursor   # writes hooks to .cursor/hooks.json
$ hat hookup all      # all supported tools at once

The hard part: where is the device?

The LED mapping is straightforward. The hard part is connecting the agent (which might run in a Docker container, a cloud VM, or an SSH session) to the physical device (which is plugged into your desk machine).

The tool has a three-tier discovery system:

Local: Unix socket, default gateway, localhost. Covers same-machine and Docker setups.

LAN: UDP broadcast and multicast. Covers multi-machine setups on the same network.

Internet: Nostr relay-based discovery. Both sides connect outbound to public relays, so neither needs to open ports. This handles the case where your agent runs on a cloud VM but your device is at home.

All three tiers run in parallel and short-circuit on first success. After initial discovery, the connection is saved, so subsequent calls skip the search entirely.

Building this was more work than the LED control itself. But getting it wrong means the tool only works in the simplest setup, which is exactly where you need it least.

Multi-agent aggregation

One pattern I did not anticipate: running multiple agents simultaneously. Three Claude Code sessions, a Cursor agent, maybe an Aider task. Each reports its own state. The device needs to show one color.

The aggregation rule is simple: waiting beats thinking beats idle. If any agent needs input, the light is red. If all agents that need input are resolved but some are still working, the light is yellow. If everything is done, blue. Priority-based state reduction.

This turns out to be the natural way to supervise multiple agents. You do not need to know which one needs attention. You need to know whether any of them do.

The deeper trend

AI coding agents are getting better fast. Six months ago, running Claude Code unattended was risky. Today, with proper tool permissions and a good system prompt, I routinely start tasks and walk away. The agent handles file reads, writes, test runs, and git operations without intervention. It only needs me for decisions that fall outside its permission scope.

As agents get more autonomous, the human's role shifts from collaborator to supervisor. You spend less time watching the agent work and more time reviewing its output. The bottleneck moves from "is the model good enough" to "how do I know when it needs me."

Screen-based interfaces are optimized for collaboration. You sit in front of them and interact. Supervision is a different activity. You check in periodically. You respond to signals. You need awareness, not attention.

The vocabulary the community uses reveals the friction: "babysitting," "polling the interface," "can't context-switch." These are words for a supervision task forced through a collaboration interface.

Stack

Go, ESP32-S, WS2812 LEDs, USB serial. The CLI is a single statically-linked binary distributed via npm. The hardware is a 3D-printed case (designed with AI and OpenSCAD) housing a microcontroller and an LED ring. Total BOM cost under 10 euros.

Written by Martin Sigloch with Tachikoma.