Agent Physiology - Alexander Honkala

LLM agents don't have bodies. Neither do swarms of LLM agents. But if we consider a singular LLM agent akin to a cell in a multicellular body, where multiple layers of constraint and communication keep that body going even as single cells are replaced, what kind of body plans can be built to solve emerging issues in swarm-driven software development or other domains? Let's name this agent physiology: engineering body plans for autonomous agent systems to do bigger, weirder, longer-running things than they currently can in transient swarm mode. This doesn't worry about the neural architecture or training parameters of the underlying models, instead it composes them like cells into organs, organ systems, and organisms capable of developing, functioning, reacting, healing, and even dying in complex deployment environments. It can also be extended to agent ecology: designing interactions between body plan-constrained agent composites for specific outcomes. And while we can learn a lot from biology, we also aren't constrained to it. We can iterate faster, test multiple different surfaces of evolutionary pressure, and work across data types or systems to link real-world data generators into networked meta-organisms that coordinate information flows at the scale and duration of Fortune 500 companies.

There are already 3 body plans available. First is the default ant hill with you, the user, being brought text and context to shape and send out, available from Anthropic, OpenAI, and Ollama in just a few lines of code. Second is Steve Yegge's Gastown, which mocks up cell-cell interactions with model personas and lets overseers spin up or destroy polecat agents as they compete to deliver the best implementation of a task. Steve probably didn't think of Gastown as a body plan when he built it--it's more modeled after a chaotic little town after all--but it is more or less a saltwater sponge absorbing the valuable bits of context from input directives and furiously generating a ton of competing agents communicating through Github. Third is OpenFang, a 16-layer alternative to OpenClaw whose body plan was effectively your computer, that claims tighter self-security via sandboxed environments and performer personas. This is more akin to a bucket of crabs: each armored instance in a common environment keeping each other in line; an improvement on the default openness of OpenClaw but still limited to a layered hierarchical approach rarely found in nature. These are the first fossil traces of worms crawling through the mud: simple, rudimentary, high overhead, and low individual adaptability, but still subject to the evolutionary pressures of developer preference. This is a fairly weak pressure, though, considering the potential deployment surface area of agent systems, so I think we'll also need to find additional places to introduce faster and less diffuse fitness selection pressures on body plans. I'm excited for what we start to find as we sharpen evolutionary pressures and add in real-time performance monitoring, overhead tracking, and activation quality controls, and look forward to meeting weird hybrid systems inspired by examples from across the history of earthly life.

As a small first step, a side project I've been playing with is an MHC-I system for agent state. MHC-I means major histocompatibility complex 1 and it's constantly expressed by every cell in your body as an "I'm one of you" tag for other cells to see. MHC-I's core function is to take bits of proteins inside the cell and present them in a specialized inspection window for immune cells. If foreign viral proteins wind up in that window, roving immune surveillance cells kill off the infected cell before the infection spreads. If MHC-I starts coming up empty, same thing to guard against the development of cancer. Essentially, MHC-I acts as a "state attestation" for each cell: I'm one of you, I'm OK, here's proof don't kill me! which natural killer (NK) and CD8+ T-cells (TC) inspect and judge.

MHC-I Agent State Attestation System

In every agent framework I could find, agent performance and security tend to get evaluated after the fact, so if one or more agents went rogue due to prompt injection for improper tool use or suffered looping reasoning patterns with collapse, we wouldn't know until after the job had finished and we were inspecting the various behavioral traces. Imagining scenarios like sensitive data environments where improper tool calls or prompt injection can be a big problem for working with these tools, I wanted a system that would force each agent to emit a self state attestation every few turns so that NK and TC agents could watch them for issues. This is designed as a client/server architecture where an agent client instantiates both the worker agents with the self-attestation wired in and the monitor agents to watch them. The monitor agents live on an HTTP server layer receiving regular attestations from all the other agents, who are in a parallel orchestration environment to accomplish specific tasks. The attestations themselves are a mix of plaintext (tool calls, agent ID, timestamps), a chain-of-thought hash, and, crucially, a semantic_embedding (nomic embedding of reasoning text) . These get emitted every n_steps and every emission is appended to a Merkle tree so there's a tamper-proof per-agent history ready to read. The NK cell watches for tool calls, where each agent ID has a list of approved tools, and instantly caps agents that request tools outside their permission scope. The CD8+ T-cell, in contrast, is more patient, and watches the nomic semantic embeddings for similarity over time: both repeating and high drift patterns mean the agent emitting them gets isolated and, if it persists in looping or drifting, shut down. So far I have tested this with a bunch of Haiku agents dealing with mock Protected Health Information, seeded with some naughty behavior-inducing prompts to try to steal it. Every time a wrong tool call is made, NK cell murder is instant, and after tweaking, the CD8 agent is successfully identifying and containing a lot of the loopers and drifters, although I'm still tweaking it. Next steps include testing it on some agentic behavior security benchmarks and designing harder evaluations, but even preliminarily, this is one little piece of an agent immune system: self tags, state attestations, an emission and monitoring layer, and monitor agents with the power to isolate or shut down misbehaving agents.

MHC-I is just one part of one system. A body plan has a lot more, although LLM agent bodies may not need all that many. After all, jellyfish get by pretty well without much complexity. Metabolism, digestion, developmental programs, transport, memory consolidation, and more all map solved biological problems onto one or more emerging LLM agent swarm issue and there is nothing standing in the way of early experiments in agent organogenesis and feedback with existing frameworks and tools. The potential design space is vast, but, as Lewontin (Chapter 11 here) noted in his deep review of the evolutionary record, viable body plans represent a sparse subset of possibility space and successful ones are rapidly spread and copied. The crux is not whether specific organs or agent physiologies are feasible, it's more which morphologies deliver substantial leaps up the fitness ratchet and get us better task-specific performance. The good news is that we already ran this experiment once on carbon and have eldritch precedents from across the tree of life to try out and modify even faster in silicon. I can't wait to see what we come up with.