AI Agent Harness: Inside vs. Outside Sandbox Architecture

This analysis explores the critical architectural decision of where to place the 'agent harness'—the essential loop that drives Large Language Model (LLM) interactions. By comparing the 'inside the sandbox' model, where the harness and code share a container, with the 'outside the sandbox' model, where the harness resides on a backend and interacts via API, the article highlights significant differences in security, failure modes, and operational complexity. While internal harnesses offer simplicity for single-user developer setups, external harnesses provide superior protection for sensitive credentials, such as LLM API keys and user tokens. This distinction is particularly vital for multi-user organizational environments where shared resources and security boundaries are paramount. The analysis delves into the tradeoffs of each approach based on the latest industry perspectives.

Key Takeaways

The Agent Harness Defined: Every production agent relies on a harness—a continuous loop that manages LLM prompts, tool execution, and feedback cycles.
Inside the Sandbox Architecture: This model places the harness and the code it acts upon in the same container, offering simplicity and local filesystem access for skills and memories.
Outside the Sandbox Architecture: This model runs the harness on a backend, calling into a sandbox via API for tool execution, which keeps sensitive credentials isolated from the execution environment.
Single-User vs. Multi-User Needs: While single-user setups (like an engineer on a laptop) can benefit from the simplicity of internal harnesses, multi-user organizations face unique security and scaling challenges that favor external placement.
Credential Security: Placing the harness outside the sandbox ensures that LLM API keys, user tokens, and database access remain protected from the environment where code is executed.

In-Depth Analysis

Defining the Agent Harness and Its Core Function

At the heart of every production-grade AI agent lies the "agent harness." This component serves as the operational engine or the "loop" that drives the Large Language Model (LLM). The process managed by the harness is cyclical and iterative: it begins by sending a prompt to the model, receiving a response, and then identifying any tool calls requested by the model. These tool calls—which might include actions like executing bash commands or reading and writing files—are then performed. The results of these actions are fed back into the model, and the cycle repeats until the LLM signals that the task is complete.

Because every production agent requires this driving mechanism, the fundamental architectural question is not whether to have a harness, but where that harness should reside. The placement of the harness dictates the agent's security profile, how it handles failures, and the scope of its capabilities. This decision is further complicated by the scale of the deployment, specifically whether the agent is intended for a single user or a multi-user organizational environment.

The "Inside the Sandbox" Model: Simplicity and Local Integration

In the "inside the sandbox" architecture, the harness loop lives within the same container as the code it is actively working on. In this setup, LLM calls originate from within the container itself. Tool calls, such as bash commands or file system operations, are executed locally within that same environment.

One of the primary advantages of this model is its simplicity. It operates under a unified execution model characterized by one container, one process tree, one filesystem, and one lifetime. This allows developers to reuse off-the-shelf harnesses without modification. Furthermore, features like "skills" and "memories"—which often rely on a local filesystem to track state—work seamlessly because they are provided with the local environment they expect. This is the architecture typically seen when running tools like Claude Code on a local laptop or within a remote container using a standard SDK. For a single engineer working in isolation, this model provides a straightforward path to shipping functional agentic tools.

The "Outside the Sandbox" Model: Security and Credential Isolation

Conversely, the "outside the sandbox" architecture moves the harness loop to a backend environment, separate from the execution sandbox. When the harness needs to execute a tool, it does not do so locally; instead, it makes an API call into a sandbox. The sandbox performs the requested tool execution and returns the result to the harness. Crucially, the harness loop itself never enters the sandbox.

This separation creates a robust security boundary. By keeping the harness on the backend, sensitive credentials remain isolated from the environment where potentially untrusted code or tool outputs are processed. The harness maintains control over LLM API keys, user tokens, and database access. In a multi-user camp—where dozens of engineers within the same organization share the same agent—this architecture addresses problems that single-user builders rarely encounter. The external model ensures that the "brain" of the operation (the harness) is not exposed to the same risks as the "hands" of the operation (the sandbox tool execution).

Industry Impact

The choice between internal and external agent harnesses represents a significant shift in how AI infrastructure is built for the enterprise. As AI agents move from experimental local tools to shared organizational assets, the "outside the sandbox" model is likely to become the standard for production environments. This transition highlights a growing focus on security-first architecture in the AI industry. By decoupling the logic of the agent from the execution of its tools, organizations can better manage permissions and protect sensitive data. This architectural evolution also suggests that future AI development kits and SDKs will need to offer more flexible deployment options to accommodate the complex security requirements of multi-user environments, moving beyond the simple "one container" approach that characterized early agent development.

Frequently Asked Questions

Question: What exactly is an agent harness in the context of LLMs?

An agent harness is the driving loop of an AI agent. It is responsible for sending prompts to the LLM, interpreting the model's requests for tool use (like running code or accessing files), executing those tools, and feeding the results back to the model until the task is finished.

Question: Why is the "outside the sandbox" model considered more secure for organizations?

This model is more secure because it keeps sensitive information—such as API keys, user authentication tokens, and database access credentials—on a secure backend. The harness never enters the sandbox where tools are executed, preventing the execution environment from potentially accessing or compromising these critical credentials.

Question: When should a developer choose to run a harness inside the sandbox?

Running a harness inside the sandbox is often ideal for single-user scenarios, such as an engineer working on a local laptop. It offers a simpler execution model with a single process tree and filesystem, making it easier to use off-the-shelf tools and manage local state like memories and skills without complex API configurations.

Architecting AI Agents: Why the Harness Belongs Outside the Sandbox for Multi-User Security