OpenAI has upgraded its Responses API with a hosted computer environment, giving developers the ability to run autonomous agents that execute shell commands, handle files, and maintain state across multi-step tasks — without managing their own infrastructure.

The move marks a significant architectural shift for OpenAI's developer platform. Where the Responses API previously functioned primarily as a structured interface for querying models, it now incorporates a sandboxed runtime layer: secure, hosted containers that persist between agent steps and expose a shell tool for direct system interaction. According to OpenAI, the goal is to close the gap between a model that can reason about tasks and one that can actually complete them end-to-end.

Shell Tool and Hosted Containers: What's Actually New

The core additions are a shell tool and hosted containers. The shell tool allows agents to run terminal commands — installing packages, executing scripts, reading and writing files — inside a controlled environment. Hosted containers provide the stateful substrate: a sandboxed compute layer that retains context, files, and process state between API calls, so an agent can pick up mid-task rather than starting from scratch on each request.

This combination is what transforms a model call into an agent runtime. Previously, developers who wanted an agent to, say, download a dataset, process it with a Python script, and return a result had to stitch together their own orchestration layer, manage sandboxing, and handle state themselves. The new setup lets the Responses API handle that scaffolding natively.

The update moves the Responses API from a model-querying interface toward a complete agent runtime, reducing the infrastructure burden on developers building multi-step AI workflows.

OpenAI has positioned security and scalability as central design principles. The containers are isolated by default, limiting the blast radius if an agent takes an unexpected action. Scalability is handled on OpenAI's infrastructure, meaning developers don't need to provision or monitor compute themselves.

Why This Matters for Developer Workflows

For developers building agentic applications, infrastructure complexity has been one of the biggest practical barriers. Running an agent that interacts with a real computer environment — browsing the web, writing and testing code, managing files — typically requires assembling a stack of tools: a sandboxing layer, a persistent filesystem, an orchestration framework, and careful state management. Each layer adds latency, cost, and failure points.

By hosting the environment alongside the model, OpenAI is making a deliberate bet that vertical integration will win developer preference over composability. Developers trade some flexibility for significantly reduced setup time. For teams prototyping agentic workflows, that trade-off is often worth it. For teams with existing infrastructure or specific compliance requirements, the calculus may differ.

The shell tool in particular opens up a class of tasks that were previously awkward to implement via API alone: running tests on generated code, processing large files without loading them into context windows, and executing system commands as part of a longer reasoning chain. These are exactly the kinds of operations that distinguish a true software agent from sophisticated autocomplete.

Responses API in OpenAI's Broader Agent Strategy

This update doesn't exist in isolation. OpenAI has been steadily building out an agent-oriented product layer over the past year — the Assistants API introduced persistent threads and built-in tools; the Responses API consolidated and extended that model; and now the computer environment layer pushes further toward a full execution runtime. The direction is consistent: reduce the number of external services a developer needs to build a working agent.

The Agents SDK, released earlier in 2025, sits alongside these API-level capabilities and provides higher-level abstractions for orchestrating multi-agent systems. The Responses API with computer environment occupies the layer below — raw enough to give developers control, managed enough to abstract away infrastructure.

Pricing and availability details were not fully specified in OpenAI's initial announcement, which is a meaningful gap for teams evaluating whether to build on the platform. Container compute will almost certainly carry additional cost beyond standard token pricing, and the economics of running long-horizon agents — where containers might be active for extended periods — will matter considerably to commercial users. Developers should expect clarification as the feature moves through availability tiers.

Integration complexity appears low for developers already using the Responses API. The shell tool and containers are designed to slot into existing Responses API workflows rather than requiring a separate integration path, according to OpenAI's documentation.

What This Means

OpenAI is betting that owning the full stack — from model inference to compute environment — will become the default choice for developers building production agents, and this release is the clearest evidence yet of that strategy in action.