OpenAI has started previewing a new tool called Operator that can navigate within a web browser. According to a blog post published on thursdayThe software works with what the company calls a computer-based agent. “CUA is trained to interact with graphical user interfaces (GUIs)—the buttons, menus, and text fields that people see on a screen—just as humans do,” OpenAI says of the model. “This gives you the flexibility to perform digital tasks without using specific operating system or web APIs.”
The current version of Operator is based on OpenAI's GPT-4o model. It combines the vision capabilities of that algorithm with “advanced reasoning” trained through reinforcement learning. The operator has the ability to “break down tasks into multi-step plans and adaptively self-correct when challenges arise.” According to OpenAI, that capability represents the next stage in ai development.
As with previous research advancements, OpenAI warns that Operator is “still early and has limitations” and will not “work reliably in all scenarios yet.” For example, depending on the complexity of the task and the interface involved, the agent benefits greatly if the user takes a few extra minutes to type a more detailed message. By The edgeThe operator will give control to the user if they ever get stuck on a task. You will also hand over control when a website requests sensitive information, including login credentials. The company says it designed the tool to “reject harmful requests and block disallowed content.”
OpenAI will make Operator available to users of its $200 per month ChatGPT Pro subscription for the first time. It's also partnering with companies like Instacart to offer the agent on their platforms, although you'll also need a ChatGPT Pro subscription to try out the integration.
The operator joins a growing list of ai agents that can navigate through a web browser or an entire operating system. Anthropic was the first to offer this capability with the launch of its Claude 3.5 Sonnet model in October, followed more recently by Google with its Gemini 2.0 model and Project Mariner.
If you purchase something through a link in this article, we may earn a commission.