
In the recent blog post, the CEO of OpenAi Sam Altamn announced and predicted that the year 2025 is going to be the changing year for Ai agents. On Thursday January 23, 2025, OpenAi launched its first artificial intelligence agent known as operators. This Ai agent is now available as a research preview. This is the Ai agent that comes with a customized web browser.
The operator is an Ai agent that can control the web browser and perform tasks online depending on the prompt given by the user. In other words, this tool can automate the task and perform it on the behalf of the user. Tasks such as booking tickets online or reserving tables in restaurants. OpenAi says that currently the operator is available in the USA to ChatGPT Pro subscribers but soon it will be expanded to other regions of the world.
During a live stream, the CEO of OpenAi Sam Altamn explained the world about the Ai agents. He says, “Ai agents are Ai system that do work for you independently. You give them a task and they go off and do it. We think it will be a big trend in Ai”.
Operator is designed and developed by using the Computer-Using Agent. This is an Ai model that merges the vision abilities of GPT-4o with advanced reasoning. Moreover, this Ai agent can connect with the Graphical User Interfaces (GUIs) including buttons, menus, and text fields on the screen. With the help of web browsers, these agents have the capability to perform tasks behind the screen while freeing up the screen for the user.
In addition, this Ai agent accepts the text and images as the input. For performing the task, the Computer Using Agent (CUA) operates the raw pixel data on the screen and uses a virtual keyboard or mouse to complete the tasks. OpenAi also stated that the operator can perform several tasks, handle errors and adopt unexpected changes.
The first preview of the operator is available at operator.chatgpt.com but sooner it will be combined in all ChatGPT users. When the Operator is activated by the user, a little window appears that shows the web browser that has been used by the agent to perform the specific task. Moreover, it also gives the explanation of the tasks that the agent is performing. As the operator uses its own web browser, users can take control of their screen.
Related articles you may find interesting
OpenAi also declared that they are partnering with other companies such as DoorDash, eBay, Instacart, Priceline, StubHub and Uber just to make sure that these operators align with these business’s terms of service agreements.
In a support document, OpenAi stated “Currently Operators cannot handle complex tasks such as creating detailed slideshows, managing calendar systems and interacting with highly customized web interfaces”.
Although after numerous safety measurements, OpenAi still needs to oversee some specific tasks. Operator does not collect or take screenshots of any data said by OpenAi. In a support article, OpenAi stated “On particularly sensitive websites, operators need active user supervision, ensuring users can directly catch and address any potential mistakes the model might make”.
No doubt that this limits the benefits of Operator but it makes sure that the operator does not violate any data privacy or security policy.
The operators have few limitations. OpenAi says that the Operator can perform several tasks at one time but there are strong limits. In these limitations there are also search limitations that resets daily. Also the operator refuses to perform the tasks because of security reasons. OpenAi stated that this feature will be settled in the future.
Ai agents have been introduced as the next big thing after the ChatGPT. New technology will completely revolutionized the way people interact with their systems. Instead of simply delivering and processing the information, the agents can take actions and practically do things.