OpenAI gave me one week to check its new AI agent, Operator, a system that may independently do duties for you on the web.
Operator is the closest factor I’ve seen to the tech trade’s imaginative and prescient of AI brokers — techniques that may automate the boring elements of life, releasing us as much as do the issues we actually love. Nevertheless, judging from my expertise with OpenAI’s agent, actually “autonomous” AI techniques are nonetheless simply out of attain.
OpenAI educated a brand new mannequin to energy Operator, which mixes the visible understanding of GPT-4o with the reasoning capabilities of o1.
That mannequin appears to work properly for primary duties; I watched Operator click on buttons, navigate menus on web sites, and fill out types. The AI was sometimes profitable at independently taking actions, and it really works a lot quicker than web-based brokers I’ve seen from Anthropic and Google.
However throughout my trial, I discovered myself aiding OpenAI’s agent greater than I’d like. It felt like I used to be teaching Operator by way of every downside, whereas I needed to push sure duties off my plate altogether.
Too usually throughout my take a look at, I needed to reply a number of questions, grant permissions, fill out private info, and assist the agent when it obtained caught.
In automobile phrases, Operator is like driving a automobile with cruise management – sometimes taking your foot off the pedals and letting the automobile drive itself – nevertheless it’s removed from full-blown autopilot.
In actual fact, OpenAI says Operator’s frequent pauses are by design.
The AI powering Operator, very like the AI powering chatbots like OpenAI’s ChatGPT, can’t reliably work independently for lengthy intervals of time, and it’s vulnerable to the identical type of hallucinating. Due to that, OpenAI doesn’t wish to give the system an excessive amount of decision-making energy or delicate person info. Perhaps that’s a secure selection by OpenAI, nevertheless it reduces Operator’s practicality.
That mentioned, OpenAI’s first agent is a formidable proof of idea — and interface — for an AI that may use the entrance finish of any web site. However to create actually impartial AI techniques, tech corporations might want to construct extra dependable AI fashions that don’t require this a lot steering.
A little bit too ‘arms on’
My Operator trial coincided with the week I used to be transferring residences, so I had OpenAI’s agent assist with transferring logistics.
I requested Operator to assist me purchase a brand new parking allow. OpenAI’s agent advised me, “Positive,” then opened a window into its browser on my PC’s display screen.
Operator then performed a seek for a San Francisco parking allow within the browser, took me to the right metropolis web site, and even the proper web page.
Operator nonetheless enables you to use the remainder of your pc whereas it’s working, one thing that may’t be mentioned for Google’s Mission Mariner. It’s because OpenAI’s agent isn’t actually engaged on the pc, however reasonably, off within the cloud someplace.

For my parking allow, I needed to grant Operator permission to start out completely different processes a couple of too many instances. It additionally stopped to ask me to fill out types with private info – reminiscent of my title, cellphone quantity, and electronic mail handle. At instances, Operator additionally obtained misplaced, forcing me to take management of the browser and get the agent again on observe.
In one other take a look at, I requested Operator to make me a reservation at a Greek restaurant. To its credit score, Operator discovered me a pleasant place in my space with cheap costs. However I needed to reply greater than half a dozen questions all through the circulation.

If you need to intervene six or extra instances simply to e-book a reservation by way of an AI agent, at what level is it simpler to only do it your self? That’s a query I requested myself loads whereas testing Operator.
Agent-as-a-platform
In a couple of of my exams, I bumped into web sites that blocked Operator for no matter purpose. For instance, I attempted reserving an electrician utilizing TaskRabbit, however OpenAI’s agent advised me that it bumped into an error, and requested if it may use another service as a substitute. Expedia, Reddit, and YouTube additionally blocked the AI agent from accessing their platforms.
Nevertheless, different providers are embracing Operator with open arms. Instacart, Uber, and eBay collaborated with OpenAI for the launch of Operator, permitting the agent to navigate their web sites on behalf of people.
These companies are making ready for a future the place a subset of person interactions are facilitated by an AI agent.
“Prospects are utilizing Instacart by way of quite a lot of completely different entry factors,” mentioned Daniel Danker, chief product officer at Instacart, in an interview with TechCrunch. “We see Operator as, probably, one other a kind of entry factors.”
Letting OpenAI’s agent use Instacart’s web site on behalf of an individual looks like it might separate Instacart from its clients. Nevertheless, Danker says Instacart needs to fulfill clients wherever they’re.
“We actually are bullish about our perception, just like OpenAI, that agentic techniques can have a serious impression on how customers work together with digital properties,” mentioned eBay’s chief AI officer, Nitzan Mekel-Bobrov, in an interview with TechCrunch.
Even when AI brokers rise in recognition, Mekel-Bobrov says he expects customers will all the time come to eBay’s web site, noting that “on-line locations are usually not going wherever.”
Belief points
I had some points trusting Operator after it hallucinated a couple of instances, and practically price me a number of tons of {dollars}.
As an illustration, I requested the agent to search out me a parking storage close to my new house. It ended up suggesting two garages that it mentioned would take just some minutes to stroll to.

Apart from being manner out of my worth vary, the garages have been really actually removed from my house. One was a 20-minute stroll away, and the opposite was a 30-minute stroll. Seems, Operator had put within the unsuitable handle.
That is precisely why OpenAI doesn’t give its agent your bank card quantity, passwords, or entry to electronic mail. If OpenAI didn’t let me intervene right here, Operator would’ve have wasted tons of of {dollars} on a parking spot I didn’t want.
Hallucinations like this are a key roadblock to truly helpful autonomous brokers – ones that may take bothersome duties off your plate. Nobody will belief brokers in the event that they’re inclined to creating primary errors, particularly errors with real-world penalties.
With Operator, OpenAI appears to have constructed some spectacular instruments to let AI techniques browse the online. However these instruments received’t quantity to a lot till the underpinning AI can reliably do what customers ask it to do. Till then, people can be caught aiding brokers — not the opposite manner round. And that form of defeats the purpose.