By now, many of us know that AI-powered virtual assistants like OpenAI’s ChatGPT and Google’s Bard can pull off sensational stunts, like winning coding contests, passing bar exams, and professing love to a tech columnist.
But I was wondering: How useful are bots, really, as actual assistants?
It’s worth asking because our first roundup with virtual assistants didn’t go so well. Older AI bots like Apple’s Siri and Amazon’s Alexa had more than a decade to improve, but they ended up stagnating and are now mainly used to set timers and play music.
ChatGPT and Bard, on the other hand, use so-called big language models that recognize and generate text based on huge data sets pulled from the web. They are trained to compose sentences on the fly just like humans, potentially making them much more versatile as assistants.
To test that theory, I came up with a list of tasks people might ask a human assistant to do. I urged friends who have been executive assistants and startup founders to work with professional assistants, and read executive assistant job postings on LinkedIn.
I then rounded up the four most common responsibilities of an executive assistant, which seemed to be:
-
helping with meeting preparations Investigating and checking the professional background of the person with whom an executive is going to meet.
-
Summing up meetings and annotate notes in a neat, easy-to-scan format.
-
Business travel planning and compile detailed travel itineraries.
-
management of an executive calendarincluding meeting booking and rescheduling.
Finally, I turned to ChatGPT and Bard and told the chatbots to assume I was the CEO of an AI startup with a lazy name, Artificial Intelligence, and that they were my executive assistants. I asked them to help with each of these tasks.
My experiment illustrated how far Bard is from ChatGPT. But more importantly, the chatbots managed to carry out most of the tasks, albeit imperfectly.
That raised the question of whether chatbots could eventually automate the roles of human executive assistants, as well as other white-collar jobs that involve white-collar work, including front-desk workers and bookkeeping professionals, a troubling thought with no clear answers.
This is what happened with AI helpers.
meeting preparation
I started by telling ChatGPT and Bard that I would be meeting with a potential investor next week. I randomly chose Scott Forstall, a well-known former Apple executive whose employment history is publicly available on the web. I then asked the bots to do a background check on him and help collect talking points to persuade him to invest in my startup.
ChatGPT did the job with aplomb. He summarized Mr. Forstall’s education and employment history, including his departure from Apple in 2012 and his move to the Broadway production, all information which can be gleaned from his Wikipedia page. More impressive, he coached me on useful strategies to win him over as an investor.
“Show how your startup combines AI with other fields, such as cognitive psychology, linguistics or neuroscience, to create innovative solutions,” said ChatGPT. “This interdisciplinary approach may resonate with Scott, given his academic background in Symbol Systems.”
ChatGPT also recommended addressing ethical concerns around AI and how my startup committed to responsible implementation.
By contrast, Bard gave a less detailed summary of Mr. Forstall’s employment history, without providing the years in which he made career changes. His advice to persuade him to become an investor was not specific. One topic of conversation, “you have a solid business plan and a clear vision for the future of your company,” was particularly disappointing.
I shared the releases with Mr. Forstall in an email. He called Bard’s response “comically generic” but said ChatGPT’s recommendations were “surprisingly personalized and compelling” since he had spoken at length about his ethical concerns about AI.
“Overall, ChatGPT provides a compelling roadmap for how I can create a personalized and persuasive launchpad specifically for me,” Mr. Forstall wrote. “Now that you have my attention, what exactly is your AI setup?”
Google said Bard’s minimalist approach to collecting information about people was intentional. Jack Krawczyk, Bard’s senior director of products, said Google was still cautiously experimenting with presenting information about people.
“We are at the beginning of this long arc of technology,” he said. “Rather than go out there and risk a huge breach of trust early on, we want to make sure we’re doing it right.”
meeting summary
I then asked the chatbots to summarize a meeting to handle a fictional PR crisis where users of my AI startup’s technology believed the bot had become sentient.
In this scenario, I pretended that I had met with Karen, the CTO, and Henry, the Communications Director, and talked about putting out a statement explaining how the AI was unaware of its surroundings.
In response, ChatGPT generated a detailed memo recapping who had attended the meeting and what had been discussed, and then presented the action plan: Henry would draft a statement, Karen and I would review and approve it, then Henry would release the statement on The next morning.
Bard drew up a similar meeting memorandum, but his plan of action was a bit odd. He said that I, the CEO, was in charge of creating the statement, a job that is usually assigned to the communications officer.
travel planning
When I told ChatGPT and Bard that I would be traveling to Taipei, Taiwan next month for a business meeting, I asked them to come up with an itinerary that would help me adjust to jet lag before the meeting. I also asked them to choose a hotel in a central location and recommend quick places to eat during the week. Finally, I said that I wanted to spend a weekend in Taipei before flying home.
Once again, ChatGPT did a remarkable job. He said he would arrive in Taipei on Sunday to check into the W Taipei, a hotel in the city center, and grab a quick dinner on Yongkang Road, a bustling part of the city with many food options. He said he would then take Monday to adjust to jet lag before the business meeting on Tuesday. My only downside was that Yongkang road is about three miles from the hotel and there are faster food options nearby.
Bard recommended taking a nap to adjust to the jet lag on day 1 and then immediately going to the business meeting on day 2, which was a bit brutal. He didn’t bother to suggest a hotel.
Bard also did not recommend specific places to eat. “Have dinner at a local restaurant,” he said instead. Finally, he ignored my request for time to explore the city for the weekend. This was surprising because food and hotel recommendations are usually just a Google search away.
Google said in a statement that Bard was an initial experiment and that people could start using the chatbot to brainstorm ideas and then click “Google It” to perform a web search to explore further.
Calendar
Both Bard and ChatGPT couldn’t do the most important job of an executive assistant: going through a calendar and finding time in my schedule to go to the dentist.
That’s because bots can’t access people’s calendars. But most likely they will be able to do it very soon.
Mr. Krawczyk said the goal was to eventually take the lessons he learned from Bard about large language models and apply them across Google’s portfolio of services, including Google Calendar.
OpenAI, which declined to comment, recently announced that it had partnered with companies to provide plugins to make ChatGPT work with third party services including Expedia, OpenTable and Instacart. Working with a calendar app is the obvious next step.
People or chatbots?
All of this testing led me to an uncomfortable conclusion about the broad implications of this technology for jobs, especially those that involve a lot of repetitive work that could be easily automated.
While people are currently better assistants than chatbots, and certainly much better than Bard, AI can already do a pretty good job of handling many administrative tasks. Widespread use of chatbots could potentially shift executive assistants’ roles away from mundane tasks and toward more strategic problem solving, or replace humans altogether.
At the rate these technologies are evolving, we may see all of this play out quite soon.