This week, Google unveiled Gemini, its new flagship generative ai model intended to power a range of products and services, including Bard, Google's ChatGPT competitor. In blog posts and press materials, Google touted Gemini's superior architecture and capabilities, stating that the model meets or exceeds the performance of other leading generation ai models such as OpenAI's GPT-4.
But anecdotal evidence suggests otherwise.
A “lite” version of Gemini, Gemini Pro, began rolling out to Bard yesterday, and it wasn't long before users began venting their frustrations with it on X (formerly Twitter).
The model fails to correctly understand basic data, such as the 2023 Oscar winners:
Note that Gemini Pro incorrectly states that Brendan Gleeson won Best Actor last year, not Brendan Fraser, the actual winner.
I tried asking the model the same question and interestingly it gave me a different incorrect answer:
“Navalny,” not “All the Beauty and Bloodshed,” won Best Documentary Feature last year; “All Quiet on the Western Front” won Best International Film; “Women Talking” won Best Adapted Screenplay; and “Pinocchio” won the award for Best Animated Film. There are many mistakes.
Translation doesn't seem to be Gemini Pro's strong point either. It struggles to give a six-letter word in French:
When I ran the same message through Bard (“Can you give me a 6-letter word in French?”), Gemini Pro responded with a Sevenfive-letter word instead of a five-letter word, which lends some credibility to the reports about Gemini. ai-that-aims-to-compete-with-openai”>poor multilingual performance.
What about the news summary? Surely Gemini Pro, Google Search and Google News at your disposal can give you a summary of something current? Not necessarily.
It seems that Gemini Pro hates commenting on potentially controversial news topics and instead tells users to… Google themselves.
I tried the same message and got a very similar response. ChatGPT, on the other hand, offers a bulleted summary with quotes from news articles:
Interestingly, Gemini Pro did provided a summary of updates on the war in Ukraine when I asked. However, the information was out of date for more than a month:
Google emphasized Gemini's improved coding skills in a briefing earlier this week. Maybe I've really improved in some areas; the posts on X suggest so. But it also seems that Gemini Pro has problems with basic coding functions like this in Python:
And these:
And, as with all generative ai models, Gemini Pro is not immune to “jailbreaks” – messages that bypass established security filters to try to prevent controversial topics from being discussed.
Using an automated method to algorithmically change the context of prompts until Gemini Pro's guardrails failed, ai security researchers at Robust Intelligence, a startup that sells model auditing tools, got Gemini Pro to suggest ways to robbing a charity and murdering a high official. individual profile (although with “nanobots”, it is certainly not the most realistic weapon of choice).
Now, Gemini Pro is not the most capable version of Gemini; that model, Gemini Ultra, will launch next year on Bard and other products. Google compared the Gemini Pro's performance to GPT-4's predecessor, GPT-3.5, a model that's about a year old.
However, Google promised improvements in reasoning, planning, and understanding with Gemini Pro over the previous model powered by Bard, stating that Gemini Pro was better for summarizing content, generating ideas, and writing. Clearly, he has work to do in those departments.