Image by author
For a while now, ChatGPT has been in the spotlight. Everyone talks about it and many people use it, what could go wrong?
Google has always aimed to maintain its reputation of being an ai-first company and they have done well so far. However, over the last year, it's clear that OpenAI has been taking the lead with ChatGPT, and it was only a matter of time before Google came in to try to take the lead again.
CEO Sundar Pichai stated that:
One of the reasons we were interested in ai from the beginning is that we always saw our mission as a timeless mission.
Presenting Gemini of Google.
If you haven't had a chance to watch the trailer yet, I highly recommend watching it. here.
Gemini is Google's largest language model, which CEO Pichai initially tested at a conference in June, and is now officially launching to the technology/ai/google-gemini-ai/#availability” rel=”noopener” target=”_blank”>public. So what's so good about Gemini and why ChatGPT is shaking in its boots?
Gemini is not just a single ai model. It comes in different variations to meet different demands. For example, you have the lighter version called Gemini Nano which is compatible to run on Android devices. It also has Gemini Pro, which uses Barb's backbone and will be used to power many Google ai services.
But the thing do not ends there. You also have the Gemini Ultra, which is Google's most capable model and the most powerful LLM yet. Gemini Ultra appears to be designed specifically for data centers and enterprise applications in particular.
A quick breakdown:
- Gemini Ultra – larger and more capable model for highly complex tasks.
- Professional Gemini – the best model for scaling in a wide range of tasks.
- Gemini Nano – more efficient model for tasks on the device.
This family of three variants of large language models has been created to understand and operate on different types of information. The LLM can handle different types of information such as text, code, images, audio and videos. Multimodality at its finest.
So how good is it?
Google has been working hard to test the Gemini models to ensure they meet requirements and have been rigorously evaluated on a variety of tasks. Google's Gemini Ultra is said to outperform current state-of-the-art results in 30 of the 32 academic benchmarks widely used in LLM research, with a whopping score of 90.0%.
Picture of Google Gemini
Gemini Ultra has proven to be the first model to outperform human experts in MMLU (massive multitasking language understanding). MMLU combines 57 subjects including mathematics, history, law, medicine, physics and more to test world knowledge and problem-solving ability.
Looking at these benchmarks, we can see that the biggest advantage the Gemini has is its ability to understand and interact with video and audio.
We've seen OpenAI try to achieve this with the creation of DALL-E and Whisper. However, Google went a step further with a multi-sensory model from the beginning. Google also mentioned improvements in coding as it uses a new code generation system called AlphaCode 2, which is said to perform 85% better than other coding competition entrants.
That said, benchmarks are just benchmarks. We will be able to fully understand all of Gemini's capabilities when everyday users interact with it.
If you want to learn more about Gemini's capabilities, watch this video:
For Pixel 8 Pro users, you may have already seen some new features, such as the auto summary feature in the Recorder app and the Smart Reply part of the Gboard keyboard, thanks to Gemini Nano.
If you are eager to try Gemini Pro, you can do so now with Bard. Developers and enterprise customers will also be able to access Gemini Pro through Google Generative ai Studio or Vertex ai on Google Cloud starting December 13.
If you're intrigued by the Gemini Nano, you may have to wait a little longer as it will be available next year.
It's good to note that Gemini is currently only available in English. More languages will be available, as CEO Pichai stated that the company aims to integrate the model into Google's search engine, advertising products, the Chrome browser, and more.
This seems to be the time for Google to take back the crown and show us why they were at the forefront of ai innovation. What do you think will appear next?
nisha arya is a data scientist and freelance technical writer. She is particularly interested in providing professional data science advice or tutorials and theory-based data science insights. She also wants to explore the different ways in which artificial intelligence can benefit the longevity of human life. A great student looking to expand her technological knowledge and writing skills, while she helps guide others.