Gemini AI: 3 New Development

Waseem AbbasFebruary 16, 2024

5 minutes read

With Gemini, a new generative AI platform that has made its public debut, Google is trying to make a splash. Gemini is a promising sign in some aspects but not in others. What, then, is Gemini? What applications does it have? How does it compare to the competition?

We developed this helpful guide to help you stay up to speed with the newest Gemini developments. It will be updated as new Gemini models and features are introduced.

What is Gemini?

Gemini is Google’s long-awaited next-generation generative AI model family, created by Google’s AI research teams DeepMind and Google Research. It comes in three flavors:

Gemini Ultra is the flagship model, while Gemini Pro is a “lite” variant. Gemini Nano is a tiny, “distilled” model that works on mobile devices like the Pixel 8 Pro.

All Gemini models were taught to be “natively multimodal”—that is, able to work with and use more than simply text. They were pre-trained and fine-tuned on various sounds, photos, videos, and a big collection of codebases and text in several languages.

What can Gemini signs do?

Gemini models can do lots of things! They can make art, label pictures and videos, and write down spoken words. Some of these features are yet to be available, but Google says they’ll add them soon.

It’s hard to trust what the company says. Google could have done better with Bard at first. And there was a problem with a video showing Gemini’s abilities; it was heavily edited. But, to give Google credit, Gemini is available in limited amounts now.

Gemini Ultra:

Gemini Ultra is a special model that only a few people have tried. It’s the basic model upon which the others are built. Only some selected customers have used it in certain Google apps and services. But, later this year, Google plans to make it available to more people. Most of what we know about Ultra comes from Google demos so that it might be partially true.

According to Google, Gemini Ultra can help with physics homework by solving problems step by step and spotting errors in filled-in answers. It can also find scientific papers related to a problem, pull out information from those papers, and update a chart with new data.

Gemini Ultra can technically create images, but this feature won’t be available when it launches. Google says it’s because its work is more complicated than other apps like ChatGPT’s image generation. Instead of using prompts to create images, Gemini can make images directly without needing extra steps.

Gemini Pro:

Gemini Pro is out now for everyone to use. But what it can do changes depending on where you use it. Google says that Gemini Pro, first launched in Bard with only text, is better than LaMDA in thinking, planning, and understanding. A Carnegie Mellon and BerriAI study agrees that Gemini Pro is better than OpenAI’s GPT-3.5 at handling longer and more complex reasoning.

However, the study found that Gemini Pro needs help with math problems involving many digits like other big language models. Users also found many things that could be improved in its reasoning, even for simple questions like who won the latest Oscars. Google promised to improve it, but we need to know when.

Gemini Pro is also available in Vertex AI, Google’s AI platform. It can take text as input and give text as output. Another part of Gemini Pro, called Gemini Pro Vision, can work with text and images, like photos and videos, and provide text output similar to OpenAI’s GPT-4 with the Vision model.

Gemini Nano:

Gemini Nano is a smaller version of Gemini Pro and Ultra. It’s small enough to work on some phones without sending tasks to a faraway server. Right now, it’s used for two things on the Pixel 8 Pro: Summarize in Recorder and Smart Reply in Gboard.

The Recorder app allows you to record and convert audio into text. Gemini helps create summaries of recorded conversations or presentations. You can still access these summaries even without Wi-Fi or a signal. And don’t worry- your data stays on your phone.

Gemini Nano is also in Gboard, Google’s keyboard app, but developers are still testing it. It’s part of a feature called Smart Reply, which suggests what you want to say next when chatting in apps like WhatsApp. Right now, it only works with WhatsApp, but Google says it’ll work with more apps in 2024.

Is Gemini better than OpenAI’s GPT-4?

We will know when Google releases Ultra later this year, but Google says Gemini is better than the current best, usually OpenAI’s GPT-4.

Google claims that Gemini Ultra beats current records on tests that check big language models. Gemini Pro is also said to be better at tasks like summarizing and writing than GPT-3.5.

But even if Gemini does well on these tests, some people need to be more impressed. They say Gemini Pro makes mistakes with basic facts, has trouble with translations, and needs to give better coding suggestions.

How much will Gemini cost?

Gemini Pro is unrestrained in Bard, AI Studio, during the preview time. However, you’ll have to pay once it leaves the preview in Vertex AI.

Gemini Pro will cost $0.0025 for every character it uses and $0.00005 for every character it produces. If you’re a Vertex customer, you’ll pay for every 1,000 characters, about 140 to 250 words. For models like Gemini Pro Vision, you’ll also pay for every image, which is $0.0025.

For example, if you want to summarize a 500-word article with Gemini Pro, it would cost you $5. And if you generate a similar-length article, it would cost you $0.1.

Gemini Pro:

You can try out Gemini Pro most easily in Bard. Right now, a special version of Pro is answering questions in English in the U.S. More languages and countries will be added later.

Through an API, you can also try Gemini Pro in preview mode in Vertex AI. Using the API is free for now, but there are some limits. It supports 38 languages and regions, including Europe, and it has features like chat and filtering.

In other places, you can find Gemini Pro in AI Studio. Developers can use this assistance to develop and trial prompts and chatbots based on Gemini. After that, they can get API keys to use them in their apps. Or, they can export the code to a more advanced IDE.

Google’s Duet AI for Developers, a set of tools to help with coding, will soon use a Gemini model. Around the same time, in early 2024, Google plans to add Gemini models to development tools for Chrome and Firebase, its mobile development platform.

Gemini Nano:

Gemini Nano is currently available on the Pixel 8 Pro and will be available on other devices later. Developers who want this model in their Android apps can sign up for an early look.

Frequent Asked Questions (FAQ’s)

Can you use Google Gemini AI?

Today, you can use Gemini Advanced in over 150 countries and places in English. We’ll add more languages later. You can get Gemini Advanced with our new Google One AI Premium Plan for $19.99 monthly. When you sign up, you’ll get a two-month free trial.

Is Gemini AI better than ChatGPT?

From my experience using both platforms, Gemini is better than ChatGPT at searching online and incorporating the information it finds into its answers.

Can I use Gemini AI for free?

In addition to the free Gemini version, Google will offer a more advanced service through the new app for $20 per month.

Are Gemini and Bard the same?

On February 8th, Google changed the name of Google Bard to Gemini. This change acknowledged the advanced technology that powers the AI chatbot. Sundar Pichai, Google’s CEO, mentioned in the announcement that Bard will now be known as Gemini.

Can Gemini generate images?

Now, with Google Gemini, people can make images by giving it text to work with. Gemini uses the Imagen 2 model to ensure the images it creates look good and match what the user asks for. Like other AI image tools, Gemini needs clear instructions to make images.

Waseem AbbasFebruary 16, 2024

5 minutes read