During the Cloud Next conference on Tuesday, Google announced the availability of Gemini 1.5 Pro. This next-generation Gemini 1.5 Pro model is available in Google AI Studio for developers. As per the Google Blog, Gemini 1.5 Pro will now be available in 180+ countries with native audio understanding, system instructions, JSON mode, and more.
The Google blog further mentioned, “We’re also launching new features like system instructions and JSON mode to give developers more control over the model’s output. Lastly, we’re releasing our next-generation text embedding model that outperforms comparable models. Go to Google AI Studio to create or access your API key, and start building.”
Read this article to explore more about Gemini 1.5 Pro, its features, price and accessibility.
What is Gemini 1.5 Pro?
Gemini 1.5 Pro is Google’s most capable generative AI model. Available in public preview on Vertex AI, it is optimized to scale across a wide range of tasks involving text, images, videos, audio, and even code.
This mid-size multi-modal can process between 128,000 and 1 million tokens, where “tokens” refers to subdivided bits of raw data. It is roughly eight times higher than OpenAI’s GPT-4 Turbo Max context and about four times more than Anthropic’s flagship model, Claude 3, can handle as input.
Features of Gemini 1.5 Pro
- Gemini 1.5 Pro is a multilingual and multimodal model. This means that it’s able to understand images and videos.
- The model can also analyze and compare content in media like TV shows, movies, radio broadcasts, conference call recordings, and more across different languages.
- This new version of Gemini Pro, which is supposed to be the middle-weight model of the Gemini family, already surpasses the biggest and most powerful model, Gemini Ultra, in performance.
- Gemini 1.5 Pro can generate transcriptions for video clips as well, although the jury’s out on the quality of those transcriptions.
- Google has added native audio or speech support, and Gemini 1.5 Pro can understand verbal prompts. Alongside this, a file API for handling files, system instructions, and JSON mode has also been added for developers to have better control over the model.
What is Gemini 1.5? All you need to know
What is the price of the Gemini 1.5 Pro?
According to a trending Reddit thread, Gemini 1.5 Pro is accessible to everyone, with audio, for free.
How to use the Gemini 1.5 Pro model?
Gemini 1.5 Pro is not available to people without access to Vertex AI or AI Studio. The AI model is currently available in more than 180 countries, including India.
Replying to a user on Google Cloud Community, Google staff member, Poala_Tenorio wrote steps to gain access to Gemini 1.5 Pro. The steps you need to follow are:
- Sign Up a Gemini Pro Account: Ensure you have a Gemini Pro account. If you’re already using MakerSuite, you might be halfway there since MakerSuite likely provides integration with Gemini. If you’re not sure, contact the MakerSuite support team for clarification.
- Request API Access: Once you have a Pro account, you’ll need to request access to the Gemini API. This often involves filling out a form on their website or contacting their support team directly.
- Provide Necessary Information: You may need to provide certain information, such as your account details, intended use of the API, and any specific requirements you have. Be prepared to explain why you need API access and what you plan to do with it.
- Receive API Credentials: Once your request is approved, you should receive API credentials, such as an API key and secret. Keep these credentials secure, and don’t share them with anyone unauthorized.
- Integrate API into Your Application: With your API credentials in hand, you can now integrate the Gemini API into your application or platform. Follow the documentation provided by Gemini to understand how to authenticate your requests and utilize the various endpoints available through the API.
Before deploying your application or platform with the Gemini API integration, thoroughly test it to ensure everything is working as expected. This helps identify and resolve any issues before they impact users.
Rowan Cheung, a user of X (formerly known as Twitter), was granted early access to the Gemini AI model and shared his observations about using it on social media. He took his observation to Twitter and wrote, “I uploaded the entire NBA dunk contest from last night and asked which dunk had the highest score. Gemini 1.5 was incredibly able to find the specific perfect 50 dunk and details from just its long context video understanding!”
According to Google, early users of Gemini 1.5 Pro are enabling the large context window for tasks like creating, debugging, and transforming code and automating the tagging of media archives’ metadata. Also, the multinational tech company previously said that latency is an area of focus and that it’s working to ‘optimize’ Gemini 1.5 Pro.
Get started today in Google AI Studio with Gemini 1.5 Pro to explore code examples and quickstarts in a new Gemini API Cookbook.
Google Gemini vs OpenAI’s ChatGPT: A Battle of AI Titans Compared