News

Google Unveils Gemini 1.5 Flash-8B: Faster, Cheaper AI for Developers at Half the Cost

Google has launched Gemini 1.5 Flash-8B, a faster and more affordable version of its popular AI model. Designed for developers, it offers a 50% cost reduction while delivering faster processing speeds and enhanced performance. Ideal for low-power devices like sensors and smartphones, Gemini 1.5 Flash-8B aims to make AI more accessible and efficient.

Google LLC is releasing a quicker and more compact version of its well-liked Gemini 1.5 Flash artificial intelligence model.

At half the price, it is substantially cheaper and goes by the name Gemini 1.5 Flash-8B. Google’s Gemini 1.5 Flash big language model is a lightweight variant geared for speed and efficiency that may be used on low-power devices like sensors and smartphones.

A few weeks after the company’s May announcement at Google I/O 2024, Gemini 1.5 Flash was made available to a select group of paying clients. A few weeks later, the Gemini mobile app made the device available for free, but with certain usage limitations.

At the end of June, it became generally available and offered high-speed processing, a competitive price, and a context window with a million tokens. When it was first released, Google claimed that its input size was 60% larger and 40% faster on average than OpenAI’s GPT-3.5 Turbo.

Also Read: Google Develops AI with Human-Like Reasoning, Rivals OpenAI’s O1 Model

The initial version, which powers the Eats AI assistant in Uber Technologies Inc.’s UberEats food delivery service, was created to offer a very low token input fee, making it price-competitive for developers. Customers like Uber Technologies Inc. embraced the original version.

Google is releasing the Gemini 1.5 Flash-8B, which is 50% less expensive and has double the rate limits of the 1.5 Flash, making it one of the lightest LLMs on the market. According to the business, it also provides reduced latency on little prompts.

Gemini 1.5 Flash-8B is available to developers via Google AI Studio and the Gemini API at no cost.

Gemini API Senior Product Manager Logan Kilpatrick stated in a blog post that the business has improved 1.5 Flash “considerably,” listened to developer feedback, and is “testing the limits” of what can be achieved with such lightweight LLMs.

He clarified that last month, the business had announced the release of an experimental Gemini 1.5 Flash-8B version. Since then, it has undergone more refinement, and it is currently widely accessible for usage in production.

Also Read: Google Strengthens India’s Healthcare with Free AI-Powered Screenings for Cancer and TB

Kilpatrick claims that the 8-B version can nearly match the 1.5 Flash model’s performance on some important benchmarks and that it performs particularly well on jobs like chat, transcription, and extended context language translation.

Kilpatrick continued, “Developer feedback and our testing of what is possible with these models continue to inform our release of best-in-class small models.” “We believe this model has the greatest potential for tasks ranging from lengthy context summarization tasks to high volume multimodal use cases.”

Kilpatrick continued, “Of all the Gemini models released to date, the Gemini 1.5 Flash-8B offers the lowest cost per intelligence.”

Also Read: How to Use Google Gemini Live on Android App Users?

The pricing is in line with comparable models from Anthropic PBC and OpenAI. Regarding OpenAI, the most affordable model remains GPT-4o small, with an input cost of $0.15 per million; however, this decreases by 50% when using batched queries and reusable prompt prefixes. The Claude 3 Haiku model from Anthropic is the most economical at $0.25/M, while cached tokens are just $0.03/M.

Kilpatrick further stated that the corporation is attempting to increase the use of 1.5 Flash-8B for straightforward, high-volume activities by increasing its rate limits. As a result, 4,000 queries can now be sent by developers every minute, according to him.

Also Read: Google AI Skills House to Empower 10 Million Indians with Free AI Courses, Launched at Google India 2024

This post was last modified on October 6, 2024 12:04 am

Kumud Sahni Pruthi

A postgraduate in Science with an inclination towards education and technology. She always looks for ways to help people improve their lives by putting complex things into simple words through her writing.

Recent Posts

Google is moving Android news to a virtual event before I/O

Google is launching The Android Show: I/O Edition, featuring Android ecosystem president Sameer Samat, to…

April 29, 2025

Top Generative AI Companies of the World 2025

The top 11 generative AI companies in the world are listed below. These companies have…

April 28, 2025

Veo 2 extends access to more Gemini Advanced Users

Google has integrated Veo 2 video generation into the Gemini app for Advanced subscribers, enabling…

April 25, 2025

Perplexity launches the iPhone voice assistant

Perplexity's iOS app now makes its conversational AI voice assistant compatible with Apple devices, enabling…

April 24, 2025

Ola’s AI arm Krutrim intends to raise $300 million

Bhavish Aggarwal is in talks to raise $300 million for his AI company, Krutrim AI…

April 22, 2025

World’s first humanoid half-marathon pits people against robots

The Beijing Humanoid Robot Innovation Center won the Yizhuang Half-Marathon with the "Tiangong Ultra," a…

April 22, 2025