News

Google Unveils Gemini 1.5 Flash-8B: Faster, Cheaper AI for Developers at Half the Cost

Google has launched Gemini 1.5 Flash-8B, a faster and more affordable version of its popular AI model. Designed for developers, it offers a 50% cost reduction while delivering faster processing speeds and enhanced performance. Ideal for low-power devices like sensors and smartphones, Gemini 1.5 Flash-8B aims to make AI more accessible and efficient.

Google LLC is releasing a quicker and more compact version of its well-liked Gemini 1.5 Flash artificial intelligence model.

At half the price, it is substantially cheaper and goes by the name Gemini 1.5 Flash-8B. Google’s Gemini 1.5 Flash big language model is a lightweight variant geared for speed and efficiency that may be used on low-power devices like sensors and smartphones.

A few weeks after the company’s May announcement at Google I/O 2024, Gemini 1.5 Flash was made available to a select group of paying clients. A few weeks later, the Gemini mobile app made the device available for free, but with certain usage limitations.

At the end of June, it became generally available and offered high-speed processing, a competitive price, and a context window with a million tokens. When it was first released, Google claimed that its input size was 60% larger and 40% faster on average than OpenAI’s GPT-3.5 Turbo.

Also Read: Google Develops AI with Human-Like Reasoning, Rivals OpenAI’s O1 Model

The initial version, which powers the Eats AI assistant in Uber Technologies Inc.’s UberEats food delivery service, was created to offer a very low token input fee, making it price-competitive for developers. Customers like Uber Technologies Inc. embraced the original version.

Google is releasing the Gemini 1.5 Flash-8B, which is 50% less expensive and has double the rate limits of the 1.5 Flash, making it one of the lightest LLMs on the market. According to the business, it also provides reduced latency on little prompts.

Gemini 1.5 Flash-8B is available to developers via Google AI Studio and the Gemini API at no cost.

Gemini API Senior Product Manager Logan Kilpatrick stated in a blog post that the business has improved 1.5 Flash “considerably,” listened to developer feedback, and is “testing the limits” of what can be achieved with such lightweight LLMs.

He clarified that last month, the business had announced the release of an experimental Gemini 1.5 Flash-8B version. Since then, it has undergone more refinement, and it is currently widely accessible for usage in production.

Also Read: Google Strengthens India’s Healthcare with Free AI-Powered Screenings for Cancer and TB

Kilpatrick claims that the 8-B version can nearly match the 1.5 Flash model’s performance on some important benchmarks and that it performs particularly well on jobs like chat, transcription, and extended context language translation.

Kilpatrick continued, “Developer feedback and our testing of what is possible with these models continue to inform our release of best-in-class small models.” “We believe this model has the greatest potential for tasks ranging from lengthy context summarization tasks to high volume multimodal use cases.”

Kilpatrick continued, “Of all the Gemini models released to date, the Gemini 1.5 Flash-8B offers the lowest cost per intelligence.”

Also Read: How to Use Google Gemini Live on Android App Users?

The pricing is in line with comparable models from Anthropic PBC and OpenAI. Regarding OpenAI, the most affordable model remains GPT-4o small, with an input cost of $0.15 per million; however, this decreases by 50% when using batched queries and reusable prompt prefixes. The Claude 3 Haiku model from Anthropic is the most economical at $0.25/M, while cached tokens are just $0.03/M.

Kilpatrick further stated that the corporation is attempting to increase the use of 1.5 Flash-8B for straightforward, high-volume activities by increasing its rate limits. As a result, 4,000 queries can now be sent by developers every minute, according to him.

Also Read: Google AI Skills House to Empower 10 Million Indians with Free AI Courses, Launched at Google India 2024

This post was last modified on October 6, 2024 12:04 am

Kumud Sahni Pruthi

A postgraduate in Science with an inclination towards education and technology. She always looks for ways to help people improve their lives by putting complex things into simple words through her writing.

Recent Posts

Best AI Model for Every Task: Image, Video, PPT and More

Pick your task, get the best AI model for it — images, video, slides, research,…

June 17, 2026

What is Agentic AI? Check How it Works with Real-Life Agentic AI Automation Examples

Learn what Agentic AI is, how it works, and how it differs from Generative AI.…

June 14, 2026

13 Best Free Online Vocal Remover AI Tools in 2026

Discover the 13 best free online vocal remover AI tools for 2026, designed to isolate…

January 4, 2026

Top 13 Yield Farming Platforms in 2026: Maximize APY with Secure and Trusted Crypto Tools

Explore the top 13 yield farming platforms for 2026, featuring secure, trusted, and high-APY crypto…

January 4, 2026

Top AI Learning Platforms for 2026: Master AI Skills with Coursera, edX, and Udacity

Explore the best AI learning platforms for 2026, including Coursera, edX, Udacity, and more. Learn…

January 4, 2026

13 Best Polygon Wallets in 2026 You Need to Checkout

Explore the 13 best Polygon wallets in 2026, comparing security, DeFi access, hardware and mobile…

January 1, 2026