Google has launched Gemini 1.5 Flash-8B, a faster and more affordable version of its popular AI model. Designed for developers, it offers a 50% cost reduction while delivering faster processing speeds and enhanced performance. Ideal for low-power devices like sensors and smartphones, Gemini 1.5 Flash-8B aims to make AI more accessible and efficient.
Gemini 1.5 Flash 8B
Google LLC is releasing a quicker and more compact version of its well-liked Gemini 1.5 Flash artificial intelligence model.
At half the price, it is substantially cheaper and goes by the name Gemini 1.5 Flash-8B. Google’s Gemini 1.5 Flash big language model is a lightweight variant geared for speed and efficiency that may be used on low-power devices like sensors and smartphones.
A few weeks after the company’s May announcement at Google I/O 2024, Gemini 1.5 Flash was made available to a select group of paying clients. A few weeks later, the Gemini mobile app made the device available for free, but with certain usage limitations.
At the end of June, it became generally available and offered high-speed processing, a competitive price, and a context window with a million tokens. When it was first released, Google claimed that its input size was 60% larger and 40% faster on average than OpenAI’s GPT-3.5 Turbo.
Also Read: Google Develops AI with Human-Like Reasoning, Rivals OpenAI’s O1 Model
The initial version, which powers the Eats AI assistant in Uber Technologies Inc.’s UberEats food delivery service, was created to offer a very low token input fee, making it price-competitive for developers. Customers like Uber Technologies Inc. embraced the original version.
Google is releasing the Gemini 1.5 Flash-8B, which is 50% less expensive and has double the rate limits of the 1.5 Flash, making it one of the lightest LLMs on the market. According to the business, it also provides reduced latency on little prompts.
Gemini 1.5 Flash-8B is available to developers via Google AI Studio and the Gemini API at no cost.
Gemini API Senior Product Manager Logan Kilpatrick stated in a blog post that the business has improved 1.5 Flash “considerably,” listened to developer feedback, and is “testing the limits” of what can be achieved with such lightweight LLMs.
He clarified that last month, the business had announced the release of an experimental Gemini 1.5 Flash-8B version. Since then, it has undergone more refinement, and it is currently widely accessible for usage in production.
Also Read: Google Strengthens India’s Healthcare with Free AI-Powered Screenings for Cancer and TB
Kilpatrick claims that the 8-B version can nearly match the 1.5 Flash model’s performance on some important benchmarks and that it performs particularly well on jobs like chat, transcription, and extended context language translation.
Kilpatrick continued, “Developer feedback and our testing of what is possible with these models continue to inform our release of best-in-class small models.” “We believe this model has the greatest potential for tasks ranging from lengthy context summarization tasks to high volume multimodal use cases.”
Kilpatrick continued, “Of all the Gemini models released to date, the Gemini 1.5 Flash-8B offers the lowest cost per intelligence.”
Also Read: How to Use Google Gemini Live on Android App Users?
The pricing is in line with comparable models from Anthropic PBC and OpenAI. Regarding OpenAI, the most affordable model remains GPT-4o small, with an input cost of $0.15 per million; however, this decreases by 50% when using batched queries and reusable prompt prefixes. The Claude 3 Haiku model from Anthropic is the most economical at $0.25/M, while cached tokens are just $0.03/M.
Kilpatrick further stated that the corporation is attempting to increase the use of 1.5 Flash-8B for straightforward, high-volume activities by increasing its rate limits. As a result, 4,000 queries can now be sent by developers every minute, according to him.
This post was last modified on October 6, 2024 12:04 am
Rish Gupta is an Indian entrepreneur who serves as the chief executive officer (CEO) of…
Are you looking to advance your engineering career in the field of robotics? Check out…
Artificial intelligence is a topic that has recently made internet users all over the world…
Boost your learning journey with the power of AI communities. The article below highlights the…
Demystify the world of Artificial Intelligence with our comprehensive AI Glossary and Terminologies Cheat Sheet.…
Scott Wu is the co-founder and Chief Executive Officer of Cognition Labs, an artificial intelligence…