AI India

IIT Gandhinagar Unveils Ganga-1B: A Powerful Pre-Trained Hindi Language Model

IIT Gandhinagar releases Ganga-1B, a pre-trained large language model for Hindi, outperforming all open-source models up to 7B in size. Developed from scratch with a highly curated Hindi dataset, Ganga-1B is a significant advancement in Indic language AI.

As part of its Unity project, IIT Gandhinagar has released Ganga-1B, a pre-trained large language model (LLM) for Hindi. Ganga-1B, created from the ground up with the most carefully selected Hindi dataset, performs better than all open-source LLMs for Hindi, up to 7B in size.

This is the first open-source Indic model from an academic research lab in India, according to Mayank Singh, assistant professor of computer science and engineering at IIT Gandhinagar. He also revealed that it was created from scratch for less than INR 10 lakh rupees. Interestingly, this comes out to be far less than the $5 million Tech Mahindra spent on Project Indus. (Four hundred and one lakh rupees).

Also Read: India Plans to Acquire 10,000 GPUs for AI Mission in Public-Private Partnership

The Ganga, the longest river traversing the Indian subcontinent, was to be honored with this dedication. Similar nomenclature would be used for our upcoming line of cars, according to Singh.

Two MTech students, Hitesh Lodwal and Siddhesh Dosi, started the project. As Singh clarified, “Hitesh was mainly involved in data curation for Hindi, while Siddhesh concentrated on modeling, architecture, and system design.”

He mentioned that CDAC servers were used for training the model. “After receiving funds from multiple sponsored organizations, we paid CDAC to obtain a few dedicated nodes on the CDAC servers.” 

Singh also mentioned how much less expensive computing is for educational institutions when compared to Azure or Google Cloud. “We started the training almost six months ago, and Ganaga was trained on one NVIDIA DGX A100, which has eight NVIDIA A100 Tensor Core GPUs,” Singh stated.

Also Read: Survey: 42% of Gen AI Jobs in India Seek Machine Learning, 40% Python

He claimed that Lingo intends to use Yotta’s infrastructure to train the upcoming iterations of the model and has obtained more funding. In the meantime, Professor Ganesh Ramakrishnan of IIT Bombay is leading BharatGPT, an ecosystem or group that is developing Indic LLMs, and they should be releasing their models shortly.

Singh stated, “Our short-term goal is to produce high-quality Indian datasets and make them open-source for the general public.” He also discussed their next project, which is to create Indic LLM benchmarks.

Lingo will eventually provide models via APIs as well. “Like other companies, we will develop APIs for these models so that users can access them easily,” Singh stated.

Currently, the Lingo team is developing a mechanism dubbed “model editing” that will spare developers from using the conventional RAG and fine-tuning procedures. While just certain parameters can be altered in model editing, all parameters are adjusted during fine-tuning. 

Also Read: Synopsys Partners with Tata for India’s First Semiconductor Fab in Gujarat

Singh added that they are working on a system that will allow updates to the LLM in one language to automatically update all supported languages.

This post was last modified on July 9, 2024 11:15 pm

Kumud Sahni Pruthi

A postgraduate in Science with an inclination towards education and technology. She always looks for ways to help people improve their lives by putting complex things into simple words through her writing.

Recent Posts

Rish Gupta Net Worth: CEO & Co-Founder of Spot AI

Rish Gupta is an Indian entrepreneur who serves as the chief executive officer (CEO) of…

April 19, 2025

Top 10 Robotics Skills Required for Engineering Career Growth

Are you looking to advance your engineering career in the field of robotics? Check out…

April 18, 2025

Top 20 Books on AI in 2025: The Ultimate Reading List on Artificial Intelligence

Artificial intelligence is a topic that has recently made internet users all over the world…

April 18, 2025

Top 10 Best AI Communities in 2025

Boost your learning journey with the power of AI communities. The article below highlights the…

April 18, 2025

Artificial Intelligence (AI) Glossary and Terminologies – Complete Cheat Sheet List

Demystify the world of Artificial Intelligence with our comprehensive AI Glossary and Terminologies Cheat Sheet.…

April 18, 2025

Scott Wu Net Worth: Devin AI Software Engineer, CEO of Cognition Labs

Scott Wu is the co-founder and Chief Executive Officer of Cognition Labs, an artificial intelligence…

April 17, 2025