Sarvam AI's launch of Sarvam-1, India's first indigenous multilingual LLM, is set to redefine AI for Indian languages, supporting 10+ native languages on domestic infrastructure. This powerful LLM enhances token efficiency, enabling faster, more inclusive applications.
Sarvam AI
Sarvam AI introduced Sarvam-1, a revolutionary Large Language Model (LLM) created especially for Indian languages, on October 24, 2024. This model is unique because it was trained entirely on domestic infrastructure and is described as India’s first indigenous multilingual LLM.
Sarvam-1, which has almost 2 billion parameters and supports ten major Indian languages in addition to English, promises to improve AI’s capabilities in a linguistically diverse nation like India.
Sarvam-1 is a significant development in artificial intelligence, especially for Indian languages. It supports Indian Languages such as Bengali, Tamil, Telugu, Gujarati, Kannada, Malayalam, Marathi, Oriya, Punjabi, and Hindi.
The Sarvam-2T dataset, which comprises around 2 trillion tokens was used to train the model. The model was constructed using cutting-edge domestic AI infrastructure driven by NVIDIA’s H100 GPUs. The purpose of this dataset was to boost the quality of training data for Indic languages.
One of the most impressive features of Sarvam-1 is its ability to handle token efficiency effectively. In many existing models, words in Indian languages are broken down into 4 to 8 tokens for processing. The token efficiency rate of Sarvam-1, on the other hand, ranges from 1.4 to 2.1 tokens per word. This indicates that compared to its predecessors, it can process information more quickly and effectively. The model also claims to outperform larger models like Meta’s Llama-3.2-3B in several benchmarks while maintaining competitive performance.
The model can be downloaded from 🤗 Hub.
The development of Sarvam-1 involved addressing two major challenges: token inefficiency and poor data quality in Indic languages. By utilizing synthetic data generation techniques, Sarvam AI created a robust training compilation that ensures better performance in tasks like cross-lingual translation and question-answering. The model’s architecture allows it to process language more effectively, making it suitable for practical applications across different devices.
Sarvam-1 has demonstrated superior performance on industry benchmarks such as MMLU, Arc-Challenge, and IndicGenBench. It achieved an accuracy score of 86.11 on the TriviaQA benchmark across Indic languages, significantly higher than the scores of larger models like Llama-3.1 8B. Moreover, its inference speed is reported to be 4 to 6 times faster than that of larger models, making it particularly effective for real-time applications.
The launch of Sarvam-1 is crucial for several reasons:
This development aligns with India’s ambition to become a leader in AI innovation tailored to its unique linguistic landscape.
The launch of Sarvam-1 may revolutionise the way AI interacts with Indian languages in the future. Its open-source status on websites such as Hugging Face will motivate researchers and developers to investigate potential applications and improve the model. Sarvam-1’s success may encourage similar initiatives in other linguistically diverse regions of the world.
NVIDIA NVLM 1.0: Know All About Open-Source Multimodal LLM
This post was last modified on October 26, 2024 4:25 am
The top 11 generative AI companies in the world are listed below. These companies have…
Google has integrated Veo 2 video generation into the Gemini app for Advanced subscribers, enabling…
Perplexity's iOS app now makes its conversational AI voice assistant compatible with Apple devices, enabling…
Bhavish Aggarwal is in talks to raise $300 million for his AI company, Krutrim AI…
The Beijing Humanoid Robot Innovation Center won the Yizhuang Half-Marathon with the "Tiangong Ultra," a…
Cursor AI Code Editor is more than just a coding tool; it’s a comprehensive assistant…