Sarvam AI aims to take on Google, OpenAI and others with its new Edge model: What is it and how it works
Domestic AI startup Sarvam AI recently announced the Sarvam Edge models. This suite of on-device AI models puts it in direct competition with offerings from Google and OpenAI in the Indian-language AI space. Unlike cloud-based models from these global players, Sarvam Edge runs entirely on consumer devices, covering speech recognition, speech synthesis, and translation, without requiring an internet connection. The company's pitch is straightforward: AI that works anywhere, costs nothing per query, and keeps user data on the device. Here’s everything that we know about Sarvam AI’s Edge models:
In a blog post, the Bengaluru-based company explains that Sarvam Edge is a collection of compact AI models built to run directly on consumer hardware rather than on remote servers. The initiative aims to bring AI functionality to users in India, including those in areas with unreliable internet connectivity. The company says it is developing Edge in collaboration with global device manufacturers.
The speech recognition model supports 10 Indian languages within a single 74-million-parameter model that occupies approximately 294MB on a device. It can automatically identify the language being spoken, without requiring the user to select it.
The model processes speech at about 8.5x real-time and provides a time-to-first-token of less than 300 milliseconds on a Qualcomm Snapdragon 8 Gen 3 chip. Hindi, Gujarati, Kannada, Punjabi, and Telugu were among the languages where the Edge model outperformed Google Cloud STT in benchmarks on the Vistaar dataset, which covers 59 test environments across domains like news and education.
The speech synthesis model has a device footprint of about 60 MB and 24 million parameters. Eight speakers and ten languages are supported in a single model, and each speaker's voice identity remains constant across languages. The model generates its first audio output on a Samsung Galaxy S25 Ultra in 260 milliseconds, which is roughly 5.2 times faster than real time.
The model achieves a mean character error rate of 0.0173 on a standard benchmark, indicating that synthesised speech closely matches the intended text across languages. Custom voice cloning is also supported — a new voice can be added using about one hour of audio data and deployed within the same 60MB model file.
The translation model has 150 million parameters and an on-device footprint of around 334MB. It handles bidirectional translation across 110 language pairs, including 10 Indian languages and English, without routing through an intermediate language.
On a Snapdragon 8 Gen 3 processor, it produces a first token in roughly 200 milliseconds and streams at around 30 tokens per second. On the FloRes benchmark, the model outperforms Meta's NLLB-600M, which is four times larger, across all tested Indian languages.
Because all processing occurs on the device, no user data is sent to external servers. There is also no per-query cost, which Sarvam says makes AI tools viable for education, small businesses, and assistive applications where cloud pricing would otherwise be a barrier.
What is Sarvam Edge and how do the AI models work
The speech recognition model supports 10 Indian languages within a single 74-million-parameter model that occupies approximately 294MB on a device. It can automatically identify the language being spoken, without requiring the user to select it.
The speech synthesis model has a device footprint of about 60 MB and 24 million parameters. Eight speakers and ten languages are supported in a single model, and each speaker's voice identity remains constant across languages. The model generates its first audio output on a Samsung Galaxy S25 Ultra in 260 milliseconds, which is roughly 5.2 times faster than real time.
The model achieves a mean character error rate of 0.0173 on a standard benchmark, indicating that synthesised speech closely matches the intended text across languages. Custom voice cloning is also supported — a new voice can be added using about one hour of audio data and deployed within the same 60MB model file.
On a Snapdragon 8 Gen 3 processor, it produces a first token in roughly 200 milliseconds and streams at around 30 tokens per second. On the FloRes benchmark, the model outperforms Meta's NLLB-600M, which is four times larger, across all tested Indian languages.
Because all processing occurs on the device, no user data is sent to external servers. There is also no per-query cost, which Sarvam says makes AI tools viable for education, small businesses, and assistive applications where cloud pricing would otherwise be a barrier.
Popular from Technology
- Defense Secretary Pete Hegseth is reportedly very angry with Anthropic; Pentagon says: We are going to make sure they ...
- KPMG puts thousands of dollars fine on partner for using AI to pass AI test; says: It’s a very hard thing to get...
- Zoho founder Sridhar Vembu says Google, Meta and other tech companies are bigger than …
- Elon Musk replies to world's first trillionaire post, says have little cash, it is all ...
- CEO of European company that was forced to sell its American business this year has a warning for everyone, says: You don’t want to be …
end of article
Trending Stories
- T20 World Cup: Zimbabwe qualify for Super Eight; Australia eliminated
- “Cupid told me to…”: Alix Earle makes a cryptic post as Tom Brady prioritizes his family for Valentine’s Day
- Jake Paul’s fiance Jutta Leerdam linked to $1 million Nike opportunity after Winter Olympics spotlight
- Myles Garrett gifts Chloe Kim $100,000 custom Pink Bronco for Valentine's Day
- JEE Main 2026 Session 1 result released at nta.ac.in: Direct link to download here
- Jake Paul’s girlfriend Jutta Leerdam claims 500m silver in Milan after 1000m gold as Netherlands celebrates
- 'Terrible mistake': Ashwin exposes Pakistan's big blunder vs India
Featured in technology
- Jeff Bezos' ex-wife MacKenzie Scott gave away more than anyone else on Earth in 2025, here's the list of all organisations she donated to including the one under FBI investigation
- India AI Impact Summit 2026: Mastercard unveils India’s first fully authenticated agentic commerce transaction
- Sarvam AI unveils Kaze, its first AI-powered smart glasses designed and built in India
- DRDO DG at AI Summit: Cannot depend on Google Gemini, ChatGPT and other foreign-developed AI models military apps
- 'Chinese Humanoid Robots' showcased during Lunar New Year event impress everyone and 'prove' why Elon Musk said that China is only competitor Tesla has
- This was the best-selling smartphone in India final quarter of 2025
Photostories
- 7 fascinating birds that can swim underwater like fish
- Ramadan 2026: Why Muslims break their Ramadan fast with dates
- Is Mysore silk the new Birkin of India? Why women are queuing at 4 AM for these ₹2.5 lakh sarees
- Rashmika Mandanna and Vijay Deverakonda movie stills that capture their undeniable chemistry
- 7 property laws all tenants must know
- Rajpal Yadav, Ameesha Patel, Rajkumar Santoshi: Bollywood names entangled in cheque bounce controversies
- Archana Puran Singh recalls how husband Parmeet considered separation amid problems in marriage; says, “There was no physical and emotional intimacy”
- Top 5 real estate hotspots in and around Chandigarh in 2026
- 5 clever comebacks for parents to deal with kids who talk back
- What is ‘energy protection' and why is everyone suddenly talking about it?
Up Next
Start a Conversation
Post comment