IIIT-H working on Telugu voice assistants

The International Institute of Information Technology-Hyderabad (IIIT-H), which has formidable experience in language and speech processing, is now building a data set that is based on daily conversations

By   |  Business Bureau  |  Published: 7th Jan 2021  9:35 pm

Hyderabad: Many smartphone users would have used a voice assistant. However, that experience is mostly available in English and to a lesser extent to Hindi, Tamil and Marathi. That leaves several regional languages, including Telugu, untouched. But that will change soon.

The International Institute of Information Technology-Hyderabad (IIIT-H), which has formidable experience in language and speech processing, is now building a data set that is based on daily conversations. This will come handy to make Telugu speaking voice assistants, bots and other voice-based applications that will find a use in tourism and hospitality, e-commerce, logistics, navigation and maps, banking and other industries that are based on Artificial Intelligence (AI).

IIIT-H has forayed into the artificial speech recognition project under the Technology Development for Indian Languages (TDIL) initiative (also known as ‘Bahu Bhashik’) of the Ministry of Electronics and Information Technology. The project aims to overcome language barriers and enable a wider proliferation of information, communication and technology in all Indian languages. This involves automatic speech recognition, speech to speech translation and speech to text translation.

“We need about 2,000-hours data sets that are based on conversations. Data sets are easy to obtain in controlled conditions like a laboratory or even a studio. However, they have to be based on daily conversations. They should also have a variety- young, old, men, women, kids, different accents and other attributes. Conversations also have a background noise,” said Prakash Yalla, Head, Technology Transfer Office at IIIT-H, who is heading the ASR project along with Dr Anil Kumar Vuppala, Associate Professor, Speech Processing Centre.

Right now, data sets are available for about 50-60 hours. To achieve scale, Yalla said, the project will crowd-source collection of data sets to connect voice (technology) with vernacular (languages). Attempts have been made to make conversations possible with the Alexa in Hindi and Siri in an Indian-accented English but a large number of regionals languages are left out. However, experts in the field say local language searches in Tier 2 and 3 towns will continue to grow.


Now you can get handpicked stories from Telangana Today on Telegram everyday. Click the link to subscribe.

Click to follow Telangana Today Facebook page and Twitter .