×

img Acces sibility Controls

Research Projects Banner

Research Projects

Development of spoken Language Corpora for Under Resourced Languages

Implementing Organization

Pandit Deendayal Energy University
Principal Investigator
Dr. Tanmay Bhowmik
Pandit Deendayal Energy University

Project Overview

Under-resourced languages are those with limited resources, such as speech data, language models, or text corpora, which are often spoken by smaller communities and are less well-studied than more commonly spoken languages. These languages are often not well-studied and require the creation of speech corpora based on spoken language, which contains prosodic words. These corpora can improve the uniformity and robustness of current AsR systems. A spoken language corpus is a collection of recorded speech that is transcribed and annotated with linguistic information, such as phonetic and prosodic features, which can be used for developing and evaluating speech recognition and language processing systems. These corpora are necessary for training speech recognition systems, developing language models, linguistic research, and preserving cultural heritage. speech recognition systems typically use machine learning algorithms, which require large amounts of annotated speech data. A spoken language corpus can provide a foundation for training these systems, improving their accuracy and performance. Linguistic research can also be conducted on spoken language corpora, focusing on phonetics, prosody, and syntax to deepen our understanding of the language and its structure. In conclusion, spoken language corpora are crucial resources for developing and evaluating speech recognition and language processing systems, as well as linguistic research and cultural heritage preservation.
Funding Organization
Funding Organization
Science and Engineering Research Board (SERB), New Delhi
Anusandhan National Research Foundation (ANRF)
Quick Information
Area of Research
Engineering Sciences
Start Year
2024
End Year
2027
Sanction Amount
₹ 18.30 L
Status
Ongoing
Output
No. of Research Paper
00
Technologies (If Any)
00
No. of PhD Produced
N/A
Startup (If Any)
00
No. of Patents
Filed :00
Grant :00
arrowtop