Building Speech Corpus in Rapid Manner to Adapt a General Purpose ASR System to Specific Domain

EasyChair Preprint 7181, version 2

Versions: 12→history

7 pages•Date: April 18, 2022

Mukund K Roy, Sunita Arora, Karunesh Arora and Shyam S Agarwal

Abstract

The situation prevalent due to Covid-19 has affected the traditional speech database collection process by reaching out persons in one-to-one manner. In this paper, we describe an alternate approach adopted for faster speech dataset construction in Hindi language for building domain adapted Automatic Speech Recognition (ASR) for agriculture domain. We resorted to two methods – one utilizing App for speech samples collection and second through domain specific YouTube videos. In this paper we outline building of App and several filtering (signature music, advertisements and cross talks) and post-processing steps for speech database collected through on-line videos. The paper also describes novel idea of making speech segments suitable for training an end-to-end ASR system. The process of annotating included combination of utilizing existing ASR systems and manual post correction to save time. Our experiment resulted in collection of speech data from 236 speakers through App and 106 hours of speech data through on-line videos. The experiment of re-training ASR with enhanced data reveals that exercise results in adapting it for a particular domain in a rapid manner.

Keyphrases: data collection, speech corpus, speech recognition

Links:

https://easychair.org/publications/preprint/PZcd

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:7181,
  author    = {Mukund K Roy and Sunita Arora and Karunesh Arora and Shyam S Agarwal},
  title     = {Building Speech Corpus in Rapid Manner to Adapt a General Purpose ASR System to Specific Domain},
  howpublished = {EasyChair Preprint 7181},
  year      = {EasyChair, 2022}}

Download PDF Open PDF in browser