Development of a Multilingual Isolated Digit Speech Corpus


The paper was presented at the 2017 Conference of The Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques (O-COCOSDA)

The purpose of Oriental COCOSDA (the oriental chapter of COCOSDA) is to exchange ideas, to share information and to discuss regional matters on creation, utilization, dissemination of spoken language corpora of oriental languages and also on the assessment methods of speech recognition/synthesis systems as well as promote speech research on oriental languages.


  • Emmanuel Malaay
  • Michael Simora
  • Ronald John Cabatic
  • Nathaniel Oco
  • Rachel Edita Roxas


We present a multilingual speech corpus for isolated digits. As case study, we focused on languages in the Philippines: English, Filipino, Ilocano, Cebuano, and Spanish. Our isolated digits speech corpus has a duration of almost nine hours, collection from 262 speakers. These data were word-level annotated and will be used to train the acoustic models using the ASR toolkits. The corpus will be used for an automatic speech recognition (ASR) system and therefore the database must be sufficient to develop an ASR system.

Posts created 35

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts

Begin typing your search term above and press enter to search. Press ESC to cancel.

Back To Top