Malayalam is one of the 22 scheduled languages in India with
more than 130 million speakers. This paper presents a report on
the development of a speaker independent, continuous
transcription system for Malayalam. The system employs
Hidden Markov Model (HMM) for acoustic modeling and Mel
Frequency Cepstral Coefficient (MFCC) for feature extraction.
It is trained with 21 male and female speakers in the age group
ranging from 20 to 40 years. The system obtained a word
recognition accuracy of 87.4% and a sentence recognition
accuracy of 84%, when tested with a set of continuous speech
data.
Description:
International Journal of Computer Applications (0975 – 8887)
Volume 19– No.5, April 2011
Performance of any continuous speech recognition system is dependent on the accuracy of its acoustic model. Hence,
preparation of a robust and accurate acoustic model lead to satisfactory recognition performance for a speech
recognizer. In acoustic modeling of phonetic unit, context information is of prime importance as the phonemes are
found to vary according to the place of occurrence in a word. In this paper we compare and evaluate the effect of
context dependent tied (CD tied) models, context dependent (CD) and context independent (CI) models in the
perspective of continuous speech recognition of Malayalam language. The database for the speech recognition
system has utterance from 21 speakers including 11 female and 10 males. Our evaluation results show that CD tied
models outperforms CI models over 21%.