DSpace About DSpace Software
 

Dyuthi @ CUSAT >
Ph.D THESES >
Faculty of Technology >

Please use this identifier to cite or link to this item: http://purl.org/purl/3721

Title: Speaker identification using models for phonemes
Authors: Babu, Anto P
Dr.Sridhar, C S
Keywords: Automatic Speaker Recognition
Speech recognition and speaker recognition
Probablistic Dependence Measure
Issue Date: Oct-1990
Publisher: Cochin University of Science And Technology
Abstract: Motivation for Speaker recognition work is presented in the first part of the thesis. An exhaustive survey of past work in this field is also presented. A low cost system not including complex computation has been chosen for implementation. Towards achieving this a PC based system is designed and developed. A front end analog to digital convertor (12 bit) is built and interfaced to a PC. Software to control the ADC and to perform various analytical functions including feature vector evaluation is developed. It is shown that a fixed set of phrases incorporating evenly balanced phonemes is aptly suited for the speaker recognition work at hand. A set of phrases are chosen for recognition. Two new methods are adopted for the feature evaluation. Some new measurements involving a symmetry check method for pitch period detection and ACE‘ are used as featured. Arguments are provided to show the need for a new model for speech production. Starting from heuristic, a knowledge based (KB) speech production model is presented. In this model, a KB provides impulses to a voice producing mechanism and constant correction is applied via a feedback path. It is this correction that differs from speaker to speaker. Methods of defining measurable parameters for use as features are described. Algorithms for speaker recognition are developed and implemented. Two methods are presented. The first is based on the model postulated. Here the entropy on the utterance of a phoneme is evaluated. The transitions of voiced regions are used as speaker dependent features. The second method presented uses features found in other works, but evaluated differently. A knock—out scheme is used to provide the weightage values for the selection of features. Results of implementation are presented which show on an average of 80% recognition. It is also shown that if there are long gaps between sessions, the performance deteriorates and is speaker dependent. Cross recognition percentages are also presented and this in the worst case rises to 30% while the best case is 0%. Suggestions for further work are given in the concluding chapter.
Description: Department of Electronics, Cochin University of Science And Technology
URI: http://dyuthi.cusat.ac.in/purl/3721
Appears in Collections:Faculty of Technology

Files in This Item:

File Description SizeFormat
Dyuthi-T1677.pdfpdf2.88 MBAdobe PDFView/Open
View Statistics

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback