Dyuthi @ CUSAT >
Ph.D THESES >
Faculty of Technology >
Please use this identifier to cite or link to this item:
http://purl.org/purl/3721
|
Title: | Speaker identification using models for phonemes |
Authors: | Babu, Anto P Dr.Sridhar, C S |
Keywords: | Automatic Speaker Recognition Speech recognition and speaker recognition Probablistic Dependence Measure |
Issue Date: | Oct-1990 |
Publisher: | Cochin University of Science And Technology |
Abstract: | Motivation for Speaker recognition work is presented
in the first part of the thesis. An exhaustive survey of
past work in this field is also presented. A low cost system
not including complex computation has been chosen for implementation.
Towards achieving this a PC based system is
designed and developed. A front end analog to digital convertor
(12 bit) is built and interfaced to a PC. Software to
control the ADC and to perform various analytical functions
including feature vector evaluation is developed. It is shown
that a fixed set of phrases incorporating evenly balanced
phonemes is aptly suited for the speaker recognition work
at hand. A set of phrases are chosen for recognition. Two
new methods are adopted for the feature evaluation. Some
new measurements involving a symmetry check method for pitch
period detection and ACE‘ are used as featured.
Arguments are provided to show the need for a new
model for speech production. Starting from heuristic, a knowledge
based (KB) speech production model is presented. In
this model, a KB provides impulses to a voice producing
mechanism and constant correction is applied via a feedback
path. It is this correction that differs from speaker to
speaker. Methods of defining measurable parameters for use as features are described. Algorithms for speaker recognition
are developed and implemented. Two methods are presented.
The first is based on the model postulated. Here the entropy
on the utterance of a phoneme is evaluated. The transitions
of voiced regions are used as speaker dependent features.
The second method presented uses features found in other works,
but evaluated differently. A knock—out scheme is used to
provide the weightage values for the selection of features.
Results of implementation are presented which show
on an average of 80% recognition. It is also shown that if
there are long gaps between sessions, the performance deteriorates
and is speaker dependent. Cross recognition percentages
are also presented and this in the worst case rises to 30%
while the best case is 0%.
Suggestions for further work are given in the concluding
chapter. |
Description: | Department of Electronics, Cochin University of Science And Technology |
URI: | http://dyuthi.cusat.ac.in/purl/3721 |
Appears in Collections: | Faculty of Technology
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|