SUSTAINED VOWEL RECOGNITOR FOR NORMAL AND DYSPHONIC SPEECH
This work aims to use speech recognition technology, through a probabilistic evaluation that the tool performs when analyzing the variation in the hit rate and accuracy of sustained phonemes of interest and variations inherent to the tool used. The focus was given to cases of people from 45 to 60 years old, in groups of individuals with normal speech and individuals with pathological speech, which are used to train and test the recognition system. Manual labeling was performed for the entire set of signals using the Praat tool; and speech recognition was performed using Hidden Markov Models (Hidden Markov Model or HMM) from the HTK tool (Hidden Markov Model Toolkit). The acquired results were a hit rate of 74.63% and accuracy of 39.18%. Therefore, even with low results, the method is effective and it is possible to optimize the method when training with a larger set of signals.
SUSTAINED VOWEL RECOGNITOR FOR NORMAL AND DYSPHONIC SPEECH
-
DOI: 10.22533/at.ed.317352310024
-
Palavras-chave: HTK Toolkit; Hidden Markov Models; speech recognition.
-
Keywords: HTK Toolkit; Hidden Markov Models; speech recognition.
-
Abstract:
This work aims to use speech recognition technology, through a probabilistic evaluation that the tool performs when analyzing the variation in the hit rate and accuracy of sustained phonemes of interest and variations inherent to the tool used. The focus was given to cases of people from 45 to 60 years old, in groups of individuals with normal speech and individuals with pathological speech, which are used to train and test the recognition system. Manual labeling was performed for the entire set of signals using the Praat tool; and speech recognition was performed using Hidden Markov Models (Hidden Markov Model or HMM) from the HTK tool (Hidden Markov Model Toolkit). The acquired results were a hit rate of 74.63% and accuracy of 39.18%. Therefore, even with low results, the method is effective and it is possible to optimize the method when training with a larger set of signals.
- Mariana Regina Aguiar Catete
- Marcelo de Oliveira Rosa