On the Optimal Set of Features and the Robustness of Classifiers in Radar-based Silent Phoneme Recognition

Research output: Contribution to book/Conference proceedings/Anthology/ReportConference contributionContributedpeer-review

Abstract

Silent speech recognition (SSR) is an active area of research with applications ranging from speech restoration to speech enhancement. Radar-based SSR has been proposed and investigated as a non-invasive method to infer vocal tract states and articulatory movements from measured changes in scattering parameters. One of the challenges in developing a radar-based SSR system is to determine the optimal set of features from these measurements. In this study, we therefore investigated the following problems: (a) The selection of the features that play the most significant role for classification. (b) The determination of the contribution of each reflection and transmission spectrum and the most important frequencies. (c) The determination of the performance of the classifiers when using fewer features. (d) The determination of the robustness of the classifiers against different noise levels. The data used in this study consisted of 230 samples of 25 German phonemes (15 vowels, each in 10 contexts, and 10 consonants, each in 8 contexts) produced by two German native speakers. Using the full feature set, a Linear Discriminant Analysis (LDA) classifier achieved up to 94 % classification accuracy for speaker 1 and 84 % for speaker 2. Using only the most important features as identified by a decision tree, the classification accuracy deteriorated slightly in most conditions, but in one case improved the accuracy from 73.5 % to 81 %. Regarding the robustness against noise, the accuracy of the LDA dropped sharply with increasing noise levels, while the decrease of the SVM’s accuracy was less steep.

Details

Original languageEnglish
Title of host publicationStudientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung 2021
EditorsStefan Hillmann, Benjamin Weiss, Thilo Michael, Sebastian Möller
Publisher Dresden : TUDpress
Pages112-119
Number of pages8
ISBN (print)978-3-959082-27-3
Publication statusPublished - 1 Mar 2021
Peer-reviewedYes

External IDs

ORCID /0000-0003-0167-8123/work/168716961

Keywords

Keywords

  • Automatische Spracherkennung