Glottal Closure Instant Detection using Echo State Networks
Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung
Beitragende
Abstract
The Time of Excitation (Tx) of speech, also widely known as the Glottal Closure Instants (GCI) denote the points in time at which the vocal folds close during the production of voiced speech. In this paper, we extend a previous approach based on a multilayer perceptron (MLP) using Echo State Networks (ESN), a variant of a Recurrent Neural Network (RNN). We show that the MLP and ESN approaches lead to similar results. The ESN model performed better than the MLP when the latter used only a single input sample (0.86 vs 0.75 area under the ROC plot), whereas the MLP slightly outperformed the ESN (0.98 vs 0.97 area under the ROC plot) when its was provided with a sufficient number of surrounding speech samples.
Details
| Originalsprache | Englisch |
|---|---|
| Titel | Studientexte zur Sprachkommunikation: Elektronische Sprachsignalverarbeitung 2021 |
| Redakteure/-innen | Stefan Hillmann, Benjamin Weiss, Thilo Michael, Sebastian Möller |
| Herausgeber (Verlag) | Dresden : TUDpress |
| Seiten | 161-168 |
| Seitenumfang | 8 |
| ISBN (Print) | 978-3-959082-27-3 |
| Publikationsstatus | Veröffentlicht - 1 März 2021 |
| Peer-Review-Status | Ja |
Externe IDs
| ORCID | /0000-0003-0167-8123/work/168716963 |
|---|---|
| ORCID | /0000-0002-8149-2275/work/168719849 |
Schlagworte
Schlagwörter
- Phonetik und Artikulation