An Investigation of Acoustic Features of the Lower Vocal Tract for Speaker Recognition .
Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review
Contributors
Abstract
Speaker recognition systems often use mel-scaled cepstral coefficients (MFCCs) as main features. In contrast to MFCCs, Godoy et al. (2015) proposed a different type of short-term spectral analysis that provides features related to the lower vocal tract (LVT). They are calculated as the ratio of the acoustic shorttime spectra during the closed and open phases of the glottal oscillation cycles based on a pitch-synchronous analysis. These features were suggested to be particularly speaker-specific and might therefore be suitable to substitute or complement MFCCs in speaker recognition systems. The present study investigated the benefit of these features in an i-vector-based speaker recognition system. Using the LVT features alone, the system achieved a speaker recognition rate of 92.3% with 63 enrolled speakers. When the LVT features were fused with conventional MFCC features, the recognition rate was about equal to the recognition rate using MFCC features alone (> 98%).
Details
Original language | English |
---|---|
Title of host publication | Elektronische Sprachsignalverarbeitung 2024 |
Editors | Timo Baumann |
Publisher | Dresden : TUDpress |
Pages | 108-115 |
Number of pages | 8 |
ISBN (print) | 978-3-95908-325-6 |
Publication status | Published - 1 Mar 2024 |
Peer-reviewed | Yes |
Publication series
Series | Studientexte zur Sprachkommunikation |
---|---|
Volume | 107 |
ISSN | 0940-6832 |
External IDs
ORCID | /0000-0003-0167-8123/work/168716968 |
---|
Keywords
Keywords
- Paralinguistische Analysen