Perceptual identification of oral and nasalized vowels across American English and British English listeners and TTS voices

Gwizdzinski, Jakub and Barreda, Santiago and Carignan, Christopher and Zellou, Georgia (2023) Perceptual identification of oral and nasalized vowels across American English and British English listeners and TTS voices. Frontiers in Communication, 8. ISSN 2297-900X

[thumbnail of pubmed-zip/versions/1/package-entries/fcomm-08-1307547/fcomm-08-1307547.pdf]

Text
pubmed-zip/versions/1/package-entries/fcomm-08-1307547/fcomm-08-1307547.pdf - Published Version
Download (2MB)

Official URL: https://doi.org/10.3389/fcomm.2023.1307547

Abstract

Nasal coarticulation is when the lowering of the velum for a nasal consonant co-occurs with the production of an adjacent vowel, causing the vowel to become (at least partially) nasalized. In the case of anticipatory nasal coarticulation, enhanced coarticulatory magnitude on the vowel facilitates the identification of an upcoming nasal coda consonant. However, nasalization also affects the acoustic properties of the vowel, including formant frequencies. Thus, while anticipatory nasalization may help facilitate perception of a nasal coda consonant, it may at the same time cause difficulty in the correct identification of preceding vowels. Prior work suggests that the temporal degree of nasal coarticulation is greater in American English (US) than British English (UK), yet the perceptual consequences of these differences have not been explored. The current study investigates perceptual confusions for oral and nasalized vowels in US and UK TTS voices by US and UK listeners. We use TTS voices, in particular, to explore these perceptual consequences during human-computer interaction, which is increasing due to the rise of speech-enabled devices. Listeners heard words with oral and nasal codas produced by US and UK voices, masked with noise, and made lexical identifications from a set of options varying in vowel and coda contrasts. We find the strongest effect of speaker dialect on accurate word selection: overall accuracy is highest for UK Oral Coda words (83%) and lower for US Oral Coda words (67%); the lowest accuracy was for words with Nasal Codas in both dialects (UK Nasal = 61%; US Nasal = 60%). Error patterns differed across dialects: both listener groups made more errors in identifying nasal codas in words produced in UK English than those produced in US English. Yet, the rate of errors in identifying the quality of nasalized vowels was similarly lower than that of oral vowels across both varieties. We discuss the implications of these results for cross-dialectal coarticulatory variation, human-computer interaction, and perceptually driven sound change.

Item Type:	Article
Subjects:	Research Scholar Guardian > Social Sciences and Humanities
Depositing User:	Unnamed user with email support@scholarguardian.com
Date Deposited:	13 Dec 2023 08:27
Last Modified:	13 Dec 2023 08:27
URI:	http://science.sdpublishers.org/id/eprint/2432

Actions (login required)

: View Item