Voice clones sound realistic but not (yet) hyperrealistic

Nadine Lavan¹, Mairi Irvine¹, Victor Rosi², Carolyn McGettigan²

Affiliations

¹ Centre for Brain and Behaviour, School of Biological and Behavioural Sciences, Queen Mary University of London, London, United Kingdom.
² UCL Speech, Hearing and Phonetic Sciences, Division of Psychology and Language Sciences, London, United Kingdom.

PMID: 40991627
PMCID: PMC12459763
DOI: 10.1371/journal.pone.0332692

Voice clones sound realistic but not (yet) hyperrealistic

Nadine Lavan et al. PLoS One. 2025.

. 2025 Sep 24;20(9):e0332692.

doi: 10.1371/journal.pone.0332692. eCollection 2025.

Authors

Nadine Lavan¹, Mairi Irvine¹, Victor Rosi², Carolyn McGettigan²

Affiliations

¹ Centre for Brain and Behaviour, School of Biological and Behavioural Sciences, Queen Mary University of London, London, United Kingdom.
² UCL Speech, Hearing and Phonetic Sciences, Division of Psychology and Language Sciences, London, United Kingdom.

PMID: 40991627
PMCID: PMC12459763
DOI: 10.1371/journal.pone.0332692

Abstract

AI-generated voices are increasingly prevalent in our lives, via virtual assistants, automated customer service, and voice-overs. With increased availability and affordability of AI-generated voices, we need to examine how humans perceive them. Recently, an intriguing effect was reported in AI-generated faces, where such face images were perceived as more human than images of real humans - a "hyperrealism effect." Here, we tested whether a "hyperrealism effect" also exists for AI-generated voices. We investigated the extent to which AI-generated voices sound real to human listeners, and whether listeners can accurately distinguish between human and AI-generated voices. We also examined perceived social trait characteristics (trustworthiness and dominance) of human and AI-generated voices. We tested these questions using AI-generated voices generated with and without a specific human counterpart (i.e., voice clones, and voices generated from the latent space of a large voice model). We find that voice clones can sound as real as human voices, making it difficult for listeners to distinguish between them. However, we did not observe a hyperrealism effect. Both types of AI-generated voices were evaluated as more dominant than human voices, with some AI-generated voices also being perceived as more trustworthy. These findings raise questions for future research: Can hyperrealistic voices be created with more advanced technology, or is the lack of a hyperrealism effect due to differences between voice and face (image) perception? Our findings also highlight the potential for AI-generated voices to misinform and defraud, alongside opportunities to use realistic AI-generated voices for beneficial purposes.

Copyright: © 2025 Lavan et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Violin plots showing the results for the forced choice “Human or AI” classification task and Realness rating task for Experiment 1a (panel a, c) and Experiment 1b (panel b, d).**
Violin plots show the distribution of the data with boxplots. * indicates p < .05 for the effect of Voice Type in a (G)LMM.

**Fig 2. Violin plots showing the results for sensitivity analyses conducted on the “Human or AI” classification task.**
Panel a) shows D Prime, Panel b) shows Criterion C for Experiment 1a and 1b. Violin plots show the distribution of the data with boxplots. * indicates p < .05 in a one-sample t-test against 0.

**Fig 3. Violin plots showing the results for the trait ratings tasks for Experiment 1a (panels a, c) and Experiment 1b (panels b, d).**
Violin plots show the distribution of the data with boxplots. * indicates p < .05 for the effect of Voice Type in a (G)LMM.

**Fig 4. Violin plots showing the results for the forced choice Human or AI classification task (panel a) and Realness rating task (panel b) for Experiment 2.**
Violin plots show the distribution of the data with boxplots. * indicates p < .05 for the effect of Voice Type (relative to “Human Voice” as a reference condition) in a (G)LMM.

**Fig 5. Violin plots showing the results for sensitivity analyses conducted on the forced choice Human or AI judgements task.**
Panel a) shows D Prime, Panel b) shows Criterion C for Experiment 2. Violin plots show the distribution of the data with boxplots. * indicates p < .05 in a one-sample t-test against 0.

**Fig 6. Mean realness ratings for the Human Voices (yellow dots) and their corresponding Voice Clones (turquoise dots) from Experiments 1b and 2.**
Grey bars illustrate the difference in ratings between a Human Voice and its corresponding Voice Clone.

See this image and copyright information in PMC

References

1. Triantafyllopoulos A, Schuller BW, İymen G, Sezgin M, He X, Yang Z, et al. An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era. Proc IEEE. 2023;111(10):1355–81. doi: 10.1109/jproc.2023.3250266 - DOI
1. Staff A. How Amazon rebuilt Alexa with generative AI [Internet]. 2025 [cited 2025 May 21]. Available from: https://www.aboutamazon.com/news/devices/new-alexa-tech-generative-artif...
1. Cave R, Bloch S. Voice banking for people living with motor neurone disease: Views and expectations. Int J Lang Commun Disord. 2021;56(1):116–29. doi: 10.1111/1460-6984.12588 - DOI - PubMed
1. Judge S, Hayton N. Voice banking for individuals living with MND: A service review. Technol Disabil. 2022;34(2):113–22.
1. Yamagishi J, Veaux C, King S, Renals S. Speech synthesis technologies for individuals with vocal disabilities: Voice banking and reconstruction. Acoust Sci Technol. 2012;33(1):1–5.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- PubMed Central
- Public Library of Science

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Voice clones sound realistic but not (yet) hyperrealistic

Affiliations

Voice clones sound realistic but not (yet) hyperrealistic

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources