New research reveals that while people struggle to distinguish between human and AI-generated voices, their brain responses differ significantly.
People assume happy voices to be real and ‘neutral’ voices to be AI.
Research shows that people struggle to identify AI from human voices, but their brain activities differ, suggesting unique responses to each voice type, which has significant implications for technology and ethics.
AI Voice Recognition
People are not very good at distinguishing between human voices and voices generated by artificial intelligence
Artificial Intelligence (AI) is a branch of computer science focused on creating systems that can perform tasks typically requiring human intelligence. These tasks include understanding natural language, recognizing patterns, solving problems, and learning from experience. AI technologies use algorithms and massive amounts of data to train models that can make decisions, automate processes, and improve over time through machine learning. The applications of AI are diverse, impacting fields such as healthcare, finance, automotive, and entertainment, fundamentally changing the way we interact with technology.
” data-gt-translate-attributes=”[{"attribute":"data-cmtooltip", "format":"html"}]” tabindex=”0″ role=”link”>artificial intelligence (AI), but our brains do respond differently to human and AI voices. This is according to research to be presented on June 25 at the Federation of European Neuroscience Societies (FENS) Forum 2024.[1]
The study will be presented by doctoral researcher Christine Skjegstad, and carried out by Ms Skjegstad and Professor Sascha Frühholz, both from the Department of Psychology at the University of Oslo
Established on September 2, 1811, the University of Oslo (Norwegian: Universitetet i Oslo) is the oldest university in Norway. It is located in Oslo, the capital of Norway. The Nobel Peace Prize was awarded in the university's Atrium, from 1947 to 1989, making it the only university in the world to be involved in awarding a Nobel Prize.
” data-gt-translate-attributes=”[{"attribute":"data-cmtooltip", "format":"html"}]” tabindex=”0″ role=”link”>University of Oslo (UiO), Norway.
Spectrograms to demonstrate the similarity between human and AI voices. Credit: FENS Forum / Christine Skjegstad
Advancements and Challenges in AI Voice Technology
Ms Skjegstad said: “We already know that AI-generated voices have become so advanced that they are nearly indistinguishable from real human voices. It’s now possible to clone a person’s voice from just a few seconds of recording, and scammers have used this technology to mimic a loved one in distress and trick victims into transferring money. While machine learning
Machine learning is a subset of artificial intelligence (AI) that deals with the development of algorithms and statistical models that enable computers to learn from data and make predictions or decisions without being explicitly programmed to do so. Machine learning is used to identify patterns in data, classify data into different categories, or make predictions about future events. It can be categorized into three main types of learning: supervised, unsupervised and reinforcement learning.
” data-gt-translate-attributes=”[{"attribute":"data-cmtooltip", "format":"html"}]” tabindex=”0″ role=”link”>machine learning experts have been developing technological solutions to detect AI voices, much less is known about the human brain’s response to these voices.”
The research involved 43 people who were asked to listen to human and AI-generated voices expressing five different emotions: neutral, angry, fear, happy, pleasure.[2] They were asked to identify the voices as synthetic or natural while their brains were studied using functional magnetic resonance imaging (fMRI
Functional magnetic resonance imaging (fMRI) is a neuroimaging technique that measures and maps brain activity by detecting changes associated with blood flow. This method relies on the fact that cerebral blood flow and neuronal activation are coupled; when an area of the brain is in use, blood flow to that region also increases. fMRI is particularly valuable in psychology and cognitive neuroscience because it allows researchers to observe the brain while it is processing information, thereby helping to identify which parts of the brain are involved in specific mental processes. Non-invasive and requiring no exposure to radiological agents, fMRI has become a fundamental tool in the understanding of brain functions, aiding in clinical diagnosis and medical research.
” data-gt-translate-attributes=”[{"attribute":"data-cmtooltip", "format":"html"}]” tabindex=”0″ role=”link”>fMRI). fMRI is used to detect changes to blood flow within the brain, indicating which parts of the brain are active. The participants were also asked to rate the characteristics of the voices they heard in terms of naturalness, trustworthiness, and authenticity.
Researcher Christine Skjegstad. Credit: FENS Forum / Christine Skjegstad
Participant Performance in Voice Identification
Participants correctly identified human voices only 56% of the time and AI voices 50.5% of the time, meaning they were equally bad at identifying both types of voices.
People were more likely to correctly identify a ‘neutral’ AI voice as AI (75% compared to 23% who could correctly identify a neutral human voice as human), suggesting that people assume neutral voices are more AI-like. Female AI neutral voices were identified correctly more often than male AI neutral voices. For happy human voices, the correct identification rate was 78%, compared to only 32% for happy AI voices, suggesting that people associate happiness as more human-like.
Both AI and human neutral voices were perceived as least natural, trustworthy, and authentic while human happy voices were perceived as most natural, trustworthy, and authentic.
Brain Response Differences to Human and AI Voices
However, looking at the brain imaging, researchers found that human voices elicited stronger responses in areas of the brain associated with memory (right hippocampus) and empathy (right inferior frontal gyrus). AI voices elicited stronger responses in areas related to error detection (right anterior mid cingulate cortex) and attention regulation (right dorsolateral prefrontal cortex).
Ms Skjegstad said: “My research indicates that we are not very accurate in identifying whether a voice is human or AI-generated. The participants also often expressed how difficult it was for them to tell the difference between the voices. This suggests that current AI voice technology can mimic human voices to a point where it is difficult for people to reliably tell them apart.
“The results also indicate a bias in perception where neutral voices were more likely to be identified as AI-generated and happy voices were more likely to be identified as more human, regardless of whether they actually were. This was especially the case for neutral female AI voices, which may be because we are familiar with female voice assistants such as Siri and Alexa.
“While we are not very good at identifying human from AI voices, there does seem to be a difference in the brain’s response. AI voices may elicit heightened alertness while human voices may elicit a sense of relatedness.”
The researchers now plan to study whether personality traits, for example extraversion or empathy, make people more or less sensitive to noticing the differences between human and AI voices.
Expert Opinion and Broader Implications
Professor Richard Roche is chair of the FENS Forum communication committee and Deputy Head of the Department of Psychology at Maynooth University, Maynooth, County Kildare, Ireland, and was not involved in the research. He said: “Investigating the brain’s responses to AI voices is crucial as this technology continues to advance. This research will help us to understand the potential cognitive and social implications of AI voice technology, which may support policies and ethical guidelines.
“The risks of this technology being used to scam and fool people are obvious. However, there are potential benefits as well, such as providing voice replacements for people who have lost their natural voice. AI voices could also be used in therapy for some mental health conditions.”
Since submitting their abstract, the researchers have included additional data in their analyses.
Funding: The Department of Psychology at the University of Oslo (UiO)
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Cookie settingsACCEPT
Privacy & Cookies Policy
Privacy Overview
This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.