OpenAI has been making rapid progress with its ChatGPT generative AI chatbot and Sora AI video creator over the past year, and now it has introduced a new artificial intelligence tool: Voice Generation. This tool can create synthetic voices from just a 15-second audio clip.
In a recent blog post (via The Verge), OpenAI announced that it has been conducting a “small-scale preview” of Voice Engine, which has been in development since late 2022. The Voice Engine is already being utilized in the Read Aloud feature of the ChatGPT app, where it reads out responses to users.
After training the voice with a 15-second sample, users can have it read out any text in an “emotive and realistic” manner. OpenAI suggests that this tool could be used for educational purposes, translating podcasts into different languages, reaching out to remote communities, and supporting non-verbal individuals.
While Voice Engine is not yet widely available, OpenAI has shared samples created by the tool for users to listen to. The samples sound impressive, although there is a slightly robotic and unnatural quality to them.
Safety first
Concerns about potential misuse are the primary reason why Voice Engine is only available in a limited preview. OpenAI intends to conduct further research on how to prevent misinformation and unauthorized voice copying using tools like this.
“We aim to initiate a discussion on the responsible use of synthetic voices and how society can adapt to this new technology,” states OpenAI. “After analyzing the outcomes of these small-scale tests and engaging in these discussions, we will make an informed decision on whether and how to deploy this technology on a larger scale.”
Given the upcoming major elections in the US and UK and the continuous advancement of generative AI tools, concerns about trustworthiness in all forms of AI content – audio, text, and video – are becoming more prevalent. It is increasingly challenging to discern what content can be trusted.
As highlighted by OpenAI, this technology has the potential to pose challenges with voice authentication systems, as well as scams involving impersonation over the phone or through voicemails. These are complex issues that require careful consideration and solutions.
You might also like