ChatGPT finds its voice

Chris Pash
By Chris Pash | 27 September 2023
 
Credit: Andy Kelly via Unsplash

ChatGPT, the artificial intelligence (AI) disrupting creative industries, can now see, hear and speak.

Have a conversation, request a bedtime story, settle a debate or tell it to work out what’s for dinner.

No typing required.

The AI will reply in one of five different voices, sounding like real people and very unlike Apple’s Siri or Amazon’s Alexa.

Open AI, the company behind the natural language processing tool, has started rolling out voice and image capabilities in ChatGPT.

“They offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about,” the company says.

“Voice and image give you more ways to use ChatGPT in your life.

“Snap a picture of a landmark while travelling and have a live conversation about what’s interesting about it.

“When you’re home, snap pictures of your fridge and pantry to figure out what’s for dinner (and ask follow up questions for a step by step recipe).

“After dinner, help your child with a math problem by taking a photo, circling the problem set, and having it share hints with both of you.”

The voice capability is powered by a text-to-speech model, capable of generating human-like audio from just text and a few seconds of sample speech.

Professional voice actors were used to create each of the voices. Whisper, an open-source speech recognition system, is used to transcribe spoken words into text.

Have something to say on this? Share your views in the comments section below. Or if you have a news story or tip-off, drop us a line at adnews@yaffa.com.au

Sign up to the AdNews newsletter, like us on Facebook or follow us on Twitter for breaking stories and campaigns throughout the day.

comments powered by Disqus