New York Tech Media
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
New York Tech Media
No Result
View All Result
Home News

Listen to an AI voice actor try and flirt with you

New York Tech Editorial Team by New York Tech Editorial Team
February 17, 2022
in News
0
Listen to an AI voice actor try and flirt with you
Share on FacebookShare on Twitter

The quality of AI-generated voices has improved rapidly in recent years, but there are still aspects of human speech that escape synthetic imitation. Sure, AI actors can deliver smooth corporate voiceovers for presentations and adverts, but more complex performances — a convincing rendition of Hamlet, for example — remain out of reach.

Sonantic, an AI voice startup, says it’s made a minor breakthrough in its development of audio deepfakes, creating a synthetic voice that can express subtleties like teasing and flirtation. The company says the key to its advance is the incorporation of non-speech sounds into its audio; training its AI models to recreate those small intakes of breath — tiny scoffs and half-hidden chuckles — that give real speech its stamp of biological authenticity.

“Bigger emotions are a little easier to capture”

“We chose love as a general theme,” Sonantic co-founder and CTO John Flynn tells The Verge. “But our research goal was to see if we could model subtle emotions. Bigger emotions are a little easier to capture.”

In the video below, you can hear the company’s attempt at a flirtatious AI — though whether or not you think it captures the nuances of human speech is a subjective question. On a first listen, I thought the voice was near-indistinguishable from that of a real person, but colleagues at The Verge say they instantly clocked it as a robot, pointing to the uncanny spaces left between certain words, and a slight synthetic crinkle in the pronunciation.

Sonantic CEO Zeena Qureshi describes the company’s software as “Photoshop for voice.” Its interface lets users type out the speech they want to synthesize, specify the mood of the delivery, and then select from a cast of AI voices, most of which are copied from real human actors. This is by no means a unique offering (rivals like Descript sell similar packages) but Sonantic says its level of customization is more in-depth than that of rivals’.

Emotional choices for delivery include anger, fear, sadness, happiness, and joy, and, with this week’s update, flirtatious, coy, teasing, and boasting. A “director mode” allows for even more tweaking: the pitch of a voice can be adjusted, the intensity of delivery dialed up or down, and those little non-speech vocalizations like laughs and breaths inserted.

Sonantic’s software lets you adjust the delivery of AI-generated speech.
Image: Sonantic

“I think that’s the main difference — our ability to direct and control and edit and sculpt a performance,” says Flynn. “Our clients are mostly triple-A game studios, entertainment studios, and we’re branching out into other industries. We recently did a partnership with Mercedes [to customize its in-car digital assistant] earlier this year.”

As is often the case with such technology, though, the real benchmark for Sonantic’s achievement is the audio that comes fresh out of its machine learning models, rather than what’s used in polished, PR-ready demos. Flynn says the speech synthesized for its flirty video required “very little manual adjustment,” but the company did cycle through a few different renderings to find the very best output.

To try and get a raw and representative sample of Sonantic’s technology, I asked them to render the same line (directed to you, dear Verge reader) using a handful of different moods. You can listen to them yourself to compare.

First, here’s “flirty”:

Then “teasing”:

“Pleased”:

“Cheerful”:

And finally, “casual”:

To my ears, at least, these clips are a lot rougher than the demo. This suggests a few things. First, that manual polishing is needed to get the most out of AI voices. This is true of many AI endeavors, like self-driving cars, which have successfully automated very basic driving but still struggle with that last and all-important 5 percent that defines human competence. It means that fully-automated, totally-convincing AI voice synthesis is still a way off.

Second, I think it shows that the psychological concept of priming can do a lot to trick your senses. The video demo — with its footage of a real human actor being unsettlingly intimate towards the camera — may cue your brain to hear the accompanying voice as real. The best synthetic media, then, might be that which combines real and fake outputs.

Apart from the question of how convincing the technology is, Sonantic’s demo raises other issues — like, what are the ethics of deploying a flirtatious AI? Is it fair to manipulate listeners in this way? And why did Sonantic choose to make its flirting figure female? (It’s a choice that arguably perpetuates a subtle form of sexism in the male-dominated tech industry, where companies tend to code AI assistants as pliant — even flirty — secretaries.)

On the first question, the company said their choice of a female voice was simply inspired by Spike Jonze’s 2013 film Her, where the protagonist falls in love with a female AI assistant named Samantha. On the second, Sonantic said it recognizes the ethical quandaries that accompany the development of new technology, and that it’s careful in how and where it uses its AI voices.

“That’s one of the biggest reasons we’ve stuck to entertainment,” says CEO Qureshi. “CGI isn’t used for just anything — it’s used for the best entertainment products and simulations. We see this [technology] the same way.” She adds that all of the company’s demos include a disclosure that the voice is, indeed, synthetic (though this doesn’t mean much if clients want to use the company’s software to generate voices for more deceitful purposes).

Comparing AI voice synthesis to other entertainment products makes sense. After all, being manipulated by film and TV is arguably the reason we make those things in the first place. But there is also something to be said about the fact that AI will allow such manipulation to be deployed at scale, with less attention to its impact in individual cases. Around the world, for example, people are already forming relationships — even falling in love — with AI chatbots. Adding AI-generated voices to these bots will surely make them more potent, raising questions about how these and other systems should be engineered. If AI voices can convincingly flirt, what might they persuade you to do?

Credit: Source link

Previous Post

Robot fry cook gets job at 100 White Castle locations

Next Post

Bill Gates’ VC investment firm leads US$50m injection into thermal energy storage startup

New York Tech Editorial Team

New York Tech Editorial Team

New York Tech Media is a leading news publication that aims to provide the latest tech news, fintech, AI & robotics, cybersecurity, startups & leaders, venture capital, and much more!

Next Post
Bill Gates’ VC investment firm leads US$50m injection into thermal energy storage startup

Bill Gates' VC investment firm leads US$50m injection into thermal energy storage startup

  • Trending
  • Comments
  • Latest
Meet the Top 10 K-Pop Artists Taking Over 2024

Meet the Top 10 K-Pop Artists Taking Over 2024

March 17, 2024
Panther for AWS allows security teams to monitor their AWS infrastructure in real-time

Many businesses lack a formal ransomware plan

March 29, 2022
Zach Mulcahey, 25 | Cover Story | Style Weekly

Zach Mulcahey, 25 | Cover Story | Style Weekly

March 29, 2022
How To Pitch The Investor: Ronen Menipaz, Founder of M51

How To Pitch The Investor: Ronen Menipaz, Founder of M51

March 29, 2022
Japanese Space Industry Startup “Synspective” Raises US $100 Million in Funding

Japanese Space Industry Startup “Synspective” Raises US $100 Million in Funding

March 29, 2022
UK VC fund performance up on last year

VC-backed Aerium develops antibody treatment for Covid-19

March 29, 2022
Startups On Demand: renovai is the Netflix of Online Shopping

Startups On Demand: renovai is the Netflix of Online Shopping

2
Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

1
Menashe Shani Accessibility High Tech on the low

Revolutionizing Accessibility: The Story of Purple Lens

1

Netgear announces a $1,500 Wi-Fi 6E mesh router

0
These apps let you customize Windows 11 to bring the taskbar back to life

These apps let you customize Windows 11 to bring the taskbar back to life

0
This bipedal robot uses propeller arms to slackline and skateboard

This bipedal robot uses propeller arms to slackline and skateboard

0
New York City

Why Bite-Sized Learning is Booming in NYC’s Hustle Culture

June 4, 2025
Driving Innovation in Academic Technologies: Spotlight from ICTIS 2025

Driving Innovation in Academic Technologies: Spotlight from ICTIS 2025

June 4, 2025
Coffee Nova’s $COFFEE Token

Coffee Nova’s $COFFEE Token

May 29, 2025
Money TLV website

BridgerPay to Spotlight Cross-Border Payments Innovation at Money TLV 2025

May 27, 2025
The Future of Software Development: Why Low-Code Is Here to Stay

Building Brand Loyalty Starts With Your Team

May 23, 2025
Tork Media Expands Digital Reach with Acquisition of NewsBlaze and Buzzworthy

Creative Swag Ideas for Hackathons & Launch Parties

May 23, 2025

Recommended

New York City

Why Bite-Sized Learning is Booming in NYC’s Hustle Culture

June 4, 2025
Driving Innovation in Academic Technologies: Spotlight from ICTIS 2025

Driving Innovation in Academic Technologies: Spotlight from ICTIS 2025

June 4, 2025
Coffee Nova’s $COFFEE Token

Coffee Nova’s $COFFEE Token

May 29, 2025
Money TLV website

BridgerPay to Spotlight Cross-Border Payments Innovation at Money TLV 2025

May 27, 2025

Categories

  • AI & Robotics
  • Benzinga
  • Cybersecurity
  • FinTech
  • New York Tech
  • News
  • Startups & Leaders
  • Venture Capital

Tags

3D bio-printing acoustic AI Allseated B2B marketing Business carbon footprint climate change coding Collaborations Companies To Watch consumer tech crypto cryptocurrency deforestation drones earphones Entrepreneur Fetcherr Finance Fintech food security Investing Investors investorsummit israelitech Leaders LinkedIn Leaders Metaverse news OurCrowd PR Real Estate reforestation software start- up Startups Startups On Demand startuptech Tech Tech leaders technology UAVs Unlimited Robotics VC
  • Contact Us
  • Privacy Policy
  • Terms and conditions

© 2024 All Rights Reserved - New York Tech Media

No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital

© 2024 All Rights Reserved - New York Tech Media