Internet Gold | Can You Trust Your Ears? Two Minute Papers (Pt. 2 of 2)

By MoreGainStrategies | Internet Gold | 5 Dec 2020


The wait is over!

In part 1 I kind of left you on a cliffhanger. In this part, we will conclude this 2-part series about voice audio manipulation. 

In part 1 I discussed several methods of faking voice recordings. However, part 1 was just the appetizer for our main course -- which is this article. In part 1 I intentionally left one particular voice manipulation method out. The most powerful of them all -- and to be honest: It is the most scary one, too.

AI -- Artificial Intelligence

Do you remember the scene from the movie Terminator 2 that I gave you as a hint at the end of part 1? Here is a slightly longer version of that clip (less than 3 minutes long):

So what did we see here?

Arnold Schwarzenegger enacted a reprogrammed terminator in that movie. A terminator is a humanoid robot which is usually used for terminating people in a dystopian future. The terminator portrayed by Arnold was reprogrammed and traveled back in time to save John, a young boy, from a more advanced T-1000 terminator robot which was sent back in time in order to kill that same boy.

What happens in that scene is that the more advanced T-1000 shape-shifted into the boy's foster mother. The foster father does not realize that and after he says something that is annoying the shape-shifted terminator, it terminates him... The "Arnold terminator" is with the boy in the phone booth. The shape-shifting terminator is not aware that Arnold traveled back in time as well. Arnold speaks in the boy's voice and uses a fake name for the boy's dog. If Arnold talked to John's foster mother, she would notice that discrepancy. However, the T-1000 apparently does not know the dog's name and is unaware of the trap Arnold laid out for it.

So much to explain the backstory.

Now we get to the part that is the main topic of this article:

The two terminator robots have a conversation over the phone. The T-1000 pretends to be the boy's foster mother, Arnold pretends to be the boy. How do they try to fool each other? By imitating the corresponding human voice. In fact, they do it so perfectly that nobody can tell whether they are speaking to the real human or to a robot.

The shape-shifting T-1000 is not only capable of taking on the appearance of pretty much any solid of a certain size, it can also generate all possible human voices. It does not only match the tone of the voice, but it also imitates little peculiarities of how a person speaks. Arnold's terminator is an earlier model and is not capable of shape-shifting, but the voice generation part was already mastered at the time of his creation (or maybe that feature was added via an over-the-air-update -- who knows 🤷‍♂️).

The point is: In 1991, when the movie Terminator 2 came out, realistic computer voice generation was just as much science-fiction as shape-shifting.

-- Well, not anymore...

Introducing "Two Minute Papers"

What do the terminators need in order to imitate a person's voice perfectly? A few seconds of hearing that person speak is enough. In that science-fiction movie from 1991.

Let's compare to real science from 2019.

The result: About 5 seconds of voice recording is enough to imitate a person's voice and way of speaking almost perfectly.

How crazy is that? 1991's science fiction became pretty much a reality 28 years later!

And keep in mind that this result is the worse, computer generated voice audio will ever be. (It can only get better in the future.) And this paper is almost one year old, so there are probably even more capable AIs available that run circles around this one...

The take-away message is:

No, we cannot trust our eyes.

Is there a way to detect manipulated or manufactured voice recordings?

There might be, but it has already become very hard for a human to decide what is real and what is not. Our best bet is probably to use an AI to distinguish between artificial voice output and a real person speaking.

But there is a catch: With every improvement in detection, an improvement in creation usually follows closely. The reason is that you can use a detection AI to "train" (improve) a generating AI. It's a vicious circle. In the future, it will just be even harder for humans to decide what is real and what is not... Maybe, it will be impossible to tell real and fake voices apart at some point -- just like in the movie.

But you might think

If I see the person speaking in a video, I will be able to tell whether the audio is fake or real, right? If it is fake, the audio won't match the movement of the lips!

Good idea, but have you heard of deepfake? That is the topic for the next installment of this Internet Gold series... And what we discuss there might shock you. You might want to sit down for that one. But don't worry, not today. I will probably finish writing the article tomorrow. So stay tuned!

What do you think?

Will we ever find a reliable way to check the authenticity of videos or audio recordings? Could blockchain technology help with that (I am thinking of NFT -- non fungible tokens)?

Let me know in the comment section!

Bonus Content -- Make your computer sound like you!

If you are tech-savvy, navigate to this GitHub repository where you can download an unofficial implementation of the AI which was demonstrated in the above video. You can install the software by following the steps (it is not super hard, but also not beginner friendly). Once installed, you can use a short recording of your own voice to train the AI. If everything works out, your computer will speak exactly like you. How cool is that?!

 


New here? Create a free Publish0x account and earn crypto for reading articles like this one. 

Are you a blogger? Here is how to earn extra Bitcoin Cash with hardly any extra work.

Get ZCash and DOGE Coin for absolutely free on PipeFlare. (Nice payout for two minutes "work" per day!)

Do you listen to radio stations? Earn crypto while doing that with bitrad.io

Feeling lucky? Try these free lotteries with guaranteed crypto wins. Register once (takes 1 minute per site), come back as often as you like (takes 5 seconds per site once you have an account):

Bitcoin | Ethereum | Cardano | Litecoin | NEM | Ripple | Steam | Binance Coin | Dash | Chainlink | Neo | Tether | USD Coin 

Are you using Brave on Android? Make sure that you receive Brave ads even when the browser is not in use. And learn how to withdraw BAT from the mobile browser.

 

I published this article also on my blog at read.cash

How do you rate this article?


70

1

MoreGainStrategies
MoreGainStrategies

Technology. Fitness. Health. Stocks. Crypto.


Internet Gold
Internet Gold

In this blog I collect my findings of noteworthy, interesting, funny or awesome content on the internet. Sit back and enjoy the entertainment.

Send a $0.01 microtip in crypto to the author, and earn yourself as you read!

20% to author / 80% to me.
We pay the tips from our rewards pool.