Home Artificial Creativity Voice Manipulation by AI Will Cause Terrible Disruption

Voice Manipulation by AI Will Cause Terrible Disruption

September 11, 2019

1227

I have always been a fan of technological innovation. And still am.

Recently I have develop an special interest in artificial creativity. More specifically, towards the use of algorithms to develop creative solutions such as music and paintings.

Furthermore, I have been discussing how it can affect the music industry and other creative industries. Undoubtedly there are many positive applications of AI (artificial intelligence) in such industries, both for human talent augmentation and also for consumers (in terms of the amount and customization of offers).

However, technological innovation often also raises concerns regarding its misuse. My main current concern lies on the advances in artificial voice manipulation and deep fake.

And how does it all work?

On a nutshell, artificial voice manipulation involves the recording (or use of existing voice files), editing and transformation through algorithms. As consequence of this process, it is possible to generate an output with any human voice replicating any given text with sentiment change (e.g. sad, happy, irritated). One example of a company developing such technologies is Lyrebird.

One practical example was released by a Canadian start-up named Dessa. They published a video with the voice manipulation of Joe Rogan, a UFC commentator, comedian and podcast host.

Although clearly not being a finished product, the accuracy of the technology is incredible and extremely scary. And please, always keep in mind the exponential nature of such developments. In other words, very soon what seems as an unfinished product will be simply perfect and indistinguishable from reality.

(PS: See THE FAKENING many more shocking examples of deep fake).

And what will happen then?

I have to admit: This worries me greatly. Thus, it is extremely important to discuss this issue and act on it. It does not require much imagination to consider chaotic scenarios in consequence of this practice.

To exemplify, I would like to simply focus on three potential outcomes.

1. Mistrust of Evidences in The Legal Environment

What will happen when lawyers and judges can no longer use voice recordings as evidence for cases? Anyone will be able to appeal in court against audio evidences claiming they may have been manipulated. This represents taking “fake news” to a completely new level. Recently, The Intercept broke an incredible story of corruption in the highest level of the Brazilian government through information obtained in messaging apps. Most of them were audio files. Can you imagine if these files had no legal validity due to the fact that they were simply audio files? This could have terrible implications and the entire legal environment will have to re-adapt. I am truly not sure we are ready for such adaptation and this breach could open the door for an immense amount of new scandals, supported by lack of evidences.

2. Distortion of History, Historical Characters and Current State

History as we know it based on the historical evidences collected by historians through years of investigative work following clear scientific procedures. All of this may be impacted by the “appearance and spreading” of fake audio files to distort history, historical characters and induce audiences to lean towards any given direction. Can you foresee terrible scenarios due to this fact? This may impact relationships among countries, create cultural conflict and political divide (as if we hadn’t had enough of it), raise religions tensions and much more. Plus, we shall not forget the immense inequality in many countries. We currently live in a connected world where millions of users have little educational background and are, unfortunately, incapable of making accurate distinctions of facts and are thus easily manipulated.

3. Legal Rights and Ownership

Elvis Presley, Michael Jackson and John Lennon, for example, have passed away and, obviously, are no longer are able to record new songs. However, they have left an incredible amount of recorded files. This means that their voices can be edited for songs that haven’t even been written yet. This would cause an incredible disruption in authorship, legal rights, authenticity and more.

In sum, when considering the fast advances in voice manipulation I predict an immense confusion and a general questioning in reality.

This will cause an unprecedented disruption to society, given the amount and fast pace with which information now flows globally.

And yes, I fully understand that I am sounding very pessimistic and that it may simply sound as another author overreacting while trying to imagine and predict the future.

But my concerns are based on the speed of development and our current incapacity, for example, to deal with the spreading of “fake news” online. If funny memes and other forms of written and visual content have had recently a great impact on national elections, what could voice manipulation cause?

Clear and strong regulations for technological development and its application is a pivotal discussion and must not be ignored.

So to conclude, my main fear is that voice manipulation will trigger the “mistrust in truth”.

“If you can’t convince them, confuse them”. This has been one of the greatest tactics we have been witnessing in the past decade through information overload online.

And things are just about to take a whole new dimension.

Let’s us all please act on it.

1. Mistrust of Evidences in The Legal Environment

2. Distortion of History, Historical Characters and Current State

3. Legal Rights and Ownership

RELATED ARTICLESMORE FROM AUTHOR

The Live AM: Artist Monitor 2024 Report is out! (Download for free)

Introducting “Artificial Intelligence, Co-Creation, and Creativity: The New Frontier for Innovation”

Call for Papers: Special Issue on “Artificial Intelligence and the Arts”

RELATED ARTICLES MORE FROM AUTHOR