The latest decade has witnesses the appearance of a series of start-ups focused on using artificial intelligence for music composition. These include, for example, companies such as AIVA, Jukedeck, Endel and many others (here is an extended list of companies producing music with AI).
Recently though, OpenAI has launched a powerful project: Jukebox. And in some ways it takes the discussion a few steps forward. Essentially, Jukebox is a neural network capable of generating automated music from different music styles and genres. But importantly, it also generates novel human voices (if intended, resembling specific artists) and even automated lyrics (using GPT-2).
The Original Paper
The original paper of the project, titled “Jukebox: A Generative Model for Music”, was submitted in April 2020 by Prafulla Dhariwal, Heewoo Jun, Christine Payne, Jong Wook Kim, Alec Radford, Ilya Sutskever.
According to the authors, Jukebox should contribute to novelty in music. And novelty applications are discussed in various ways. For example, novelty of styles, voices, lyrics and riffs.
The abstract of the paper reads as follows:
“We introduce Jukebox, a model that generates music with singing in the raw audio domain. We tackle the long context of raw audio using a multi- scale VQ-VAE to compress it to discrete codes, and modeling those using autoregressive Trans- formers. We show that the combined model at scale can generate high-fidelity and diverse songs with coherence up to multiple minutes. We can condition on artist and genre to steer the musical and vocal style, and on unaligned lyrics to make the singing more controllable. We are releasing thousands of non cherry-picked samples, along with model weights and code“.
You can access the original paper here.
In short, Jukebox covers all elements of a typical music composition: harmonies, melodies, lyrics and includes voice. While yet, resembling an original artist. As with most AI advances, Jukebox is equally fascinating and scary.
Are you curios to hear how it sounds? The researchers have kindly released an extraordinary number of samples at OpenAI. According to them, the samples were not “cherry-picked”, in order to showcase the authenticity of the samples.
You can access them simply by clicking here or on the image below.
Jukebox is another important piece on a fast growing puzzle that is already changing the music industry and the music composition process. It represents a major step forward by covering all elements of the composition process in a highly efficient way.
Personally, I believe the next step are to simply perfect each elements, such as voice pitching and emotion. As we have witnessed, this is simply a matter of time. Of very short time.
In the next 5 to 8 years, music composition may actually become simply the pressing of a button. And will it mean the end of human composition as we know it? Of course not. The only problem is that we will not know the difference. It will come down to a matter of trust in the narrative of the artist.