DEV Community

Cover image for Generative AI: Shaping the Future of Music Industry using AWS DeepComposer
Abdul Raheem
Abdul Raheem

Posted on • Edited on

Generative AI: Shaping the Future of Music Industry using AWS DeepComposer

What is Generative AI

Generative AI, also known as generative models, is a subfield of artificial intelligence that focuses on creating new and unique outputs, such as text, images, and music, based on a set of inputs or training data.

Example
We have trained our model with data from a cat and then Generative AI will use that data to create an artificial cat by following the pattern of previous cat data. It is usually an unsupervised machine-learning model where it uses the generative technique that is creating new data by using the pattern learned during the training. This is in contrast to discriminative models, which focus on classifying or identifying inputs based on previously learned information.

Famous Platform that used Generative AI:

  1. Chat GPT
  2. DALL·E 2
  3. Amper Music
  4. AIVA

Types of Generative AI

There are several types of generative AI:

Generative Adversarial Networks (GANs)- These consist of two neural networks, a generator and a discriminator, that work together to generate new data that is similar to a given dataset.

Variational Autoencoders (VAEs)- These consist of an encoder network that maps input data to a latent space, and a decoder network that maps the latent space back to the original data space. They are used to generate new data by sampling from the latent space.

Autoregressive Models - These models predict the next value in a sequence based on the previous values. Examples include Autoregressive Integrated Moving Average (ARIMA) and Recurrent Neural Networks (RNNs).

Transformer-based Models - These models are commonly used for natural language processing tasks and are known for their ability to handle sequential data with long-term dependencies. Examples include BERT and GPT-2

Deep Convolutional Generative Adversarial Networks (DCGANs) - These models use convolutional layers in both generator and discriminator networks, and are used for generating images.


Generative AI with AWS DeepComposer

AWS DeepComposer is an Amazon Web Services (AWS) tool that allows users to generate and compose music using generative artificial intelligence (AI) models.

It consists of the following parts:

1. USB Keyboard - Connects to your computer to input the melody.
2. Accompanying Console - Includes AWS DeepComposer Music studio to generate music.
3. Chartbusters - To represent your machine-learning skills.

But is not necessary to have the keyboard all the time, we can import your own MIDI file, use one of the provided sample melodies, or use the virtual keyboard in the AWS DeepComposer Music studio.

Working
The AWS DeepComposer Music Studio offers the ability to generate music using three different generative AI techniques: GANs, AR-CNNs, and transformers. The GAN technique can be used to generate accompaniment tracks, the AR-CNN technique can be used to make changes to notes in an input track, and the transformer technique can be used to extend an input track by up to 30 seconds.

GANs on AWS DeepComposer

Generative Adversarial Networks (GANs) are a unique type of machine learning model that utilizes two neural networks to generate new content.

1. A generator is the first neural network that learns to create new data resembling the source data on which it was trained.
2. A discriminator is a second neural network trained to assess how closely the generator's data resembles the training data set.

The generator and discriminator work in a back-and-forth process where the generator improves in creating realistic data and the discriminator becomes more adept at distinguishing between real and generated data.

Example To Understand:
Think of an example where A GAN can be compared to a cooking competition. There is a chef who creates new recipes and a judge who evaluates the quality of the dishes. The chef, like the generator network in a GAN, generates new dishes, and the judge, like the discriminator network, evaluates the dishes and provides feedback on how to improve them. As the chef receives feedback and improves, the dishes become more and more refined, similar to how the generator network in a GAN improves over time. AWS DeepComposer uses GANs in a similar way to create unique and distinctive music compositions.

Image description


Let's Generate a New Melody

Step 1:
Create an account on AWS DeepComposer website.

Step 2:
Click on AWS DeepComposer Music Studio or find AWS DeepComposer on the search option.

Image description

Step 3:
Click on the start composition.

Image description

Step 4:
You will move toward this page where you will have multiple options.

  • Update the Name of Your Music File
  • Import Music or Select Music
  • Play Or Stop Music
  • Create your own melody

After selecting a music track, you will hit continue.

Image description

Step 5:

This is the Machine learning phase where you have to select a model according to your requirements. For example, you can choose any of the following:

  • ARR-CNN can make your music sound more like Bach music by analyzing the patterns and characteristics of the training data, such as harmony, melody, rhythm, and structure, and using this knowledge, it will generate new music that will be similar to the original one.

  • GANs can be used to enhance music tracks by working in a collaborative cycle between the generator and discriminator. The generator will take in a single-track piano roll as input and output a multi-track piano roll with added accompaniments. The discriminator will then evaluate the output and provide feedback to the generator to improve the realism of the music generated. This process will continue to iterate until it creates more realistic and enjoyable music.

  • Transformers can make your music longer by adding 20 seconds. They are like computer programs that can make new music that sounds good. They do this using an autoregressive architecture where it tries to figure out what the next note, rhythm, or chord should be based on the pattern of the music that has already been played.

Image description

Step 6
After updating ARR-CNN parameters click continue. You can understand these parameters as:

Maximum Input Notes To Remove: Controls the percentage of input melody to be removed during effect. Increasing its value will allow the model to use less of input melody as reference during inference. You can set it to 60(optimal) or other value.

Maximum Notes to Add: Controls the number of node that can be added to input melody. Increasing the value might introduce some notes out of place. I will set it to 80.

Sampling Iteration: Controls the number of time input melody is passed through the model. I will set it to 90.

Creative Risk: Controls how much the model can deviate from the music that it was trained on. If you set the value too low, model will choose high probability notes and vice versa. I will set its value to 0.8

Image description

Step 7
You will get the output melody in this step that is been trained using the Generative AI in AWS DeepComposer.

Image description

Step 8
You can edit these melody if you want to by removing some of the nodes by clicking on Edit Melody.

Image description

Step 9
Once you are satisfied with the new melody, click on the continue button. You will be directed toward the Share Composition Page.

Image description

Step 10
This is it, Your Generative AI melody is ready, if you want to do additional changes to it you can do that. Some of the options in AWS DeepComposer are:

  • Generate accompaniment track using GANs
  • Extend input track with Transformers

Afterward, you can participate in the AWS DeepComposer Chartbuster challenge as well with your melody.


Top comments (0)