Adversarial Synthesis of Drum Sounds
July 26, 2020
Drysdale, J. and Tomczak, M. and J. Hockman. 2020. Adversarial synthesis of drum sounds. In Proceedings of the 23nd International Conference on Digital Audio Effects, Vienna, Austria.
[pdf, presentation]
Recent advancements in generative audio synthesis have allowed for the development of creative tools for generation and manipulation of audio. In this project, a strategy is proposed for the synthesis of drum sounds using generative adversarial networks (GANs). The system is based on a conditional Wasserstein GAN, which learns the underlying probability distribution of a dataset compiled of labeled drum sounds. Labels are used to condition the system on an integer value that can be used to generate audio with the desired characteristics. Synthesis is controlled by an input latent vector that enables continuous exploration and interpolation of generated waveforms.
Audio Examples
Results accompanying the paper “Adversarial Synthesis of Drum sounds” for the International Conference on Digital Audio Effects 2020.
Training Data
A random selection of 30 examples from the dataset used in training.
Generations
A random selection of 30 examples from the generated data.
Usage demonstration
Example usage within loop-based electronic music compositions. The percussive elements of the following tracks were created using a selection of samples from the generated data. A light amount of post-processing (equalisation and volume envelope shaping) was applied to mix the sounds.
Generating Drum Loops
Below are some examples of the systems capacity to generate 1 bar loops. A dataset of 130bpm, 1 bar drum loops was complied and then sliced into 16th note segments. The system is conditioned on each of these segments (giving a total of 16 classes) and then trained for a number of iterations. A loop can be created by generating a waveform for each of the 16 classes and then concatentating them together.

Some more examples can be found here: https://soundcloud.com/beatsbygan
Interpolation demonstration
The proposed system learns to map points in the latent space to the generated waveforms. The structure of the latent space can be explored by interpolating between points in the space. For the following experiments, the GAN was trained with a latent space dimensionality of size 3.
Figure 2: Interpolation in the latent space for kick drum generation. Kick drums are generated for each point along linear pathsthrough the latent space (left). Paths are colour coded and subsequent generated audio appears across rows (right).
A to B interpolation
In the following examples, two generated drum samples are selected and their latent vectors are noted. A linear path of 30 steps between each latent vector is created and a waveform is generated for each of those 30 steps.
Interpolating between Snare A and Snare B.
Interpolating between Kick A and Kick B.
Interpolating between Cymbal A and Cymbal B.
Linear interpolation
More examples of linear interpolation between two random points.
Spherical interpolation
Examples of spherical interpolation between two random points.
References
[1] | Drysdale, J., M. Tomczak, J. Hockman, Adversarial Synthesis of Drum Sounds. Proceedings of the 23rd International Conference on Digital Audio Effects (DAFX), 2020. |
---|
@inproceedings{drysdale2020ads,
title={Adversarial synthesis of drum sounds},
author={Drysdale, Jake and Tomczak, Maciej and Hockman, Jason},
booktitle = {Proceedings of the International Conference on Digital Audio Effects
(DAFx)},
year={2020}
}
Help
Any questions please feel free to contact me on jake.drysdale@bcu.ac.uk