Interpolation of Synthesizer Presets using Timbre-Regularized Auto-Encoders

Gwendal Le Vaillant; Thierry Dutoit

doi:10.36227/techrxiv.170327281.10174081/v2

loading page

Interpolation of Synthesizer Presets using Timbre-Regularized Auto-Encoders

Gwendal Le Vaillant,
Thierry Dutoit

Abstract

Sound synthesizers are ubiquitous in modern music production but manipulating their presets, i.e. the sets of synthesis parameters, demands expert skills. This study presents a novel variational auto-encoder model tailored for black-box synthesizer preset interpolation, which enables the intuitive generation of new sounds from pre-existing ones. Leveraging multi-head self-attention networks, the model efficiently learns latent representations of presets, aligning these with perceived timbre dimensions through attribute-based regularization. It is able to smoothly transition between diverse presets, surpassing traditional linear parametric interpolation methods. Furthermore, we introduce an objective and reproducible evaluation method, based on smoothness and linearity metrics computed on a broad set of audio features. The model's efficacy is demonstrated through subjective experiments, whose results also highlight significant correlations with the proposed objective metrics. The model is validated using a widespread frequency modulation synthesizer with a large set of interdependent parameters. It can be adapted to various commercial synthesizers, and can perform other tasks such as modulations and extrapolations.

22 Apr 2024Submitted to TechRxiv

29 Apr 2024Published in TechRxiv

Abstract

Peer review timeline