Creating Your Own Singing Voice Synthesizer: Overcoming Data Collection Challenges – by @mattricesound – ADCx SF

[[Tuesday August 1, 2023]]

By digitalmedium1

Join Us For ADC23 - London - 13-15 November 2023
More Info: https://audio.dev/
@audiodevcon

Creating Your Own Singing Voice Synthesizer: Overcoming Data Collection Challenges - Matthew Rice - ADCx SF

While singing voice synthesizers have existed for decades, recent deep-learning-based products (Sinsy, Vocaloid) have greatly improved the quality of the results. However, these systems provide only a limited number of pre-trained "voices" based on proprietary datasets. Luckily, open-source systems (NNSVS, OpenUtau, VISinger, DiffSinger) exist, allowing users to use custom datasets to create a singing voice synthesizer. Unfortunately, creating the necessary datasets is a time-consuming process that requires collecting phoneme-level timing and other data points. As a result, few public datasets are available, and those that do exist are mostly restricted to Mandarin Chinese and Japanese. In this talk, I will demonstrate several approaches to collecting this data, from manual labeling to fully automated procedures, making it easier for everyone to create their own personalized singing voice synthesizers.

Slides: https://data.audio.dev/talks/ADCxSF/2023/creating-your-own-singing-voice-synthesizer/slides.pdf
_
Matthew Rice

Matthew Rice is a master's student at Queen Mary University of London, studying Sound and Music Computing with a focus on music production applications of deep learning. Previously, Matthew was at startup Mayk as a software engineer, working on both the audio engine and audio research teams. Matthew also has experience in digital hardware and embedded systems, having worked at Qualcomm designing PMICs and audio codec drivers.

Edited by Digital Medium Ltd - online.digital-medium.co.uk
_

Organized and produced by JUCE: https://juce.com/
_

Special thanks to the ADC Team:

Sophie Carus
Derek Heimlich
Andrew Kirk
Bobby Lombardi
Tom Poole
Ralph Richbourg
Jim Roper
Jonathan Roper

#audiodevcon #audiodev #synthesizer

Previous:Leveraging JUCE for Developing Spatial Audio Plugins - Scott Murakami - ADCx SF

Next:Massive leveraging of FlexBox using the JUCE C++ classes - Nick Porcaro - ADCx SF