https://audio.dev/ -- @audiodevcon
Developing an AI-Powered Karaoke Experience - Thomas Hézard & Clément Tabary - ADC23
Karaoke has been of popular interest for many years, from the first karaoke bars in the 1970s to the karaoke video games of today, and the recent progress in deep learning technologies has opened up new horizons. Audio source separation and voice transcription algorithms now give the opportunity to create a complete karaoke song, with instrumental track and synchronised lyrics, from any mixed music track. Real-time stems remixing, pitch and tempo control, and singing quality assessment are other useful audio features to go beyond the traditional karaoke experience. In this talk we will discuss the challenges we had to tackle to provide our users with a fully automatic and integrated karaoke system adapted for both mobile and web platforms.
_
Thomas Hézard
Thomas leads the Audio Research & Development team at MWM, working with his team on innovative signal processing algorithms and their optimised implementation on various platforms. Before joining the MWM adventure, Thomas completed a PhD on voice analysis-synthesis at IRCAM in Paris. Fascinated by every aspect of sound and music, both artistic and scientific, Thomas is also a musician, a sound engineer, a passionate teacher, and an amateur photographer.
_
Clément Tabary
Clément is a deep-learning research engineer at MWM. He applies ML algorithms to a wide range of multimedia fields, from music information retrieval to image generation. He's currently working on audio source separation, music transcription, and automatic DJing.
_
Streamed & Edited by Digital Medium Ltd: https://online.digital-medium.co.uk
_
Organized and produced by JUCE: https://juce.com/
_
Special thanks to the ADC23 Team:
Sophie Carus
Derek Heimlich
Andrew Kirk
Bobby Lombardi
Tom Poole
Ralph Richbourg
Jim Roper
Jonathan Roper
Prashant Mishra
#adc #audiodev #ai #karaoke