Developing an AI-Powered Karaoke Experience – Thomas Hézard & Clément Tabary – ADC23

[[Friday May 10, 2024]]

By digitalmedium1

https://audio.dev/ -- @audiodevcon

Developing an AI-Powered Karaoke Experience - Thomas Hézard & Clément Tabary - ADC23

Karaoke has been of popular interest for many years, from the first karaoke bars in the 1970s to the karaoke video games of today, and the recent progress in deep learning technologies has opened up new horizons. Audio source separation and voice transcription algorithms now give the opportunity to create a complete karaoke song, with instrumental track and synchronised lyrics, from any mixed music track. Real-time stems remixing, pitch and tempo control, and singing quality assessment are other useful audio features to go beyond the traditional karaoke experience. In this talk we will discuss the challenges we had to tackle to provide our users with a fully automatic and integrated karaoke system adapted for both mobile and web platforms.
_

Thomas Hézard

Thomas leads the Audio Research & Development team at MWM, working with his team on innovative signal processing algorithms and their optimised implementation on various platforms. Before joining the MWM adventure, Thomas completed a PhD on voice analysis-synthesis at IRCAM in Paris. Fascinated by every aspect of sound and music, both artistic and scientific, Thomas is also a musician, a sound engineer, a passionate teacher, and an amateur photographer.
_

Clément Tabary

Clément is a deep-learning research engineer at MWM. He applies ML algorithms to a wide range of multimedia fields, from music information retrieval to image generation. He's currently working on audio source separation, music transcription, and automatic DJing.
_

Streamed & Edited by Digital Medium Ltd: https://online.digital-medium.co.uk
_

Organized and produced by JUCE: https://juce.com/
_

Special thanks to the ADC23 Team:

Sophie Carus
Derek Heimlich
Andrew Kirk
Bobby Lombardi
Tom Poole
Ralph Richbourg
Jim Roper
Jonathan Roper
Prashant Mishra

#adc #audiodev #ai #karaoke

Previous:Virtual Studio Production Tools With AI Driven Personalized Spatial Audio for Immersive Mixing

Next:Workshop: An Introduction to Inclusive Design of Audio Products - Accessibility Panel - ADC23