VolumetricCondensed

Pradeep Rengaswamy

Senior Technical Specialist

Sony India Software Center

About Me

Hi everyone, my name is Dr. R. Pradeep, and I'm a Senior Technical Specialist at Sony India Software Center. My passion lies in pushing the boundaries of audio technology, particularly in the areas of speech, audio, and music signal processing.

During my PhD at IIT Kharagpur (2014-2021), my research focused on fundamental frequency estimation – the essence of pitch – across music, speech, and singing voices. This work has been published in prestigious conferences like INTERSPEECH, showcasing my commitment to impactful research.

Since joining Sony, I've delved into a wider range of audio applications. From analyzing audio quality and personalizing user audio experience to exploring real-time speech voice conversion, real-time singing voice conversion, real-time target speaker separation and real-time speech enhancement, I'm constantly seeking innovative ways to improve user audio experience.

In this talk, I'm excited to share my expertise in differentiable digital signal processing applied to audio domain. We'll explore models that can train without a labelled dataset and understand potential use cases.

Let's embark on this journey together and unlock the potential of deep learning for the future of audio engineering!

Sessions

  • Deep Dive

    Unsupervised Audio Processing with Differentiable Digital Signal Processing (DDSP)
    14:00 - 14:50 UTC | Tuesday 12th November 2024 | Empire
    Beginner
    Intermediate
    Advanced

    Traditional audio processing often relies on labeled data, hindering its scalability and efficiency. This session delves into Differentiable Digital Signal Processing (DDSP), a revolutionary approach that enables unsupervised learning for various audio tasks. We'll explore the core principles of DDSP and its advantages for audio processing. The presentation will showcase how DDSP can be harnessed for tasks like feature extraction, audio parameterization, and even audio generation – all without the need for extensive labeled datasets. Crucially, DDSP allows us to learn the inherent structural priors of audio data itself. This refers to the underlying patterns and relationships within audio signals […]