VolumetricCondensed

Back To Schedule

Scalable, Efficient Processing and Analysis of Large Audio Datasets

18:20 - 18:40 UTC | Friday 1st November 2024 | ADCx Gather
Online Only

The exponential growth of audio data necessitates robust and scalable solutions for processing and analysis. This presentation introduces a novel approach to handle a colossal 30 or more TB audio dataset using various methods and Ray framework for distributed computing.
When you have terabytes or petabytes of data, it is difficult to use python to process it and finish it before the asterisk. Distributed computations are now easy to perform thanks to the ray.io framework. I will show you how to use distributed methods in practice based on my experience in analyzing and training ML audio, speech and language models. With a wide range of applications, we always face the elementary problem of data preparation, and the dynamically created Ray cluster with calculation and optimization pipelines speeds them up many times. It will show you the basics of the environment, how to navigate and prepare production-ready applications. In this talk, we provide practical tips on how to manage data to build a scalable/robust/reliable software system.We will delve into specific use cases, including the feature extraction like Mel-frequency cepstral coefficients (MFCCs) and spectrogram analysis, showcasing how Ray’s flexibility and scalability can transform conventional audio processing workflows. The presentation will conclude with a discussion on aggregating results and deriving meaningful insights from large-scale audio data, providing attendees with actionable strategies to manage and analyze vast audio datasets effectively.

Pawel Cyrta