Scalable, Efficient Processing and Analysis of Large Audio Datasets – Pawel Cyrta – ADCx Gather 2024

[[Wednesday April 2, 2025]]

By digitalmedium1

https://audio.dev/ -- @audiodevcon
---

Scalable, Efficient Processing and Analysis of Large Audio Datasets - Pawel Cyrta - ADC 2024
---

The exponential growth of audio data necessitates robust and scalable solutions for processing and analysis. This presentation introduces a novel approach to handle a colossal audio dataset (e.g 40 or more TB) using various methods and Ray framework for distributed computing.

When you have terabytes or petabytes of data, it is difficult to use python to process it and finish it before the asterisk. Distributed computations are now easy to perform thanks to the ray.io framework. I will show you how to use distributed methods in practice based on my experience in analyzing and training ML audio, speech and language models. With a wide range of applications, we always face the elementary problem of data preparation, and the dynamically created Ray cluster with calculation and optimization pipelines speeds them up many times. It will show you the basics of the environment, how to navigate and prepare production-ready applications. In this talk, we provide practical tips on how to manage data to build a scalable/robust/reliable software system.We will delve into specific use cases, including the feature extraction like Mel-frequency cepstral coefficients (MFCCs) and spectrogram analysis, showcasing how Ray’s flexibility and scalability can transform conventional audio processing workflows.
The presentation will conclude with a discussion on aggregating results and deriving meaningful insights from large-scale audio data, providing attendees with actionable strategies to manage and analyze vast audio datasets effectively.

Join Paweł as he shares invaluable insights and practical tips to master massive audio data distributed parallel processing.
---

Paweł Cyrta

Paweł Cyrta is a Applied Research Scientist and ML Engineer with over 20 years of expertise in audio technology and machine learning.
His innovative work spans the realms of speech recognition, speech synthesis, natural language processing, and generative audio AI.
Currently, Paweł consults on emerging audio technology projects, delivering bespoke on-premise state-of-the-art ML solutions for complex speech and audio tasks, bridging the gap between cutting-edge technology and practical business solutions.

His diverse career spans multiple industries, including work with prominent organizations such as NowThisMedia, Rev.ai and Roche, where he implemented cutting-edge audio ML solutions.
At Samsung, he played a key role in developing speech recognition and synthesis for S-Voice in 24 European languages, a technology now available in Samsung TVs.

Paweł's academic background combines Computer Science and Electroacoustics from the Warsaw University of Technology with Computational Engineering from HPC center, at the University of Warsaw.
He completed research intership at IRCAM in Paris focused on integrating natural emotions into speech and singing synthesis, bridging the gap between technology and expressive audio content.

He also shares his expertise as a lecturer in Deep Learning postgraduate studies at Warsaw University of Technology,
previously teaching "Interactive Systems" and "Interactive Sound II" at the Fryderyk Chopin University of Music.

As a composer and researcher in music technology, Paweł brings a unique perspective to audio ML, specializing in generative music, interactive systems, and algorithmic composition.
His multifaceted approach combines technical prowess with creative insight, driving innovation in sound analysis and processing, as technical curator and artist at many digital art festivals in Poland.
---

ADC is an annual event celebrating all audio development technologies, from music applications and game audio to audio processing and embedded systems. ADC’s mission is to help attendees acquire and develop new audio development skills, and build a network that will support their audio developer career.
Annual ADC Conference - https://audio.dev/
https://www.linkedin.com/company/audiodevcon
https://twitter.com/audiodevcon
https://facebook.com/audiodevcon
https://instagram.com/audiodevcon
https://www.reddit.com/r/audiodevcon/
https://mastodon.social/@audiodevcon
---

Streamed & Edited by Digital Medium Ltd: https://online.digital-medium.co.uk
---

Organized and produced by JUCE: https://juce.com/
---

Special thanks to the ADCxGather Team:

Sophie Carus
Derek Heimlich
Andrew Kirk
Bobby Lombardi
Tom Poole
Ralph Richbourg
Jim Roper
Jonathan Roper
Prashant Mishra

#datasets #dataanalytics #adc #audiodev #audio #machinelearningprojects #audioprocessing #audioproduction #audioprogramming #dataengineering #data #audiotech #audiotechnology

Previous:Workshop: Designing & Developing an AVB/Milan-Compliant Audio Network Endpoint - Fabian Braun - ADC 2024

Next:JUCE and Direct2D - Matt Gonzalez - ADC 2024