VolumetricCondensed

Back To Schedule

GPU Based Audio Processing Platform with AI Audio Effects

15:30 - 15:50 UTC | Friday 1st November 2024 | ADCx Gather
Online Only

GPUs are optimised for parallel processing and can perform some audio processing tasks much more efficiently than traditional DSP or CPU-based methods. Parallelising real-time audio effects requires complex task management and synchronisation. Furthermore, recent trends advocate using AI audio processing algorithms, which are best run GPU architectures.
This talk presents an implementation of an embedded GPU-based audio processing framework on a Nvidia Jetson hardware platform. It is capable of combining neural network inference and other audio effects into signal graphs that process within periods as small as 32 frames (0.667ms). Aside from the real-time limit of the period size, the signal graph has no restrictions on the number and combination of parallel and serial audio effects. As a result, the framework handles large numbers of parallel channels, as found in a mixing console, and complex routing options available in high-end audio effect processors, such as the Neural DSP Quad Cortex.
Launching GPU work using the CUDA graph API produces better stability and performance than was observed using the CUDA stream API in a 2017 study.\cite{audio-processing-in-opencl}
Processing a signal graph that pushes the Jetson to its limit by mimicking a 64-channel mixing console on a 128 frames (2.67ms) period has higher than 99\% success rate. Sadly, occasional stalling on the GPU still produces worst-case execution times of up to 20ms, regardless of the loaded audio effects.
As a result, the framework can not yet be classified as real-time capable. Further study into the CUDA scheduler and improvements to the operating system and audio driver may be able to achieve real-time capability in the future.

Simon Schneider