Tag: audio processing

Python Templates for Neural Image Classification and Spectral Audio Processing – Part 2

  • Lobby
  • Tag Archives: audio processing

https://audio.dev/ -- @audiodevcon​
ADCx India - 29th March
ADC Bristol ​- 9th - 11th November

Python Templates for Neural Image Classification and Spectral Audio Processing - Lightning Hydra Template Extended and Neural Spectral Modeling Template - Julius Smith - ADCx Gather 2025

This presentation introduces two open-source research frameworks for neural image classification and spectral audio processing: (1) the Lightning Hydra Template Extended (LHTE) and (2) the Neural Spectral Modeling Template (NSMT). The LHTE extends the widely used PyTorch Lightning + Hydra template with state-of-the-art architectures (CNNs, ConvNeXt, EfficientNet, Vision Transformers) and expanded dataset support, adding CIFAR-10, CIFAR-100, and a new generalized Variable Image Multi-Head (VIMH) format. VIMH accommodates extremely large image/channel dimensions, multi-head tasks, and supports both classification and regression from a single shared backbone. The LHTE also provides reproducible benchmark experiments, and systematic workflows for rapid model comparison.

Built upon the LHTE, the NSMT specializes in spectral audio modeling, where stacked spectrograms and other 2D audio representations serve as image-like inputs. By leveraging the perceptual inductive priors of human hearing, the NSMT avoids the computational expense of end-to-end waveform modeling while maintaining high accuracy. Applications include synthesizer parameter estimation (tested on sawtooth oscillators, and Moog VCFs with ADSR envelopes), instrument recognition, and real-time effect control. NSMT emphasizes small, efficient architectures, extended spectral representations, auxiliary conditioning inputs, and enhanced VIMH support for audio-specific datasets.

Together, the LHTE and NSMT form robust, reproducible platforms for advancing machine learning research at the intersection of vision and audio. Code, datasets, and other resources are available online for immediate adoption.
---

Julius Smith

Julius O. Smith is a research engineer, educator, and musician devoted primarily to developing new technologies for music and audio signal processing. He received the B.S.E.E. degree from Rice University in 1975 (Control, Circuits, and Communication), and the M.S. and Ph.D. degrees in E.E. from Stanford University, in 1978 and 1983, respectively. For his MS/EE, he focused largely on statistical signal processing. His Ph.D. research was devoted to improved methods for digital filter design and system identification applied to music and audio systems, particularly the violin. From 1975 to 1977 he worked in the Signal Processing Department at ESL, Sunnyvale, CA, on systems for digital communications. From 1982 to 1986 he was with the Adaptive Systems Department at Systems Control Technology, Palo Alto, CA, where he worked in the areas of adaptive filtering and spectral estimation. From 1986 to 1991 he was employed at NeXT Computer, Inc., responsible for sound, music, and signal processing software for the NeXT computer workstation. After NeXT, he became a Professor at the Center for Computer Research in Music and Acoustics (CCRMA) at Stanford, with a courtesy appointment in EE, teaching courses and pursuing/supervising research related to signal processing techniques applied to music and audio systems. At varying part-time levels, he was a founding consultant for Staccato Systems, Shazam Inc., and moForte Inc. He is presently a Professor Emeritus of Music and by courtesy Electrical Engineering at Stanford, and a perennial consultant for moForte Inc. and a few others. For more information, see https//ccrma.stanford.edu/~jos/.

---

ADC is an annual event celebrating all audio development technologies, from music applications and game audio to audio processing and embedded systems. ADC’s mission is to help attendees acquire and develop new audio development skills, and build a network that will support their audio developer career.
Annual ADC Conference - https://audio.dev/
https://www.linkedin.com/company/audiodevcon

https://facebook.com/audiodevcon
https://instagram.com/audiodevcon
https://www.reddit.com/r/audiodevcon/
https://mastodon.social/@audiodevcon
---

Streamed & Edited by Digital Medium Ltd: https://online.digital-medium.co.uk
_

Organized and produced by JUCE: https://juce.com/
_

Special thanks to the ADCxGather Team:

Sophie Carus
Derek Heimlich
Andrew Kirk
Bobby Lombardi
Tom Poole
Ralph Richbourg
Jim Roper
Jonathan Roper
Prashant Mishra

#adc #audiodev #dsp #audio #conferenceaudio #audioprocessing #audioproduction #audioprogramming #sound #music #musictech #soundtech #audiotech #audiotechnology

Filed under: UncategorizedTagged with: , , ,

SRC – Sample Rate Converters in Digital Audio Processing – Theory and Practice – ADC 2024

  • Lobby
  • Tag Archives: audio processing

https://audio.dev/ -- @audiodevcon​
---

SRC - Sample Rate Converters in Digital Audio Processing - Theory and Practice - Christian Gilli & Michele Mirabella - ADC 2024
---

Sample Rate Conversion (SRC) is a key component of digital audio processing that lets you change how many samples per second are used to represent a stream. It is fundamental when you need to make audio from one system work with another that uses a different sample rate. Getting SRC right is crucial in lots of audio applications, particularly in environments where multiple audio devices coexist, each potentially functioning with its own clock frequency.
The importance of SRC stems from three main factors: 1) preventing pitch distortion; in fact, running an audio stream at an incorrect rate can alter pitch or the relative relationships between pitches. 2) Maintaining synchronization, i.e., ensuring different devices remain in step with one another and 3) compensating for clock drift by accounting for slight frequency variations between devices nominally operating at the same frequency.

This presentation will begin with an introduction to the fundamental principles of SRC.

The goal for this talk is to give attendees a comprehensive understanding of SRC's importance in audio processing.
---

Slides: https://data.audio.dev/talks/2024/sample-rate-conversion/slides.pptx
---

Christian Gilli

I'm a software engineer with 6+ years of experience working on high-performance implementations of DSP algorithms for audio applications in C++.

Right now, I spend most of my time working on writing high-performance numerical software for DSP applications on heterogeneous platforms.
---

Michele Mirabella

Michele Mirabella received the B.S. degree (cum laude) in electronic engineering from the University of Modena and Reggio Emilia, Italy, in 2019, and the M.S. degree (cum laude) in electronic engineering from the University of Bologna in 2021. He is currently pursuing the Ph.D. degree with the University of Modena and Reggio Emilia, Italy. His main research interests lie in the area of joint communication and sensing systems.
---

ADC is an annual event celebrating all audio development technologies, from music applications and game audio to audio processing and embedded systems. ADC’s mission is to help attendees acquire and develop new audio development skills, and build a network that will support their audio developer career.
Annual ADC Conference - https://audio.dev/
https://www.linkedin.com/company/audiodevcon

https://facebook.com/audiodevcon
https://instagram.com/audiodevcon
https://www.reddit.com/r/audiodevcon/
https://mastodon.social/@audiodevcon
---

Streamed & Edited by Digital Medium Ltd: https://online.digital-medium.co.uk
---

Organized and produced by JUCE: https://juce.com/
---

Special thanks to the ADC24 Team:

Sophie Carus
Derek Heimlich
Andrew Kirk
Bobby Lombardi
Tom Poole
Ralph Richbourg
Jim Roper
Jonathan Roper
Prashant Mishra

#digitalaudio #cpp #adc #audiodev #dsp #audio #cppprogramming #audioprocessing #audioproduction #audioprogramming #musictech #soundtech #audiotech #audiotechnology

Filed under: UncategorizedTagged with: , , ,