Engineering Practices Break Music Interaction
(but Can Also Fix It)
Have you ever encountered an interactive system of great technical prowess but with lousy interaction capabilities? And what about one with a big AI sticker up front?
This situation may be more common than we think. Can we unpack the design process of a technical artefact and understand the people and beliefs that drive it?
The engineer's toolbox is a wide array of tools and tricks to get the job done. But we may be getting more than we bargained for: It carries a way to model reality that sometimes attempts to stabilize the messy world we live in, trying to make sense of it in ways that can bizarrely fail beyond the narrow test scenarios we envision.
- What happens when you apply such a model to the task of real-time audio analysis, trying to make sense of the rich and subjective craft of playing a musical instrument?
- In what ways can you fail when your tidy model of reality crumbles, as you realize your practices and beliefs are worth very little for this task? That was me two years ago, working in instrumental interaction with artificial intelligence.
- And would you believe me if I told you that, in the end, it was through engineering that I finally brought it all together?
But there was light at the end of the tunnel.
Join me as I share my three-year journey designing a low-latency system for musical instrument transformation. Now available, this system emerged from a challenging process of unlearning traditional engineering approaches, embracing the expressive potential of ambiguity in neural networks, and reflecting on the kind of agency we, as designers and engineers, exercise in shaping the final product.

Franco Caspe
PhD Candidate
Queen Mary University of London
Engineer, maker, hobbyist guitarist and singer, and PhD candidate at the Centre for Digital Music (Queen Mary University of London) and the Augmented Instruments Lab (Imperial College London). Walking a thin line between AI for audio, real-time systems, and human–computer interaction.
Exploring the space between acoustic instruments and synthesizers, using AI as an analysis tool to capture performance from instrument audio, and as a generation tool for synthetic sound rendering.