I am a PhD candidate at Northwestern University in the Interactive Audio Lab, under Prof. Bryan Pardo. My research is in machine learning, music information retrieval, audio source separation, music structure and similarity, and acoustics. I also write music. Contact me at prem [at] u.northwestern.edu.
We approach cover song identification using a novel time-series representation of audio based on the 2DFT. The audio is represented as a sequence of magnitude 2D Fourier Transforms (2DFT). This representation is robust to key changes, timbral changes, and small local tempo deviations. We look at cross-similarity between these time-series, and extract a distance measure that is invariant to music structure changes. Our approach is state-of-the-art on a recent cover song dataset, and expands on previous work using the 2DFT for music representation and work on live song recognition.
Audealize is a new way of looking at audio production tools. Instead of the traditional complex interfaces consisting of knobs with hard-to-understand labels, Audealize provides a semantic interface. Simply describe the type of sound you're looking for in the search boxes, or click and drag around the maps to find new effects.
Audio source separation is the isolation of sound producing sources in an audio scene (e.g. isolating a horn section in a big band).
Nonnegative Matrix Factorization (NMF) is a popular source separation method. It learns a dictionary of spectral templates from the audio. Separation via NMF needs external guidance to group spectral templates by source.
SocialReverb is a task designed to collect words that people use to describe reverberation. In collecting this vocabulary, we can map words people use to describe audio to actual tools that can manipulate the audio. Using that knowledge, we can develop tools that allow laymen to manipulate audio just by describing it.
ClapIR is an application that allows users to measure the acoustic properties of rooms quickly and easily. From a simple recording of a clap, or other loud noise, the app calculates reverberation time, decay curve, frequency decay, frequency response, and impulse spectra.