Chapter 8: Musically Informed Audio Decomposition

In Chapter 8 of [Müller, FMP, Springer 2015] on audio decomposition, we present a challenging research direction that is closely related to source separation. Within this wide research area, we consider three subproblems: harmonic–percussive separation, main melody extraction, and score-informed audio decomposition. Within these scenarios, we discuss a number of key techniques including instantaneous frequency estimation, fundamental frequency (F0) estimation, spectrogram inversion, and nonnegative matrix factorization (NMF). Furthermore, we encounter a number of acoustic and musical properties of audio recordings that have been introduced and discussed in previous chapters.

8.1 Harmonic–Percussive Separation
8.2 Melody Extraction
8.3 NMF-Based Audio Decomposition
8.4 Further Notes


Harmonic–Percussive Separation (HPS) [Section 8.1.1]
Harmonic sound; percussive sound; median filter; binary mask; soft mask; signal reconstruction; HPR experiments; Violin–Castanets example; audio examples (diverse)
Harmonic–Residual–Percussive Separation (HRPS) [Section 8.1.1, Exercise 8.5]
Separation factor; residual component; binary mask; cascaded HRPS; Violin–Applause–Castanets example; Bornemark example (Stop Messing With Me)
Signal Reconstruction [Section 8.1.2]
Inverse DFT; inverse STFT; modified STFT; overlap–add procedure; Griffin–Lim optimization problem
Applications of HPS and HPRS [Section 8.1.3]
Feature enhancement; chroma feature; onset detection; time-scale modification; Violin–Castanets example
Instantaneous Frequency Estimation [Section 8.2.1]
Phase wrapping; principle argument; exponential function; phase prediction; instantaneous frequency (IF); polar coordinates; bin offset; visualization of IF values; dependency on hop size; C4 piano example
Salience Representation [Section 8.2.2]
Log-frequency spectrogram; instantaneous frequency; binning; harmonic summation; salience; Weber example (Freischütz)
Fundamental Frequency Tracking [Section 8.2.3]
Frequency trajectory; sonification; salience representation; continuity constraint; dynamic programming; score-informed constraint; constraint region; Weber example (Freischütz); Bornemark example (Stop Messing With Me)
Melody Extraction and Separation [Section 8.2, Section]
Melody; salience representation; predominant frequency; F0-trajectory; separation; binary mask; harmonics; frequency-dependent tolerance; signal reconstruction; Weber example (Freischütz); Bornemark example (Stop Messing With Me)
Nonnegative Matrix Factorization (NMF) [Section 8.3.1]
Matrix factorization; nonnegative matrix; rank; template vector; activation vector; gradient descent; multiplicative update rule; magnitude spectrogram; Chopin example (Op. 28, No. 4); C-major scale example
NMF-Based Spectrogram Factorization [Section 8.3.2]
Spectrogram factorization; score-informed NMF; initialization; template constraints; pitch information; activation constraints; score information; onset model; Chopin example (Op. 28, No. 4)
NMF-Based Audio Decomposition [Section 8.3.3]
Score-informed NMF; activation matrix; spectral masking; audio decomposition; audio editing; Chopin example (Op. 28, No. 4)
