In Chapter 3 of [Müller, FMP, Springer 2015], we study the problem of music synchronization. The objective is to temporally align compatible representations of the same piece of music. Considering this scenario, we explain the need for musically informed audio features. In particular, we introduce the concept of chroma-based music features, which capture properties that are related to harmony and melody. Furthermore, we study an alignment technique known as dynamic time warping (DTW), a concept that is applicable for the analysis of general time series. For its efficient computation, we discuss an algorithm based on dynamic programming—a widely used method for solving a complex problem by breaking it down into a collection of simpler subproblems.

3.1 Audio Features

3.2 Dynamic Time Warping

3.3 Applications

3.4 Further Notes

Topic |
Relation to [Müller, FMP, Springer 2015] & Description |
HTML |
IPYNB |

Log-Frequency Spectrogram and Chromagram | [Section 3.1.1, Section 3.1.2] STFT; pitch; logarithmic frequency pooling; chroma; chromatic scale example; Burgmüller example (Op. 100 No. 2) |
[html] | [ipynb] |

Logarithmic Compression | [Section 3.1.2.1] Compression function; compressed spectrogram; chromagram; C-major scale example |
[html] | [ipynb] |

Feature Normalization | [Section 3.1.2.1, Section 2.2.3.3] Norm; Euclidean ($\ell^2$) norm; Manhattan ($\ell^1$) norm; maximum ($\ell^\infty$) norm; mean ($\mu$); variance ($\sigma$); standard ($z$) score; C-major scale example |
[html] | [ipynb] |

Temporal Smoothing and Downsampling | [Section 3.1.2.3, Section 7.2.1] Smoothing; average filtering; downsampling; median; median filtering; Beethoven example (Fifth Symphony) with six different audio recordings |
[html] | [ipynb] |

Tuning and Transposition | [Section 3.1.2.2, Section 3.1.2.1, Exercise 3.5, Exercise 3.6] Transposition; cyclic shift; chromagram; detuning effects; tuning estimation; C-major scale example; other audio examples |
[html] | [ipynb] |

Dynamic Time Warping (DTW) | [Section 3.2.1] Warping path; cost matrix; optimality; dynamic programming |
[html] | [ipynb] |

DTW Variants | [Section 3.2.1] Step size condition; local weights; global constraints; multiscale DTW |
[html] | [ipynb] |

Music Synchronization | [Chapter 3, Section 3.2.1] Audio recording; synchronization; chroma representation; DTW; visualization; Beethoven example (Fifth Symphony) |
[html] | [ipynb] |

Application: Music Navigation | [Section 3.3.1] Interpretation switcher; score following; videos; Beethoven example (Fifth Symphony) |
[html] | [ipynb] |

Application: Tempo Curves | [Section 3.3.2] Performance analysis; tempo curve; score chromagram; audio chromgram; DTW; warping path; strict alignment path; beat–duration representation; beat–tempo representation; parameter dependence; Schumann example (Träumerei) |
[html] | [ipynb] |