{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "Following Section 7.1 of [Müller, FMP, Springer 2015], we discuss in this notebook the task of audio identification. In particular, we present the main ideas of a popular fingerprinting system based on spectral peaks, which was original introduced by Wang (and used in commercial systems such as Shazam).\n", " \n", "
Short name | \n", "Type of distortion | \n", "Audio | \n", "
---|---|---|
Original | \n", "Original song | \n", "\n", " \n", " | \n", "
Talking | \n", "Superposition with other sources (e.g., people talking in background) | \n",
" \n", " \n", " | \n", "
Noise | \n", "Superposition with Gaussian noise | \n", "\n", " \n", " | \n", "
Coding | \n", "Strong coding artifacts | \n", "\n", " \n", " | \n", "
Faster | \n", "Time scale modification (faster) | \n", "\n", " \n", " | \n", "
Higher | \n", "Pitch shifting (higher) | \n", "\n", " \n", " | \n", "
\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |