{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "

Discrete Fourier Transform (DFT)

\n", "
\n", "\n", "
\n", "\n", "

\n", "In this notebook, we introduce the discrete Fourier transform (DFT) and its basic properties. We then study the fast Fourier transform (FFT), which is an efficient algorithm to evaluate the DFT. In most parts, we closely follow Section 2.4 of [Müller, FMP, Springer 2015].\n", "

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Inner Product\n", "\n", "An important concept for understanding the Fourier transform is the **inner product** for a complex vector space $\\mathbb{C}^N$ for $N \\in \\mathbb{N}$. Given two complex vectors $x, y \\in \\mathbb{C}^N$, the inner product between $x$ and $y$ is defined as follows:\n", "\n", "$$ \\langle x | y \\rangle := \\sum_{n=0}^{N-1} x(n) \\overline{y(n)}.$$\n", "\n", "The absolute value of the inner product may be interpreted as a measure of similarity between $x$ and $y$: if $x$ and $y$ point to the same direction (i.e., $x$ and $y$ are similar), the inner product $|\\langle x | y \\rangle|$ is large. If $x$ and $y$ are orthogonal (i.e., $x$ and $y$ are dissimilar), the inner product $|\\langle x | y \\rangle|$ is zero.\n", "\n", "Note that when using the function `np.vdot` to compute the inner product, the complex conjugate is performed on the first argument. Therefore, for computing $\\langle x | y \\rangle$ as defined above, one has to call `np.vdot(y, x)`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from numba import jit\n", "\n", "x = np.array([1, 1j, 1 + 1j])\n", "y = np.array([1.1 , 1j, 0.9 + 1.1j])\n", "print('Vectors of high similarity:', np.abs(np.vdot(y, x)))\n", "\n", "x = np.array([1, 1j, 1 + 1j])\n", "y = np.array([1.1 , -1j, 0.1])\n", "print('Vectors of low similarity:', np.abs(np.vdot(y, x)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Definition of DFT\n", "\n", "Let $x\\in \\mathbb{C}^N$ be a vector of length $N\\in\\mathbb{N}$. In the music signal context, $x$ can be interpreted as a discrete signal with samples $x(0), x(1), ..., x(N-1)$. Note that we start indexing with the index $0$. The **discrete Fourier transform** (DFT) of $x$ is defined by:\n", "\n", "$$ X(k) := \\sum_{n=0}^{N-1} x(n) \\exp(-2 \\pi i k n / N) $$\n", "\n", "for $k \\in [0:N-1]$. The vector $X\\in\\mathbb{C}^N$ can be interpreted as frequency representation of the time-domain signal $x$. To obtain a geometric interpretation of the DFT, we define the a vector $\\mathbf{u}_k\\in\\mathbb{C}^N$ by\n", "\n", "$$\\mathbf{u}_k(n) := \\exp(2 \\pi i k n / N) = \\cos(2 \\pi i k n / N) + i \\sin(2 \\pi i k n / N)$$\n", "\n", "for each $k \\in [0:N-1]$. This vector can be regarded as a sampled version of the exponential function of frequency $k/N$. Then, the DFT can be expressed as inner products\n", "\n", "$$ X(k) := \\sum_{n=0}^{N-1} x(n) \\overline{\\mathbf{u}_k} = \\langle x | \\mathbf{u}_k \\rangle$$\n", "\n", "of the signal $x$ and the sampled exponential functions $\\mathbf{u}_k$. The absolute value $|X(k)|$ indicates the degree of similarity between the signal $x$ and $\\mathbf{u}_k$.\n", "\n", "In the case that $x\\in \\mathbb{R}^N$ is a real-valued vector (which is always the case for our music signal scenario), we obtain:\n", "\n", "$$ X(k) := \\langle x |\\mathrm{Re}(\\mathbf{u}_k) \\rangle - i\\langle x | \\mathrm{Im}(\\mathbf{u}_k) \\rangle $$\n", "\n", "The following plot shows an example signal $x$ compared with functions $\\overline{\\mathbf{u}_k}$ for two different frequency parameters $k$. The real and imaginary part of $\\overline{\\mathbf{u}_k}$ are shown in red and blue, respectively." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from matplotlib import pyplot as plt\n", "%matplotlib inline\n", "\n", "N = 64\n", "n = np.arange(N)\n", "k = 3\n", "x = np.cos(2 * np.pi * (k * n / N) + (1.2*np.random.rand(N) - 0.5))\n", "\n", "plt.figure(figsize=(10, 5))\n", "\n", "plt.subplot(2, 1, 1)\n", "plt.plot(n, x, 'k', marker='.', markersize='10', linewidth=2.0, label='$x$')\n", "plt.xlabel('Time (samples)')\n", "k = 3\n", "u_k_real = np.cos(2 * np.pi * k * n / N)\n", "u_k_imag = -np.sin(2 * np.pi * k * n / N)\n", "u_k = u_k_real + u_k_imag*1j\n", "sim = np.abs(np.vdot(u_k, x))\n", "plt.title(r'Signal $x$ and some $u_k$ (k=3) having a high degree of similarity (similarity = %0.3f)'%sim)\n", "plt.plot(n, u_k_real, 'r', marker='.', markersize='5', \n", " linewidth=1.0, linestyle=':', label='$\\mathrm{Re}(\\overline{\\mathbf{u}}_k)$');\n", "plt.plot(n, u_k_imag, 'b', marker='.', markersize='5', \n", " linewidth=1.0, linestyle=':', label='$\\mathrm{Im}(\\overline{\\mathbf{u}}_k)$');\n", "plt.legend()\n", "\n", "plt.subplot(2, 1, 2)\n", "plt.plot(n, x, 'k', marker='.', markersize='10', linewidth=2.0, label='$x$')\n", "plt.xlabel('Time (samples)')\n", "k = 5\n", "u_k_real = np.cos(2 * np.pi * k * n / N)\n", "u_k_imag = -np.sin(2 * np.pi * k * n / N)\n", "u_k = u_k_real + u_k_imag*1j\n", "sim = np.abs(np.vdot(u_k, x))\n", "plt.title(r'Signal $x$ and some $u_k$ (k=5) having a low degree of similarity (similarity = %0.3f)'%sim)\n", "plt.plot(n, u_k_real, 'r', marker='.', markersize='5', \n", " linewidth=1.0, linestyle=':', label='$\\mathrm{Re}(\\overline{\\mathbf{u}}_k)$');\n", "plt.plot(n, u_k_imag, 'b', marker='.', markersize='5', \n", " linewidth=1.0, linestyle=':', label='$\\mathrm{Im}(\\overline{\\mathbf{u}}_k)$');\n", "plt.legend()\n", "\n", "plt.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## DFT Matrix\n", "\n", "Being a linear operator $\\mathbb{C}^N \\to \\mathbb{C}^N$, the DFT can be expressed by some $N\\times N$-matrix. This leads to the famous DFT matrix $\\mathrm{DFT}_N \\in \\mathbb{C}^{N\\times N}$ matrix, which is given by\n", "\n", "$$\\mathrm{DFT}_N(n, k) = \\mathrm{exp}(-2 \\pi i k n / N)$$\n", "\n", "for $n\\in[0:N-1]$ and $k\\in[0:N-1]$. Let $\\rho_N:=\\exp(2 \\pi i / N)$ be the primitive $N^\\mathrm{th}$ root of unity as defined in the notebook Exponential Function. Then \n", "\n", "$$\\sigma_N:= \\overline{\\rho_N} = \\mathrm{exp}(-2 \\pi i / N)$$\n", "\n", "also defines a primitive $N^\\mathrm{th}$ root of unity. From the properties of exponential functions, one obtains that\n", "\n", "$$ \\sigma_N^{kn} = \\mathrm{exp}(-2 \\pi i / N)^{kn} = \\mathrm{exp}(-2 \\pi i k n / N)$$\n", "\n", "From this, one obtains:\n", "\n", "$$\n", "\\mathrm{DFT}_N =\n", "\\begin{pmatrix}\n", " 1 & 1 & 1 & \\dots & 1 \\\\\n", " 1 & \\sigma_N & \\sigma_N^2 & \\dots & \\sigma_N^{N-1} \\\\\n", " 1 & \\sigma_N^2 & \\sigma_N^4 & \\dots & \\sigma_N^{2(N-1)} \\\\\n", " \\vdots & \\vdots & \\vdots & \\ddots & \\vdots \\\\\n", " 1 & \\sigma_N^{N-1} & \\sigma_N^{2(N-1)} & \\dots & \\sigma_N^{(N-1)(N-1)} \\\\\n", "\\end{pmatrix}\n", "$$\n", "\n", "In the following visualization, the real and imaginary part of $\\mathrm{DFT}_N$ are shown, where the values are encoded by suitable colors. Note that the $k^\\mathrm{th}$ row of $\\mathrm{DFT}_N$ corresponds to the vector $\\mathbf{u}_k$ as defined above." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "@jit(nopython=True)\n", "def generate_matrix_dft(N, K):\n", " \"\"\"Generate a DFT (discete Fourier transfrom) matrix\n", " \n", " Notebook: C2/C2_DFT-FFT.ipynb\n", "\n", " Args:\n", " N: Number of samples\n", " K: Number of frequency bins\n", "\n", " Returns:\n", " dft: The DFT matrix\n", " \"\"\"\n", " # your code here\n", " return dft\n", "\n", "N = 32\n", "dft_mat = generate_matrix_dft(N, N)\n", "\n", "plt.figure(figsize=(12, 4))\n", "\n", "plt.subplot(1, 2, 1)\n", "plt.title('$\\mathrm{Re}(\\mathrm{DFT}_N)$')\n", "plt.imshow(np.real(dft_mat), origin='lower', cmap='seismic', aspect='equal')\n", "plt.xlabel('Time index $n$')\n", "plt.ylabel('Frequency index $k$')\n", "plt.colorbar()\n", "\n", "plt.subplot(1, 2, 2)\n", "plt.title('$\\mathrm{Im}(\\mathrm{DFT}_N)$')\n", "plt.imshow(np.imag(dft_mat), origin='lower', cmap='seismic', aspect='equal')\n", "plt.xlabel('Time index $n$')\n", "plt.ylabel('Frequency index $k$')\n", "plt.colorbar()\n", "plt.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now write a function that computes the DFT $X = \\mathrm{DFT}_N \\cdot x$:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "@jit(nopython=True)\n", "def dft(x):\n", " \"\"\"Compute the discete Fourier transfrom (DFT)\n", "\n", " Notebook: C2/C2_DFT-FFT.ipynb\n", "\n", " Args:\n", " x: Signal to be transformed\n", "\n", " Returns:\n", " X: Fourier transform of `x`\n", " \"\"\"\n", " # your code here\n", " return X\n", "\n", "\n", "N = 128\n", "n = np.arange(N)\n", "k = 10\n", "x = np.cos(2 * np.pi * (k * n / N) + 2 * (np.random.rand(N) - 0.5)) \n", "X = dft(x)\n", "\n", "plt.figure(figsize=(10, 3))\n", "\n", "plt.subplot(1, 2, 1)\n", "plt.title('$x$')\n", "plt.plot(x, 'k')\n", "plt.xlabel('Time (index $n$)')\n", "\n", "plt.subplot(1, 2, 2)\n", "plt.title('$|X|$')\n", "plt.plot(np.abs(X), 'k')\n", "plt.xlabel('Frequency (index $k$)')\n", "plt.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fast Fourier Transform (FFT)\n", "\n", "Next, we discuss the famous fast Fourier transform (FFT), which is a fast algorithm to compute the DFT. The FFT algorithm was originally found by Gauss in about 1805 and then rediscovered by Cooley and Tukey in 1965. The FFT algorithm is based on the observation that applying a DFT of even size $N=2M$ can be expressed in terms of applying two DFTs of half the size $M$. It exploits the fact that there are algebraic relations between the entries $\\sigma_N^{kn} = \\mathrm{exp}(-2 \\pi i / N)^{kn}$ of DFT matrices. In particular, one has \n", "\n", "$$\\sigma_M = \\sigma_N^2$$\n", "\n", "In the FFT algorithm, one computes the DFT of the even-indexed and the uneven-indexed entries of $x$:\n", "\n", "\\begin{align}\n", "(A(0), \\dots, A(N/2-1)) &= \\mathrm{DFT}_{N/2} \\cdot (x(0), x(2), x(4), \\dots, x(N-2))\\\\\n", "(B(0), \\dots, B(N/2-1)) &= \\mathrm{DFT}_{N/2} \\cdot (x(1), x(3), x(5), \\dots, x(N-1))\n", "\\end{align}\n", "\n", "With these two DFTs of size $N/2$, one can compute the full DFT of size $N$ via:\n", "\n", "\\begin{eqnarray}\n", "C(k) &=& \\sigma_N^k \\cdot B(k)\\\\\n", "X(k) &=& A(k) + C(k)\\\\\n", "X(N/2 + k) &=& A(k) - C(k)\\\\\n", "\\end{eqnarray}\n", "\n", "for $k \\in [0: N/2 - 1]$. The numbers $\\sigma_N^k$ are also called *twiddle factors*. If $N$ is a power of two, this idea can be applied recusively until one reaches the computation of $\\mathrm{DFT}_{1}$ (the case $N=1$), which is simply multiplication by one (i.e. just returning the signal of length $N=1$). For further details, we refer to Section 2.4.3 of [Müller, FMP, Springer 2015]) (see also Table 2.1).\n", "\n", "In the following code, there is a function `twiddle` that computes twiddle factors and a function `fft` that implements the FFT algorithm." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "@jit(nopython=True)\n", "def twiddle(N):\n", " \"\"\"Generate the twiddle factors used in the computation of the Fourier transform (FFT)\n", "\n", " Notebook: C2/C2_DFT-FFT.ipynb\n", "\n", " Args:\n", " N: Number of samples\n", "\n", " Returns:\n", " sigma: The twiddle factors\n", " \"\"\"\n", " # your code here\n", " return sigma\n", "\n", "@jit(nopython=True)\n", "def fft(x):\n", " \"\"\"Compute the fast Fourier transform (FFT)\n", "\n", " Notebook: C2/C2_DFT-FFT.ipynb\n", "\n", " Args:\n", " x: Signal to be transformed\n", "\n", " Returns:\n", " X: Fourier transform of `x`\n", " \"\"\"\n", " # your code here\n", "\n", "N = 16\n", "n = np.arange(N)\n", "k = 4\n", "x = np.cos(2 * np.pi * (k * n / N) + 2 * (np.random.rand(N) - 0.5)) \n", "X_via_dft = dft(x)\n", "X_via_fft = fft(x)\n", "\n", "\n", "plt.figure(figsize=(10, 3))\n", "\n", "plt.subplot(1, 2, 1)\n", "plt.title('$x$')\n", "plt.plot(x, 'k', marker='.', markersize=12)\n", "plt.xlabel('Time (index $n$)')\n", "\n", "plt.subplot(1, 2, 2)\n", "plt.title('$|X|$')\n", "plt.plot(np.abs(X_via_dft), 'k', marker='.', markersize=18, label='dft')\n", "plt.plot(np.abs(X_via_fft), linestyle='--', color='pink', marker='.', markersize=6, label='fft')\n", "plt.xlabel('Frequency (index $k$)')\n", "plt.legend()\n", "\n", "plt.tight_layout()\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Computational Complexity\n", "\n", "The FFT reduces the overall number of operations from the order of $N^2$ (needed when computing the usual matrix–vector product $\\mathrm{DFT}_N \\cdot x$) to the order of $N\\log_2N$. The savings are enormous. For example, using $N=2^{10}=1024$, the FFT requires roughly $N\\log_2N=10240$ instead of $N^2=1048576$ operations in the naive approach\n", "\n", "Using the module `timeit`, which provides a simple way to time small bits of Python code, the following code compares the running time when using the naive approach and the FFT:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "N = 512\n", "n = np.arange(N)\n", "x = np.sin(2 * np.pi * 5 * n / N )\n", "\n", "print('Timing for DFT: ', end='')\n", "%timeit dft(x)\n", "print('Timing for FFT: ', end='')\n", "%timeit fft(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us now compare the running times by successively increasing the size $N$ (this may take a couple of seconds):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import timeit\n", "\n", "Ns = [2 ** n for n in range(5, 11)]\n", "times_dft = []\n", "times_fft = []\n", "execuctions = 3\n", "\n", "for N in Ns:\n", " n = np.arange(N)\n", " x = np.sin(2 * np.pi * 5 * n / N )\n", " \n", " time_dft = timeit.timeit(lambda: dft(x), number=execuctions) / execuctions\n", " time_fft = timeit.timeit(lambda: fft(x), number=execuctions) / execuctions\n", " times_dft.append(time_dft)\n", " times_fft.append(time_fft)\n", " \n", "plt.figure(figsize=(10, 4))\n", " \n", "plt.plot(Ns, times_dft, '-xk', label='DFT')\n", "plt.plot(Ns, times_fft, '-xr', label='FFT')\n", "plt.xticks(Ns)\n", "plt.legend()\n", "plt.grid()\n", "plt.xlabel('$N$')\n", "plt.ylabel('Runtime (seconds)');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "Acknowledgment: This notebook was created by Frank Zalkow and Meinard Müller.\n", "
" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 1 }