NMF Toolbox

This is the accompanying website for the paper "NMF Toolbox: Music Processing Applications of Nonnegative Matrix Factorization" by Patricio López-Serrano, Christian Dittmar, Yiğitcan Özer and Meinard Müller.

Abstract

Nonnegative matrix factorization (NMF) is a family of methods widely used for information retrieval across domains including text, images, and audio. Within music processing, NMF has been used for tasks such as transcription, source separation, and structure analysis. Prior work has shown that initialization and constrained update rules can drastically improve the chances of NMF converging to a musically meaningful solution. Along these lines we present the NMF toolbox, containing MATLAB and Python implementations of conceptually distinct NMF variants---in particular, this paper gives an overview for two algorithms. The first variant, called nonnegative matrix factor deconvolution (NMFD), extends the original NMF algorithm to the convolutive case, enforcing the temporal order of spectral templates. The second variant, called diagonal NMF, supports the development of sparse diagonal structures in the activation matrix. Our toolbox contains several demo applications and code examples to illustrate its potential and functionality. By providing MATLAB and Python code on a documentation website under a GNU-GPL license, as well as including illustrative examples, our aim is to foster research and education in the field of music processing.

General Description

The toolbox is available both for MATLAB and Python. We additionally implemented unit tests to ensure that the results for both programming languages are consistent. A small dataset of example audiofiles is provided for demonstration purposes.

Here is the link to the toolbox: NMFtoolbox.zip

The folder structure is as follows:

Folder Name Description
data Dataset directory
matlab MATLAB implementation
python Python implementation
unit_tests Includes the unit tests to ensure that results on both programming languages are consistent.

MATLAB Code

MATLAB implementation requires a MATLAB version of 2016a. Please remark the folder structure while using it.


Demo Files

demoAudioMosaicingContinuityNMF
demoDrumSoundSeparationNMF
demoEDMDecompositionFourComp
demoDrumExtractionKAM_NMF_percThreshold
demoDrumExtractionKAM_NMF_scoreInformed


Code Description

Filename Description and main parameters
NMFD.m Nonnegative Matrix Factor Deconvolution with KLD and fixable components [2]. V, numComp, numIter, numTemplateFrames, initW, initH, paramConstr, fixH
NMF.m Nonnegative matrix factorization with KLD as default cost function [3], [4]. V, costFunc, numIter, numComp.
NMFdiag.m Nonnegative matrix factorization with enhanced diagonal continuity constraints [5]. V, W0, H0, distmeas, numOfIter, fixW, continuity.length, continuity.grid, continuity.sparsen, continuity.polyphony
NMFconv.m Convolutive NMF with beta-divergence [6]. V, numComp, numIter, numTemplateFrames, initW, initH, beta, sparsityWeight, uncorrWeight
convModel.m Convolutive NMF model implementing Eq. (4) from [7]. Note that it can also be used to compute the standard NMF model in case the number of time frames of the templates equals one. W, H
shiftOperator.m Shift operator as described in Eq. (5) from [7]. It shifts the columns of a matrix to the left or the right and fills undefined elements with zeros. A, shiftAmount
initActivations.m Initialization strategies for NMF activations, including random and uniform. The pitched strategy places gate-like activations at the frames where certain notes are active in the ground truth [8]. The strategy drums uses decaying impulses at these positions [7]. numComp, numFrames, deltaT, pitches, onsets, durations, drums, decay, onsetOffsetTol, tolerance, strategy
initTemplates.m NMF template initialization strategies, including random and uniform. The strategy pitched uses comb-filter templates [8]. The drums strategy uses pre-extracted averaged spectra of typical drum types. numComp, numBins, numTemplateFrames, pitches, drumTypes, strategy
NEMA.m Row-wise nonlinear exponential moving average. Used to introduce exponentially decaying slopes according to Eq. (3) from [9]. lambda
midi2freq.m, freq2midi.m, logFreqLogMag.m Helper functions to convert between MIDI pitches and frequencies in Hz, as well as log-frequency and log-magnitude representations for visualization. midi, freq, A, deltaF, binsPerOctave, upperFreq, lowerFreq
LSEE_MSTFTM_GriffinLim.m, forwardSTFT.m, inverseSTFT.m Reconstruct the time-domain signal by means of the frame-wise inverse FFT and overlap-add method described as least squares error estimation from the modified STFT magnitude (LSEE-MSTFT) in[10]. blockSize, hopSize, anaWinFunc, synWinFunc, reconstMirror, appendFrame, analyticSig, numSamples
alphaWienerFilter.m Alpha-related soft masks for extracting sources from mixture. Details in [11] and experiments in [12]. alpha, binarize

Python Code

Python implementation needs python3.6.


Installation on Ubuntu 16.04

virtualenv env -p python3.6
source env/bin/activate
cd python
python setup.py develop
jupyter notebook 

Installation on Windows 10

virtualenv env
env\Scripts\activate
cd python
python setup.py develop
jupyter notebook 

Demo Files

demoAudioMosaicingContinuityNMF
demoDrumSoundSeparationNMF
demoEDMDecompositionFourComp
demoDrumExtractionKAM_NMF_percThreshold
demoDrumExtractionKAM_NMF_scoreInformed


NMFtoolbox

We implemented NMFtoolbox as a Python library which contains the above MATLAB scripts with exactly the same names and parameters. Please note, that some data structures were implemented on Python as dictionaries. We also needed to write an additional utils.py helper script to provide some functionalities that are not that straightforward as on MATLAB.

Filename Description and main parameters
NMFD.py Nonnegative Matrix Factor Deconvolution with KLD and fixable components [2]. V, numComp, numIter, numTemplateFrames, initW, initH, paramConstr, fixH
NMF.py Nonnegative matrix factorization with KLD as default cost function [3], [4]. V, costFunc, numIter, numComp.
NMFdiag.py Nonnegative matrix factorization with enhanced diagonal continuity constraints [5]. V, W0, H0, distmeas, numOfIter, fixW, continuity['length'], continuity['grid'], continuity['sparsen'], continuity['polyphony']
NMFconv.py Convolutive NMF with beta-divergence [6]. V, numComp, numIter, numTemplateFrames, initW, initH, beta, sparsityWeight, uncorrWeight
convModel.py Convolutive NMF model implementing Eq. (4) from [7]. Note that it can also be used to compute the standard NMF model in case the number of time frames of the templates equals one. W, H
shiftOperator.py Shift operator as described in Eq. (5) from [7]. It shifts the columns of a matrix to the left or the right and fills undefined elements with zeros. A, shiftAmount
initActivations.py Initialization strategies for NMF activations, including random and uniform. The pitched strategy places gate-like activations at the frames where certain notes are active in the ground truth [8]. The strategy drums uses decaying impulses at these positions [7]. numComp, numFrames, deltaT, pitches, onsets, durations, drums, decay, onsetOffsetTol, tolerance, strategy
initTemplates.py NMF template initialization strategies, including random and uniform. The strategy pitched uses comb-filter templates [8]. The drums strategy uses pre-extracted averaged spectra of typical drum types. numComp, numBins, numTemplateFrames, pitches, drumTypes, strategy
NEMA.py Row-wise nonlinear exponential moving average. Used to introduce exponentially decaying slopes according to Eq. (3) from [9]. lambda
midi2freq.py, freq2midi.py, logFreqLogMag.py Helper functions to convert between MIDI pitches and frequencies in Hz, as well as log-frequency and log-magnitude representations for visualization. midi, freq, A, deltaF, binsPerOctave, upperFreq, lowerFreq
LSEE_MSTFTM_GriffinLim.py, forwardSTFT.py, inverseSTFT.py Reconstruct the time-domain signal by means of the frame-wise inverse FFT and overlap-add method described as least squares error estimation from the modified STFT magnitude (LSEE-MSTFT) in[10]. blockSize, hopSize, anaWinFunc, synWinFunc, reconstMirror, appendFrame, analyticSig, numSamples
alphaWienerFilter.py Alpha-related soft masks for extracting sources from mixture. Details in [11] and experiments in [12]. alpha, binarize
utils.py Additional helper functions on Python.

Unit Tests

The test script 'run_test.py' serves as the main workhorse to test different functions. Please note, that the unit tests require MATLAB to generate the reference files.

usage: run_test.py [-h] [-f <FunctionName>] [-m <MatlabPath>]

optional arguments:  
  -h, --help            show this help message and exit  
  -f <FunctionName>,  --function_name <FunctionName>  
                        Function name to test 
  -m <MatlabPath>,    --matlab_path <MatlabPath>
                        Path to matlab binary file  

Here is an example call that shows how to run the unit test for the core NMFD function.

python run_test.py -f NMFD -m /usr/local/MATLAB/R2019a/bin/matlab

Literature

This is the accompanying website for [1], where further details on the toolbox, dataset, and the applications are discussed.

  1. Patricio López-Serrano, Christian Dittmar, Yiğitcan Özer, and Meinard Müller
    NMF Toolbox: Music Processing Applications of Nonnegative Matrix Factorization
    In Proceedings of the International Conference on Digital Audio Effects (DAFx), 2019.
    @inproceedings{LopezSerranoDOEM19_NMFToolbox_DAFx,
    author    = {Patricio L\{'o}pez-Serrano and Christian Dittmar and Yi{ğ}itcan \{"O}zer and Meinard M\"uller},
    booktitle = {Proceedings of the International Conference on Digital Audio Effects ({DAFx})},
    title     = {NMF Toolbox: Music Processing Applications of Nonnegative Matrix Factorization},
    year      = {2019},
    month     = {September},
    address   = {Birmingham, UK},
    pages     = {},
    }
  2. Paris Smaragdis
    Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs
    In Proceedings of the International Conference on Independent Component Analysis and Blind Signal Separation (ICA): 494–499, 2004.
    @inproceedings{Smaragdis04_NMD,
    author    = {Paris Smaragdis},
    title     = {Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs},
    booktitle = {Proceedings of the International Conference on Independent Component Analysis and Blind Signal Separation {(ICA)}},
    pages     = {494--499},
    address   = {Granada, Spain},
    year      = {2004},
    month     = {September},
    }
  3. Meinard Müller
    Fundamentals of Music Processing
    Springer Verlag, ISBN: 978-3-319-21944-8, 2015.
    @book{Mueller15_FMP_SPRINGER,
    author    = {Meinard M{\"u}ller},
    title     = {Fundamentals of Music Processing},
    type      = {Monograph},
    year      = {2015},
    isbn      = {978-3-319-21944-8},
    publisher = {Springer Verlag}
    }
  4. Daniel D. Lee and H. Sebastian Seung
    Learning the parts of objects by non-negative matrix factorization
    Nature, 401(6755): 788–791, 1999.
    @article{LeeS99_LearningPartsNMF_Nature,
    author={Daniel D. Lee and H. Sebastian Seung},
    title={Learning the parts of objects by non-negative matrix factorization},
    volume={401},
    number={6755},
    journal={Nature},
    year={1999},
    pages={788--791}
    }
  5. Jonathan Driedger, Thomas Prätzlich, and Meinard Müller
    Let It Bee — Towards NMF-Inspired Audio Mosaicing
    In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 350–356, 2015.
    @inproceedings{DriedgerPM15_AudioMosaicingNMF_ISMIR,
    author    = {Jonathan Driedger and Thomas Pr{\"a}tzlich and Meinard M{\"u}ller},
    title     = {{L}et {I}t {B}ee -- {T}owards {NMF}-Inspired Audio Mosaicing},
    booktitle = {Proceedings of the International Society for Music Information Retrieval Conference ({ISMIR})},
    address   = {M\'{a}laga, Spain},
    year      = {2015},
    pages     = {350--356},
    }
  6. Andrzej Cichocki, Rafal Zdunek, Anh Huy Phan, and Shun-ichi Amari
    Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation
    John Wiley and Sons, 2009.
    @Book{CichockiZP_AlternateAlgorithmsNmf_Book,
    author    = {Andrzej Cichocki and Rafal Zdunek and Anh Huy Phan and {Shun-ichi} Amari},
    title     = {Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation},
    publisher = {John Wiley and Sons},
    year      = {2009}
    }
  7. Christian Dittmar and Meinard Müller
    Reverse Engineering the Amen Break — Score-Informed Separation and Restoration Applied to Drum Recordings
    IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(9): 1531–1543, 2016. DOI
    @article{DittmarMueller16_DrumSep_IEEE-ACM-TASLP,
    author    = {Christian Dittmar and Meinard M{\"u}ller},
    title     = {Reverse Engineering the {A}men Break -- Score-Informed Separation and Restoration Applied to Drum Recordings},
    journal   = {{IEEE/ACM} Transactions on Audio, Speech, and Language Processing},
    volume    = {24},
    number    = {9},
    pages     = {1531--1543},
    year      = {2016},
    doi       = {10.1109/TASLP.2016.2567645},
    }
  8. Jonathan Driedger, Harald Grohganz, Thomas Prätzlich, Sebastian Ewert, and Meinard Müller
    Score-Informed Audio Decomposition and Applications
    In Proceedings of the ACM International Conference on Multimedia (ACM-MM): 541–544, 2013. PDF Details
    @inproceedings{DriedgerGPEM13_AudioDecomposition_ACM-MM,
    author    = {Jonathan Driedger and Harald Grohganz and Thomas Pr{\"a}tzlich and Sebastian Ewert and Meinard M{\"u}ller},
    title     = {Score-Informed Audio Decomposition and Applications},
    booktitle = {Proceedings of the {ACM} International Conference on Multimedia ({ACM-MM})},
    address   = {Barcelona, Spain},
    year      = {2013},
    pages     = {541--544},
    url-pdf   = {2013_DriedgerGPEM_SourceSeparationInterface_ACM.pdf},
    url-details = {https://www.audiolabs-erlangen.de/resources/2013-ACMMM-AudioDecomp/}
    }
  9. Christian Dittmar, Patricio López-Serrano, and Meinard Müller
    Unifying Local and Global Methods for Harmonic-Percussive Source Separation
    In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP): 176–180, 2018. Demo
    @inproceedings{DittmarLM18_HPSS_KAM_NMF_ICASSP,
    author    = {Christian Dittmar and Patricio L{\'o}pez-Serrano and Meinard M{\"u}ller},
    title     = {Unifying Local and Global Methods for Harmonic-Percussive Source Separation},
    booktitle = {Proceedings of the {IEEE} International Conference on Acoustics, Speech, and Signal Processing ({ICASSP})},
    address   = {Calgary, Canada},
    month     = {April},
    year      = {2018},
    pages     = {176--180},
    url-demo={https://www.audiolabs-erlangen.de/resources/MIR/2018-ICASSP-HPSS_KAM_NMF},
    }
  10. Daniel W. Griffin and Jae S. Lim
    Signal estimation from modified short-time Fourier transform
    IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(2): 236–243, 1984.
    @article{GriffinL84_SpecgramInversion_TASSP,
    author={Daniel W. Griffin and Jae S. Lim},
    title={Signal estimation from modified short-time {F}ourier transform},
    journal={{IEEE} Transactions on Acoustics, Speech, and Signal Processing},
    year={1984},
    volume={32},
    number={2},
    pages={236--243}
    }
  11. Antoine Liutkus and Roland Badeau
    Generalized Wiener filtering with fractional power spectrograms
    In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): 266–270, 2015.
    @inproceedings{LiutkusB15_WienerFilter_ICASSP,
    author = {Antoine Liutkus and Roland Badeau},
    booktitle = {Proceedings of the {IEEE} International Conference on Acoustics, Speech and Signal Processing ({ICASSP})},
    title = {Generalized {W}iener filtering with fractional power spectrograms},
    year = {2015},
    month = {April},
    pages = {266--270},
    address = {Brisbane, Australia},
    }
  12. Christian Dittmar, Jonathan Driedger, Meinard Müller, and Jouni Paulus
    An Experimental Approach to Generalized Wiener Filtering in Music Source Separation
    In Proceedings of the European Signal Processing Conference (EUSIPCO), 2016.
    @inproceedings{DittmarDMP16_WienerFiltering_EUSIPCO,
    author    = {Christian Dittmar and Jonathan Driedger and Meinard M{\"u}ller and Jouni Paulus},
    title     = {An Experimental Approach to Generalized {W}iener Filtering in Music Source Separation},
    booktitle = {Proceedings of the European Signal Processing Conference ({EUSIPCO})},
    address   = {Budapest, Hungary},
    year      = {2016},
    pages     = {},
    month     = {August},
    url-pdf   = {}
    }