AudioLabs - NMF Toolbox

NMF Toolbox

This is the accompanying website for the paper "NMF Toolbox: Music Processing Applications of Nonnegative Matrix Factorization" by Patricio López-Serrano, Christian Dittmar, Yiğitcan Özer and Meinard Müller.

Abstract

Nonnegative matrix factorization (NMF) is a family of methods widely used for information retrieval across domains including text, images, and audio. Within music processing, NMF has been used for tasks such as transcription, source separation, and structure analysis. Prior work has shown that initialization and constrained update rules can drastically improve the chances of NMF converging to a musically meaningful solution. Along these lines we present the NMF toolbox, containing MATLAB and Python implementations of conceptually distinct NMF variants---in particular, this paper gives an overview for two algorithms. The first variant, called nonnegative matrix factor deconvolution (NMFD), extends the original NMF algorithm to the convolutive case, enforcing the temporal order of spectral templates. The second variant, called diagonal NMF, supports the development of sparse diagonal structures in the activation matrix. Our toolbox contains several demo applications and code examples to illustrate its potential and functionality. By providing MATLAB and Python code on a documentation website under a GNU-GPL license, as well as including illustrative examples, our aim is to foster research and education in the field of music processing.

General Description

The toolbox is available both for MATLAB and Python. We additionally implemented unit tests to ensure that the results for both programming languages are consistent. A small dataset of example audiofiles is provided for demonstration purposes.

Here is the link to the toolbox: NMFtoolbox.zip

The folder structure is as follows:

Folder Name	Description
data	Dataset directory
matlab	MATLAB implementation
python	Python implementation
unit_tests	Includes the unit tests to ensure that results on both programming languages are consistent.

MATLAB Code

MATLAB implementation requires a MATLAB version of 2016a. Please remark the folder structure while using it.

Demo Files

demoAudioMosaicingContinuityNMF
demoDrumSoundSeparationNMF
demoEDMDecompositionFourComp
demoDrumExtractionKAM_NMF_percThreshold
demoDrumExtractionKAM_NMF_scoreInformed

Code Description

Filename	Description and main parameters
`NMFD.m`	Nonnegative Matrix Factor Deconvolution with KLD and fixable components [2]. `V`, `numComp`, `numIter`, `numTemplateFrames`, `initW`, `initH`, `paramConstr`, `fixH`
`NMF.m`	Nonnegative matrix factorization with KLD as default cost function [3], [4]. `V`, `costFunc`, `numIter`, `numComp`.
`NMFdiag.m`	Nonnegative matrix factorization with enhanced diagonal continuity constraints [5]. `V`, `W0`, `H0`, `distmeas`, `numOfIter`, `fixW`, `continuity.length`, `continuity.grid`, `continuity.sparsen`, `continuity.polyphony`
`NMFconv.m`	Convolutive NMF with beta-divergence [6]. `V`, `numComp`, `numIter`, `numTemplateFrames`, `initW`, `initH`, `beta`, `sparsityWeight`, `uncorrWeight`
`convModel.m`	Convolutive NMF model implementing Eq. (4) from [7]. Note that it can also be used to compute the standard NMF model in case the number of time frames of the templates equals one. `W`, `H`
`shiftOperator.m`	Shift operator as described in Eq. (5) from [7]. It shifts the columns of a matrix to the left or the right and fills undefined elements with zeros. `A`, `shiftAmount`
`initActivations.m`	Initialization strategies for NMF activations, including `random` and `uniform`. The `pitched` strategy places gate-like activations at the frames where certain notes are active in the ground truth [8]. The strategy `drums` uses decaying impulses at these positions [7]. `numComp`, `numFrames`, `deltaT`, `pitches`, `onsets`, `durations`, `drums`, `decay`, `onsetOffsetTol`, `tolerance`, `strategy`
`initTemplates.m`	NMF template initialization strategies, including `random` and `uniform`. The strategy `pitched` uses comb-filter templates [8]. The `drums` strategy uses pre-extracted averaged spectra of typical drum types. `numComp`, `numBins`, `numTemplateFrames`, `pitches`, `drumTypes`, `strategy`
`NEMA.m`	Row-wise nonlinear exponential moving average. Used to introduce exponentially decaying slopes according to Eq. (3) from [9]. `lambda`
`midi2freq.m`, `freq2midi.m`, `logFreqLogMag.m`	Helper functions to convert between MIDI pitches and frequencies in Hz, as well as log-frequency and log-magnitude representations for visualization. `midi`, `freq`, `A`, `deltaF`, `binsPerOctave`, `upperFreq`, `lowerFreq`
`LSEE_MSTFTM_GriffinLim.m`, `forwardSTFT.m`, `inverseSTFT.m`	Reconstruct the time-domain signal by means of the frame-wise inverse FFT and overlap-add method described as least squares error estimation from the modified STFT magnitude (LSEE-MSTFT) in[10]. `blockSize`, `hopSize`, `anaWinFunc`, `synWinFunc`, `reconstMirror`, `appendFrame`, `analyticSig`, `numSamples`
`alphaWienerFilter.m`	Alpha-related soft masks for extracting sources from mixture. Details in [11] and experiments in [12]. `alpha`, `binarize`

Python Code

Python implementation needs python3.6.

Installation on Ubuntu 16.04

virtualenv env -p python3.6
source env/bin/activate
cd python
python setup.py develop
jupyter notebook

Installation on Windows 10

virtualenv env
env\Scripts\activate
cd python
python setup.py develop
jupyter notebook

Demo Files

demoAudioMosaicingContinuityNMF
demoDrumSoundSeparationNMF
demoEDMDecompositionFourComp
demoDrumExtractionKAM_NMF_percThreshold
demoDrumExtractionKAM_NMF_scoreInformed

NMFtoolbox

We implemented NMFtoolbox as a Python library which contains the above MATLAB scripts with exactly the same names and parameters. Please note, that some data structures were implemented on Python as dictionaries. We also needed to write an additional utils.py helper script to provide some functionalities that are not that straightforward as on MATLAB.

Filename	Description and main parameters
`NMFD.py`	Nonnegative Matrix Factor Deconvolution with KLD and fixable components [2]. `V`, `numComp`, `numIter`, `numTemplateFrames`, `initW`, `initH`, `paramConstr`, `fixH`
`NMF.py`	Nonnegative matrix factorization with KLD as default cost function [3], [4]. `V`, `costFunc`, `numIter`, `numComp`.
`NMFdiag.py`	Nonnegative matrix factorization with enhanced diagonal continuity constraints [5]. `V`, `W0`, `H0`, `distmeas`, `numOfIter`, `fixW`, `continuity['length']`, `continuity['grid']`, `continuity['sparsen']`, `continuity['polyphony']`
`NMFconv.py`	Convolutive NMF with beta-divergence [6]. `V`, `numComp`, `numIter`, `numTemplateFrames`, `initW`, `initH`, `beta`, `sparsityWeight`, `uncorrWeight`
`convModel.py`	Convolutive NMF model implementing Eq. (4) from [7]. Note that it can also be used to compute the standard NMF model in case the number of time frames of the templates equals one. `W`, `H`
`shiftOperator.py`	Shift operator as described in Eq. (5) from [7]. It shifts the columns of a matrix to the left or the right and fills undefined elements with zeros. `A`, `shiftAmount`
`initActivations.py`	Initialization strategies for NMF activations, including `random` and `uniform`. The `pitched` strategy places gate-like activations at the frames where certain notes are active in the ground truth [8]. The strategy `drums` uses decaying impulses at these positions [7]. `numComp`, `numFrames`, `deltaT`, `pitches`, `onsets`, `durations`, `drums`, `decay`, `onsetOffsetTol`, `tolerance`, `strategy`
`initTemplates.py`	NMF template initialization strategies, including `random` and `uniform`. The strategy `pitched` uses comb-filter templates [8]. The `drums` strategy uses pre-extracted averaged spectra of typical drum types. `numComp`, `numBins`, `numTemplateFrames`, `pitches`, `drumTypes`, `strategy`
`NEMA.py`	Row-wise nonlinear exponential moving average. Used to introduce exponentially decaying slopes according to Eq. (3) from [9]. `lambda`
`midi2freq.py`, `freq2midi.py`, `logFreqLogMag.py`	Helper functions to convert between MIDI pitches and frequencies in Hz, as well as log-frequency and log-magnitude representations for visualization. `midi`, `freq`, `A`, `deltaF`, `binsPerOctave`, `upperFreq`, `lowerFreq`
`LSEE_MSTFTM_GriffinLim.py`, `forwardSTFT.py`, `inverseSTFT.py`	Reconstruct the time-domain signal by means of the frame-wise inverse FFT and overlap-add method described as least squares error estimation from the modified STFT magnitude (LSEE-MSTFT) in[10]. `blockSize`, `hopSize`, `anaWinFunc`, `synWinFunc`, `reconstMirror`, `appendFrame`, `analyticSig`, `numSamples`
`alphaWienerFilter.py`	Alpha-related soft masks for extracting sources from mixture. Details in [11] and experiments in [12]. `alpha`, `binarize`
`utils.py`	Additional helper functions on Python.

Unit Tests

The test script 'run_test.py' serves as the main workhorse to test different functions. Please note, that the unit tests require MATLAB to generate the reference files.

usage: run_test.py [-h] [-f <FunctionName>] [-m <MatlabPath>]

optional arguments:  
  -h, --help            show this help message and exit  
  -f <FunctionName>,  --function_name <FunctionName>  
                        Function name to test 
  -m <MatlabPath>,    --matlab_path <MatlabPath>
                        Path to matlab binary file

Here is an example call that shows how to run the unit test for the core NMFD function.

python run_test.py -f NMFD -m /usr/local/MATLAB/R2019a/bin/matlab

Literature

This is the accompanying website for [1], where further details on the toolbox, dataset, and the applications are discussed.

Patricio López-Serrano, Christian Dittmar, Yiğitcan Özer, and Meinard Müller
NMF Toolbox: Music Processing Applications of Nonnegative Matrix Factorization
In Proceedings of the International Conference on Digital Audio Effects (DAFx), 2019.

@inproceedings{LopezSerranoDOEM19_NMFToolbox_DAFx,
author    = {Patricio L\{'o}pez-Serrano and Christian Dittmar and Yi{ğ}itcan \{"O}zer and Meinard M\"uller},
booktitle = {Proceedings of the International Conference on Digital Audio Effects ({DAFx})},
title     = {NMF Toolbox: Music Processing Applications of Nonnegative Matrix Factorization},
year      = {2019},
month     = {September},
address   = {Birmingham, UK},
pages     = {},
}

Paris Smaragdis
Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs
In Proceedings of the International Conference on Independent Component Analysis and Blind Signal Separation (ICA): 494–499, 2004.

@inproceedings{Smaragdis04_NMD,
author    = {Paris Smaragdis},
title     = {Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs},
booktitle = {Proceedings of the International Conference on Independent Component Analysis and Blind Signal Separation {(ICA)}},
pages     = {494--499},
address   = {Granada, Spain},
year      = {2004},
month     = {September},
}

Meinard Müller
Fundamentals of Music Processing
Springer Verlag, ISBN: 978-3-319-21944-8, 2015.

@book{Mueller15_FMP_SPRINGER,
author    = {Meinard M{\"u}ller},
title     = {Fundamentals of Music Processing},
type      = {Monograph},
year      = {2015},
isbn      = {978-3-319-21944-8},
publisher = {Springer Verlag}
}

Daniel D. Lee and H. Sebastian Seung
Learning the parts of objects by non-negative matrix factorization
Nature, 401(6755): 788–791, 1999.

@article{LeeS99_LearningPartsNMF_Nature,
author={Daniel D. Lee and H. Sebastian Seung},
title={Learning the parts of objects by non-negative matrix factorization},
volume={401},
number={6755},
journal={Nature},
year={1999},
pages={788--791}
}

Jonathan Driedger, Thomas Prätzlich, and Meinard Müller
Let It Bee — Towards NMF-Inspired Audio Mosaicing
In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 350–356, 2015.

@inproceedings{DriedgerPM15_AudioMosaicingNMF_ISMIR,
author    = {Jonathan Driedger and Thomas Pr{\"a}tzlich and Meinard M{\"u}ller},
title     = {{L}et {I}t {B}ee -- {T}owards {NMF}-Inspired Audio Mosaicing},
booktitle = {Proceedings of the International Society for Music Information Retrieval Conference ({ISMIR})},
address   = {M\'{a}laga, Spain},
year      = {2015},
pages     = {350--356},
}

Andrzej Cichocki, Rafal Zdunek, Anh Huy Phan, and Shun-ichi Amari
Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation
John Wiley and Sons, 2009.

@Book{CichockiZP_AlternateAlgorithmsNmf_Book,
author    = {Andrzej Cichocki and Rafal Zdunek and Anh Huy Phan and {Shun-ichi} Amari},
title     = {Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation},
publisher = {John Wiley and Sons},
year      = {2009}
}

Christian Dittmar and Meinard Müller
Reverse Engineering the Amen Break — Score-Informed Separation and Restoration Applied to Drum Recordings
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(9): 1531–1543, 2016. DOI

@article{DittmarMueller16_DrumSep_IEEE-ACM-TASLP,
author    = {Christian Dittmar and Meinard M{\"u}ller},
title     = {Reverse Engineering the {A}men Break -- Score-Informed Separation and Restoration Applied to Drum Recordings},
journal   = {{IEEE/ACM} Transactions on Audio, Speech, and Language Processing},
volume    = {24},
number    = {9},
pages     = {1531--1543},
year      = {2016},
doi       = {10.1109/TASLP.2016.2567645},
}

Jonathan Driedger, Harald Grohganz, Thomas Prätzlich, Sebastian Ewert, and Meinard Müller
Score-Informed Audio Decomposition and Applications
In Proceedings of the ACM International Conference on Multimedia (ACM-MM): 541–544, 2013. PDF Details

@inproceedings{DriedgerGPEM13_AudioDecomposition_ACM-MM,
author    = {Jonathan Driedger and Harald Grohganz and Thomas Pr{\"a}tzlich and Sebastian Ewert and Meinard M{\"u}ller},
title     = {Score-Informed Audio Decomposition and Applications},
booktitle = {Proceedings of the {ACM} International Conference on Multimedia ({ACM-MM})},
address   = {Barcelona, Spain},
year      = {2013},
pages     = {541--544},
url-pdf   = {2013_DriedgerGPEM_SourceSeparationInterface_ACM.pdf},
url-details = {https://www.audiolabs-erlangen.de/resources/2013-ACMMM-AudioDecomp/}
}

Christian Dittmar, Patricio López-Serrano, and Meinard Müller
Unifying Local and Global Methods for Harmonic-Percussive Source Separation
In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP): 176–180, 2018. Demo

@inproceedings{DittmarLM18_HPSS_KAM_NMF_ICASSP,
author    = {Christian Dittmar and Patricio L{\'o}pez-Serrano and Meinard M{\"u}ller},
title     = {Unifying Local and Global Methods for Harmonic-Percussive Source Separation},
booktitle = {Proceedings of the {IEEE} International Conference on Acoustics, Speech, and Signal Processing ({ICASSP})},
address   = {Calgary, Canada},
month     = {April},
year      = {2018},
pages     = {176--180},
url-demo={https://www.audiolabs-erlangen.de/resources/MIR/2018-ICASSP-HPSS_KAM_NMF},
}

Daniel W. Griffin and Jae S. Lim
Signal estimation from modified short-time Fourier transform
IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(2): 236–243, 1984.

@article{GriffinL84_SpecgramInversion_TASSP,
author={Daniel W. Griffin and Jae S. Lim},
title={Signal estimation from modified short-time {F}ourier transform},
journal={{IEEE} Transactions on Acoustics, Speech, and Signal Processing},
year={1984},
volume={32},
number={2},
pages={236--243}
}

Antoine Liutkus and Roland Badeau
Generalized Wiener filtering with fractional power spectrograms
In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): 266–270, 2015.

@inproceedings{LiutkusB15_WienerFilter_ICASSP,
author = {Antoine Liutkus and Roland Badeau},
booktitle = {Proceedings of the {IEEE} International Conference on Acoustics, Speech and Signal Processing ({ICASSP})},
title = {Generalized {W}iener filtering with fractional power spectrograms},
year = {2015},
month = {April},
pages = {266--270},
address = {Brisbane, Australia},
}

Christian Dittmar, Jonathan Driedger, Meinard Müller, and Jouni Paulus
An Experimental Approach to Generalized Wiener Filtering in Music Source Separation
In Proceedings of the European Signal Processing Conference (EUSIPCO), 2016.

@inproceedings{DittmarDMP16_WienerFiltering_EUSIPCO,
author    = {Christian Dittmar and Jonathan Driedger and Meinard M{\"u}ller and Jouni Paulus},
title     = {An Experimental Approach to Generalized {W}iener Filtering in Music Source Separation},
booktitle = {Proceedings of the European Signal Processing Conference ({EUSIPCO})},
address   = {Budapest, Hungary},
year      = {2016},
pages     = {},
month     = {August},
url-pdf   = {}
}

International Audio Laboratories Erlangen

NMF Toolbox

Abstract

General Description

MATLAB Code

Demo Files

Code Description

Python Code

Installation on Ubuntu 16.04

Installation on Windows 10

Demo Files

NMFtoolbox

Unit Tests

Literature