This is the accompanying website for the paper "NMF Toolbox: Music Processing Applications of Nonnegative Matrix Factorization" by Patricio López-Serrano, Christian Dittmar, Yiğitcan Özer and Meinard Müller.

Nonnegative matrix factorization (NMF) is a family of methods widely used for information retrieval across domains including text, images, and audio. Within music processing, NMF has been used for tasks such as transcription, source separation, and structure analysis. Prior work has shown that initialization and constrained update rules can drastically improve the chances of NMF converging to a musically meaningful solution. Along these lines we present the NMF toolbox, containing MATLAB and Python implementations of conceptually distinct NMF variants---in particular, this paper gives an overview for two algorithms. The first variant, called nonnegative matrix factor deconvolution (NMFD), extends the original NMF algorithm to the convolutive case, enforcing the temporal order of spectral templates. The second variant, called diagonal NMF, supports the development of sparse diagonal structures in the activation matrix. Our toolbox contains several demo applications and code examples to illustrate its potential and functionality. By providing MATLAB and Python code on a documentation website under a GNU-GPL license, as well as including illustrative examples, our aim is to foster research and education in the field of music processing.

The toolbox is available both for MATLAB and Python. We additionally implemented unit tests to ensure that the results for both programming languages are consistent. A small dataset of example audiofiles is provided for demonstration purposes.

Here is the link to the toolbox: `NMFtoolbox.zip`

The folder structure is as follows:

Folder Name | Description |
---|---|

data |
Dataset directory |

matlab |
MATLAB implementation |

python |
Python implementation |

unit_tests |
Includes the unit tests to ensure that results on both programming languages are consistent. |

MATLAB implementation requires a MATLAB version of **2016a**. Please remark the folder structure while using it.

`demoAudioMosaicingContinuityNMF`

`demoDrumSoundSeparationNMF`

`demoEDMDecompositionFourComp`

`demoDrumExtractionKAM_NMF_percThreshold`

`demoDrumExtractionKAM_NMF_scoreInformed`

Filename | Description and main parameters |
---|---|

`NMFD.m` |
Nonnegative Matrix Factor Deconvolution with KLD and fixable components [2]. `V` , `numComp` , `numIter` , `numTemplateFrames` , `initW` , `initH` , `paramConstr` , `fixH` |

`NMF.m` |
Nonnegative matrix factorization with KLD as default cost function [3], [4]. `V` , `costFunc` , `numIter` , `numComp` . |

`NMFdiag.m` |
Nonnegative matrix factorization with enhanced diagonal continuity constraints [5]. `V` , `W0` , `H0` , `distmeas` , `numOfIter` , `fixW` , `continuity.length` , `continuity.grid` , `continuity.sparsen` , `continuity.polyphony` |

`NMFconv.m` |
Convolutive NMF with beta-divergence [6]. `V` , `numComp` , `numIter` , `numTemplateFrames` , `initW` , `initH` , `beta` , `sparsityWeight` , `uncorrWeight` |

`convModel.m` |
Convolutive NMF model implementing Eq. (4) from [7]. Note that it can also be used to compute the standard NMF model in case the number of time frames of the templates equals one. `W` , `H` |

`shiftOperator.m` |
Shift operator as described in Eq. (5) from [7]. It shifts the columns of a matrix to the left or the right and fills undefined elements with zeros. `A` , `shiftAmount` |

`initActivations.m` |
Initialization strategies for NMF activations, including `random` and `uniform` . The `pitched` strategy places gate-like activations at the frames where certain notes are active in the ground truth [8]. The strategy `drums` uses decaying impulses at these positions [7]. `numComp` , `numFrames` , `deltaT` , `pitches` , `onsets` , `durations` , `drums` , `decay` , `onsetOffsetTol` , `tolerance` , `strategy` |

`initTemplates.m` |
NMF template initialization strategies, including `random` and `uniform` . The strategy `pitched` uses comb-filter templates [8]. The `drums` strategy uses pre-extracted averaged spectra of typical drum types. `numComp` , `numBins` , `numTemplateFrames` , `pitches` , `drumTypes` , `strategy` |

`NEMA.m` |
Row-wise nonlinear exponential moving average. Used to introduce exponentially decaying slopes according to Eq. (3) from [9]. `lambda` |

`midi2freq.m` , `freq2midi.m` , `logFreqLogMag.m` |
Helper functions to convert between MIDI pitches and frequencies in Hz, as well as log-frequency and log-magnitude representations for visualization. `midi` , `freq` , `A` , `deltaF` , `binsPerOctave` , `upperFreq` , `lowerFreq` |

`LSEE_MSTFTM_GriffinLim.m` , `forwardSTFT.m` , `inverseSTFT.m` |
Reconstruct the time-domain signal by means of the frame-wise inverse FFT and overlap-add method described as least squares error estimation from the modified STFT magnitude (LSEE-MSTFT) in[10]. `blockSize` , `hopSize` , `anaWinFunc` , `synWinFunc` , `reconstMirror` , `appendFrame` , `analyticSig` , `numSamples` |

`alphaWienerFilter.m` |
Alpha-related soft masks for extracting sources from mixture. Details in [11] and experiments in [12]. `alpha` , `binarize` |

Python implementation needs **python3.6**.

```
virtualenv env -p python3.6
source env/bin/activate
cd python
python setup.py develop
jupyter notebook
```

```
virtualenv env
env\Scripts\activate
cd python
python setup.py develop
jupyter notebook
```

`demoAudioMosaicingContinuityNMF`

`demoDrumSoundSeparationNMF`

`demoEDMDecompositionFourComp`

`demoDrumExtractionKAM_NMF_percThreshold`

`demoDrumExtractionKAM_NMF_scoreInformed`

We implemented NMFtoolbox as a Python library which contains the above MATLAB scripts with exactly the same names and parameters. Please note, that some data structures were implemented on Python as dictionaries. We also needed to write an additional `utils.py`

helper script to provide some functionalities that are not that straightforward as on MATLAB.

Filename | Description and main parameters |
---|---|

`NMFD.py` |
Nonnegative Matrix Factor Deconvolution with KLD and fixable components [2]. `V` , `numComp` , `numIter` , `numTemplateFrames` , `initW` , `initH` , `paramConstr` , `fixH` |

`NMF.py` |
Nonnegative matrix factorization with KLD as default cost function [3], [4]. `V` , `costFunc` , `numIter` , `numComp` . |

`NMFdiag.py` |
Nonnegative matrix factorization with enhanced diagonal continuity constraints [5]. `V` , `W0` , `H0` , `distmeas` , `numOfIter` , `fixW` , `continuity['length']` , `continuity['grid']` , `continuity['sparsen']` , `continuity['polyphony']` |

`NMFconv.py` |
Convolutive NMF with beta-divergence [6]. `V` , `numComp` , `numIter` , `numTemplateFrames` , `initW` , `initH` , `beta` , `sparsityWeight` , `uncorrWeight` |

`convModel.py` |
Convolutive NMF model implementing Eq. (4) from [7]. Note that it can also be used to compute the standard NMF model in case the number of time frames of the templates equals one. `W` , `H` |

`shiftOperator.py` |
Shift operator as described in Eq. (5) from [7]. It shifts the columns of a matrix to the left or the right and fills undefined elements with zeros. `A` , `shiftAmount` |

`initActivations.py` |
Initialization strategies for NMF activations, including `random` and `uniform` . The `pitched` strategy places gate-like activations at the frames where certain notes are active in the ground truth [8]. The strategy `drums` uses decaying impulses at these positions [7]. `numComp` , `numFrames` , `deltaT` , `pitches` , `onsets` , `durations` , `drums` , `decay` , `onsetOffsetTol` , `tolerance` , `strategy` |

`initTemplates.py` |
NMF template initialization strategies, including `random` and `uniform` . The strategy `pitched` uses comb-filter templates [8]. The `drums` strategy uses pre-extracted averaged spectra of typical drum types. `numComp` , `numBins` , `numTemplateFrames` , `pitches` , `drumTypes` , `strategy` |

`NEMA.py` |
Row-wise nonlinear exponential moving average. Used to introduce exponentially decaying slopes according to Eq. (3) from [9]. `lambda` |

`midi2freq.py` , `freq2midi.py` , `logFreqLogMag.py` |
Helper functions to convert between MIDI pitches and frequencies in Hz, as well as log-frequency and log-magnitude representations for visualization. `midi` , `freq` , `A` , `deltaF` , `binsPerOctave` , `upperFreq` , `lowerFreq` |

`LSEE_MSTFTM_GriffinLim.py` , `forwardSTFT.py` , `inverseSTFT.py` |
Reconstruct the time-domain signal by means of the frame-wise inverse FFT and overlap-add method described as least squares error estimation from the modified STFT magnitude (LSEE-MSTFT) in[10]. `blockSize` , `hopSize` , `anaWinFunc` , `synWinFunc` , `reconstMirror` , `appendFrame` , `analyticSig` , `numSamples` |

`alphaWienerFilter.py` |
Alpha-related soft masks for extracting sources from mixture. Details in [11] and experiments in [12]. `alpha` , `binarize` |

`utils.py` |
Additional helper functions on Python. |

The test script 'run_test.py' serves as the main workhorse to test different functions. Please note, that the unit tests require MATLAB to generate the reference files.

```
usage: run_test.py [-h] [-f <FunctionName>] [-m <MatlabPath>]
optional arguments:
-h, --help show this help message and exit
-f <FunctionName>, --function_name <FunctionName>
Function name to test
-m <MatlabPath>, --matlab_path <MatlabPath>
Path to matlab binary file
```

Here is an example call that shows how to run the unit test for the core NMFD function.

`python run_test.py -f NMFD -m /usr/local/MATLAB/R2019a/bin/matlab`

**This is the accompanying website for [1], where further details on the toolbox, dataset, and the applications are discussed.**

- Patricio López-Serrano, Christian Dittmar, Yiğitcan Özer, and Meinard Müller

**NMF Toolbox: Music Processing Applications of Nonnegative Matrix Factorization**

In Proceedings of the International Conference on Digital Audio Effects (DAFx), 2019.@inproceedings{LopezSerranoDOEM19_NMFToolbox_DAFx, author = {Patricio L\{'o}pez-Serrano and Christian Dittmar and Yi{ğ}itcan \{"O}zer and Meinard M\"uller}, booktitle = {Proceedings of the International Conference on Digital Audio Effects ({DAFx})}, title = {NMF Toolbox: Music Processing Applications of Nonnegative Matrix Factorization}, year = {2019}, month = {September}, address = {Birmingham, UK}, pages = {}, }

- Paris Smaragdis

**Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs**

In Proceedings of the International Conference on Independent Component Analysis and Blind Signal Separation (ICA): 494–499, 2004.@inproceedings{Smaragdis04_NMD, author = {Paris Smaragdis}, title = {Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs}, booktitle = {Proceedings of the International Conference on Independent Component Analysis and Blind Signal Separation {(ICA)}}, pages = {494--499}, address = {Granada, Spain}, year = {2004}, month = {September}, }

- Meinard Müller

**Fundamentals of Music Processing**

Springer Verlag, ISBN: 978-3-319-21944-8, 2015.@book{Mueller15_FMP_SPRINGER, author = {Meinard M{\"u}ller}, title = {Fundamentals of Music Processing}, type = {Monograph}, year = {2015}, isbn = {978-3-319-21944-8}, publisher = {Springer Verlag} }

- Daniel D. Lee and H. Sebastian Seung

**Learning the parts of objects by non-negative matrix factorization**

Nature, 401(6755): 788–791, 1999.@article{LeeS99_LearningPartsNMF_Nature, author={Daniel D. Lee and H. Sebastian Seung}, title={Learning the parts of objects by non-negative matrix factorization}, volume={401}, number={6755}, journal={Nature}, year={1999}, pages={788--791} }

- Jonathan Driedger, Thomas Prätzlich, and Meinard Müller

**Let It Bee — Towards NMF-Inspired Audio Mosaicing**

In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 350–356, 2015.@inproceedings{DriedgerPM15_AudioMosaicingNMF_ISMIR, author = {Jonathan Driedger and Thomas Pr{\"a}tzlich and Meinard M{\"u}ller}, title = {{L}et {I}t {B}ee -- {T}owards {NMF}-Inspired Audio Mosaicing}, booktitle = {Proceedings of the International Society for Music Information Retrieval Conference ({ISMIR})}, address = {M\'{a}laga, Spain}, year = {2015}, pages = {350--356}, }

- Andrzej Cichocki, Rafal Zdunek, Anh Huy Phan, and Shun-ichi Amari

**Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation**

John Wiley and Sons, 2009.@Book{CichockiZP_AlternateAlgorithmsNmf_Book, author = {Andrzej Cichocki and Rafal Zdunek and Anh Huy Phan and {Shun-ichi} Amari}, title = {Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation}, publisher = {John Wiley and Sons}, year = {2009} }

- Christian Dittmar and Meinard Müller

**Reverse Engineering the Amen Break — Score-Informed Separation and Restoration Applied to Drum Recordings**

IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(9): 1531–1543, 2016. DOI@article{DittmarMueller16_DrumSep_IEEE-ACM-TASLP, author = {Christian Dittmar and Meinard M{\"u}ller}, title = {Reverse Engineering the {A}men Break -- Score-Informed Separation and Restoration Applied to Drum Recordings}, journal = {{IEEE/ACM} Transactions on Audio, Speech, and Language Processing}, volume = {24}, number = {9}, pages = {1531--1543}, year = {2016}, doi = {10.1109/TASLP.2016.2567645}, }

- Jonathan Driedger, Harald Grohganz, Thomas Prätzlich, Sebastian Ewert, and Meinard Müller

**Score-Informed Audio Decomposition and Applications**

In Proceedings of the ACM International Conference on Multimedia (ACM-MM): 541–544, 2013. PDF Details@inproceedings{DriedgerGPEM13_AudioDecomposition_ACM-MM, author = {Jonathan Driedger and Harald Grohganz and Thomas Pr{\"a}tzlich and Sebastian Ewert and Meinard M{\"u}ller}, title = {Score-Informed Audio Decomposition and Applications}, booktitle = {Proceedings of the {ACM} International Conference on Multimedia ({ACM-MM})}, address = {Barcelona, Spain}, year = {2013}, pages = {541--544}, url-pdf = {2013_DriedgerGPEM_SourceSeparationInterface_ACM.pdf}, url-details = {https://www.audiolabs-erlangen.de/resources/2013-ACMMM-AudioDecomp/} }

- Christian Dittmar, Patricio López-Serrano, and Meinard Müller

**Unifying Local and Global Methods for Harmonic-Percussive Source Separation**

In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP): 176–180, 2018. Demo@inproceedings{DittmarLM18_HPSS_KAM_NMF_ICASSP, author = {Christian Dittmar and Patricio L{\'o}pez-Serrano and Meinard M{\"u}ller}, title = {Unifying Local and Global Methods for Harmonic-Percussive Source Separation}, booktitle = {Proceedings of the {IEEE} International Conference on Acoustics, Speech, and Signal Processing ({ICASSP})}, address = {Calgary, Canada}, month = {April}, year = {2018}, pages = {176--180}, url-demo={https://www.audiolabs-erlangen.de/resources/MIR/2018-ICASSP-HPSS_KAM_NMF}, }

- Daniel W. Griffin and Jae S. Lim

**Signal estimation from modified short-time Fourier transform**

IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(2): 236–243, 1984.@article{GriffinL84_SpecgramInversion_TASSP, author={Daniel W. Griffin and Jae S. Lim}, title={Signal estimation from modified short-time {F}ourier transform}, journal={{IEEE} Transactions on Acoustics, Speech, and Signal Processing}, year={1984}, volume={32}, number={2}, pages={236--243} }

- Antoine Liutkus and Roland Badeau

**Generalized Wiener filtering with fractional power spectrograms**

In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): 266–270, 2015.@inproceedings{LiutkusB15_WienerFilter_ICASSP, author = {Antoine Liutkus and Roland Badeau}, booktitle = {Proceedings of the {IEEE} International Conference on Acoustics, Speech and Signal Processing ({ICASSP})}, title = {Generalized {W}iener filtering with fractional power spectrograms}, year = {2015}, month = {April}, pages = {266--270}, address = {Brisbane, Australia}, }

- Christian Dittmar, Jonathan Driedger, Meinard Müller, and Jouni Paulus

**An Experimental Approach to Generalized Wiener Filtering in Music Source Separation**

In Proceedings of the European Signal Processing Conference (EUSIPCO), 2016.@inproceedings{DittmarDMP16_WienerFiltering_EUSIPCO, author = {Christian Dittmar and Jonathan Driedger and Meinard M{\"u}ller and Jouni Paulus}, title = {An Experimental Approach to Generalized {W}iener Filtering in Music Source Separation}, booktitle = {Proceedings of the European Signal Processing Conference ({EUSIPCO})}, address = {Budapest, Hungary}, year = {2016}, pages = {}, month = {August}, url-pdf = {} }