Let it Bee - Towards NMF-inspired Audio Mosaicing

This is the accompanying website for the paper "Let it Bee - Towards NMF-inspired Audio Mosaicing" by Jonathan Driedger, Thomas Prätzlich, and Meinard Müller.

  1. Jonathan Driedger, Thomas Prätzlich and Meinard Müller
    Let It Bee — Towards NMF-Inspired Audio Mosaicing
    In Proceedings of the International Conference on Music Information Retrieval (ISMIR): 350—356, 2015. PDF
    @inproceedings{DriedgerPM15_AudioMosaicingNMF_ISMIR,
    author    = {Jonathan Driedger and Thomas Pr{\"a}tzlich and Meinard M{\"u}ller},
    title     = {{L}et {I}t {B}ee -- Towards {NMF}-Inspired Audio Mosaicing},
    booktitle = {Proceedings of the International Conference on Music Information Retrieval ({ISMIR})},
    address   = {Malaga, Spain},
    year      = {2015},
    pages     = {350--356},
    url-pdf   = {2015_DriedgerPM_AudioMosaicingNMF_ISMIR.pdf},
    }

Abstract

teaser

A swarm of bees buzzing "Let it be" by the Beatles or the wind gently howling the romantic "Gute Nacht" by Schubert - these are examples of audio mosaics as we want to create them. Given a target and a source recording, the goal of audio mosaicing is to generate a mosaic recording that conveys musical aspects (like melody and rhythm) of the target, using sound components taken from the source. In this work, we propose a novel approach for automatically generating audio mosaics with the objective to preserve the timbre of the source in the mosaic. Inspired by algorithms for non-negative matrix factorization (NMF), our idea is to use update rules to learn an activation matrix that, when multiplied with the spectrogram of the source recording, resembles the spectrogram of the target recording. However, when applying the original NMF procedure, the resulting mosaic does not adequately reflect the timbre of the source. As our main technical contribution, we propose an extended set of update rules for the iterative learning procedure that supports the development of sparse diagonal structures in the activation matrix. We show how these structures better retains timbral characteristics of the source in the resulting mosaic.

4.3 Audio Examples

In the following you find the audio examples corresponding to Table 1 in the paper. You can also download all files as [.zip].

Mosaic of LetItBe with Bees

[Target] LetItBe
[Source] Bees

Click the button in the middle to activate the player.

Additional material

Mosaic of GuteNacht with Wind

[Target] GuteNacht
[Source] Wind

Click the button in the middle to activate the player.

Additional material

Mosaic of FunkJazz with Whales

[Target] FunkJazz
[Source] Whales

Click the button in the middle to activate the player.

Additional material

Mosaic of Stepdad with Chainsaw

[Target] Stepdad
[Source] Chainsaw

Click the button in the middle to activate the player.

Additional material

Mosaic of Freischütz with AirRaid

[Target] Freischütz
[Source] AirRaid

Click the button in the middle to activate the player.

Additional material

Mosaic of Vermont with RaceCars

[Target] Vermont
[Source] RaceCars

Click the button in the middle to activate the player.

Additional material

HAMR

The core ideas of this work were born at the "Hacking Audio and Music Research" Hackathon event hold at ISMIR 2014 in Taipei. Check out our project from back then here.

hamr-ismir-logo