Multi-Layer Analysis and Structuring of Music Signals (METRUM)

Logo_DFG Teaser_METRUM Logo_FAU Logo_UniBonn

In the METRUM project, we developed robust and adaptive algorithms for analyzing and structuring music signals in the presence of acoustical and musical variabilities. The project was funded by the German Research Foundation. On this website, we summarize the project's main outcomes while providing links to project-related resources (data, demonstrators, websites) and publications.

Project Description

Multi-Layer Analysis and Structuring of Music Signals

Teaser_METRUM_details

Due to the diversity of music in form and content, the automated processing of music signals poses major challenges. In the METRUM project, we developed robust and adaptive algorithms for analyzing and structuring music signals in the presence of acoustical and musical variabilities. One main innovation of the METRUM project consisted of a multi-layered analysis and structuring approach, considering different aspects such as time, rhythm, dynamics, harmony, and timbre. In addition to these aspects, we exploited that a piece of music is often available in numerous interpretations. Simultaneously considering these aspects and interpretations stabilized the automatic analysis and segmentation results. In order to ensure practical relevance and sustainability, we developed user interfaces for multimodal navigation in music databases in cooperation with the Beethoven-Haus Bonn and the Saar University of Music. One such interface was implemented for the Digital Beethoven House and made accessible to the general museum public and a specialist audience.

Projektbeschreibung

Mehrschichtige Analyse und Strukturierung von Musiksignalen

Teaser_METRUM_details

Bei der automatisierten Verarbeitung von Musiksignalen steht man aufgrund der Vielfältigkeit von Musik in Form und Inhalt vor großen Herausforderungen. Im METRUM-Projekt wurden robuste und adaptive Analyse- und Strukturierungsalgorithmen für Musiksignale mit dem Ziel entwickelt werden, akustisch und musikalisch begründete Variabilitäten in den Griff zu bekommen. Die wesentliche Innovation des METRUM-Projekts bestand in einer mehrschichtigen Analyse und Strukturierung unter simultaner Berücksichtigung unterschiedlicher Aspekte wie z.B. Zeit, Rhythmus, Dynamik, Harmonie und Klangfarbe. Neben diesen Aspekten wurde ausgenutzt, dass ein Musikstück oft in zahlreichen Interpretationen vorliegt. Das simultane Einbeziehen dieser Aspekte und Interpretationen führte zu einer wesentlichen Stabilisierung der automatischen Analyse- und Segmentierungsergebnisse. Um Praxisrelevanz und Nachhaltigkeit sicherzustellen, wurde in Kooperation mit dem Beethoven-Haus Bonn und der Hochschule für Musik Saar Benutzerschnittstellen zur multimodalen Navigation in Musikdatenbeständen anhand unterschiedlicher Strukturierungskriterien entwickelt. Eine solche Schnittstelle wurde für das Digitale Beethoven-Haus implementiert und sowohl dem breiten Museumspublikum als auch einem Fachpublikum zugänglich gemacht.

Projected-Related Resources and Demonstrators

The following list provides an overview of the most important publicly accessible sources created in the METRUM project:

Projected-Related Publications

The following publications reflect the main scientific contributions of the work carried out in the SeReCo project.

  1. Balaji Thoshkahna, Meinard Müller, Venkatesh Kulkarni, and Nanzhu Jiang
    Novel Audio Features for Capturing Tempo Salience in Music Recordings
    In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP): 181–185, 2015. PDF
    @inproceedings{ThoshkahnaMKJ15_TempoSalience_ICASSP,
    author    = {Balaji Thoshkahna and Meinard M{\"u}ller and Venkatesh Kulkarni and Nanzhu Jiang},
    title     = {Novel Audio Features for Capturing Tempo Salience in Music Recordings},
    booktitle  = {Proceedings of the {IEEE} International Conference on Acoustics, Speech, and Signal Processing ({ICASSP})},
    address  = {Brisbane, Australia},
    year       = {2015},
    pages    = {181--185},
    url-pdf   = {2015_ThoshkahnaMVJ_TempoClarity_ICASSP.pdf}
    }
  2. Harald Grohganz, Michael Clausen, Nanzhu Jiang, and Meinard Müller
    Converting path structures into block structures using eigenvalue decompositions of self-similarity matrices
    In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 209–214, 2013. PDF
    @inproceedings{GrohganzCJM13_BlockStructureSSM_ISMIR,
    author    = {Harald Grohganz and Michael Clausen and Nanzhu Jiang and Meinard M{\"u}ller},
    title     = {Converting path structures into block structures using eigenvalue decompositions of self-similarity matrices},
    booktitle = {Proceedings of the International Society for Music Information Retrieval Conference  ({ISMIR})},
    address   = {Curitiba, Brazil},
    year      = {2013},
    pages     = {209--214},
    url-pdf   = {2013_GrohganzCJM_PathBlock_ISMIR.pdf}
    }
  3. Peter Grosche, Meinard Müller, and Joan Serrà
    Towards Cover Group Thumbnailing
    In Proceedings of the ACM International Conference on Multimedia (ACM-MM): 613–616, 2013. PDF
    @inproceedings{GroscheMS13_CoverGroupThumb_ACM-MM,
    author    = {Peter Grosche and Meinard M{\"u}ller and Joan Serr{\`a}},
    title     = {Towards Cover Group Thumbnailing},
    booktitle = {Proceedings of the {ACM} International Conference on Multimedia ({ACM-MM})},
    address   = {Barcelona, Spain},
    year      = {2013},
    pages     = {613--616},
    url-pdf   = {2013_GroscheMuellerSerra_CoverThumbnailing_ACM-MM.pdf}
    }
  4. Nanzhu Jiang and Meinard Müller
    Automated methods for analyzing music recordings in sonata form
    In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 595–600, 2013. PDF
    @inproceedings{JiangMueller13_SonataForm_ISMIR,
    author    = {Nanzhu Jiang and Meinard M{\"u}ller},
    title     = {Automated methods for analyzing music recordings in sonata form},
    booktitle = {Proceedings of the International Society for Music Information Retrieval Conference  ({ISMIR})},
    address   = {Curitiba, Brazil},
    year      = {2013},
    pages     = {595--600},
    url-pdf   = {2013_JiangMueller_StructureSonataBeethoven_ISMIR.pdf}
    }
  5. Nanzhu Jiang and Meinard Müller
    Towards Efficient Audio Thumbnailing
    In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP): 5192–5196, 2014. PDF
    @inproceedings{JiangM14_ThumbnailEfficient_ICASSP,
    author     = {Nanzhu Jiang and Meinard M{\"u}ller},
    title      = {Towards Efficient Audio Thumbnailing},
    booktitle  = {Proceedings of the {IEEE} International Conference on Acoustics, Speech, and Signal Processing ({ICASSP})},
    address  = {Florence, Italy},
    pages    = {5192--5196},
    year       = {2014},
    url-pdf   = {2014_JiangMueller_ScapePlotMultiRes_ICASSP.pdf}
    }
  6. Nanzhu Jiang and Meinard Müller
    Estimating Double Thumbnails for Music Recordings
    In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP): 146–150, 2015. PDF
    @inproceedings{JiangM15_DoubleThumbnail_ICASSP,
    author    = {Nanzhu Jiang and Meinard M{\"u}ller},
    title     = {Estimating Double Thumbnails for Music Recordings},
    booktitle  = {Proceedings of the {IEEE} International Conference on Acoustics, Speech, and Signal Processing ({ICASSP})},
    address  = {Brisbane, Australia},
    year       = {2015},
    pages    = {146--150},
    url-pdf   = {2015_JiangMueller_JointThumb_ICASSP.pdf}
    }
  7. Meinard Müller, Peter Grosche, and Nanzhu Jiang
    A Segment-Based Fitness Measure for Capturing Repetitive Structures of Music Recordings
    In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 615–620, 2011. PDF Details
    @inproceedings{MuellerGJ11_MusicStructureFitness_ISMIR,
    author    = {Meinard M{\"u}ller and Peter Grosche and Nanzhu Jiang},
    title     = {A Segment-Based Fitness Measure for Capturing Repetitive Structures of Music Recordings},
    booktitle = {Proceedings of the International Society for Music Information Retrieval Conference  ({ISMIR})},
    address = {Miami, Florida, USA},
    year      = {2011},
    pages     = {615--620},
    url-pdf   = {2011_MuellerGroscheJiang_AudioStructure_ISMIR.pdf},
    url-details = {https://www.audiolabs-erlangen.de/resources/MIR/SMtoolbox}
    }
  8. Meinard Müller and Nanzhu Jiang
    A scape plot representation for visualizing repetitive structures of music recordings
    In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 97–102, 2012. PDF
    @inproceedings{MuellerJiang12_ScapePlot_ISMIR,
    author    = {Meinard M{\"u}ller and Nanzhu Jiang},
    title     = {A scape plot representation for visualizing repetitive structures of music recordings},
    booktitle = {Proceedings of the International Society for Music Information Retrieval Conference  ({ISMIR})},
    address   = {Porto, Portugal},
    year      = {2012},
    pages     = {97-102},
    url-pdf   = {2012_MuellerJiang_StructureVisualization_ISMIR.pdf}
    }
  9. Meinard Müller, Nanzhu Jiang, and Harald Grohganz
    SM Toolbox: MATLAB implementations for computing and enhancing similarity matrices
    In Proceedings of the 53rd AES Conference on Semantic Audio, 2014. PDF Details
    @inproceedings{MuellerJG14_SM-Toolbox_AES,
    author    = {Meinard M{\"u}ller and Nanzhu Jiang and Harald Grohganz},
    title     = {{SM} {T}oolbox: {MATLAB} implementations for computing and enhancing similarity matrices},
    booktitle = {Proceedings of the 53rd {AES} Conference on Semantic Audio},
    address   = {London, UK},
    year      = {2014},
    url-pdf   = {2014_MuellerJiangGrohganz_ToolboxSM_AES.pdf},
    url-details = {https://www.audiolabs-erlangen.de/resources/MIR/SMtoolbox}
    }
  10. Meinard Müller, Nanzhu Jiang, Harald Grohganz, and Michael Clausen
    Strukturanalyse für Musiksignale
    In Proceedings of the GI Jahrestagung: 2943–2957, 2013. PDF Demo
    @inproceedings{MuellerDE13_Strukturanalyse_GI,
    author    = {Meinard M{\"u}ller and Nanzhu Jiang and Harald Grohganz and Michael Clausen},
    title     = {{S}trukturanalyse f{\"u}r {M}usiksignale},
    booktitle = {Proceedings of the GI Jahrestagung},
    address   = {Koblenz, Germany},
    year      = {2013},
    pages     = {2943--2957},
    url-pdf   = {2013_MuellerJiangGrohganzClausen_AudioStruktur_GI.pdf},
    url-demo = {https://www.audiolabs-erlangen.de/resources/MIR/SMtoolbox}
    }
  11. Meinard Müller, Nanzhu Jiang, and Peter Grosche
    A Robust Fitness Measure for Capturing Repetitions in Music Recordings With Applications to Audio Thumbnailing
    IEEE Transactions on Audio, Speech, and Language Processing, 21(3): 531–543, 2013. PDF Demo
    @article{MuellerJG13_StructureAnaylsis_IEEE-TASLP,
    author    = {Meinard M{\"u}ller and Nanzhu Jiang and Peter Grosche},
    title     = {A Robust Fitness Measure for Capturing Repetitions in Music Recordings With Applications to Audio Thumbnailing},
    journal   = {IEEE Transactions on Audio, Speech, and Language Processing},
    volume    = {21},
    number    = {3},
    year      = {2013},
    pages     = {531-543},
    url-pdf   = {https://ieeexplore.ieee.org/document/6353546},
    url-demo = {https://www.audiolabs-erlangen.de/resources/MIR/SMtoolbox}
    }
  12. Meinard Müller, Thomas Prätzlich, and Jonathan Driedger
    A cross-version approach for stabilizing tempo-based novelty detection
    In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 427–432, 2012. PDF
    @inproceedings{MuellerPD12_TempoCrossVersion_ISMIR,
    author    = {Meinard M{\"u}ller and Thomas Pr{\"a}tzlich and Jonathan Driedger},
    title     = {A cross-version approach for stabilizing tempo-based novelty detection},
    booktitle = {Proceedings of the International Society for Music Information Retrieval Conference  ({ISMIR})},
    address   = {Porto, Portugal},
    year      = {2012},
    pages     = {427--432},
    url-pdf   = {2012_MuellerPraetzlichDriedger_TempoCrossVersion_ISMIR.pdf}
    }
  13. Joan Serrà, Meinard Müller, Peter Grosche, and Josep Llu ' is Arcos
    Unsupervised detection of music boundaries by time series structure features
    In Proceedings of the AAAI International Conference on Artificial Intelligence, 2012. PDF
    @inproceedings{SerraMGA12_BoundaryDetection_AAAI,
    author      = {Joan Serr{\`a} and Meinard M{\"u}ller and Peter Grosche and Josep Llu\'{\i}s Arcos},
    title       = {Unsupervised detection of music boundaries by time series structure features},
    booktitle   = {Proceedings of the AAAI International Conference on Artificial Intelligence},
    address     = {Toronto, Ontario, Canada},
    year        = {2012},
    ee        = {http://www.aaai.org/ocs/index.php/AAAI/AAAI12/paper/view/4907},
    url-pdf   = {2012_SerraMGA_TimeSeriesStructureFeature_AAAI.pdf}
    }
  14. Joan Serrà, Meinard Müller, Peter Grosche, and Josep Ll. Arcos
    Unsupervised Music Structure Annotation by Time Series Structure Features and Segment Similarity
    IEEE Transactions on Multimedia, 16(5): 1229–1240, 2014. PDF DOI
    @article{SerraMFA14_AudioStructure_IEEE-TMM,
    author    = {Joan Serr{\`a} and Meinard M{\"u}ller and Peter Grosche and Josep Ll. Arcos},
    title     = {Unsupervised Music Structure Annotation by Time Series Structure Features and Segment Similarity},
    journal   = {IEEE Transactions on Multimedia},
    volume    = {16},
    number    = {5},
    year      = {2014},
    pages     = {1229--1240},
    doi =       {10.1109/TMM.2014.2310701},
    url-pdf   = {https://ieeexplore.ieee.org/document/6763101}
    }

Projected-Related Ph.D. Thesis

  1. Nanzhu Jiang
    Repetition-based Structure Analysis of Music Recordings
    PhD Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg, 2015. PDF
    @phdthesis{Jiang15_RepetitionStructureAnalysisMusic_PhD,
    author      = {Nanzhu Jiang},
    year        = {2015},
    title       = {Repetition-based Structure Analysis of Music Recordings},
    school      = {Friedrich-Alexander-Universit{\"a}t Erlangen-N{\"u}rnberg},
    url-pdf = {2015_Jiang_StructureAnalysis_PhD-Thesis.pdf}
    }
  2. Harald G. Grohganz
    Algorithmen zur strukturellen Analyse von Musikaufnahmen
    PhD Thesis, Rheinische Friedrich-Wilhelms-Universität Bonn, 2016. PDF Details
    @phdthesis{Grohganz15_StrukturAnalyseMusik_PhD,
    author      = {Harald G. Grohganz},
    year        = {2016},
    title       = {Algorithmen zur strukturellen Analyse von Musikaufnahmen},
    school      = {Rheinische Friedrich-Wilhelms-Universit{\"a}t Bonn},
    url-details = {https://bonndoc.ulb.uni-bonn.de/xmlui/handle/20.500.11811/6434},
    url-pdf = {2015_Grohganz_StructureAnalysis_PhD-Thesis.pdf}
    }