Evaluating Salience Representations for Cross-Modal Retrieval of Western Classical Music Recordings

This is the accompanying website for the following paper:

  1. Frank Zalkow, Stefan Balke, and Meinard Müller
    Evaluating Salience Representations for Cross-Modal Retrieval of Western Classical Music Recordings
    In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP): 311–335, 2019. PDF Details Presentation
    @inproceedings{ZalkowBM19_SalienceRetrieval_ICASSP,
    author      = {Frank Zalkow and Stefan Balke and Meinard M{\"u}ller},
    title       = {Evaluating Salience Representations for Cross-Modal Retrieval of Western Classical Music Recordings},
    booktitle   = {Proceedings of the {IEEE} International Conference on Acoustics, Speech, and Signal Processing ({ICASSP})},
    address     = {Brighton, United Kingdom},
    year        = {2019},
    pages       = {311--335},
    url         = {https://ieeexplore.ieee.org/document/8683609},
    url-pdf     = {https://ieeexplore.ieee.org/document/8683609},
    url-details = {https://www.audiolabs-erlangen.de/resources/MIR/2019-ICASSP-BarlowMorgenstern/},
    url-presentation = {https://www.audiolabs-erlangen.de/content/05-fau/assistant/00-zalkow/01-publications/2019_poster_ZalkowBM_SalienceThemeRetrieval_ICASSP.pdf}
    }

Abstract

In this paper, we consider a cross-modal retrieval scenario of Western classical music. Given a short monophonic musical theme in symbolic notation as query, the objective is to find relevant audio recordings in a database. A major challenge of this retrieval task is the possible difference in the degree of polyphony between the monophonic query and the music recordings. Previous studies for popular music addressed this issue by performing the cross-modal comparison based on predominant melodies extracted from the recordings. For Western classical music, however, this approach is problematic since the underlying assumption of a single predominant melody is often violated. Instead of extracting the melody explicitly, another strategy is to perform the cross-modal comparison directly on the basis of melody-enhanced salience representations. As the main contribution of this paper, we evaluate several conceptually different salience representations for our cross-modal retrieval scenario. Our extensive experimental results, which have been made available on a website, comprise more than 2000 musical themes and 100 hours of audio recordings.

Barlow Morgenstern Overview and Retrieval Results

The following table displays all queries of the set BM-Medium with metadata and the ranks achieved by several feature representations. It contains the following columns:

ComposerID
Identifier for the composer, usually the surname (without spaces or special characters)
WorkID
Identifier for the work, usually relating to the catalogue of works of the corresponding composer
PerformanceID
Identifier for the performer, usually a chosen “main performer”, e.g. the pianist in case of a piano solo performance or the conductor in case of an orchestral performance
BM_Theme_ID
Identifier of the theme as specified in [1]
\(\mathcal{C}_{\mathrm{IIR}}^{}\), etc.
Rank achieved with the corresponding feature representation

A detailed view with images of the feature representations is accessible by clicking on the icon. Columns are sortable by clicking on the corresponding header cell. The search form allows searching substrings in all columns.

Statistics for BM-Medium

Number of Queries per Composer

Duration of Queries

Duration of Database Documents

Further Links

Two selected examples

References

  1. Harold Barlow and Sam Morgenstern
    A Dictionary of Musical Themes
    Crown Publishers, Inc., 1975.
    @book{BarlowM75_MusicalThemes_BOOK,
    Author = {Harold Barlow and Sam Morgenstern},
    Edition = {Revised edition},
    Publisher = {Crown Publishers, Inc.},
    Title = {A Dictionary of Musical Themes},
    Year = {1975}
    }
  2. Stefan Balke, Vlora Arifi-Müller, Lukas Lamprecht, and Meinard Müller
    Retrieving Audio Recordings Using Musical Themes
    In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP): 281–285, 2016.
    @inproceedings{BalkeALM16_BarlowRetrieval_ICASSP,
    author    = {Stefan Balke and Vlora Arifi-M{\"u}ller and Lukas Lamprecht and Meinard M{\"u}ller},
    title     = {Retrieving Audio Recordings Using Musical Themes},
    booktitle = {Proceedings of the {IEEE} International Conference on Acoustics, Speech, and Signal Processing ({ICASSP})},
    address   = {Shanghai, China},
    year      = {2016},
    pages     = {281--285},
    }
  3. Frank Zalkow, Stefan Balke, and Meinard Müller
    Evaluating Salience Representations for Cross-Modal Retrieval of Western Classical Music Recordings
    In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP): 311–335, 2019. PDF Details Presentation
    @inproceedings{ZalkowBM19_SalienceRetrieval_ICASSP,
    author      = {Frank Zalkow and Stefan Balke and Meinard M{\"u}ller},
    title       = {Evaluating Salience Representations for Cross-Modal Retrieval of Western Classical Music Recordings},
    booktitle   = {Proceedings of the {IEEE} International Conference on Acoustics, Speech, and Signal Processing ({ICASSP})},
    address     = {Brighton, United Kingdom},
    year        = {2019},
    pages       = {311--335},
    url         = {https://ieeexplore.ieee.org/document/8683609},
    url-pdf     = {https://ieeexplore.ieee.org/document/8683609},
    url-details = {https://www.audiolabs-erlangen.de/resources/MIR/2019-ICASSP-BarlowMorgenstern/},
    url-presentation = {https://www.audiolabs-erlangen.de/content/05-fau/assistant/00-zalkow/01-publications/2019_poster_ZalkowBM_SalienceThemeRetrieval_ICASSP.pdf}
    }