Source Separation and Restoration of Sound Components in Music Recordings (SeReCo)

The goal of the SeReCo2 project is develop matrix decomposition and source separation techniques for decomposing a music recording into musically meaningful (e.g. note-related) sound components. The project is funded by the German Research Foundation. On this website, we summarize the project's main objectives and provide links to project-related resources (data, demonstrators, websites) and publications.

Project Description


This is a follow-up project, which continues the previous DFG-funded project "Source Separation and Restoration of Drum Sound Components in Music Recordings" [MU 2686/10-1] aiming at the development of techniques for separating and restoring sound events as occurring in complex music recordings. In the first phase ([MU 2686/10-1]), we focused on percussive sound sources, where we decomposed a drum recording into individual drum sound events. Using Non-Negative Matrix Factor Deconvolution (NMFD) as our central methodology, we studied how to generate and integrate audio- and score-based side information to guide the decomposition. We tested our approaches within concrete application scenarios, including audio remixing (redrumming) and swing ratio analysis of jazz music. In the second phase of the project ([MU 2686/10-2]), our goals are significantly extended. First, we want to go beyond the drum scenario by considering other challenging music scenarios, including piano music (e.g., Beethoven Sonatas, Chopin Mazurkas), piano songs (e.g., Klavierlieder by Schubert), and string music (e.g., Beethoven String Quartets). In these scenarios, our goal is to decompose a music recording into individual note-related sound events. As our central methodology, we develop a unifying audio decomposition framework that combines classical signal processing and machine learning with recent deep learning (DL) approaches. Furthermore, we adopt generative DL techniques for improving the perceptual quality of restored sound events. As a general goal, we investigate how prior knowledge, such as score information can be integrated into DL-based learning to improve the interpretability of the trained models.



Quellentrennung und Wiederherstellung von Klangkomponenten in Musikaufnahmen

Dieses Projekt ist eine Fortsetzung von [MU 2686/10-1] mit dem Ziel, Techniken zur Trennung und Wiederherstellung von Klangereignissen, wie sie bei komplexen Musikaufnahmen auftreten, zu entwickeln. In der ersten Phase ([MU 2686/10-1]) konzentrierten wir uns auf die Separation von Schlagzeugaufnahmen in individuelle Schlagzeugklangkomponenten. Unter Verwendung von Techniken der nicht-negativen Matrixzerlegung haben wir systematisch untersucht, wie sich Audio- und Notentext-basierte Seiteninformation generieren, integrieren und zur Steuerung der Zerlegung ausnutzen lässt. Unsere Verfahren wurden im Kontext konkreter Anwendungsszenarien wie dem Audio-Remixing (Redrumming) und der Swing-Analyse von Jazzmusik getestet. In der zweiten Projektphase ([MU 2686/10-2]) erweitern wir unsere Ziele erheblich. Zunächst gehen wir über das Schlagzeugszenario hinaus, indem wir andere komplexe Musikszenarien betrachten, einschließlich Klaviermusik (z.B. Beethoven-Sonaten, Chopin-Mazurkas), Klavierlieder (z. B. von Schubert) und Streichmusik (z.B. Beethoven-Streichquartette). In diesen Szenarien besteht unser Ziel darin, eine Musikaufnahme in einzelne notenbezogene Klangereignisse zu zerlegen. Als zentrale Methodik kombinieren wir klassische Techniken der Signalverarbeitung und des maschinellen Lernens mit aktuellen Deep-Learning-Ansätzen (DL). Weiterhin entwickeln wir generative DL-basierte Methoden, um die perzeptuelle Qualität der separierten Klangereignisse zu verbessern. Als ein übergeordnetes Ziel widmen wir uns der Frage, wie sich musikalische Vorkenntnisse in DL-basierte Lernverfahren integrieren lassen, um auf diese Weise die Interpretierbarkeit der trainierten Modelle zu verbessern.

Projected-Related Resources and Demonstrators

The following list provides an overview of the most important publicly accessible sources created in the SeReCo2 project:

Projected-Related Publications

The following publications reflect the main scientific contributions of the work carried out in the SeReCo2 project.

  1. Yigitcan Özer, Hans-Ulrich Berendes, Vlora Arifi-Müller, Fabian-Robert Stöter, and Meinard Müller
    Notewise Evaluation for Music Source Separation: A Case Study for Separated Piano Tracks
    In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR) (to appear), 2024. Demo
  2. Yigitcan Özer, Leo Brütting, Simon Schwär, and Meinard Müller
    libsoni: A Python Toolbox for Sonifying Music Annotations and Feature Representations
    Journal of Open Source Software (JOSS), 9(96): 1–6, 2024. PDF Demo DOI
  3. Yigitcan Özer and Meinard Müller
    Source Separation of Piano Concertos Using Musically Motivated Augmentation Techniques
    IEEE/ACM Transactions on Audio, Speech, and Language Processing, 32: 1214–1225, 2024. PDF Details Demo DOI
  4. Yigitcan Özer, Simon Schwär, Vlora Arifi-Müller, Jeremy Lawrence, Emre Sen, and Meinard Müller
    Piano Concerto Dataset (PCD): A Multitrack Dataset of Piano Concertos
    Transaction of the International Society for Music Information Retrieval (TISMIR), 6(1): 75–88, 2023. PDF Details Demo DOI
  5. Nazif Can Tamer, Yigitcan Özer, Meinard Müller, and Xavier Serra
    High-Resolution Violin Transcription Using Weak Labels
    In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 223–230, 2023. PDF Details DOI
  6. Yigitcan Özer and Meinard Müller
    A Computational Approach for Creating Orchestra Tracks from Piano Concerto Recordings
    In Proceedings of the Deutsche Jahrestagung für Akustik (DAGA): 1370–1373, 2023. PDF
  7. Nazif Can Tamer, Xavier Serra, Yigitcan Özer, and Meinard Müller
    TAPE: An End-to-End Timbre-Aware Pitch Estimator
    In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP): 1–5, 2023. DOI
  8. Meinard Müller, Rachel Bittner, Juhan Nam, Michael Krause, and Yigitcan Özer
    Deep Learning and Knowledge Integration for Music Audio Analysis (Dagstuhl Seminar 22082)
    Dagstuhl Reports, 12(2): 103–133, 2022. PDF Details DOI
  9. Yigitcan Özer and Meinard Müller
    Source Separation of Piano Concertos with Test-Time Adaptation
    In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 493–500, 2022. PDF Demo DOI
  10. Yigitcan Özer, Michael Krause, and Meinard Müller
    Using the Sync Toolbox for an Experiment on High-Resolution Music Alignment
    In Demos and Late Breaking News of the International Society for Music Information Retrieval Conference (ISMIR), 2021. PDF
  11. Yigitcan Özer, Matej Istvanek, Vlora Arifi-Müller, and Meinard Müller
    Using Activation Functions for Improving Measure-Level Audio Synchronization
    In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 749–756, 2022. PDF DOI
  12. Yigitcan Özer, Jonathan Hansen, Tim Zunner, and Meinard Müller
    Investigating Nonnegative Autoencoders for Efficient Audio Decomposition
    In Proceedings of the European Signal Processing Conference (EUSIPCO): 254–258, 2022. Details
  13. Meinard Müller, Yigitcan Özer, Michael Krause, Thomas Prätzlich, and Jonathan Driedger
    Sync Toolbox: A Python Package for Efficient, Robust, and Accurate Music Synchronization
    Journal of Open Source Software (JOSS), 6(64): 1–4, 2021. PDF Demo DOI
Projected-Related Ph.D. Theses

  1. Yigitcan Özer
    Source Separation of Piano Music Recordings
    PhD Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), 2024. PDF Details
