libtsm - a Python Library for Time-Scale Modification and Pitch-Shifting

This notebook demonstrates the functionalities of libtsm - a Python library for Time-Scale Modification (TSM) and pitch-shifting. It is based on a re-implementation of the Matlab TSM toolbox by Jonathan Driedger and Meinard Müller.

If you are using libtsm for your work, please cite:

Sebastian Rosenzweig, Simon Schwär, Jonathan Driedger, and Meinard Müller:
Adaptive Pitch-Shifting with Applications to Intonation Adjustment in A Cappella Recordings
Proceedings of the International Conference on Digital Audio Effects (DAFx), 2021.

Further contributors:

  • Edgar Suarez
  • El Mehdi Lemnaouar
  • Miguel Gonzales

In [1]:
import numpy as np
import librosa
import IPython.display as ipd
import scipy.io as sio

import libtsm
In [2]:
# Choose File
#filename = 'Bongo'
#filename = 'BeethovenOrchestra'
#filename = 'BeethovenPiano'
filename = 'CastanetsViolin'
#filename = 'DrumSolo'
#filename = 'Glockenspiel'
#filename = 'Stepdad'
#filename = 'Jazz'
#filename = 'Pop'
#filename = 'SingingVoice'
#filename = 'SynthMono'
#filename = 'SynthPoly'
#filename = 'Scale_Cmajor_Piano'

directory = './data/'
audio_file = directory + filename + '.wav'
x, Fs = librosa.load(audio_file)
#x = 0.5 * np.sin(2*np.pi*440*np.arange(0, len(x)/Fs, 1/Fs))

print('Original signal', flush=True)
ipd.display(ipd.Audio(x, rate=Fs, normalize=True))
Original signal

Overlap-Add (OLA)

In [3]:
alpha = 1.8  # time-stretch factor
y_ola = libtsm.wsola_tsm(x, alpha, tol=0)

print('Original signal', flush=True)
ipd.display(ipd.Audio(x, rate=Fs, normalize=True))

print('Time-Scale modified signal with OLA', flush=True)
ipd.display(ipd.Audio(y_ola[:, 0], rate=Fs, normalize=True))
Original signal
Time-Scale modified signal with OLA

Waveform Similarity Overlap-Add (WSOLA)

In [4]:
alpha = 1.8  # time-stretch factor
y_wsola = libtsm.wsola_tsm(x, alpha)

print('Original signal', flush=True)
ipd.display(ipd.Audio(x, rate=Fs, normalize=True))

print('Time-Scale modified signal with WSOLA', flush=True)
ipd.display(ipd.Audio(y_wsola[:, 0], rate=Fs, normalize=True))
Original signal
Time-Scale modified signal with WSOLA

Phase Vocoder TSM

In [5]:
alpha = 1.8  # Time stretching factor
y_pv = libtsm.pv_tsm(x, alpha, phase_locking=False)
y_pvpl = libtsm.pv_tsm(x, alpha, phase_locking=True)

print('Original signal', flush=True)
ipd.display(ipd.Audio(x, rate=Fs, normalize=True))

print('Time-Scale modified signal with Phase Vocoder', flush=True)
ipd.display(ipd.Audio(y_pv[:, 0], rate=Fs, normalize=True))

print('Time-Scale modified signal with Phase Vocoder (phase locking)', flush=True)
ipd.display(ipd.Audio(y_pvpl[:, 0], rate=Fs, normalize=True))
Original signal
Time-Scale modified signal with Phase Vocoder
Time-Scale modified signal with Phase Vocoder (phase locking)

TSM based on Harmonic-Percussive Separation

In [6]:
alpha = 1.8

# Harmonic-Percussive Separation
x_harm, x_perc = libtsm.hps(x)

# Phase Vocoder for harmonic part
y_harm = libtsm.pv_tsm(x_harm, alpha)

# OLA for percussive part
y_perc = libtsm.wsola_tsm(x_perc, alpha, tol=0)

# Synthesis
y = y_harm + y_perc


print('Original signal', flush=True)
ipd.display(ipd.Audio(x, rate=Fs, normalize=True))

print('Harmonic part', flush=True)
ipd.display(ipd.Audio(x_harm[:, 0], rate=Fs, normalize=True))

print('Percussive part', flush=True)
ipd.display(ipd.Audio(x_perc[:, 0], rate=Fs, normalize=True))

print('Time-Scale modified harmonic part', flush=True)
ipd.display(ipd.Audio(y_harm[:, 0], rate=Fs, normalize=True))

print('Time-Scale modified percussive part', flush=True)
ipd.display(ipd.Audio(y_perc[:, 0], rate=Fs, normalize=True))

print('Time-Scale modified signal (HPS-TSM)', flush=True)
ipd.display(ipd.Audio(y[:, 0], rate=Fs, normalize=True))
Original signal
Harmonic part