Optimized Velvet-Noise Decorrelator

Sebastian J. Schlecht, Benoit Alary, Vesa Välimäki and Emanuël A. P. Habets

21th International Conference on Digital Audio Effects (DAFx-18), Aveiro, Portugal.

Abstract

Decorrelation of audio signals is a critical step for spatial sound reproduction on multichannel configurations. Correlated signals yield a focused phantom source between the reproduction loudspeakers and may produce undesirable comb-filtering artifacts when the signal reaches the listener with small phase differences. Decorrelation techniques reduce such artifacts and extend the spatial auditory image by randomizing the phase of a signal while minimizing the spectral coloration. This paper proposes a method to optimize the decorrelation properties of a sparse noise sequence, called velvet noise, to generate short FIR decorrelation filters. The sparsity allows a highly efficient time-domain convolution. The listening test results demonstrate that the proposed optimization method can yield effective and colorless decorrelation filters. In comparison to a white noise sequence, the filters obtained using the proposed method produces equivalent broadband decorrelation while using 76% fewer operations for the convolution. Satisfactory results can be achieved with an even lower impulse density which decreases the computational cost by 88%.

plot_meanMagnitudeAllIR

Figure 1: Decorrelator sequences in the time domain: white noise $\mathrm{WN}$, exponential velvet noise $\mathrm{EVN}30$, and two optimized velvet-noise sequences $\mathrm{OVN}30$ and $\mathrm{OVN}15$. Positive impulses are indicated by $\bullet$ and negative gains by $\circ$ (except for $\mathrm{WN}$).

Optimization

A central challenge in decorrelation is the coloration caused by a non-flat magnitude response of the decorrelator. The continuous formulation plays a critical role in the optimization process as it allows continuous modification of both impulse location and impulse gain.

Figure 2: Single pulse optimization with corresponding phase slope of the frequency response.

The optimization problem is a constrained, non-linear and non-convex problem such that the optimal solution, i.e., the global minimum, is generally difficult to find. However, local minima can be attained by various gradient descent algorithms. Here we employ a variant of the interior-point method. The initial point is given by a randomly generated EVN.

Figure 3: Complete optimization procedure of a velvet noise sequence and the corresponding magnitude response.

Coloration Test

The first listening test evaluated how much the decorrelation filters colorate the input signal. The input signal was convolved with a single decorrelation filter, and the difference to the unprocessed signal was rated by the participants. In MUSHRA terminology, the unprocessed mono signal was the reference, and the input signal processed with a lowpass filter having a 3.5 kHz cutoff frequency was the anchor. The resulting mono signals were reproduced on both headphone channels. The main coloration was expected to be caused by the change in timbre and smearing of transients.

First set of decorrelators with drum signal
First set of decorrelators with guitar signal
First set of decorrelators with female vocalist
First set of decorrelators with speech signal

Stereo Quality Test

The second listening test evaluated the effectiveness of the decorrelators in extending the auditory source width and the overall spatial quality. The input signal was convolved with a decorrelation filter for each channel (left and right) and the participants were asked to rate the perceived width, localization at the center, and overall quality. In this test, no ideal reference could be defined, so the unprocessed mono signal was provided only for guidance. The lowpass filtered mono signal was given as the anchor. The resulting stereo signal was reproduced on the left and right headphone channels.

First set of decorrelators with drum signal
First set of decorrelators with guitar signal
First set of decorrelators with female vocalist
First set of decorrelators with speech signal