An Informed Spatial Filter for Dereverberation in the Spherical Harmonic Domain

S. Braun, D. P. Jarrett, J. Fischer and E. A. P. Habets

Published in the Proc. of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013.


In speech communication systems the received microphone signals are commonly degraded by reverberation and ambient noise that can decrease the fidelity and intelligibility of a desired speaker. Reverberation can be modeled as non-stationary diffuse sound which is not directly observable. In this work, we derive a multichannel Wiener filter in the spherical harmonic domain to reduce both reverberation and noise. The filter depends on the direction-of-arrival of the direct sound of the desired speaker and an interference power spectral density matrix for which an estimator is developed. The resulting informed spatial filter incorporates instantaneous information about the diffuseness of the sound field into the design of the filter. In addition, it is shown how the proposed filter relates to the well-known robust minimum variance distortionless response filter that is also used for comparison in the evaluation. Experimental results show that the proposed spatial filter provides a tradeoff between noise reduction and dereverberation depending on the diffuse sound PSD.

Audio Examples

We simulated the signals captured by a rigid spherical array with 32 microphones and a radius of 4.2 cm in a shoebox room using a room impulse response generator for spherical microphone arrays based on the source-image method. The source-array distance was varied between 1 and 2 meters and stationary white noise with a SNR (signal-to-noise ratio) of 25 dB and 60 dB was added. More details of the simulated audio examples can be found in [1].

The audio player works best with the Google Chrome browser.

Example 1: SNR = 60 dB, source-array distance was 1 m, T60 = 500 ms

Example 2: SNR = 60 dB, source-array distance was 2 m, T60 = 400 ms

Example 3: SNR = 25 dB, source-array distance was 1 m, T60 = 500 ms

Example 4: SNR = 25 dB, source-array distance was 2 m, T60 = 400 ms

* using the proposed PSD matrix estimator


  1. Braun, S. and Jarrett, D. P. and Fischer, J. and Habets, E. A. P., "An informed spatial filter for dereverberation in the spherical harmonic domain", In Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2013.