Presented by E.A.P. Habets and Sharon Gannnot at the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Vancouver, Canada, May 26, 2013.
A major challenge in modern acoustic communication systems is the acquisition of the sound of interest. In the last decade, spatial signal processing has been shown to provide useful and viable solutions. Especially in adverse acoustic conditions, encountered in reverberant and noisy environments, the use of multiple microphones has been shown to provide significant advantages compared to the use of a single microphone. This tutorial will briefly review room acoustics in order to explain the properties of different sound fields. It will then outline current and emerging techniques for spatial signal processing. In particular, the problem of acquiring an estimate of a desired sound will be addressed. This problem will be tackled from the perspective of (distributed and non-distributed) linear spatial processing and parametric spatial processing and will be supported by numerous audio examples. The tutorial will close with an outlook, highlighting important open questions and promising research directions.
The proposed tutorial is intended to be of relevance to researchers and development engineers working in many aspects of acoustic signal processing, optimal estimation and adaptive filtering. It will include a brief resume of the relevant aspects of room acoustics and then move on to show the state-of-the-art of the various spatial processing approaches available for acoustic communication into two categories, viz. linear spatial processing and parametric spatial processing. This will break down the algorithms into a manageable number of types, enabling participants to quickly gain an overview and understanding of the subject area and the relevant applications. At appropriate points in the tutorial, recent advances beyond the state-of-the-art will be described and new research results will be presented.
The following categories will be covered:
The first part of this tutorial will summarize the state-of-the-art in spatial speech processing. Several optimization criteria will be presented, highlighting their different attributes. Both closed-form and adaptive solutions will be introduced. We then explore blind and semi-blind estimation techniques that enable the design of feasible beamformers utilizing the available microphone signals. We then explore the important aspect of single-channel postfilters, designed to further enhance the beamformer's output. We conclude by discussing robustness and performance issues.
Recent technological advances facilitate the concept of distributed, randomly placed, microphone arrays. A distributed microphone array is composed of multiple sub-arrays (nodes), each of which consists of several microphones, a signal processing unit and a wireless communication module. The large spatial distribution of such microphone constellations increases the probability that a subset of the microphones is close to a relevant sound source. New challenges arise with these new distributed structures. In this part of the tutorial we will explore performance bounds of such distributed microphone arrays and introduce novel algorithms for solving the optimality criteria under the limited communication bandwidth constraints between nodes.
Parametric spatial processing is a promising and emerging technique that is fundamentally different from traditional spatial processing techniques. First, a relatively simple sound field model is adopted and the parameters of the model, i.e., the direction of arrival and diffuseness, are estimated in a time-frequency domain. Secondly, the estimated parameters are used to process the received microphone signals. The compact and efficient representation of the sound field can be used to develop algorithms for directional and spatial filtering and source localization. In this tutorial, two distinct sound field models and corresponding parameter estimation techniques will be presented. We will then present an algorithm for directional filtering and dereverberation. Finally, we show how one of the parametric representations can be used to create a virtual microphone signal with a pre-defined pickup pattern.
In the final part, it is shown how linear and parametric spatial processing can be combined. It will be shown that this type of processing results in an increased robustness compared to pure linear spatial processing techniques and an increased performance (e.g., interference reduction) compared to pure parametric spatial processing techniques.
Part 1 - Introduction
Part 2 - Linear Spatial Processing
Part 4 - Parametric Spatial Processing
Part 6 - Wrap-up