Speaker Extraction for Multi-Party Communication Applications

Abstract

Multi-party communication is challenging due to interference from other speakers at the near end and the playback of far-end speakers, especially when pairs of speakers are engaged in independent conversations in the same acoustic space. Guided speaker extraction combined with echo reduction is a useful technology for such applications. In this demonstration, we showcase our speaker extraction system for real-time multi-party communication applications. The system uses loudspeakers to play back the far-end signals and a microphone array to record the acoustic scene. It employs multi-channel acoustic echo reduction to pre-process the microphone signals, and then a beamformer-guided target speaker extraction neural network is used to extract the speakers separately. We demonstrate the potential of the system in communication scenarios containing up to three simultaneous conversing parties.

For more information about these technologies please contact Prof. Dr. Emanuël Habets (emanuel.habets@audiolabs-erlangen.de).

Note: If you experience any problems viewing this page, please try to use the latest version of Firefox.

Audio-Video Example