3.11.2013 - 8.11.2013
Leibniz Center for Informatics, Schloss Dagstuhl
With the rapid growth and omnipresence of digitized multimedia data, the processing, analysis, and understanding of such data by means of automated methods has become a central issue in computer science and associated areas of research. As for the acoustic domain, audio analysis traditionally has strongly been focused on data related to speech with the goal to recognize and transcribe the spoken words. In the proposed seminar, we want to consider current and future audio analysis tasks that go beyond the classical speech recognition scenario. On the one hand side, we want to look at the computational analysis of speech with regard to the speakers’ traits (e. g., gender, age, height, cultural and social background), physical conditions (e. g., sleepiness, medical and alcohol intoxication, health), or emotion-related and affective states (e. g., stress, interest, confidence, frustration). So, rather than recognizing what is being said, the goal is to find out how and by whom it is being said. On the other side, there is a rich variety of sounds besides speech such as music recordings, animal sounds, environmental sounds, and mixtures thereof. Here, similar as for the speech domain, we want to study how to decompose and classify the content of complex sound mixtures with the objective to infer semantically meaningful information. For more information take a look at the Dagstuhl website.