Aufsatz

Speech Classification for Acoustic Source Localization and Tracking Applications Using Convolutional Neural Networks

Acoustic Source Localization and Speaker Tracking are continuously gaining importance in fields such as human computer interaction, hands-free operation of smart home devices, and telecommunication. A set-up using a Steered Response Power approach in combination with high-end professional microphone capsules is described and the initial processing stages for detection angle stabilization are outlined. The resulting localization and tracking can be improved in terms of reactivity and angular stability by introducing a Convolutional Neural Network for signal/noise discrimination tuned to speech detection. Training data augmentation and network architecture are discussed; classification accuracy and the resulting performance boost of the entire system are analyzed.


Erschienen in:

Audio Engineering Society Convention 145
Autoren: Koch, Andreas / Schilling, Andreas / Ziegler, Jonathan
Hrsg.: Audio Engineering Society
Erscheinungsjahr: 2018

Weiterführende Links:
http://www.aes.org/e-lib/browse.cfm?elib=19827
Dateianhänge:


Autoren

Name:
Prof. Dr. Andreas Koch  Elektronische Visitenkarte
Forschungsgebiet:
Künstliche Intelligenz
Funktion:
Prodekan
Lehrgebiet:
Nachrichtentechnik, Elektronik, Technische Informatik, Künstliche Intelligenz
Studiengang:
Audiovisuelle Medien (Bachelor, 7 Semester)
Fakultät:
Fakultät Electronic Media
Raum:
318, Nobelstraße 10 (Hörsaalbau)
Telefon:
0711 8923-2249
Telefax:
0711 8923-2207
E-Mail:
kocha@hdm-stuttgart.de
Homepage:
www.hdm-stuttgart/kocha
Andreas Koch

Name:
Andreas Schilling

Eingetragen von


Mehr zu diesem Autor
Sie haben eine Frage oder einen Kommentar zu diesem Beitrag?