Aufsatz

Speech Classification for Acoustic Source Localization and Tracking Applications Using Convolutional Neural Networks

Acoustic Source Localization and Speaker Tracking are continuously gaining importance in fields such as human computer interaction, hands-free operation of smart home devices, and telecommunication. A set-up using a Steered Response Power approach in combination with high-end professional microphone capsules is described and the initial processing stages for detection angle stabilization are outlined. The resulting localization and tracking can be improved in terms of reactivity and angular stability by introducing a Convolutional Neural Network for signal/noise discrimination tuned to speech detection. Training data augmentation and network architecture are discussed; classification accuracy and the resulting performance boost of the entire system are analyzed.


Erschienen in:

Audio Engineering Society Convention 145
Autoren: Koch, Andreas / Schilling, Andreas / Ziegler, Jonathan
Hrsg.: Audio Engineering Society
Erscheinungsjahr: 2018

Weiterführende Links:
http://www.aes.org/e-lib/browse.cfm?elib=19827
Dateianhänge:


Autoren

Name:
Andreas Schilling

Eingetragen von


Mehr zu diesem Autor
Sie haben eine Frage oder einen Kommentar zu diesem Beitrag?