Speech Classification for Acoustic Source Localization and Tracking Applications Using Convolutional Neural Networks
Acoustic Source Localization and Speaker Tracking are continuously gaining importance in fields such as human computer interaction, hands-free operation of smart home devices, and telecommunication. A set-up using a Steered Response Power approach in combination with high-end professional microphone capsules is described and the initial processing stages for detection angle stabilization are outlined. The resulting localization and tracking can be improved in terms of reactivity and angular stability by introducing a Convolutional Neural Network for signal/noise discrimination tuned to speech detection. Training data augmentation and network architecture are discussed; classification accuracy and the resulting performance boost of the entire system are analyzed.
Erschienen in:
Audio Engineering Society Convention 145Autoren: Koch, Andreas / Schilling, Andreas / Ziegler, Jonathan
Hrsg.: Audio Engineering Society
Erscheinungsjahr: 2018
Weiterführende Links:
http://www.aes.org/e-lib/browse.cfm?elib=19827
Autoren
- Name:
- Prof. Dr. Andreas Koch
- Forschungsgebiet:
- Künstliche Intelligenz
- Funktion:
- Prodekan
- Lehrgebiet:
- Nachrichtentechnik, Elektronik, Technische Informatik, Künstliche Intelligenz
- Studiengang:
- Audiovisuelle Medien (Bachelor, 7 Semester)
- Fakultät:
- Fakultät Electronic Media
- Raum:
- 318, Nobelstraße 10 (Hörsaalbau)
- Telefon:
- 0711 8923-2249
- Telefax:
- 0711 8923-2207
- E-Mail:
- kocha@hdm-stuttgart.de
- Homepage:
- www.hdm-stuttgart/kocha
- Name:
- Andreas Schilling
Eingetragen von
Mehr zu diesem Autor