Bachelor Thesis of Schöls, Tobias
This works presents approaches towards a robust, hands-free, distant-talking speech recognition in the aixCAVE virtual environment. A top mounted microphone array is used for the speech acquisition. The microphone array signals are processed into a mono-signal that is fed into a speech recognition software. The processing contains speech enhancement, noise reduction and minimization of convolutional distortions and makes use of the precise speaker position and 3-D view vector acquired by a head tracking system. For the speech recognition an external software package is used and embedded into the existing software framework. The speech recognition software has to perform in real-time, be able to run multiple instances in parallel and achieve a good recognition rate even for changing speakers.