improving speech recognition with embodied cognition and behaviour-based robotics
TRANSCRIPT
![Page 1: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/1.jpg)
Improving Speech Recognitionwith Embodied Cognition
and Behaviour-based Robotics
Improving Speech Recognitionwith Embodied Cognition
and Behaviour-based Robotics
Jorge Davila-Chacon
University of Hamburg - Knowledge Technology
www.informatik.uni-hamburg.de/WTM/
Spotify ML Meetup – November 3rd 2014
![Page 2: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/2.jpg)
MotivationMotivation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 2
• Why is bio-inspired SSL interesting / useful?
![Page 3: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/3.jpg)
Neurobotic ExperimentsNeurobotic Experiments
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 3
![Page 4: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/4.jpg)
Virtual Reality LabVirtual Reality Lab
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 4
Bauer, J., Davila-Chacon, J., Strahl, E., Wermter, S. Smoke and Mirrors — Virtual Realities for Sensor Fusion Experiments in Biomimetic Robotics. In: Multisensor Fusion and Integration for Intelligent Systems, 2012
![Page 5: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/5.jpg)
Neurobotic ExperimentsNeurobotic Experiments
Jorge Davila-Chacon 5Bio-Inspired SSL for Robot ASR
![Page 6: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/6.jpg)
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 6
ITD
ILD
ITDs fromLow Frequencies
ITDs fromLow Frequencies
ILDs fromHigh Frequencies
ILDs fromHigh Frequencies
Spatial cues allow sound source localisation:
• Interaural Time Difference (ITD)• Interaural Level Difference (ILD)
Spatial cues allow sound source localisation:
• Interaural Time Difference (ITD)• Interaural Level Difference (ILD)
Same frequency component
Same frequency component
![Page 7: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/7.jpg)
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 7
ITDs extracted in Medial Superior Olive (MSO)
ITDs extracted in Medial Superior Olive (MSO)
• AVCN - Anterior Ventral Cochlear Nucleus
• AN - Auditory Nerve
• IC – Inferior Colliculus
Interaural Time DifferencesNeuroanatomy
Interaural Time DifferencesNeuroanatomy
![Page 8: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/8.jpg)
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 8
Interaural Time DifferencesComputational Principle
Interaural Time DifferencesComputational Principle
![Page 9: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/9.jpg)
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 9
ILDs extracted in Lateral Superior Olive (LSO)
ILDs extracted in Lateral Superior Olive (LSO)
• MNTB - Medial Nucleus of the Trapezoid Body
• IC – Inferior Colliculus
Interaural Level DifferencesNeuroanatomy
Interaural Level DifferencesNeuroanatomy
![Page 10: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/10.jpg)
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 10
Output ofMSO and LSO integrated in
IC
Output ofMSO and LSO integrated in
IC
J. Dávila-Chacón, S. Heinrich, J. Liu, S. Wermter. Biomimetic Binaural Sound Source Localisation with Ego-Noise Cancellation. International Conference on Artificial Neural Networks, 2012.
![Page 11: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/11.jpg)
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 11
![Page 12: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/12.jpg)
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 12
![Page 13: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/13.jpg)
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 13
![Page 14: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/14.jpg)
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 14
MLP
IC
IC
![Page 15: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/15.jpg)
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 15
J. Dávila-Chacón, S. Magg, J. Liu, S. Wermter. Neural and Statistical Processing of Spatial Cues for Sound Source Localisation. International Joint Conference on Neural Networks, 2013.
![Page 16: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/16.jpg)
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 16
![Page 17: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/17.jpg)
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 17
Simple IC outputSimple IC output
![Page 18: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/18.jpg)
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 18
Complex IC outputComplex IC output
![Page 19: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/19.jpg)
Bio-Inspired Sound Source LocalisationBio-Inspired Sound Source Localisation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 19
Static SSLStatic SSL
Dynamic SSL
Dynamic SSL
Feed forwardneural network
![Page 20: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/20.jpg)
Robotic Automatic Speech RecognitionRobotic Automatic Speech Recognition
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 20
Platforms used for ASR: iCub and Soundman
Platforms used for ASR: iCub and Soundman
![Page 21: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/21.jpg)
Robotic Automatic Speech RecognitionRobotic Automatic Speech Recognition
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 21
J. Dávila-Chacón, J. Twiefel, J. Liu, S. Wermter. Improving Humanoid Robot Speech Recognition with Sound Source Localisation. International Conference on Artificial Neural Networks, 2014.
Binary measure - Static ASRBinary measure - Static ASR
![Page 22: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/22.jpg)
Robotic Automatic Speech RecognitionRobotic Automatic Speech Recognition
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 22
Continuous measure - Static ASR
Continuous measure - Static ASR
J. Dávila-Chacón, J. Twiefel, J. Liu, S. Wermter. Improving Humanoid Robot Speech Recognition with Sound Source Localisation. International Conference on Artificial Neural Networks, 2014.
![Page 23: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/23.jpg)
● Robotics as a “sandbox” for learning ML
● Neuroscience provides clues for computational principles
● Embodiment• iCub allows computation of spatial cues
• Interaction with environment can reduce noise
● Signal processing with ANN• Spiking ANN are an effective representation of spatial cues
• Bayesian integration important for dimensionality reduction
• Softmax Neural layer robust to ego-noise and reverberation
ConclusionConclusion
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 23
![Page 24: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/24.jpg)
Future WorkFuture Work
● Neural SSL• Integrate GPU version of MSO and LSO
• Propagation of probabilities through time
• From discrete to continuous
● Integration with vision• From supervised to unsupervised SSL
• Possible extension to sensorimotor contingencies• Vision to select between multiple sound sources
• Vision for speech segregation
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 24
![Page 25: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/25.jpg)
Thank you for your attention.
LinkedIn: Jorge Davila Chacon
• J. Liu, D. Perez-Gonzalez, A. Rees, H. Erwin, S. Wermter. A biologically inspired spiking neural network model of the auditory midbrain for sound source localisation. Neurocomputing (2010)
• J. Davila-Chacon, S. Heinrich, J. Liu, and S. Wermter. Biomimetic binaural sound source localisation with ego-noise cancellation. International Conference on Artificial Neural Networks (2012)
• J. Bauer, J. Davila-Chacon, E. Strahl, S. Wermter. Smoke and Mirrors — Virtual Realities for Sensor Fusion Experiments in Biomimetic Robotics. Multisensor Fusion and Integration for Intelligent Systems (2012)
• J. Davila-Chacon, S. Magg, J. Liu, S. Wermter. Neural and Statistical Processing of Spatial Cues for Sound Source Localisation. International Joint Conference on Neural Networks (2013)
• J. Dávila-Chacón, J. Twiefel, J. Liu, S. Wermter. Improving Humanoid Robot Speech Recognition with Sound Source Localisation. International Conference on Artificial Neural Networks (2014)
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 25
![Page 26: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/26.jpg)
AppendixAppendix
Best performances with clustering layer
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 26
![Page 27: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/27.jpg)
AppendixAppendix
Best performances with clustering layer
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 27
![Page 28: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/28.jpg)
AppendixAppendix
Bayesian IC model
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 28
![Page 29: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/29.jpg)
AppendixAppendix
Bayesian IC model
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 29
![Page 30: Improving Speech Recognition with Embodied Cognition and Behaviour-based Robotics](https://reader033.vdocument.in/reader033/viewer/2022052912/55a077271a28abb2128b45d5/html5/thumbnails/30.jpg)
AppendixAppendix
Levenshtein distance
Jorge Davila-Chacon Bio-Inspired SSL for Robot ASR 30
J. Dávila-Chacón, J. Twiefel, J. Liu, S. Wermter. Improving Humanoid Robot Speech Recognition with Sound Source Localisation. International Conference on Artificial Neural Networks, 2014.