a fusion framework for multimodal interactive applications
DESCRIPTION
This research aims to propose a multi-modal fusion framework for high-level data fusion between two or more modalities. It takes as input low level features extracted from dier-ent system devices, analyses and identies intrinsic meanings in these data. Extracted meanings are mutually compared to identify complementarities, ambiguities and inconsistencies to better understand the user intention when interacting with the system. The whole fusion life cycle will be describedand evaluated in an OCE environment scenario, where two co-workers interact by voice and movements, which might show their intentions. The fusion in this case is focusing on combining modalities for capturing a context to enhance the user experience.TRANSCRIPT
A Fusion Framework for Multimodal Interactive Applications
Presented by: Hildeberto Mendonça
Jean-Yves Lionel LawsonOlga Vybornova
Benoit MacqJean Vanderdonckt
ICMI-MLMI 2009 – Cambridge MA, USA, November 2-6, 2009Special Session Fusion Engines for Multimodal Interfaces
November 3, 2009
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 2
Motivations
How to support multimodal fusion in order to maximize reuse and minimize complexity? If there is complexity on multimodal fusion it should
be about the fusion in itself What already exists should be reused with minimal
adaptation A general life cycle can guarantee a standard
treatment for each modality
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 3
Research Goal
To define and develop a multipurpose framework for high level data fusion on multimodal
interactive applications
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 4
Fusion Principles
Type: Parallel + Combined = Synergistic Each modality is endowed of meanings
Level: Feature (i.e. pattern extraction) + decision (i.e. Recognized task)
Input Devices: Multiple Notation: Defined by the developer Ambiguity resolution: Defined by the developer Time representation (Quantitative – Qualitative): Both Application Type : The domain is defined using ontologies
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 5
Process
Recognition: identification of patterns on input signals. Segmentation: delimitation of identified areas. Meanings Extraction: deeper analysis to identify
meanings and correlations between segments according to specific domains.
Annotation: formal description of segments through domain concepts.
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 6
Process
The flow is fixed but it can start at any point respecting the sequence.
Not fixed to any particular method. The method is “plugged”.
Focus on good level of analysis, not on real time processing.
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 7
OpenInterface
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 8
OpenInterface
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 9
OpenInterface
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 10
Fusion Mechanism
Define a process for each modality and put them in parallel.
Data from each stage is buffered and processed together for the purpose of fusion.
Agent-oriented: problem solved in a distributed fashion.
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 11
Fusion Mechanism
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 12
Fusion Mechanism
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 13
Fusion Mechanism – OpenInterface
OI Modeling Tool
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 14
Fusion Mechanism - Instance
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 15
Scenario
Maybe I can find a book about it in the
library
Ronald is moving towards the book
shelves
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 16
Results
managed spatial relationships based on the fixed objects in the room
made semantic fusion of events not coinciding in time achieved good results in speaker identification -
synchronization between image and speech identification created an open framework to manage fusion between two
(in our case) or more modalities (in enhanced future work) designed the system so that each component can run in a
separate machine due to the distribution mechanism interchanging data through a TCP/IP network
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 17
Next Steps
Implementing the segmentation and annotation of 3D content
Migrate the framework to a real-time implementation
Evaluate other methods under the rules of the framework
Continuously extend the framework to support other fusion concepts and methods of implementation
04/09/23 ICMI-MLMI 2009 – Cambridge MA, USA 18
Thank you for your attention!