july 13, 2006 © 2006 ibm corporation distributed multimodal synchronization protocol (dmsp) chris...

21
July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald McCobb and Les Wilson

Upload: justin-maxwell

Post on 27-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

Distributed Multimodal Synchronization Protocol (DMSP)

Chris Cross

IETF 66

July 13, 2006With Contribution from Gerald McCobb and Les Wilson

Page 2: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

DMSP Background

• Result of 4 year IBM and Motorola R&D effort• OMA Architecture Document:

– Multimodal and Multi-device Architecture– http://member.openmobilealliance.org/ftp/Public_documents/BAC/MAE/Permanent_documents/

OMA-AD-MMMD-V1_0-20060612-D.zip • ID to IETF July 8, 2005 by IBM & Motorola• Reason for contribution

– A standard is needed to synchronize network based services implementing distributed modalities in multimodal applications

– Other protocols may have overlap but do not address all multimodal interaction requirements

– Other IETF IDs and RFCs:• Media Server Control Protocol (MSCP)• LRDP: The Lightweight Remote Display Protocol (Remote UI BoF)• Media Resource Control Protocol Version 2 (MRCPv2)• Widex • RFC 1056 Distributed Mail System for Personal Computers (also DMSP )

Page 3: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

Why do you need a distributed system, i.e., a Thin Client?

Grammar Size and ComplexityGrammar Size and ComplexityGG

R R Resources: memory and MIPS on the client deviceResources: memory and MIPS on the client device

G G Size and Complexity of application grammarsSize and Complexity of application grammars

R/G = 1R/G = 1 Resources are adequate to perform “real time”Resources are adequate to perform “real time”recognition and synthesis.recognition and synthesis.

A thick client has speech recognition and synthesis on the device. As resources available on a device shrink or the application requirements increase (larger application grammars) then the performance of the system becomes unacceptable. When that threshold is reached then it is economically feasible to distribute the speech over the network.

Clie

nt

Re

sou

rce

sC

lien

t R

eso

urc

es

RR

R/G = 1R/G = 1

Thick Client

Thin Client

Page 4: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

Multimodal Use Cases

• Opera X+V Pizza demo

• X+V

• J+V

• Future W3C multimodal languages (VoiceXML 3, etc.)

Page 5: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

DMSP Architecture

1. Modalities

2. Model-View-Controller (MVC) design pattern

3. View Independent Model

4. Event-based modality synchronization

• There are 4 DMSP building blocks:

Page 6: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

DMSP Architecture Building Blocks

1. Modalities are Views in the MVC Pattern

• GUI, Speech, Pen• Individual browsers for

each modality• Compound browsers for

multiple modalities

Compound Browser

Page 7: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

DMSP Architecture Building Blocks

• Multimodal system can be modeled in terms of the MVC pattern

• Each modality can be decomposed and implemented in its own MVC pattern

• A modality can implement a view independent model and controller locally or use one in the network (e.g., an IM)

2. Model-View-Controller (MVC) design pattern

Page 8: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

DMSP Architecture Building Blocks

View Independent Model • Enables a centralized model• Modality interaction updates view and model• Local event filters reflect “important” events to

view independent model• A modality listens to view independent model for

only the events it cares about• Compound clients and centralized control

through an Interaction Manager as well as distributed modalities all enabled with a single protocol

Page 9: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

DMSP Architecture Building Blocks

4. Event-based synchronization• Compound Client: All modalities

rendered in client

• Interactions in one modality reflected in others thru event based changes to one or more model

• GUI DOM can serve as View Independent model

• Something about connecting non-dom UA’s to the ones with a dom

Page 10: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

DMSP Architecture Building Blocks

4. Event-based synchronization (CONT’D)

• Distributed Modality: A modality is handled in the infrastructure

• Requires the DMSP for distributing modality• Event based synchronization via View

Independent Model gives a modality independent distribution mechanism

• Enables multiple topographies– Compound Client w/ Distributed Modality– Interaction Manager

Distributed Modality

Page 11: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

DMSP Design

• There are 4 abstract interfaces1. Command2. Response3. Event4. Signal

• Each interface defines a set of methods and related data structures exchanged between user agents

• Specified as a set of messages• XML and Binary message encodings

Page 12: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

DMSP Message Types

1. Signals• One-way asynchronous messages used to

negotiate internal processing states• Initialization (SIG_INIT)• VXML Start (SIG_VXML_START)• Close (SIG_CLOSE)

Page 13: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

DMSP Message Types

2. Command and control messages• Add and remove event listener

(CMD_ADD/REMOVE_EVT_LISTENER)• Can dispatch (CMD_CAN_DISPATCH)• Dispatch event (CMD_DISPATCH_EVT)• Load URL (CMD_LOAD_URL)• Load Source (CMD_LOAD_SRC)• Get and Set Focus (CMD_GET/SET_FOCUS)• Get and Set Fields (CMD_GET/SET_FIELDS)• Cancel (CMD_CANCEL)• Execute Form (CMD_EXEC_FORM)• Get and Set Cookies (CMD_GET/SET_COOKIES)

Page 14: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

DMSP Message Types

3. Responses• Response messages to commands

• OK (RESP_OK)• Boolean (RESP_BOOL)• String (RESP_STRING)• Fields (RESP_FIELDS)

• Contains 1 or more Field data structures

• Error (RESP_ERROR)

Page 15: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

DMSP Message Types

4. Events• Asynchronous notifications between user agents

with a common data structure• Events correlated with event listeners• DOM events

• DOMActivate, DOMFocusIn, and DOMFocusOut

• HTML 4 events• Click, Mouse, Key, submit, reset, etc

• Error and abort• VXML Done (e.g., VoiceXML form complete)

Page 16: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

DMSP Message Types

4. Events (CONT’D)• Form Data

• One or more Field data structures (GUI or Voice)• Recognition Results

• One or more Result data structures with raw utterance, score, and one or more Field data structures

• Recognition Results EX• One or more Result EX data structures with raw utterance, score,

grammar, and semantics• Start and stop play back

• Play back of audio or TTS prompts has started or stopped• Start and stop play back mark

• TTS encounters a mark in the play text• Custom (i.e., application-defined)

Page 17: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

DMSP Conclusion

• A protocol dedicated to distributed multimodal interaction• Based on the Model-View-Controller design pattern• Enables both Interaction Manager and Client based View

Independent Model topographies• Asynchronous signals and events• Command-response messages• Can be generalized for other modalities besides GUI and

Voice• Supports application specific result protocols (e.g.

EMMA) through extension TBD• Interested in getting more participation

Page 18: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

Draft Charter• The convergence of wireless communications with information technology

and the miniaturization of computing platforms have resulted in advanced mobile devices that offer high resolution displays, application programs with graphical user interfaces, and access to the internet through full function web browsers.

• Mobile phones now support most of the functionality of a laptop computer. However the miniaturization that has made the technology possible and commercially successful also puts constraints on the user interface. Tiny displays and keypads significantly reduce the usability of application programs.

• Multimodal user interfaces, UIs that offer multiple modes of interaction, have been developed that greatly improve the usability of mobile devices. In particular multimodal UIs that combine speech and graphical interaction are proving themselves in the marketplace.

• However, not all mobile devices provide the computing resources to perform speech recognition and synthesis locally on the device. For these devices it is necessary to distribute the speech modality to a server in the network.

Page 19: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

Draft Charter (cont.)• The Distributed Multimodal Working Group will develop the protocols

necessary to control, coordinate, and synchronize distributed modalities in a distributed Multimodal system. There are several protocols and standards necessary to implement such a system including DSR and AMR speech compression, session control, and media streaming. However, the DM WG will focus exclusively on the synchronization of modalities being rendered across a network, in particular Graphical User Interface and Voice Servers.

• The DM WG will develop an RFC for a Distributed Multimodal Synchronization Protocol that defines the logical message set to effect synchronization between modalities and enough background on the expected multimodal system architecture (or reference architecture defined elsewhere in W3C or OMA) to present a clear understanding of the protocol. It will investigate existing protocols for the transport of the logical synchronization messages and develop an RFC detailing the message format for commercial alternatives, including, possibly, HTTP and SIP.

Page 20: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

Draft Charter (cont.)• While not being limited to these, for simplicity of the scope the protocol will

assume RTP for carriage of media, SIP and SDP for session control, and DSR, AMR, QCELP, etc., for speech compression. The working group will not consider the authoring of applications as it will be assumed that this will be done with existing W3C markup standards such as XHTML and VoiceXML and commercial programming languages like Java and C/C++.

Page 21: July 13, 2006 © 2006 IBM Corporation Distributed Multimodal Synchronization Protocol (DMSP) Chris Cross IETF 66 July 13, 2006 With Contribution from Gerald

July 13, 2006 © 2006 IBM Corporation

Draft Charter (cont.)

• It is expected that we will coordinate our work in the IETF with the W3C Multimodal Interaction Work Group.

• The following are our goals for the Working Group: – Date Milestone – TBD Submit Internet Draft Describing DMSP

(standards track) – TBD Submit Drafts to IESG for publication – TBD Submit DMSP specification to IESG