sisl: several interfaces, single logic

16
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY 3, 93–108, 2000 c 2000 Kluwer Academic Publishers. Manufactured in The Netherlands. Sisl: Several Interfaces, Single Logic THOMAS BALL * Software Production Research Department, Bell Laboratories, Lucent Technologies, 263 Shuman Blvd., Naperville, IL 60566, USA [email protected] CHRISTOPHER COLBY Department of Mathematical and Computer Sciences, Loyola University Chicago, 6525 N. Sheridan Road, Chicago, IL 60626, USA [email protected] PETER DANIELSEN AND LALITA JATEGAONKAR JAGADEESAN Software Production Research Department, Bell Laboratories, Lucent Technologies, 263 Shuman Blvd., Naperville, IL 60566, USA [email protected] [email protected] RADHA JAGADEESAN AND KONSTANTIN L ¨ AUFER Department of Mathematical and Computer Sciences, Loyola University Chicago, 6525 N. Sheridan Road, Chicago, IL 60626, USA [email protected] [email protected] PETER MATAGA AND KENNETH REHOR Software Production Research Department, Bell Laboratories, Lucent Technologies, 263 Shuman Blvd., Naperville, IL 60566, USA [email protected] [email protected] Received July 12, 1999; Accepted December 16, 1999 Abstract. Modern interactive services such as information and e-commerce services are becoming increasingly more flexible in the types of user interfaces they support. These interfaces incorporate automatic speech recognition and natural language understanding and include graphical user interfaces on the desktop and web-based interfaces using applets and HTML forms. To what extent can the user interface software be decoupled from the service logic software (the code that defines the essential function of a service)? Decoupling of user interface from service logic directly impacts the flexibility of services, or how easy they are to modify and extend. * Present address: Microsoft Research, One Microsoft Way, Redmond, WA 98052, USA.

Upload: thomas-ball

Post on 03-Aug-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sisl: Several Interfaces, Single Logic

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY 3, 93–108, 2000c© 2000 Kluwer Academic Publishers. Manufactured in The Netherlands.

Sisl: Several Interfaces, Single Logic

THOMAS BALL∗

Software Production Research Department, Bell Laboratories, Lucent Technologies, 263 Shuman Blvd.,Naperville, IL 60566, USA

[email protected]

CHRISTOPHER COLBYDepartment of Mathematical and Computer Sciences, Loyola University Chicago,

6525 N. Sheridan Road, Chicago, IL 60626, [email protected]

PETER DANIELSEN AND LALITA JATEGAONKAR JAGADEESANSoftware Production Research Department, Bell Laboratories, Lucent Technologies, 263 Shuman Blvd.,

Naperville, IL 60566, [email protected]

[email protected]

RADHA JAGADEESAN AND KONSTANTIN LAUFERDepartment of Mathematical and Computer Sciences, Loyola University Chicago,

6525 N. Sheridan Road, Chicago, IL 60626, [email protected]

[email protected]

PETER MATAGA AND KENNETH REHORSoftware Production Research Department, Bell Laboratories, Lucent Technologies, 263 Shuman Blvd.,

Naperville, IL 60566, [email protected]

[email protected]

Received July 12, 1999; Accepted December 16, 1999

Abstract. Modern interactive services such as information and e-commerce services are becoming increasinglymore flexible in the types of user interfaces they support. These interfaces incorporate automatic speech recognitionand natural language understanding and include graphical user interfaces on the desktop and web-based interfacesusing applets and HTML forms. To what extent can the user interface software be decoupled from the service logicsoftware (the code that defines the essential function of a service)? Decoupling of user interface from service logicdirectly impacts the flexibility of services, or how easy they are to modify and extend.

∗Present address: Microsoft Research, One Microsoft Way, Redmond, WA 98052, USA.

Page 2: Sisl: Several Interfaces, Single Logic

94 Ball et al.

To explore these issues, we have developed Sisl, an architecture and domain-specific language for designing andimplementing interactive services with multiple user interfaces. A key principle underlying Sisl is that all userinterfaces to a service share the same service logic. Sisl provides a clean separation between the service logicand the software for a variety of interfaces, including Java applets, HTML pages, speech-based natural languagedialogue, and telephone-based voice access. Sisl uses an event-based model of services that allows service providersto support interchangeable user interfaces (or add new ones) to a single consistent source of service logic and data.

As part of a collaboration between research and development, Sisl is being used to prototype a new generationof call processing services for a Lucent Technologies switching product.

Keywords: interactive services, domain-specific languages, web, automatic speech recognition, dialogue systems,telephony, voice services, user interfaces, Triveni, Java, VoiceXML

1. Introduction

Modern interactive services are becoming increasinglymore flexible in the user interfaces they support. Theseinterfaces incorporate automatic speech recognition(ASR) and natural language understanding, and in-clude graphical user interfaces on the desktop and web-based interfaces using applets and HTML forms. Suchservices include banking systems, airline reservationsystems, stock market services, and Internet call cen-ters, to name but a few.

A basic problem in the development of such sys-tems is that the expense associated with the user inter-faces can be significant. For example, approximately50% percent of the software for systems with highly-interactive graphical interfaces is dedicated to the userinterface (Myers and Rosson, 1992). This difficulty isaccentuated by the wide variety of possible interfacesto modern interactive services. In general, the infor-mation presented on each device must be tailored tothe functionality that is available on that device. Thus,the graphical interface to an application that is appro-priate for a workstation may be quite different than aninterface that is appropriate for an interactive voice re-sponse system or a cellular phone with a very smalldisplay. Last but not least, natural language dialogueinterfaces based on automatic speech recognition addnew flexibility in interaction with the user.

To what extent can the user interface software bedecoupled from the service logic software (the codethat defines the essential function of a service)? In gen-eral, how does one achieve the following modularityprinciples?

• Reduce the amount of changes to the user interfacesoftware required due to modifications to the servicelogic software, and vice-versa;

• Enable service providers to provideinterchangeableuser interfaces (or add new ones) to a single consis-tent source of service logic and data.

A first step in this direction is a software archi-tecture following standard ideas from component ordistributed programming architectures. Such an archi-tecture specifies the connections and communicationbetween the service logic and user interfaces as abstractinterfaces, and permits addition of new user interfacesas long as they comply with the interface specifications.Thus, such an architecture allows several different userinterfaces to share the same service logic programmingcode.

However, a general software architecture does notaddress issues regarding succinct expression of the ser-vice logic. Let us say that we want to add a new userinterface to an existing service logic. The architectureviewpoint certainly allows this as long as the code forthe new user interface satisfies the interface require-ments. However, there is no guarantee that the “old”service logic permits the “new” user interface to beused in an unfettered and completely flexible fashion.This issue is especially significant because of the fun-damental differences in the nature of interaction acrossthe spectrum of user interfaces. For example, in typicalweb interaction based on HTML forms, the buttons andfields of the form serve to constrain the type of infor-mation that the user can provide to the service logic. Ininteractive voice response (IVR) systems, such restric-tions are imposed by listing out menus to users (“ for ...,press 1, for .., press 2,” etc.) and forcing the user to staywithin the listed options. However, when users interactwith the system through spoken natural language, con-trol is shared between the user and the system. In a freeflowing conversation, the system can take control by re-questing required information from the user, and can

Page 3: Sisl: Several Interfaces, Single Logic

Sisl: Several Interfaces, Single Logic 95

Figure 1. A high-level description of the Any-Time Teller.

verify, correct, or clarify information given by the user.Importantly, however, the user can also query or directthe system at any time by providing information thathas not been requested by the system. Consequently,the fact that a service logic is adequate for user inter-action based on HTML forms and IVR does not guar-antee that it will be adequate for a user interface basedon automatic speech recognition, since it does not nec-essarily support the additional flexibility on the part ofthe user. To allow this, the service logic has to be de-signed to support multiple user interfaces by providinga high-level abstraction of the service/user interaction.

We demonstrate our approach through an example.Consider Fig. 1, which describes an interactive bankingservice called the Any-Time Teller (a running examplein this paper). The transfer capability establishes anumber ofconstraintsamong the three input events(source account, target account, and transfer amount)required to make a transfer:

• the specified source and target accounts both must bevalid accounts for the previously given (login, PIN)pair;• the dollar amount must be greater than zero and less

than or equal to the balance of the source account;• it must be possible to transfer money between the

source and target accounts.

These constraints capture the minimum require-ments on the input for a transfer transaction to proceed.Perhaps more important is what these constraints do notspecify: for example, they do not specify an orderingon the three inputs, or what to do if a user has repeatedly

entered incorrect information. Thus, the basic principleof designing a flexible service logic might be restated:

Introduce a constraint among events only when ab-solutely necessary for the correct functioning of aservice.

This basic principle leads to a number of sub-principlesof service specification that directly address the issuesof flexibility and decoupling. For example, the servicelogic should be able to:

• accept input in different orders;• accept incomplete input;• allow the user to correct or update previously sub-

mitted information; and• automatically send to the user interfaces information

about user prompts, help, and ways to revert back toprevious points in the service.

There are two basic requirements on user interfacesthat can be used in conjunction with such service logics.In particular, any such user interface must be able to:

• based on information received from the service logic,prompt the user to provide the appropriate informa-tion and respond if the user needs help;• collect information from the user and transform the

information into abstract events to be sent to the ser-vice logic.

A wide range of user interfaces satisfies these re-quirements. For example, web-based interfaces using

Page 4: Sisl: Several Interfaces, Single Logic

96 Ball et al.

HTML pages can perform this event transformationand communication via name/value pairs, as cantelephone-based voice interfaces based on VoiceXML(VoiceXML, 1999) document interpretation. Desktopautomatic speech recognition interfaces based on theJava Speech API support this capability via tags in thespeech grammars (the input to a speech recognition en-gine that permits it to recognize spoken input efficientlyand effectively).

Based on the above principles, we have developedSisl (Several Interfaces, Single Logic), an approach fordesigning and implementing interactive services withmultiple user interfaces. A key idea underlying Sisl isthat all user interfaces to a service share the same ser-vice logic, which provides a high-level abstraction ofthe service/user interaction. It thus modularizes thedevelopment process by allowing the implementationof the service logic and user interfaces to proceed in-dependently.

There are two pieces to the Sisl approach: (1) astandard language-independent architecture and (2)re-active constraint graphs, a novel domain-specific lan-guage (DSL) for designing and implementing interac-tive services. Programs written in the DSL are used ascomponents in the architecture. The DSL is based on ananalysis of the features required of a service logic thatis shared across many different user interfaces, includ-ing speech-based natural language interfaces. Amongother things, our DSL permits the description of servicelogics that can accept partial and incomplete informa-tion from the user interface and allows user interfacesto deliver events to the service logic in any order.

Sisl is implemented as a library on top ofTriveni (Colby et al., 1998a,b), a process algebraicAPI for reactive programming. Triveni itself has beenimplemented as a library in Java, enabling Sisl to beused with any (Personal) Java implementation. Sislhas a XML front-end to describe reactive constraintgraphs in a markup language, allowing Sisl to be usedeasily by application programmers. The Sisl imple-mentation currently supports applet-based interfaces,HTML interfaces using the Java Servlet API, speech-based natural language interfaces using the Java SpeechAPI, and telephone-based voice access via VoiceXML(VoiceXML, 1999) and the TelePortal platform (Atkinset al., 1997). Our implementation of the Sisl lan-guage consists of approximately 2300 lines of Java andTriveni code (in addition to the Triveni library, which isabout 6000 lines of Java code). The implementation ofthe infrastructure for the supported interfaces describedabove is about 500 lines each.

The rest of this paper.In Section 2, we present thecriteria for designing and implementing service logicsthat are shared by several interfaces, and discuss howcurrent approaches perform with respect to these crite-ria. In Section 3, we describe the top level architectureof Sisl, including the protocols on the communicationbetween service logic and its multiple user interfaces.The requirements on the implementations of user in-terfaces are described in Section 4, while Section 5presents the language of reactive constraint graphs. InSection 6, we illustrate the concepts with the imple-mentations and execution traces of part of the func-tionality of the Any-Time Teller service. This servicehas an applet, web, automatic speech recognition, andtelephone-based voice interface, all sharing the sameservice logic. Section 7 describes additional features ofSisl, including its associated testing/debugging facility.In Section 8, we conclude by describing briefly someapplications in Sisl and discussing future work.

2. Criteria for Flexible InteractiveService Creation

We now discuss the principles for flexible interactiveservices that guided the design of Sisl, and describehow current approaches perform with respect to thesecriteria.

2.1. Principles for a Domain-Specific Language

As described in the introduction, a basic principle offlexible service logics is thatthe service logic shouldintroduce a constraint among events only when ab-solutely necessary for the correct functioning of theservice. We now elaborate on this idea and its con-sequences, using speech and web-based interfaces toillustrate the concepts.

Partial order on events.The service should implementa partial order on events, rather than a total order onevents (in the spirit of Hayes et al., 1985).

For example, consider an implementation of thetransfer capability that totally orders the interactionsof the service with the user interface: namely, thesource account is first collected and checked for va-lidity, then the amount is collected and another valid-ity check is performed, and finally, the target accountis collected and other validity checks are performed.

Such an implementation suffices in the context ofa web based HTML interface which serves up forms

Page 5: Sisl: Several Interfaces, Single Logic

Sisl: Several Interfaces, Single Logic 97

in afixedorder and thus controls the order of receiptof information from the user interface.

However, such an implementation overly restrictsa speech interface, since robust speech interfacesshould allow the information to be collected in anyorder. In particular, the user should be allowed tosay

• “I want to transfer $500 from my savings to mychecking account” or• “I want to transfer $500 to my checking account

from my savings account” or• “From my savings account, transfer $500 to

checking”

Thus, the source account, amount, and target accountcan be given in any order.

By making explicit independent and dependentevents, it becomes clear what events may be re-ordered without affecting the behavior of the service.

Partial information. The service should be able to ac-cept partial information in the form of “incomplete”input. Consider the following speech recognitionscenario for fund transfer:

• User says, “I want to transfer $500 dollars frommy savings account.”• The user interface, after querying the service

logic, prompts the user for the target account.• The user then specifies the target account.

An implementation of the transfer capability that re-acts only when all three events (source account, tar-get account, and amount) are presented checks thatthe savings account has $500 dollars only after thelast step. In the scenario where the user’s savings ac-count does not contain $500, the user will have gonethrough several needless interactions before this isreported by the system.

Since a constraint on the input events may referto any arbitrary subset of these events, it is desirablethat the service logic able to accept arbitrary subsetsof events at any time.

Error recovery. Unfortunately, humans often changetheir mind and/or make mistakes. Whenever possi-ble, services must accommodate these shortcomingsof our species, providing a capability to correct orupdate previously submitted information, back outof a transaction and allow the user to back up toprevious points in the service.

User prompts.The service logic must automaticallysend to the user interfaces information about userprompts, help, and ways to revert back to previouspoints in the service.

This principle is particularly relevant for servicesthat obey the previously stated three principles. Suchservices allow multiple points of interaction to be po-tentially enabled at a given instant to accommodatethe different orderings of inputs. In such a context,the prompt and help information serves as an ab-straction of the current control point of the serviceand can be handled in a different manner by differ-ent user interfaces. In automatic speech recogni-tion interfaces, the information about currently en-abled events is used by the user interface in two ways(JSAPI, 1998):

• to appropriately prompt the user for information,thus compensating for the lack of visual cues,and• to effectively parse the information provided by

the user.

A user interface need not respond to all currentlyenabled events of the service. Thus, different userinterfaces can formulate different queries to the usereven though the state of the underlying service logic(as revealed by the current set of enabled events) isthe same.

2.2. Prior Approaches

The traditional approach for designing and implement-ing interactive services is based on finite state machines(FSMs). These state machines may be described ex-plicitly in a graphical notation such as ObjecTime orStateCharts, or described implicitly in imperative lan-guages such as Perl, C, Mawl (Atkins et al., 1997, 1999)or Active Server Pages (Francis et al., 1998). Figure 2presents a simplified FSM that one might construct tohandle the collection of the three inputs needed for thetransfer transaction. Each state is uniquely identifiedby the set of events that has been collected so far, andeach state checks the constraints for which all relevantevents have been collected. By construction, the FSMis insensitive to the order in which it receives events,since it explicitly specifies every possible sequence ofthree events. However, we have left many arcs out ofthis FSM, including arcs needed to handle the violationof constraints, or the user correcting previously submit-ted information. This example illustrates the two majorproblems faced by FSM-based approaches:

• As the number of input events and constraints to bechecked increases, there is the potential for an expo-nential blow-up in the number of states. This is es-pecially true if the FSM has been designed to handle

Page 6: Sisl: Several Interfaces, Single Logic

98 Ball et al.

Figure 2. A (simplified) finite state machine for collecting thesource account (src), target account (tgt), and dollar amount (amt)needed for the transfer transaction.

different orderings of input events, incomplete in-puts, and correction of previous inputs.• The FSM is fragile with respect to changes in the

requirements of the service. For example, if a fourthevent were required for a transfer to take place,the changes to the FSM would be non-modular andhence complicated.

Reactive constraint graphs in Sisl allow explicit de-scription of constraints, satisfy the requirements thatwe have described and do not suffer from these twoproblems of FSM-based approaches.

Similar issues also arise in spoken dialogue systems,which combine automatic speech recognition with nat-ural language understanding.

Toolkits and application programming interfacesfor building spoken dialogue systems includeCSLU (Sutton et al., 1996), VIL (Pargellis et al., 1998),and REWARD (Brondsted et al., 1998). In spokendialogue systems, dialogue flow is controlled by a di-alogue manager, which is responsible for eliciting suf-ficient information from the user, communicating thisinformation to the application, and reporting informa-tion back to the user. In mixed-initiative dialogue sys-tems, dialogue control is shared between the user andthe system. Approaches to dialogue management canbe broadly classified into FSM-based methods, on theone hand, and self-organizing or locally-managed ap-proaches on the other (McTear, 1998).

• FSM-based methods face the obstacles describedabove, such as inflexibility of the dialogue, difficultyhandling extra information provided by the user,difficulty allowing the user to back up to previouspoints in the service, and combinatorial explosionof the state machine size when adding more flexi-bility in user input (McTear, 1998; Goddeau et al.,1996).• The self-organizing approach to dialogue manage-

ment includes methods based on frames (Abellaet al., 1996) and forms (Goddeau et al., 1996;Issar, 1997); the focus is on systems with a sin-gle, spoken natural language interface. In contrast,our approach emphasizes services with multiple andvaried interfaces.

Sisl extends event-based approaches aimed at multi-modal systems where dialogue control is represented asa variant of AND/OR trees (Wang, 1998) in two ways.We move from events to partial information on sets ofevents, and from AND nodes to constraint nodes thatwait for a constraint to be satisfied. We also augmentthe control structures with the ability to revert backto earlier points in the service, and provide automaticgeneration of prompts and user help.

3. The Sisl Architecture

The Sisl architecture is shown in Fig. 3. All communi-cation between the service logic and its multiple userinterfaces1 passes through the service monitor and oc-curs via events. The service monitor mediates commu-nication between the service logic and the user inter-faces on an application-specific basis. In particular, theservice monitor is responsible for passing events fromthe user interfaces to the service logic in a priority orderthat is tailored to the application.

The user interface prompts and collects events fromthe user and dispatches these to the service monitor.The service monitor interacts with the service logic inrounds. In each round, an individual event is selectedfrom the input set in priority order and sent to the ser-vice logic, which processes the event, performs its as-sociated computations, and reports its new enabled setof events to the service monitor. If events in the in-put set match the enabled set of events, the servicemonitor selects one in priority order and the processis repeated. If no event in the input set matches theenabled set of events, the service monitor reports thecurrently enabled set of events of the service logic to

Page 7: Sisl: Several Interfaces, Single Logic

Sisl: Several Interfaces, Single Logic 99

Figure 3. The Sisl architecture.

the user interface. Based on these enabled events, theuser interface prompts the user and collects the next setof events.

There are three kinds of events:promptevents,upevents, andnotifyevents.

Prompt events.Prompt events indicate to the user in-terface what information to communicate to the userand what information the service is ready to accept.There are three kinds of prompt events:

• promptchoiceevents are disjunctive choices cur-rently enabled in the service logic. For example,after the user has successfully logged into theAny-Time Teller, a choice among the differenttransaction types is enabled. The service logicsends apromptchoicedeposit, promptchoicewithdrawal, promptchoicetransferevent, and soforth, to the service monitor.• prompt req events are the events currently re-

quired by the service logic. For example, sup-pose the user has chosen to perform a transfertransaction. The Any-Time Teller requires thatthe user input a source account, target account,and amount, and hence sends aprompt req src,prompt req tgt, andprompt req amt event to theservice monitor.• promptopt events are events enabled in the ser-

vice logic for which the user may correct previ-ously given information. For example, supposethe user is performing a transfer and has al-ready provided his source and target accounts,but not the amount. The service logic sends apromptopt src, promptopt tgt event, as well as aprompt req amtevent, to the service monitor. This

indicates that the user may override the previouslygiven source and target accounts with newinformation.

Up events.Up events correspond to earlier points inthe service logic to which the user may go back. Forexample, the service logic sends anup MainMenuevent to the service monitor. This allows the userto abort any transaction and go back up to the mainmenu.

Notify events.Notify events are simply notificationsthat the user interface should give the user, for ex-ample, that a transaction has completed successfullyor that information provided by the user was incor-rect or inconsistent.

4. Requirements on User Interfaces

User interfaces have two main responsibilities with re-spect to the Sisl architecture that reflect the two-wayinformation flow between the user interface and servicelogic:

• Based on the events received from the service logic(via the service monitor), prompt the user to providethe appropriate information and respond if the userrequests help.• Collect the information from the user and transform

the information into events to be sent to the servicelogic, via the service monitor.

Any user interface that performs these functions canbe used in conjunction with a Sisl service logic. Forexample, web-based interfaces using HTML pages can

Page 8: Sisl: Several Interfaces, Single Logic

100 Ball et al.

perform this event transformation and communicationvia name/value pairs, as can telephone voice interfacesbased on VoiceXML document interpretation. Desk-top automatic speech recognition interfaces based onthe Java Speech API support this capability via tagsin the speech grammars (the input to a speech recog-nition engine that permits it to recognize spoken inputefficiently and effectively).

In addition, Sisl provides a convenient frame-work for designing and implementing web, applet,automatic speech recognition, and telephone-basedvoice interfaces. To implement such user interfaces,the UI designer need only specify two functions—corresponding to the prompt and help mechanisms. Forautomatic speech recognition interfaces (on the desktopand via the telephone), a set of speech grammars to-gether with a third function that specifies which gram-mars to enable is also required.

The required functions are:

• A prompt function that generates the string to begiven as the prompt to the user. An example isgiven in Fig. 4. The Sisl infrastructure automaticallycauses speech-based interfaces to speak the promptstring. Web-based interfaces automatically displaythe prompt string, as well as radio buttons corre-

Figure 4. A portion of the web, desktop ASR and telephone-based voice user interface for the Any-Time Teller.

sponding to the possible transaction choices. (Forthe other prompt events, text fields are automaticallydisplayed, while submit buttons are automaticallydisplayed for enabledupevents.) A screen snapshotis given in Fig. 5.• A helpfunction that generates the string to be given

as the prompt to the user. An example is given inFig. 4.• A grammarfunction that enables the correct set of

grammar rules. This function is only needed for au-tomatic speech recognition interfaces. An exampleis given in Fig. 6.

Figure 4 shows portions of the prompt and help func-tions shared by a web, automatic speech recognition,and telephone-based voice interface for the Any-TimeTeller. Portions of the grammar rules, against which thespeech recognition engine will parse spoken input fromthe user, are depicted in Fig. 6. Figure 7 illustrates a por-tion of the associated grammar function shared by anautomatic speech recognition interface and telephone-based voice interface.

From these functions (and grammars), the Sisl in-frastructure automatically coordinates the collectionand event transformer mechanisms and integrates theuser interface with the service logic and the service

Page 9: Sisl: Several Interfaces, Single Logic

Sisl: Several Interfaces, Single Logic 101

Figure 5. A screen shot of web interface: Choice node.

Figure 6. A portion of the speech recongnition grammer for the desktop ASR and telephone-based voice user interface for the Any-Time Teller.

monitor. For automatic speech recognition-based in-terfaces, the Sisl infrastructure automatically gener-ates a desktop interface based on the Java SpeechAPI. To enable telephone-based voice access tothe service, Sisl automatically generates VoiceXML(VoiceXML, 1999) pages, which specify the voice di-alog to be carried out on a telephony platform. Forweb-based interfaces, the Sisl infrastructure automati-cally generates HTML pages. (We note that Sisl also

provides a mechanism for the UI designer to customizethe look and feel of the interface, if so desired.)

5. Service Logic: Reactive Constraint Graphs

In Sisl, the service logic of an application is speci-fied as areactive constraint graph, which is a directedacyclic graph with an enriched structure on the nodes.The traversal of reactive constraint graphs is driven by

Page 10: Sisl: Several Interfaces, Single Logic

102 Ball et al.

Figure 7. A portion of the desktop ASR and telephone-based voice user interface for the Any-Time Teller.

the reception of events from the environment. Theseevents have an associated label (the event name) andmay carry associated data. In response, the graph tra-verses its nodes and executes actions; the reaction ofthe graph ends when it needs to wait for the next eventto be sent by the environment. An important aspect ofour computational paradigm is that every reaction ofthe graph is atomic with respect to events: that is, noevent can be received from the environment during areaction of the graph.2

For example, Fig. 8 depicts a Sisl reactive constraintgraph that implements part of the functionality of theAny-Time Teller. As mentioned in the introduction, Sisl

Figure 8. The reactive constraint graph for a portion of the Any-Time Teller.

has a XML front-end to describe reactive constraintgraphs in a markup language, allowing Sisl to be usedeasily by application programmers.

Types of nodes.Reactive constraint graphs can havethree kinds of nodes.

Choice nodes.These nodes represent a disjunctionof information to be received from the user. Thereare two forms of choice nodes: event-based anddata-based. Every event-based choice node has aspecified set of events. For every event in this set,the Sisl infrastructure automatically sends out a

Page 11: Sisl: Several Interfaces, Single Logic

Sisl: Several Interfaces, Single Logic 103

correspondingpromptchoiceevent from the ser-vice logic to the user interfaces (via the servicemonitor). The choice node waits for the user in-terfaces to send (via the service monitor) any eventin the specified set. When such an event arrives,the corresponding transition is taken, and controltransfers to the child node. To ensure determinism,all outgoing transitions of a choice node must belabeled with distinct event names.

Every data-based choice node has a specifiedset of preconditions on data. To ensure determin-ism, these preconditions must be specified so thatexactly one of them is true in any state of the sys-tem. When control reaches a data-based choicenode, the transition associated with the uniquetrue precondition is taken, and control transfersto the child node.

Constraint nodes.These nodes represent a conjunc-tion of information to be received from the user.Every constraint node has an associated set ofconstraints on events. The Sisl infrastructure auto-matically sends out aprompt reqevent—from theservice logic to the user interfaces (via the servicemonitor)—for every event that is still needed inorder to evaluate some constraint. In addition, itautomatically sends out apromptoptevent for allother events mentioned in the constraints. Thesecorrespond to information that can be correctedby the user.

In every round of interaction, the constraintnode waits for the user interfaces to send (via theservice monitor) any event that is mentioned in itsassociated constraints. Each constraint associatedwith a constraint node is evaluated as soon as allof its events have arrived. If an event is resent bythe user interfaces (i.e., information is corrected),all constraints with that event in their signatureare re-evaluated. For example, in Fig. 8, the con-straint node for the transfer capability waits forthe eventssrc, tgt, andamtand evaluates the cor-responding constraints.

For every evaluated constraint, its optionalsatisfied/violated action is automatically exe-cuted, and anotify event is automatically sent tothe user interfaces with the specified message. Ifany constraint is violated, the last received eventis automatically erased from all constraints, sinceit caused an inconsistency.

A constraint node exits when all of its con-straints have been evaluated and are satisfied.

Action nodes.These nodes represent some action,not involving user interaction, to be taken. Afterthe action is executed, control transfers to the childnode.

Additional structure on nodes. All nodes can havean optional “uplabel,” which is used to transfer con-trol from some descendant back up to the node, al-lowing the user to revert back to previous points inthe service. In each round of interaction, the Sislinfrastructure automatically sends out anupevent—from the service logic to the user interfaces (via theservice monitor)—corresponding to the up-label ofevery ancestor of the current node.

Nodes can also have a self-looping arc, with aboolean precondition on data. This indicates thatthe subgraph from the node will be repeatedly exe-cuted until the precondition becomes false.

Details about constraint nodes.We now give moredetails about the structure of constraint nodes.Constraints have the following components:

• Thesignatureis the set of events occurring in theconstraint.• Theevaluation functionis a boolean function on

the events in the signature.• The (optional)satisfaction tupleconsists of an op-

tional action (not involving user interaction) andan optional notify function that may return anotifyevent with an associated message. If the constraintevaluates to true, the action is executed, the no-tify function is executed, and the returnednotifyevent is sent to the user interfaces (via the servicemonitor).• The (optional)violation tupleconsists of an op-

tional action (not involving user interaction), anoptional notify function that may return anotifyevent with an associated message, an optional up-label function that may return the up-label of anancestor node, and an optional child node. Ifthe constraint evaluates to false, the action is ex-ecuted, the notify function is executed, and thereturnednotify event is sent to the user interfaces(via the service monitor). The up-label function isalso executed: if it returns an ancestor’s up-label,it is generated, and hence control reverts back tothat ancestor. If no ancestor’s up-label is returned,and a child node is specified, control is transferredto that node.

Page 12: Sisl: Several Interfaces, Single Logic

104 Ball et al.

Every constraint node has the followingcomponents:

• An associated set of constraints. In the currentsemantics and implementation, this set is totallyordered,3 specifying the priority order in whichconstraints are evaluated.• An optional entry action (not involving user

interaction).• An optional finished tuple, consisting of an op-

tional exit action (not involving user interaction),an optional notify function, an optional up-labelfunction, and an optional child node.

A detailed description of constraint node executionis given in Fig. 9, and summarized below.

The constraints are evaluated in the specified pri-ority order (currently the total ordered set). If someconstraint is violated and no control flow changes (viaan up-label function or child node) occur as a result, thenode goes back to waiting for events. If no constraintshave been violated, the action and notify functions ofevery satisfied constraint are executed, and the returnednotify events are sent to the user interfaces (via theservice monitor). When all the constraints have beenevaluated and are satisfied, the exit action and notifyfunction associated with the constraint node are exe-cuted, and control is transferred either to an ancestor(via the up-label function) or to the child node.

6. An Example Execution of the Any-Time Teller

To illustrate the Sisl paradigm, we now describe anexample execution of the Sisl Any-Time Teller depictedin Fig. 8, using the web, automatic speech recognition,and telephone-based voice interfaces partially specifiedin Figs. 4, 6 and 7.

Logging into the Service.The service initially says“Welcome to the Any-Time Teller.” The controlpoint is at the root node, a constraint node. For theconstraint to be satisfied, the login and the pin valuesmust be identical (i.e.,login==pin). The Sisl infras-tructure automatically sends out aprompt req loginandprompt req pin from the service logic to the userinterfaces (via the service monitor). The user inter-faces (via the prompt function) say “Please specifyyour login and personal identification number.” Forthe web interface, text fields for the login and pinare automatically generated; for the speech-basedinterfaces, the grammars specified in the grammarfunction are automatically enabled.

In this scenario, suppose the user says “My loginis Mary Smith and my pin is Mary Smith”, and hencea login event with the value “Mary Smith” and apinevent with the value “Mary Smith” are sent to theservice logic. Since the login and pin are identical,the constraint is satisfied. The Sisl infrastructure au-tomatically sends out anotifyevent with the message“Hello Mary Smith. Welcome to the Sisl Any-TimeTeller.” The user interface says this message to theuser.

Login Successful.Control now proceeds to the choicenode. The Sisl infrastructure automatically sendsout

• promptchoicestartDeposit,• promptchoicestartWithdrawal,• promptchoicestartTransfer, and• promptchoiceuserQuit

events from the service logic to the user interfaces(via the service monitor), corresponding to the en-abled choices. The user interfaces ask the user “Whattransaction would you like to do?” Figure 5 shows ascreen snapshot of the web-based interface; the pos-sible choices are automatically represented as radiobuttons. For an automatic speech recognition inter-face, if the user says “I need help,” the user inter-face says—via the help function of Fig. 4—“Youcan make a withdrawal, deposit, transfer, or you canquit the service.” Suppose the user now chooses toperform a transfer; thestartTransferevent is sent tothe service logic.

Transfer. Control now proceeds to the transfer con-straint node. The Sisl infrastructure automaticallysends out

• prompt req src,• prompt req tgt, and• prompt req amt

events from the service logic to the user inter-faces (via the service monitor), together with aup MainMenuevent (since it is the up-label of anancestor). Suppose the user responds with “I wouldlike to transfer $1000 from my checking account,”or equivalently “From checking, I’d like to trans-fer $1000.” Either order of information is allowed;furthermore, this information is partial, since the tar-get account is unspecified. The user interface sendsa src and amt event, with the corresponding data,to the service monitor, which sends them one at atime to the service logic; suppose thesrc is sentfirst, followed by theamt event. The constraints

Page 13: Sisl: Several Interfaces, Single Logic

Sisl: Several Interfaces, Single Logic 105

Figure 9. The execution of constraint nodes.

Page 14: Sisl: Several Interfaces, Single Logic

106 Ball et al.

amt >=0, isValid( src ) and amt <= Bal-ance( src ) are automatically evaluated.

Suppose the checking account does not have a bal-ance of at least $1000; hence, there is a constraintviolation and the supplied amount information iserased since it was sent last. Note that constraints areevaluated as soon as possible: for example, the useris not required to specify a target account in orderfor the balance on the source account to be checked.The Sisl infrastructure then automatically sends outa promptopt src, prompt req tgt, prompt req amt,andup MainMenuevent from the service logic tothe user interfaces (via the service monitor), as wellas anotify event with the message, “Your checkingaccount does not have sufficient funds to cover theamount of $1000. Please specify an amount and atarget account.” The user interface then notifies theuser with this message and prompts the user for theinformation.

Suppose the user now says, “Transfer $500 to sav-ings.” Theamtandtgt events are sent to the servicemonitor, and passed to the service logic. The con-straints are now all evaluated and satisfied, the ser-vice logic automatically sends anotify event to theuser interfaces with the message, “Your transfer of$500 from checking to savings was successful.”

Control then goes back up to the choice node. Theloop on the incoming arc to the choice node indicatesthat the corresponding subgraph is repeatedly exe-cuted until the condition on the arc becomes false.If the user wants to quit, theuserQuitevent is sent tothe service logic, the hasQuit variable is set to true,and the loop is terminated.

Back up to the Main Menu.If the user would like toabort at any time during a withdrawal, deposit, ortransfer transaction, he/she can say, “I would like togo back up to the Main Menu,” which results in anup MainMenuevent to be sent to the service logic.This causes control to return to the choice node,which has an associated upMainMenu label.

7. Other Sisl Features

We now describe two additional features of Sisl: aform of event priorities and lookahead, and a testing/debugging facility.

Event priorities and lookahead.As described inSection 3, the user interfaces collect a set of events

from the user and dispatch them to the service moni-tor. The service monitor then selects and sends eventsfrom this set, one at a time, to the service logic. Inparticular, the service monitor will only send eventsthat match the currently enabled set of events of theservice logic. This behavior automatically supportsa form of event priorities, since enabled events havehigher priority than others. It also provides a formof lookahead, since additional information given bythe user is stored until it becomes relevant. For ex-ample, consider a scenario of the Any-Time Teller inwhich the user has successfully logged in. The ser-vice logic’s control point is at the choice node, andthe enabled events of the service logic arestartDe-posit, startWithdrawal, startTransfer, anduserQuit.Suppose the user now says “I would like to transfer$500 from checking,” and hence the user interfacepasses thestartTransfer, src andamt events to theservice monitor. Since only thestartTransfereventis enabled at this node, it automatically will be sentfirst by the service monitor. The service logic nowproceeds to the transfer constraint node, and thesrc,amt, andtgt events now become enabled. Since theuser has already given the source and amount infor-mation, it will automatically be passed to the servicelogic, and the user will then be prompted for the re-maining information, namely, the target account. Inthis fashion, Sisl services can automatically performlookahead in the user input, without requiring anymodifications to the service logic.

Specification-based testing.Reactive constraintgraphs in the Sisl language can be instrumentedwith a non-intrusive testing and debugging facilityin the flavor of assert statements in traditional lan-guages. In particular, service specifications can beexpressed as safety properties, or conditions, on se-quences of events in propositional linear-time tem-poral logic (Manna and Pnueli, 1992).

As a simple example, in the Any-Time Teller, theenvironment must send astartTransferevent (as rep-resented by the event label on the choice node transi-tion) before thesrc, tgt, andamtevents can becomeenabled (as represented by the events in the trans-fer constraint node). In addition, the environmentmust first sendlogin andpin events in any order tostart up the service. This required behavior can berepresented as a temporal logic specification; if thisspecified property is violated at any point during anexecution of the service, the assertion fails and the

Page 15: Sisl: Several Interfaces, Single Logic

Sisl: Several Interfaces, Single Logic 107

Sisl toolset generates a special event. The serviceprogrammer/tester has the option to abort the appli-cation, ignore the failed assertion, or ask the systemto report entire test traces.

8. Conclusions

Our implementation of the Any-Time Teller consistsof approximately 500 lines of Sisl code. It currentlyhas applet, HTML, automatic speech recognition, andVoiceXML (VoiceXML, 1999) interfaces, each about300 lines, all sharing the same service logic.

We are currently using Sisl to develop an interactiveservice based on a system for visual exploration andanalysis of data (Cox et al., 1999). Our service supportsmultiple interfaces—including web, automatic speechrecognition, and text-based interfaces, all with naturallanguage understanding. All interfaces share the sameservice logic.

Moving further up in application size, Sisl is be-ing used to prototype a new generation of call pro-cessing services for a Lucent Technologies switchingproduct, as part of a collaboration between researchand development. These services must be accessi-ble via both graphical and speech-based natural lan-guage interfaces, and must support the inexperiencedas well as expert user. As part of this collaboration,the research/development team has prototyped the re-quirements for some call processing features and hasintegrated these prototyped services into the call pro-cessing platform of the actual switch. We have usedweb, touch-tone, and speech based interfaces to inter-act with these services.

We are also exploring the use of Sisl in the develop-ment of collaborative applications in which users mayinteract with the system through a variety of devices, in-cluding personal digital assistants, cellular telephones,and traditional desktop devices. For such applications,a richer class of information, including documents andand information visualization views, may be sent bythe service logic to the user interfaces. We expect thatthis will lead to interesting extensions of Sisl, in whichthe transmitted information may be tailored for a vari-ety of devices. For example, a spreadsheet or informa-tion visualization view may appear in a distilled formon a personal digital assistant, while a document maybe automatically spoken on a cellular phone without ascreen.

Detailed user studies and experience reports on theseapplications will appear in later publications.

Acknowledgments

Christopher Colby’s and Radha Jagadeesan’s work onTriveni is supported in part by a grant from the NationalScience Foundation.

Notes

1. There is one instance of the service logic for each user interactingwith the service.

2. We note that reactive constraint graphs consume events. In partic-ular, if a parent node and child node both react to a certain event,then the child node will need to wait for the next occurrence ofthat event. The event is said to be consumed by the parent node.

3. This may be generalized in the future to support partialorderings.

References

Abella, A., Brown, M., and Buntschuh, B. (1996). Developmentprinciples for dialog-based interfaces. InProceedings of the Euro-pean Conference on Artificial Intelligence, Vol. W29. Budapest,Hungary: Wiley, pp. 1–6.

Atkins, D.L., Ball, T., Benedikt, M., Cox, K., Ladd, D., Mataga,P., Puchol, C., Ramming, J., Rehor, K., and Tuckey, C. (1997).Integrated web and telephone service creation.Bell Labs TechnicalJournal, 2(1):19–35.

Atkins, D.L., Ball, T., Bruns, G., and Cox, K. (1999). Mawl: Adomain-specific language for form-based services.Institute ofElectrical and Electronic Engineers Transactions on Software En-gineering, 25(3):334–346.

Brondsted, T., Bai, B., and Olsen, J. (1998). The reward servicecreation environment, an overview. InProceedings of the Inter-national Conference on Spoken Language Processing, Vol. 4.Sydney, Australia: Australian Speech Science and TechnologyAssociation, Incorporated, pp. 1175–1178.

Colby, C., Jagadeesan, L.J., Jagadeesan, R., L¨aufer, K., and Puchol,C. (1998a). Design and implementation of Triveni: A process-algebraic API for threads + events. InProceedings of the Inter-national Conference on Computer Languages. Chicago, IL: IEEEComputer Society Press, pp. 58–67.

Colby, C., Jagadeesan, L.J., Jagadeesan, R., L¨aufer, K., and Puchol,C. (1998b). Objects and concurrency in Triveni: A telecommu-nication case study in Java. InProceedings of the 4th Conferenceon Object-Oriented Technologies and Systems (COOTS-98). SantaFe, NM: USENIX Association, pp. 133–148.

Cox, K., Hibino, S., Hong, L., Mockus, A., and Wills, G. (1999).Infostill: A task-oriented framework for analyzing data throughinformation visualization. InInstitute of Electrical and Elec-tronic Engineers Symposium on Information Visualization Late-Breaking Results. San Francisco, CA: IEEE Computer Society,pp. 19–22.

Francis, B., Fedorov, A., Harrison, R., Homer, A., Murphy, S., Smith,R., and Sussman, D. (1998).Professional Active Server Pages 2.0.Chicago, IL: Wrox Press Inc.

Goddeau, D., Meng, H., Polifroni, J., Seneff, S., andBusayapongchai, S. (1996). A form-based dialogue manager forspoken language applications. InProceedings of the International

Page 16: Sisl: Several Interfaces, Single Logic

108 Ball et al.

Conference on Spoken Language Processing, Vol. 2. Philadelphia,PA: IEEE, pp. 701–704.

Hayes, P.J., Szekely, P.A., and Lerner, R.A. (1985). Design alterna-tives for user interface management systems based on experiencewith COUSIN. In Proceedings of Conference on Human Fac-tors in Computing Systems, Interface Tools and Structures. SanFrancisco, CA: Association for Computing Machinery, pp. 169–175.

Issar, S. (1997). A speech interface for forms on WWW. InProceed-ings of the European Conference on Speech Communication andTechnology. Rhodes, Greece: European Speech CommunicationAssociation, pp. 1343–1346.

JSAPI. (1998). Java Speech API programmer’s guide.http://java.sun.com/products/java-media/speech/.

Manna, Z. and Pnueli, A. (1992).The Temporal Logic of Reactiveand Concurrent Systems. New York: Springer.

McTear, M. (1998). Modelling spoken dialogues with state transitiondiagrams: Experiences of the CSLU toolkit. InProceedings of theInternational Conference on Spoken Language Processing, Vol. 4.Sydney, Australia: Australian Speech Science and Technology

Association, Incorporated, pp. 1223–1226.Myers, B. and Rosson, M. (1992). Survey on user interface pro-

gramming. InProceedings of the Conference on Human Factorsin Computing Systems. Monterey, CA: Association for ComputingMachinery, pp. 195–202.

Pargellis, A., Zhou, Q., Saad, A., and Lee, C.-H. (1998). A lan-guage for creating speech applications. InProceedings of the In-ternational Conference on Spoken Language Processing, Vol. 4.Sydney, Australia: Australian Speech Science and TechnologyAssociation, Incorporated, pp. 1619–1622.

Sutton, S., Novick, D.G., Cole, R.A., Vermeulen, P., de Villiers,J., Schalkwyk, J., and Fanty, M. (1996). Building 10,000 spokendialogue systems. InProceedings of the International Conferenceon Spoken Language Processing, Vol. 2. Philadelphia, PA: IEEE,pp. 709–712.

VoiceXML. (1999). http://www.voicexml.org/.Wang, K. (1998). An event driven model for dialogue systems. In

Proceedings of the International Conference on Spoken LanguageProcessing, Vol. 2. Sydney, Australia: Australian Speech Scienceand Technology Association, Incorporated, pp. 393–396.