a scalable service architecture for computer-telephony...

1

A Scalable Service Architecture for Computer-Telephony Integration

R. Katz, A. Joseph, S. Czerwinski, T. Hodes, B. Hohlt, E. Kiciman, R. Ludwig*, S. Mukkamalla, K. Oden, A. Ordonez, B. Raman, J. Shih, H. Wang, B. Zhao

Computer Science DivisionDepartment of Electrical Engineering and Computer Science

University of California at Berkeley, Berkeley, CA, USA 94720-1776* affiliated with Ericsson Radio Systems

Abstract

The convergence of traditional voice-oriented telecommunicationsnetworks and data-oriented computer communications networks isyielding new challenges for building systems equally adept at han-dling voice and data applications. While there is much discussionabout packetized voice over IP networks, a little explored opportu-nity is the ability to more easily deploy innovative new servicesbased on the Internet’s client-server paradigm and the ease withwhich software agents can be introduced and migrated around thenetwork. We discuss our new architecture for middleware servicesthat more effectively enables the integration of telephone and dataapplication. This horizontally-integrated architecture supportscompetition between interchangeable service implementations,based upon features, cost, etc. It is characterized by pervasive andseamless access across multiple cascaded networks. We describeour experiences in integrating an Internet-based core with cellularand other access networks, and our analysis of IP performance inthis testbed using a graphical multi-layer protocol analysis tool.Based on our architecture, we have developed prototype convergedapplications for voice-actuated room control and personal “univer-sal in-box” information management.

Key Words and Phrases: Computer-telephony integration; middle-ware; data and voice convergence; hybrid network architectures;

1. Introduction

1.1. A Motivating Scenario

Imagine that you walk into a room with a multi-networkcommunicator device that operates over several wireless net-works: IR, Wireless LAN, cellular (alternatively, you mightbe carrying several devices: a cell phone, pager, PDA, andlaptop, each with its own options for network connectivity).

Your communicator (or preferred device) connects to thelocal infrastructure and dialogs with it to build a custom UserInterface (UI) for controlling the room’s environment. Thiscontrol UI could be device-specific (e.g., a cell phone mightuse a speech-based interface, while a laptop a Graphical UI).Using a cell phone, you could speak the phrase “dim thelights,” and the lights in your room would be dimmed. Stepinto another room, and utter the phrase “lights on,” and thelights in that room would come on. Location informationprovided by the end-device or access network would be usedto specialize the speech-recognized commands and apply

them to the local environment.

Once connectivity and the local control interface is estab-lished, your device asks the infrastructure to establish asecure “path” from the local environment to your personalinformation space of preferences, address book, messages,agents processing on your behalf, etc. The device creates thepath by exchanging network- and service-specific authenti-cation information with the local access network (e.g., com-municating the user’s Caller ID information from a cellularphone). The gateways along the path transcode the informa-tion to the appropriate format for each network.

Agents in your personal information space are processing e-mail, voice-mail, faxes, and paging messages for you. Basedon your preferences and user- and service-specified policies,any communications (including two or more way conversa-tions) directed at you can be intercepted, translated to thedesired form, and delivered on your preferred device. Usingthis Universal In-box Service, your e-mails can be processedand summarized, and converted into text for delivery via theShort Message Service to your cell phone. A more sophisti-cated example is capturing a voice message, directing it to aspeech recognition service, passing the resulting text to anatural language service to intelligently extract headers, anddelivering the summary to your e-mail in-box.

We call these services and capabilities Potentially Any Net-work Services (PANS), as they are accessible from multiplenetworks. These PANS services are enabled by the conver-gence of telecommunications and data networks. This paperpresents an architecture that provides the necessary tools andframework for building the applications in this scenario

1.2. The Opportunity: An End-to-End Digital Network based on IP

There is much discussion about the convergence of telecom-munications and data networks. The fastest growing sectorsof the telecommunications market are cellular/mobile accessnetworks on the one hand, and the Internet/World Wide Webon the other (Figure 1). The growth of mobile phone sub-scribers has been so rapid that the number of mobile phonesmay soon exceed the number of wireline phones in someparts of the world (Figure 2). These trends suggest an emerg-ing pervasive capability for access to information on the

2

move, enabled by portable access devices that combine someof the capabilities of a telephone with those of a personaldigital assistant (PDA) (e.g., communicator devices like theNokia 9000 and the new Qualcomm PdQ).

That these networks would become converged is inevitable.While the core of the public-switched telephone network(PSTN) has been digital for several decades, it is onlyrecently that full end-to-end digitization of the phone net-work has become more real. The rapid growth of digital cel-lular access networks has led the way, providing a pervasivedigital infrastructure supporting mobile users, albeit at lowdata rates (Figure 3). Third generation systems offer thepromise of hundreds of Kbit/s to small Mbit/s in the widearea (Figure 4). These developments are being followed byaccelerating deployments of broadband access to the home,via such technologies as cable modem and xDSL, driven bythe exploding demand for Internet access (Figure 5).

With the rollout of digital broadband “last mile” extensions,momentum is building for a universal IP-based core net-work. IP networks have advantages in terms of low deploy-ment cost, support for heterogeneous link, network, andaccess device technologies, and ease of deploying network-based services around well-understood structuring methodssuch as Remote Procedure Call [Nelson84] and Active Mes-sages [Culler92]. But IP networks also face challenges inachieving adequate performance for critical real-time appli-

cations like packet voice. Nevertheless, the technology forvoice over IP is progressing rapidly, and its usage will accel-erate. Exploitation of voice, and services that enable moreeffective use of voice, will be important in the convergednetwork. This is especially true in the context of small, por-table access devices with limited user interface support.

In fact, World Wide Web access is overtaking voice as the“killer” application of future networks. In places like the SanFrancisco Bay Area, data is already represents the dominatedata type of the network. Marketing studies predicate thatthe fastest growing applications of the integrated network

0

100

200

300

400

500

600

700

1993 1994 1995 1996 1997 1998 1999 2000 2001

Mobile TelephoneUsers

Internet Users

Millions

Year

Figure 1. Rapid Growth of Mobile Telephone Subscribers andInternet Users World Wide

Source: Ericsson Radio Systems

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

1996 1997 1998 1999 2000

Fixed

M obile

Millions ofTelephone Lines

Figure 2. Growth of Cellular Subscribers in Hong Kong:Mobile Access Reaching Parity with Fixed Services

Source: Pyramid Researchin The Economist, 31 Oct 98

0

100

200

300

400

500

600

700

1993 1994 1995 1996 1997 1998 1999 2000 2001

Millions ofSubscribers

Year

Digital

Analog

Figure 3. Second Generation Cellular Networks Represent theLargest Base of Digital Access to the Subscriber

Source: Ericsson Radio Systems

�Mobile BroadbandSystems�

WirelessLocal AreaNetworks

Mbps

0.01

0.1

1

10

100

Wired

Cellular

Cordless

�Universal MobileTelecomms Systems�

(UMTS)

60 GHz100 m range

Office orRoom

Building

Indoors

Stationary Walking

Outdoors

Vehicle

Figure 4. Expected Bandwidths in Third Generation Systems

0

5

10

15

20

25

30

35

1998 1999 2000 2001 2002

Broadband

Narrow band

Forecast American Householdswith Internet Connections (millions)

Figure 5. Broadband Digital Access to the Home Will GrowRapidly But Not Completely Displace Narrowband ModemAccess in the Next Few Years

Source: Forrester Researchin the Economist, 7 Nov 98

3

will not be voice, but rather web-based transactions[Nelson98]. A network for excellent end-to-end data transac-tion performance is important, perhaps more so than supportfor real-time applications like voice and video.

A by-product of these developments is that the circuit-switched PSTN infrastructure has become overloaded. Thismarvelously engineered system was never designed for thelong duration connections of computer dial-up sessions.Local operators are being driven to integrate a packetswitched infrastructure with the circuit switched PSTN, e.g.,by integrating an IP router into a switch. Hence, the forcesare already driving the operators towards an IP-based core.

Figure 6 illustrates how this migration is taking place. Intoday’s PSTN, data is converted from digital to analog todigital form several times and transported through the infra-structure as though it were voice traffic (top part of figure).The first step in the migration is to capture the data streamsin the access network and route them through an IP-basedwide area network to their destination. This could be througha local router/switch for modem terminated traffic or viagateways directly to a corporate LAN. Interworking func-tions (IWFs) manage the necessary conversions between for-mats1. The second step is to deploy voice over IP gatewaysin the local area, and use the WAN rather than the PSTN asthe wide area transport. Bypassing the local exchange withdirect access to the Internet is a major advantage for broad-band access networks like cable modems.

While existing access networks for wireline voice, cellular,paging, etc., will not soon disappear, the notion of an inte-grating “IP Dialtone”--a single network for wireless, Inter-net, and voice access--is becoming a reality (e.g., Sprint’sIntegrated On-demand Network, MCI/Worldcom’s On-Net,Qwest Communications, Level 3 Communications, etc.).

1.3. The Challenge: Services for Multi-Networks and Diverse End Devices

We assume the continuation of diverse access networks inte-grated around an IP-based core, with deployment of voiceover IP. Our focus is on new services and their rapid deploy-ment in the core. The challenge is to develop a service archi-tecture that achieves seamless access across networks (PSTNand IP), devices (telephones, pagers, computers), types ofcommunications media (speech, message, Web), and user-interfaces (hands-free, keyboard-free, keyboard and mouse).

How can I best reach you if you have a mobile phone and awireless laptop computer? It should be easy for you toreceive calls or access your e-mail whether you choose thephone or the laptop. Consider the mobility service availablefor telephones, via the cellular network, and for portablecomputers, via the Mobile IP protocols. These are network-and device-specific. There is no existing service that allowsyou to move easily between telephones and computers, con-tinuing to access your calls and e-mails as you do so. Weexplore these questions as challenges for the architecture.

What is the “best” network to reach you (network transpar-ency)? Individuals should be “connected” irrespective of theaccess network they currently choose or their preferences forhow to be reached. The service architecture must enable theuse of service- and user-specific policies for routing (andtranscoding) streams between networks: policies that allowthe services layer to determine the “best” network to use.

If I send you an e-mail message, can you access it on yourcell phone (device- and type-independence, device-transpar-ency)? Type-independence means an individual can receiveany communications medium, irrespective of type. An e-mail can be speech-synthesized for a user on a telephone,while an e-mail can be composed via speech recognition anddictation. Type-independence enables device-transparency:e-mail can be delivered to telephones and voice-streams canbe delivered to electronic mailboxes. Device transparencyand network transparency yield device independence: theuser chooses to accept communications in any form andmanner, irrespective of originating device, network, ormedia type. Likewise, they can access services in other net-works, regardless of the format of that service’s data.

Can you easily move from listening to your e-mail on a cellphone to reading it on a laptop (service mobility)? Services,including those in use, should be accessible even as the usermoves from one network to another. The ability to pass anactive service across network boundaries is service handoff.Providing support for service handoff is challenging. Itrequires separating control (signaling, semantic information,and service metadata) and data information for each net-work, and propagating such information across each networkboundary. For example, when a VoIP user using a laptop anda wireless LAN dials Enhanced 9112 (E911), what locationis passed to the emergency services operator?

1. The GSM cellular network deploys IWFs to eliminate the need for modems in enddevices, since the GSM digital airlink can support data frames as well as digitizedvoice frames. Amazingly enough, the IWF converts data back to an analog for trans-missions over the PSTN.

IP-Based WAN

Local ExchNet (LEC)

Local ExchNet (LEC)

InterexchangeNetwork (IXC)

Local Switch Local Switch

Local Exch Local ExchPSTN

Local SwitchIWF + Router

Local SwitchIWF + Router

Voice TrafficConnection-Oriented

Data TrafficPacket-Oriented

Local Gateway Local GatewayCore Network

Access

Network

Access

Network

Figure 6. Migration of the Network Towards an IP-Based Core

4

Multi-modal Interfaces: Telephones are designed for voice,with a limited UI provided by the handset. Computers aredesigned for data (voice is a kind of data), and are difficult touse without a keyboard and pointing device. How can appli-cations be built once, but used on such different devices?Support for access from any kind of end-device, irrespectiveof the communications media it accepts, requires pervasivesupport for strongly-typed device interfaces and data formatsand any-to-any format conversion. The architecture mustprovide application-level tools for describing and construct-ing interface components, and system-level tools for auto-matically composing these. Given such tools, UI developerscan implement transcoding/transformational operators alongwith multi-modal interfaces. The environment then dynami-cally composes operators and interfaces to serve newdevices. This work has been examined initially in BARWAN[Brewer98], but not in the context of the huge heterogeneityspanning voice and data networks.

1.4. Technical Capabilities of the ICEBERG and Ninja Service Architectures

Our work is being developed within the context of two inter-related projects at Berkeley: ICEBERG3 and Ninja4. ICE-BERG is developing architectures for combining voice anddata services in third generation digital cellular networks.These are used by two types of entities: users (callers andcallees) and agents acting on their behalf. Ninja provides thedistributed run-time environment that supports service cre-ation and manages its execution. We are constructing a test-bed incorporating current and prototype access nets,spanning GSM cellular, wireless LAN (WaveLAN and Blue-tooth), two-way pager (ReFLEX), and integrated by a highbandwidth IP-based core (gigabit Ethernet). This testbedprovides a rich proving ground for experimenting with ubiq-uitous access to information, anywhere, anyplace, anytime,using any service and any I/O device.

ICEBERG and Ninja simpilfy the construction of services byproviding plug-and-play wide-area components; automaticdiscovery, composition, and use of components; powerfuldata transformation operators; support for e-commerce; sup-port for very diverse devices, sensors, and actuators; and theability to combine components to provide ubiquitous supportfor access and mobility. We describe each in more detail:

Plug-and-Play Wide-Area Software Components: A distrib-uted component architecture is essential for the kind of ser-vice model we envision. “Plug-and-play” implies thatcomponents are horizontally organized, rather than verticallyand tightly integrated. This permits the easy replacement of

one component with another, perhaps implementing anenhanced functionality or offering a lower cost solution.“Wide-Area” means the components are distributed acrossthe network, and can execute on heterogeneous underlyingsystems embedded in the switching fabric. For example,computationally-intensive services, such as voice recogni-tion, could be provided by third-party service providers eachwith different tradeoffs between quality and speed of recog-nition, with different charging models for their use.

Automatic Discovery, Composition, and Use: We envision anenvironment where alternative implementations of compo-nents exist. An essential ability is to automatically discoveror locate these services or components capable of providingthe desired functionality. If there are no available compo-nents providing the needed functionality, the environmentautomatically synthesizes it by composing more primitivecomponents. For example, if a speech-to-email conversioncomponent cannot be discovered, then the service environ-ment dynamically constructs one by composing a speech-to-text component with a text-to-email component.

Powerful Operators: Clusters, Databases, and Agents: Com-ponents should be based on powerful operators. The run-time environment should exploit a cluster computer base forproviding incrementally scalable computation, memory, andstorage. It should leverage database technology for the per-sistent and efficiently organized storage of large amounts ofinformation. It should exploit “agents:” major pieces of soft-ware functionality that can move through the network.

Viable Component Economics via Subscription, Pay PerUse: Enabling a component economy is essential to incentiv-ize new services deployment. Users could be charged eachtime the service is used. Or they could subscribe, offeringpayment for a usage period. The service environment pro-vides mechanisms for accounting, billing, and payment (justlike the existing PSTN service infrastructure!).

Supports Diverse Devices, Sensors, Actuators: The environ-ment must support a wide selection of access and displaydevices beyond traditional computer and telephone-basedartifacts. These may include thin clients like wall displays,devices with speech- or gesture-based interfaces, environ-mental sensors for detecting heat, light, and motion, and con-trols like building HVAC. Even the thinnest of devices canleverage the infrastructure’s computational resources todeliver functionality that is beyond their local abilities.

Connects Everything via Ubiquitous Support for Access andMobility: To achieve “big infrastructure, small clients,” allclient devices must be connected, though the connectionquality may vary. Thus, the environment must provide capa-bilities for continuous access and mobility. This includes:providing the necessary interworking functions and the toolsfor appropriately provisioning the IP transport network andits resources to insure adequate performance for delay sensi-tive real-time streams (e.g., soft quality of services support

2. An emergency services operator in an Enhanced 911 system receives informationabout the caller’s identity (the owner of the telephone line) and the location of the tele-phone line.3. http://iceberg.cs.Berkeley.edu/4. http://ninja.cs.Berkeley.edu/

5

and RSVP and class-based queuing for bottleneck links).

Structure of this Paper

This paper presents an open services architecture for manag-ing heterogeneity in the converged network. It is a middle-ware layer of applications building blocks that make it easierto develop new kinds of converged applications that com-bine elements of data and voice support. The rest of thepaper is organized as follows. We develop the set of require-ments for the Services architecture in Section 2. Section3presents our new Services architecture. The run-time envi-ronment is called Ninja while the set of services it supportstailored for computer-telephony integration is called ICE-BERG. Section 4 illustrates how an application makes use ofthe underlying services. We present integration and perfor-mance issues in Section 5. Section 6 discusses related work.Finally, Section 7 provides a summary and future work.

2. A New Services Architecture

2.1. Horizontally Integrated Service Architecture for Multiple Networks

In today’s networks, the network, access devices, communi-cations media types, and services are tightly bundled, e.g.,the telephone handset is bundled to the voice service and thedevice’s phone number ties it to a particular wireline or cel-lular network. One of ICEBERG’s goals is to decouple thesecomponents. In doing so, the architecture enables thedynamic composition of devices, media types, and servicesto create new integrated services. As discussed in Section1.4, the key challenges are how to create a communicationsenvironment that supports transparent and optimized accessacross multiple networks (network transparency), providingany-to-any format conversion for access to media types(type-independence) from any devices (device transparencyand device-independence), while retaining access to servicesas users move across networks (service mobility).

Implementing user-driven transparency and independencerequires a shift from a traditional vertically integrated ser-vice architecture to one that is horizontally organized. Thearchitecture must enable interchangeable component ser-vices and pervasive access. Dr. George Heilmeier, ChairmanEmeritus of Bellcore, is among the first to observe the needfor a horizontal service organization [Heilmeier98]:

“Today, the telecommunications sector is beginning to reshapeitself, from a vertically to a horizontally structured industry. ... [I]tused to be that new capabilities were driven primarily by the carri-ers. Now, they are beginning to be driven by the users. ... There’s auniverse of people out there who have a much better idea than wedo of what key applications are, so why not give those folks theopportunity to realize them. ... The smarts have to be buried in the‘middleware’ of the network, but that is going to change as more

capable user equipment is distributed throughout the network.When it does, the economics of this industry may also change.”

The transparency and independence attributes described inSection 1.3 are enabled by an IP core network, with gate-ways to other networks. This enhances flexibility and perfor-mance because media sources and destinations becomesoftware components communicating over high speed net-works. The IP core provides a common routing and compu-tational overlay structure to diverse networks that areotherwise closed to external modification control.

Another enabler is the shift in the locus of control from cen-tralized service providers to distributed third-party serviceproviders and end-users. This shift means providing externalaccess to the control flow of information and to computa-tional resources that are located in the core of the network.

2.1.1 Computation in the Infrastructure

The service architecture must support scalable, highly avail-able and customizable services. Exploiting processing in theinfrastructure makes it possible to give access to very power-ful services, like speech recognition, to even the simplest“thin client” PDA. Thus, processing in the infrastructure is akey ingredient in delivering functionality to thin clients,including encapsulating legacy servers that have beendesigned for more powerful access devices. Infrastructureprocessing must also be highly available, reliable, and incre-mentally scalable. In the San Francisco Bay region, it is notoutlandish to think in terms of hundreds of thousands ofusers simultaneously using the infrastructure. Using anarchitecture based upon distributed processing greatly sim-plifies the task of providing high availability and reliability.

2.1.2 Interchangeable Component Services

A key application will be enabling the dynamic creation andcomposition of software agents in the infrastructure. Thesewill include agents for mobility, redirection, translation, andother forms of automatic adaptation of the network to end-device characteristics and network connectivity. The net-work uses agents to support emerging information appli-ances that are neither handsets nor personal computers butcombine elements of both in a portable/handheld package.

Dynamic composition is enabled by strongly-typed inter-faces for individual service components, data types, and end-devices. With such interfaces, automatically composing atransformation path from a source device or data type to adestination device or data type becomes a simple matter ofsearching for the shortest path in a graph of operators.

2.1.3 Pervasive Access

Access networks are not likely to disappear soon, nor arethey likely to become homogenous. They will continue todiffer in coverage, bandwidth, latency, and cost. New access

6

technologies are continuously being brought to market, suchas third generation cellular systems (UMTS/IMT2000), nextgeneration wireless LANs (Bluetooth, HomeRF), and broad-band wired access networks (xDSL, cable modems). To real-ize pervasive access, the infrastructure must integrate thesevia a common IP overlay. The universal overlay was thePSTN: dialing a phone number caused a page to be sent to apager. Today’s alphanumeric pagers make it compelling tobe able to send e-mail to the pager. The data network pro-vides the requisite common overlay, with gateways to theother networks. With the improved ability to handle voice inIP networks, coupled with H.323 gateways, it is conceivablethat the PSTN will be replaced as the overlay.

2.1.4 Software Agents

The infrastructure supports software agents. These includeredirection agents, intercepting real-time streams to directthem to a new destination. This is for terminal mobility, aswell as type-specific redirection based on user policy (type-and device-independence) and forms of service mobility.

Another class are those performing on-the-fly transformationof data streams. We call these transduction agents. An exam-ple is an agent that converts a voice stream to text by invok-ing speech recognition. Agents can be cascaded. Combiningtransduction with redirection, for example, makes it possibleto redirect a voice message destined for a PSTN end pointinto a text message delivered to a pager.

2.1.5 Examples of Services

This section provides examples of the types of multi-net-work services that the ICEBERG architecture supports.

Cross-network Access to Emergency 911 (E911) Services

E911 services provide the caller’s telephone number andlocation to emergency personnel. In the PSTN, identificationand location information are determined by the user’s “line,”a customer information database that maps lines to users,and a geographic database that maps lines to locations. Thisis important when a user becomes incapacitated after con-tacting an E911 service. The information is also useful indealing with false calls. At a minimum, cross-network accessto E911 must include identification and location information.

Consider a user who presses a button on her 2-way pager tosend an emergency message to an E911 operator. It is trivialto convert the paging message from a text message into aspeech message that can be played for an operator. Providingthe location information is not so simple.

Using triangulation, the paging network could determine theuser’s location. However, there is no mechanism for passingthis service-specific information through a paging gatewayto the operator located in a telephony network.

A similar problem occurs if the emergency request is from auser using a VoIP gateway. The appropriate location is notthe one associated with the site of the VoIP gateway itself.For a fixed machine, determining the location requires a geo-graphic database that maps callers’ IP addresses to locations.For mobile users, the location of a nearby wireless basesta-tion may be sufficient. However, for dialup users, it wouldbe necessary to use caller identification and propagate thelocation based upon the dialup user’s telephone number.

For each situation, the architecture must support the propa-gation of information across each network interface and, insome cases, more than one cascaded networks.

The solution here is to leverage local network support forlocating callers and to provide an architecture that supportsthe passing of service-specific metadata between networks.

Multi-modal Information Access and Smart Spaces Control

These allow any-to-any (multi-modal) access to informationservices and resources and smart spaces through a variety ofend devices. Users can use any device and combination ofcascaded networks to reach interfaces designed for the par-ticular device/network combination. A number of issuesarise when networks are cascaded.

For example, an Interactive Voice Response (IVR) servicemay be located within the IP network or in the PSTN net-work. The user may have a preference for how to reach theIVR service, and the best network (or mode) must beselected to meet the service’s voice quality requirements.

Using a VoIP service in the wide-area is an issue. VoIP mayintroduce unacceptable jitter into touch tones or speech,making tone detection or speech conversion difficult. A deci-sion must be made for the best route to an IVR service, orwhere to perform touch tone detection or speech recognition.For example, in the GSM network, touch tones are trans-ported as separate control signals. They are not sent throughthe lossy voice codec, which would render them useless.

Routing and action location decisions are service- and user-specific, and differ among environments. Acceptable resultsare achieved by choosing a particular network, or perform-ing an action close to the source. The architecture must pro-vide the means to manage the routing of control and datawithin cascaded networks.

3. ICEBERG Architecture

ICEBERG5 provides a service architecture for integratingvoice and data services in many, diverse, interconnected net-works: third-generation digital cellular (GSM, CDMA, and

5. ICEBERG stands for “Internet-Core nEtwork BEyond the Third Generation.” Thiscaptures ICEBERG’s technical approach, which builds a multinetwork capabilityaround an IP-based overlay or core network.

7

UMTS / IMT2000), IP, PSTN, wireless IP, and 2-way pagernetworks [Goodman97, Kuruppillai97]. These cover a rangeof transports, input/output interfaces, and user interfaces,yielding many challenges in managing heterogeneity andproviding transparency across networks, devices, communi-cations media, and services. ICEBERG is implemented onNinja, an infrastructure for building reliable, incrementallyscalable, highly-available, persistent distributed services ondistributed clusters of commodity processors.

3.1. New Service Architecture

Potentially Any Network Services (PANS)

To illustrate PANS, consider the following. While talking onyour mobile phone, you enter your office. Hitting a keysequence on your handset, you redirect the in-progress callto your desktop PSTN telephone (to save money or improvethe voice quality). Or perhaps you redirect your call to yourLAN’s VoIP gateway, and continue your call while sitting atyour computer, checking some account data while speakingwith the other party. Service mobility supports movementbetween networks while maintaining the same service.

Making service mobility a first-class entity means that rout-ing and control information must be passed across networkboundaries. Within a single network, information flow caneasily be controlled. But complications arise when spanningnetworks. The challenge is to dynamically change and opti-mize routing. This is accomplished via a control path associ-ated with the data path. The infrastructure monitors theperformance of operators along the path, moves operatorswhen possible, and controls the overall flow of information.Data path changes may be made by the infrastructure, theservices along the path, or agents acting on a user’s behalf.

ICEBERG Access Points (IAP) interwork among networks,transcoding the format/signaling used by the control pathinto compatible protocols for the underlying networks. Forexample, an IAP between IP and GSM networks adapts theSS7 signaling protocol used by GSM to the MBone routing/session control protocols used by the IP network.

Classes of Services

Cascaded services described above can be generalized as:

• Implementation of an existing service on a new network

Services offering similar functions in different networksare mapped. For example, all networks have directoryservices: 411 (PSTN), LDAP and DNS (IP), HLR/VLR(digital cellular), and HA/FA (mobile IP [Perkins97]).These would be used as a part of a universal name to net-work-specific name resolution process.

Another example service is voicemail (PSTN), E-mail(IP), and pages (pager networks and cellular’s Short Mes-sage Service). These offer similar messaging capabilities

in different networks. Interoperability requires convert-ing to the appropriate format (e.g., performing speech-to-text to map voicemail to pages or e-mail; or performingtext-to-speech to convert pages or e-mail to voicemail).

To implement one of these services, the service providerprovides an IWF by providing transcoders between dataformats or through a single universal format. When thereare few formats, implementing all conversions is a betterchoice. This allows the provider to better map the specialfeatures of the data format of one network onto another.

Another task is the control IWF. This functionality is crit-ical because it provides the underlying mechanism forcontent/operator negotiation and service mobilty.

• Mapping an existing service in one network into another

These exist in multiple networks and need to be mappedor instantiated in a new network using existing services.Consider billing services in the PSTN and digital cellularnetworks. In the former, billing is on a “line” basis, whilein the latter, it is based upon a Subscriber IdentificationModule (SIM) card. The challenge is to map these ser-vices onto IP networks where there are multiple authenti-cation mechanisms (e.g., PGP, X.509, Kerberos, etc.).

Another multinetwork service is E911 (PSTN and digitalcellular networks). E911 calls have priority over othertraffic. That same priority must be conveyed across gate-ways and interfaces between networks (e.g., other calls ortraffic must be dropped if necessary). Also, semanticinformation like the identity and location of the callermust also be conveyed to emergency services operators.

As with the previous class of service, the tasks here are toprovide data and control IWFs to handle the transcodingof data and control information.

• New services

Cross-network access enables new services, like servicesfor concierge or assistance, location-dependency, best-mode information routing, and information push. Con-sider one that allows a user to dial *HOTEL on a phoneto have a list of local hotels (customized to the user’srequirements) and rates read back, with the option ofbeing connected to a reservation desk.

Services may be agent-based (e.g., the infrastructuremonitoring your location, perhaps by intercepting loca-tion update messages in a cellular network, alerting youof changes in traffic conditions for your area). Otheragent-based services include: customized news headlineclipping, stock market symbol tracking and alerting, etc.

A primary task is the implementation of data and controlIWFs, coupled with access network-specific informationin the development of these services.

8

3.2. Problems with Services on Multiple Cascaded Networks

Cascaded networks introduce several problems that compli-cate service handoff:

Multiple Endpoints in Multiple Networks

The complexity of cascaded networks is that users have mul-tiple end devices, some which have multiple network inter-faces (e.g., a PDA with cellular, pager, and wired networkinterfaces). The choice of device or interface may be a func-tion of: connectivity (i.e., users and devices are mobile, soconnectivity and reachability changes dynamically), qualityof service, cost, etc. The important consideration is that theuser should be able to specify their desired choices for rout-ing (see Section 4.3). Mobility in cascaded networks takesthe form of terminal mobility and user/personal mobility.

• Terminal mobility may be intra- and inter-network. Intra-network mobility is movement within a network (e.g.,handoff between wireless IP basestations or handoverbetween cellular basestations). The underlying networkcontrols this movement type. It will be internally orexternally visible in some networks, while in others it ishidden. Control over handoff may reside with the mobiledevice (i.e., mobile IP) or with the network (i.e., GSM).Inter-network movement, e.g., handoff between a cellularinterface and a wireless IP interface, requires coordina-tion between the two (more details below).

• User/personal mobility involves users switching amongmultiple devices. Users switch for many reasons (e.g.,location, cost, quality of service, etc.). Some of move-ments will be within a network, like moving from oneworkstation to another or from one office phone toanother, while others will be cross-network, like switch-ing between a wireless IP phone and an office phone.

ICEBERG uses indirection to hide both terminal and user/personal mobility. The architecture associates a globallyunique, non-domain-specific Universal Name with users andagents. Universal names with an inter-domain naming proto-col provide the reference mechanisms (see Section 3.4.1).

Service Handoff Across Cascaded Networks

Service handoff requires passing metadata across networkboundaries. Such metadata is service- or network-specificand includes, e.g., the caller’s identity, authentication rights,and billing or cost information. It may include the target ser-vice in use, since this affects transcoder choice and place-ment, the requested QoS, or the chosen network path.

The key to service handoff is to separate control (metadata)from data, and allow each to follow a separate logical pathbetween the source and destinations. The physical paths maybe the same or physically distinct, as in the long-distancePSTN network, where call data is sent over a separate net-

work from call set-up and control information (SS-7).

The metadata may be passed across the same network inter-face as the data, or it may be passed to the new networkthrough an alternate route. Once the metadata has beenreceived by the new network, it is used to negotiate and opti-mize a path through that network, to setup the appropriaterouting and transcoding, and to specify billing, authentica-tion, and any other service- or network-specific information.

Service Transformations

End-to-end cross-domain services raise the question ofwhere to perform data conversions, which may be computa-tionally intensive. Gateway choice and transcoder placementwill affect the performance of later operations. Alternatively,computation may take place at servers within the network.

With cascaded services, the service may need to negotiatewith the underlying network for resources (e.g., computa-tional, call admission, QoS, etc.). It is important that theycontrol where and how conversions occur. Similarly, it isimportant that services be given control when crossing a net-work boundary. Here the services negotiate service- and net-work-specific attributes (e.g., negotiating with the network’scall admission policy or negotiating for a specific QoS).

ICEBERG supports both models, including services thatspan gateways and the infrastructure’s computationalresources, while passing of metadata between operators.

Example: Cross-system service handoff

Consider a user with a cordless handset capable of wirelessPSTN and wireless IP. As the user walks out of range of thePSTN cradle, the handset negotiates with the VoIP networkto admit the call and for the appropriate bandwidth and QoS.A PSTN core network component is contacted to bridge/move the call to the appropriate PSTN-to-VoIP gateway,which then relays the call to the handset’s wireless IP inter-face. When the user leaves the PSTN, the link is dropped andthe call seamlessly continues on the wireless IP network.When the user re-enters PSTN coverage, the process isreversed and the call is restored to the handset’s PSTN inter-face. These actions require that metadata be passed to theappropriate gateways and resources in each network.

3.3. Ninja Distributed Component System

ICEBERG relies upon the Ninja distributed components sys-tem to manage computational tasks and resources.

3.3.1 Ninja’s Goals and Objectives

Ninja is a software infrastructure supporting the next genera-tion of Internet-based applications. Central to its approach isthe concept of a service, an Internet-accessible application orset of applications that are scalable (supports thousands ofconcurrent users), fault-tolerant (masks faults in the server or

9

network), and highly-available (resilient to outages). Exam-ples of constructed Ninja services include: a stock tradingapplication using any end-device; a Jukebox, providing real-time streaming audio data from music CDs scattered aroundthe network; and Keiretsu, an ICQ-like, multi-modal, instantmessaging service for one-way and two-way pagers.

Ninja services are enabled by a new architecture for servicecreation and wide-area service deployment that supports thedevelopment and evolution of powerful, flexible servicesacross a huge range of computational and networking scale.This is described in Section 4.1.2.

Ninja’s service requirements are like those for ICEBERG.They are composable (automatically aggregating multipleservices into a single entity), customizable (users injectingcode to customize a service’s behavior), and widely accessi-ble (accessing the service from a wide range of devices).

Service composability is the basis for automatic path cre-ation, which supports device-independence, device-transpar-ency, type-independence, and wide accessibi l ity inICEBERG. Reformating and translation operations are com-posed with the conventional application’s processing step toachieve independence and transparency. Similarly, customiz-ability provides the leverage necessary for end-user control.

ICEBERG constructs the components for converged com-puter and telephony applications, such as those managingcascaded network gateways, speech-enabled applications,any-to-any message format translation, support for mobilityand redirection, etc. From ICEBERG’s perspective, Ninja isits service programming environment. The ICEBERG Uni-versal In-Box is an example of an ICEBERG service imple-mented on Ninja (see Section 4.2).

3.3.2 Ninja Service Model

Ninja provides the execution environment to automaticallygenerate and compose services from strongly typed compo-nents distributed across the wide-area. Its key elements are:

• A structured architecture with a careful partitioning ofstate among Bases, Active Proxies, and Units;

• Wide-area paths, described as operators and connectors,and interconnecting strongly-typed components;

• An execution environment with efficient communica-tions primitives, based on active messages (i.e., movingexecutable code to the message destination where it canexecute on the remote data).

A wide-area service operation consists of a Path throughbases, active proxies, and units, with each hop potentiallybeing a separate service (involving resource discovery, trans-formation, or information gathering).

Ninja provides a clean solution to state management bymaintaining persistent state on highly available bases, and

soft-state on Active Proxies. It provides automatic resourcediscovery using a query language for service location, initia-tion, and automatic creation of wide-area paths.

Wide-area paths are a new concept providing a frameworkfor authentication, resource allocation, privacy, feedback,and dynamic extension and optimization. A path is asequence of Connectors spanning the network and Opera-tors residing on an architectural component. Paths are real-ized by a data channel and control channel pair (bothdescribed in a scripting language) and are strongly typed, sostatic semantic analysis is used to ensure that componentcompositions are meaningful. We describe these next.

Structured Architecture

Units are simple, low cost, usually mobile, network-con-nected access devices with little computational or storageability. Connectivity is likely of low quality: low bandwidth,high error rate, etc. Units include sensors, actuators, PDA,pagers, smartphones, communicators, palmtop, laptop, andpersonal computers. Units are heterogeneous in their display,processing, and remote programming capabilities (Figure 7).

Active Proxies are connection points for Units supportingbootstrapping, resource discovery, agents and transforma-tion. Their state is not persistent, though soft-state andcaches can improve local performance. Active Proxies mightbe located on the users’ premises, in wiring cabinets, base-ments, or car trunks. They have good network connectivity,and can execute localization services (e.g., specialized solu-tions for improving reliable delivery of information, on-the-fly data reformatting and translation, etc.). A wireless LANbasestation with local computation is an active proxy.

Bases are large, highly available computation and storagecenters, providing safe and persistent state. A base might bea cluster of workstations in a machine room managed by aninfrastructure provider. It is the place to run certain servicesthat demand high scalability or are processing intensive. Abase is where a subscriber account database should reside.Users also have home bases, the locations within the net-work infrastructure where their preferences and persistentstate (e.g., mail archives) are maintained.

Units

ActiveProxies

Bases

Figure 7. Ninja Physical Elements: Units, Active Proxies, Bases

10

Operator, Connector, and Path Model

Operators are software components that perform operationslike transformation and aggregation. All of Microsoft’sCOM objects, implementing desktop applications like wordprocessors, spreadsheets, etc., can be operators. An operatorcan be a simple agent. A more complex agent is constructedby composing multiple operators using connectors.

Connectors are abstract “wires” that logically interconnectoperator outputs to inputs. Connectors can have varyingsemantics. Connectors can be unicast, multicast, or anycast.

Outputs must match the inputs to which they are connected.This is accomplished by interfaces, which are stronglytyped, language independent, and mapped to an operator’saccess methods. Strongly typed interfaces makes it possiblefor the infrastructure to dynamically compose operators(e.g., composing speech-to-text and text-to-email operatorsto create a speech-to-email operator).

iSpace Execution Environment

Ninja’s run-time is based on Java, but it can support opera-tors in other languages. The environment on a single node(e.g., an Active Proxy) is called iSpace. iSpace allows a ser-vice developer to concentrate on service functionality with-out having to worry about the availability, persistence,reliability, or scalability aspects of the service.

The parallel version of iSpace, used on bases is multiSpace, aparallel application development framework for intra-Baseexecution that has evolved from the run-time environmentdeveloped for the Berkeley Network of Workstations(NOW) Project (Figure 8). iSpace and multiSpace are basedon Ninja Remote Method Invocation, a customizable serviceVirtual Machine called iS-box, and a Redirector to balanceexecution threads among the multiple processors of a Base.iS-Box is the Java Virtual Machine (JVM) extended with aSecurity Manager and Trusted Services. The latter provide asafe “sandbox” permitting the downloading and execution offoreign code extending the JVM functionality.

3.4. Design of the ICEBERG Architecture

The key concepts in ICEBERG are entities and services.ICEBERG provides services between entities located in the

same or different networks. Entities are users, callers/callees,or agents acting on their behalf. Associated with each is aunique, non-domain-specific Universal Name that provides amechanism for entities to refer to others in an abstract way.Services are the Ninja concept of service as an application orset of applications. ICEBERG’s services are focused onthose needed to integrate computer and telephony applica-tions, and to make it easier to deploy new services.

3.4.1 ICEBERG Service Architecture

Finding Services: The Service Discovery Service

A key component of ICEBERG is the Service Discovery Ser-vice (SDS). SDS acts as a repository for information aboutservices in the system, and provides to clients directory-styleaccess to this information. The SDS maintains descriptionsof services that are available for allocation at Active Proxiesor Bases via path instantiation (unpinned services) and ser-vices that are already running.

The SDS supports both push- and pull-based access, allow-ing proactive announcement of the existence of a service aswell as queries against cached announcements. The formerenables passive discovery of new capabilities (e.g., to dis-cover a local smart space) and simplifies state management.The latter allows clients to explicitly query for services whileignoring any dynamic SDS announcements.

The query model enables clients to search for a specific ser-vices necessary to complete a path (for automatic path cre-ation), or to allow a user to manually determine pathendpoints by browsing available services (for example, todiscover a stock trading service or a information dissemina-tion source [Wong98]).

Service descriptions and queries are specified in XML. Thequery language gives agents the ability to perform searchesbased on varying criteria, allows the set of attributes toevolve as services evolve while still maintaining backward-compatibility, and provides a clean way to mix data andmeta-data. The latter is especially important, because oneservice’s meta-data is another’s data.

As an example of a globally-distributed, wide-area service,the SDS exhibits design challenges beyond those livinginside a single Base. It must handle network partitions, band-width limitations between remote SDS entities, “localiza-tion” (i.e., differentiate between service that are “local” to aclient and those that are not), and provide application-levelquery routing between components.

The search problem is more difficult here than in systemslike DNS or CORBA’s Globe, because the SDS accepts que-ries as a complex set of hierarchical attribute-value pairs,rather than in a form where the resolution path is embedded.

Unlike techniques designed only to work in the local-area,such as the IETF Service Location Protocol, the SDS must

Service request

service threads

OperatorsCaches

Managed RMI++

Physicalprocessor

operatorupload

PersistentStorage

JVM

iS-L

oade

r

Trus

ted-

Ser

vice

s

Security MGR

Newservice

Figure 8. Ninja Architecture and Run-Time EnvironmentThe elements of the run-time environment are the Service API (RMI), SDS,JAVA RMI, and Processor Resource.

11

address scaling discovery to the wide-area. The SDS compo-nents are arranged in an adaptive hierarchy managed by theservice itself. The hierarchy can be based on service scope,the underlying topology, manually configured administrativedomains, or a combination of these.

For fault-tolerance, the SDS uses soft-state and relies onmulticast to propagate service descriptions, like the approachused by the MBone’s Session Announcement Protocol. Stateis rebuilt by listening for the service-existence multicasts. Italso enables clients, such as service announcers and clientagents, to adjust to changes in the underlying hierarchy.

For security concerns, the SDS controls the agents that candiscover services, allowing capability-based access limita-tions, i.e., to hide the existence of services rather than (or inaddition to) disallowing access to a found service.

Service Transformation

IAPs are gateways that provide the interconnection betweennetworks. An IAP may be as simple as an H.323 gateway orit may be more complicated, e.g., providing metadata as wellas data transport. IAPs perform the link layer and bit trans-formations, and perform service-level transcoding. In mostcases, these operations take place on Active Proxies, whilecomputationally intensive operations execute on Bases.

The IAP is the point where cross-domain (network-specific)signaling information or metadata is collected from anincoming network and from the outgoing network. Suchinformation includes authentication and identification infor-mation (e.g., a PSTN caller’s telephone number).

Cross-domain Name Resolution

Service handoff requires cross-domain name resolution. Thisallows users to provide a single globally unique name (Uni-versal Name) which is dynamically resolved in any networkto a domain-specific name using entity- or service-specificpolicies. This is handled by servers using the ICEBERGInter-Domain Naming Protocol (IDNP).

ICEBERG Inter-Domain Naming Protocol (IDNP) serversmap Universal Names to domain-specific names using: anentity profile that specifies the entity’s domain-specificnames, system state (for reachability information about anentity), and the entity’s policy for mapping Universal Namesto domain-specific names (Figure 9).

The entity profile is a list of domain-specific names for theentity’s end devices. It changes on the order of weeks tomonths--whenever the entity acquires a new device.

System state or reachability information is associated withdomain-specific names. For example, a PCS phone entryincludes network cost and reachability. For a wireless LAN,it includes registration information and available QoS andbandwidth. System state varies from minutes to hours,depending upon the entity’s activities.

This policy is a difference between ICEBERG and thePSTN. Entities can provide code to evaluate variables tomake dynamic routing decisions at call setup or when systemstate changes. It varies on the scale of days to weeks. Typicalvariables that are used include the caller’s network, the costfor using each of the available end devices, the availableQoS or bandwidth, the caller, interactive vs. non-interactiveservice, and other entity- or service-specific information.

Entity-specific policies provide dynamic control over howan IDNP server maps Universal Names to domain-specificnames. Figure 10 provides an example of such a mapping.

Privacy is important. The exposure of domain-specificnames is under entity control. Its policy can specify that adomain-specific name should be returned to a caller or it canspecify that the name should be hidden, in which case, theIDNP server informs the IAP that it should forward the callor message to the domain-specific name. The IDNP executeson servers located in each network using portals into the net-work’s IDNP servers. For example, a special telephone num-ber would be used in PSTN and digital cellular networks.

The rate of change depends on the type of information.Entity profiles change on the slowest time scale, followed by

IDNPServer

Call(Randy@Berkeley, Caller’s network, Interactive, CallerID certificate)

IDNPServer

Profile

PolicySystemState

If IAPs can’t beembedded in networks,then resides in IP core

minutes/hours

days/weeks

weeks/months

IAP

Iceberg Access Point(One per network)

Policy Engine, Routing, Security

Iceberg DomainName Policy Servers

Stored inBases

Figure 9. ICEBERG Access PointsIAPs execute on Active Proxies or Bases. They provide inter-networkaccess and policy enforcement.

OfficePSTN (Teaching): 510-642-8778OfficePSTN (Chair): 510-642-0253DeskIP: dreadnaught.cs.berkeley.edu:555LaptopIP: polo.cs.berkeley.edu:555PCS: 510-555-8778Cellular: 510-555-1998E-mail: [email protected]: 415-555-5555

OfficePSTN (Teaching): 510-642-8778OfficePSTN (Chair): 510-642-0253DeskIP: dreadnaught.cs.berkeley.edu:555LaptopIP: polo.cs.berkeley.edu:555PCS: 510-555-8778Cellular: 510-555-1998E-mail: [email protected]: 415-555-5555

“Randy@Berkeley”

An Entity has a universalname and a profile; Entitiesare people or processes

Universal Names: Globally unique IDs

Profile: set ofdomain-specific names

Figure 10. Multiple Identities for an Individual

12

entity policies. System state or domain-specific reachabilitymay change rapidly over the course of a few minutes orhours, as a function of user and terminal mobility.

IDNP servers cache information elements so that they canreduce decision making latency. The information may bereplicated and cached at multiple IDNP servers, and changesneed to be propagated in a timely and secure manner. Wher-ever possible, duplication of effort by IDNP is avoided. Inparticular, name resolution leverages each network’s existinginfrastructure. This means, for example, that the profileshould contain a device’s mobile IP home address rather thanits current wireless IP address.

4. Testbed and Prototype Service-Enabled Applications

4.1. Campus-Wide Testbed

ICEBERG is deploying a testbed consisting of GSM digitalcellular, wired and wireless IP, and PSTN networks to exper-iment with services across cascaded networks and new kindsof services like smart spaces. The ICEBERG architectureand toolkit allow developers to build services across net-works more rapidly. The ICEBERG testbed, and the proto-type services we are developing on it, will help demonstrateease of service construction and deployment.

ICEBERG/Ninja builds on existing research projects whichprovide the campus-area processing clusters and high band-width network interconnect. The campus-wide testbedencompasses the latest networking technologies, includingwired and wireless telephony and early prototypes of digitalcellular and high-density wireless LAN technologies (seeFigure 11).

The testbed is being deployed throughout the Berkeley cam-

pus, spanning several “smart” classrooms. The toolkit isbeing used to build collaborative learning services for stu-dent use. These include instant messaging and access to webinformation, shared class repositories, and personal informa-tion management applications.

4.2. Transparent Information Access Service: The Universal In-Box

The Universal In-Box Service is an early ICEBERG service(Figure 12). Users can manage their incoming and outgoingcommunications modes. The service drives ICEBERG’ssupport for cascaded networks that span cellular (using theBTS gateway described in Section 4.2.2), the PSTN (via anH.323 gateway), paging (with the deployment of a two-wayReFLEX basestation from Motorola), and Internet-basedelectronic mail and real-time streams. It also provides a spe-cific context for any-to-any format conversion services.

This illustrates how the infrastructure can be leveraged toroute information in a dynamic, timely, and user- or service-specific fashion. The service architecture controls routing,within the infrastructure (e.g., agents acting behalf of users)and externally (e.g., service- or user-specific control).

The Universal In-Box provides user-specific informationcollection and aggregation, simplifying the task of datadelivery and dissemination. Coupled with the automatic cre-ation of transformational paths, the Universal In-Box allowsusers and services to disseminate information without regardto its format or end-device capabilities. This attribute is animportant decoupling of the source of the data and its desti-nation. A user specifies the format of their choice whenaccessing information sent to them or in interacting withother users. This is an important distinction between ourwork and related work in universal messaging applications.

An example of this service is distributing the same informa-tion to multiple users across multiple networks (e.g., sendingone message to users with cellular phones, pagers, voice-mail, E-mail, etc.). Consider being able to type a messageand have it delivered to a list of users, where the list not only

Network

Infrastructure

GSM BTS

Millennium Cluster

Millennium Cluster

WLAN Pager

IBMWorkPad

CF788

MC-16

MotorolaPagewriter 2000

Ericsson

Smart SpacesPersonal Information Management

Figure 11. ICEBERG/Ninja Testbed The Testbed includes GSM wireless, wireless LAN, and Reflex two-way pageraccess networks. It is being built on a gigabit campus-scale network backbonethat will be interconnected to Internet-2 as well as xDSL and cable modem“last mile” technologies. The testbed incorporates considerable processingresources and diverse access devices, including telephone handsets.

Policy-basedLocation-basedActivity-based

Speech-to-Text

Speech-to-Voice Attached-Email

Call-to-Pager/Email NotificationEmail-to-SpeechAll compositionsof the above!

Universal In-box

Figure 12. Universal In-BoxIt provides any-to-any document routing and formating.

13

contains e-mail addresses, but also cellular phone numbers,office numbers, pager numbers, voice mail numbers, etc.The infrastructure transparently converts the message to theappropriate format for each recipient. This simplifies andoptimizes information delivery for senders and recipients.

4.3. Multi-Modal Interfaces and Smart Spaces

Traditional computer applications provide processing (wordprocessing, spreadsheet), storage (databases), and communi-cations (e-mail, Web browsing). Smart spaces is new kind ofapplication that adds end-user control to the list of computercapabilities. These are environments that contain computer-controllable sensors, actuators, and I/O devices (e.g., cam-eras, microphones, thermostats, etc.). They are an extensionof today’s office environments, which contain a variety ofcomputer-controlled devices (e.g., HVAC: Heating/Ventila-tion and Cooling systems, door locks, elevators, slide projec-tors, TV monitors, A/V devices, etc.), many of which cannotbe directly controlled by users. Cost reductions are making itpossible to extend computer control to the home environ-ment, where users are building spaces that contain cameras,microphones, movable blinds, DVD players, and other A/Vdevices. Users want to be able to access and control thesedevices using their mobile devices and their new communi-cations networks. A complication is that every smart space isunique--no two have the same devices or interfaces.

4.3.1 Interactive Voice Response to Smart Spaces Control Service

We have implemented a complex composed service formulti-modal control of smart spaces through cascaded net-works: the Interactive Voice Response (IVR) to Smart SpacesControl (SSC) service. It implements speaker-independentIVR to control A/V resources in several classrooms and con-ference rooms in Soda Hall.

Figure 13 provides an overview of the SSC environment: anSSC server, remote users of the environment, Java-based ser-vices, and gateways to non-Java services and devices.Remote users, services, and devices are entities, capable of

sending and receiving messages to and from each other. TheSSC server’s operation involves the multi-stage transforma-tion of messages being sent between entities.

Users use graphical, text, or speech-based user interfaces tointeract with one another and to control the devices in thesmart space. For example, Barbara can speak a message intoher computer and send it to Emre. He can choose to receiveit as audio or as a text popup message. The SSC server auto-matically performs the transcodings. The source entity doesnot know the target entity’s output format.

The SSC server is based on the Ninja run-time technology totranscode information from any input to any output format.It combines operators, each performing a single, specifictranscoding, into a path to perform more complex transcod-ings. The SSC server uses SDS to discover the operators andcreates the paths automatically and dynamically.

The steps taken when a user sends a command to a smartspaces room are (Figure 14):

• The user provides speech input through a local micro-phone, wired or wireless IP using Mbone audio confer-encing tools (VAT), GSM voice, or by directly typing.

• Voice input is converted to Pulse Code Modulation(PCM) audio and a speech recognition operator convertsit to text. This requires the selection of a language and anN-gram, an application- or service-specific grammar. Ifthe user typed the commands, this step is skipped.

• A Natural Language Processing (NLP) operator convertsthe text into device commands. This also requires theselection of a grammar specific to the target services. Ourcurrent NLP engine uses simple word-spotting.

• The command is delivered to the room entitiy, which for-wards the command to the appropriate device. It thensends a response to the user. This is converted to theuser’s preferred format via a reverse-path of operators.

We have implemented complex operators using wrappersaround large programs: audio format conversion tools,speech-to-text recognition based upon the InternationalComputer Science Institute’s speech-to-text recognitionproject [ICSI98], text-to-speech generation using the Uni-

Figure 13. The Smart Spaces Control (SSC) EnvironmentThis figure shows an SSC server, two local users of the environment (enti-ties), an Mbone RTP gateway to remote users and devices, and a room con-trol gateway. The server provide routing and transformational/transcodingservices between the various components.

Simja

Server

Service Entity

Room Control

Entity

Barbara

Entity

Emre

Room

(MASH)

UDP

RMIGateway

Cell Phone

IP-Pad

(BTS)

RTP

Room

Entity

Text to

Command

ICSI

Speech

Recognizer

Microphone

Cell phone

A/V

Devices

Responseto Client

Path

Audio Text Cmd

Figure 14. IVR/Smart Spaces Control ServiceEach of the boxes represents an operator in the service, while the ovals aretarget devices. ICEBERG uses the operator interface specification informa-tion passed along the metadata/control to dynamically construct the pathfrom an input source to a destination device.

14

versity of Edinburgh’s Festival project [Festival98], and var-ious existing room control programs [Hodes98].

4.3.2 SSC Architectural Issues

“Smart spaces” built on the protocol and middleware under-pinnings that allow client devices to act as “universal remotecontrols” [Hodes97]. One of the key challenges for the SSCservice was providing a means for developers to rapidly addnew devices and communications networks. The SSC pro-vides strongly-typed transcoding components, automaticpath creation (APC), and channels for control or metadatainformation. We have deployed our control architecture intwo seminar rooms in Soda Hall, providing support for agroup of participants in a conference room taking part in alightweight-sessions collaboration [Hodes99].

Strongly-Typed Interfaces

Operator interfaces describe the input and output data types.They also contain information about how to load and run theoperator code. These interfaces are used to enable the auto-matic composition of operators into paths, and to check thecorrectness of manually created paths. Interfaces are speci-fied using eXtensible Markup Language (XML) documents.

An example data type description in XML for a 44Khz 16-bit WAV audio stream is the following:

<datatype> audio <format>WAV</format> <samplerate>44000</samplerate> <samplesize>16</samplesize> </meta></stream>

The Service Discovery Service is the repository for operatorinterfaces (Section 3.4.1).

Automatic Path Creation

Given a set of typed input and output devices, operators, anddata streams, Ninja automatically creates a path from theinput device to the output device. Automatic Path Creation(APC) first chooses the operators for the path, and theninstantiates and connects them. It attempts to find the short-est path between inputs and outputs. However, other “opera-tor cost” functions than path length in can be considered,such as total computational cost (e.g., perhaps a longersequence of operators yields a lower number of computa-tions). If the operators are on separate machines, there areadditional “communications costs.” ICEBERG will eventu-ally support developer- and user-specified cost functions.

Control/Metadata Channels

The choice of operator is not solely a function of the XMLinterface, but rather of the input/output devices and users.For example, we have found that generic (large vocabulary),

speaker-independent speech recognition can be slow andinaccurate. However, if the recognizer N-gram is con-strained, the result is rapid and more accurate speech recog-nition. The selection information may come from speakeridentification (e.g., caller line identification from a phone) orfrom knowing the target application or device (e.g., control-ling the devices in a smart space).

Ninja provides a control/metadata information channel toconvey information between operators, streams, and I/Odevices during APC. It is used during operator execution as amechanism for state information. The channel allows a ser-vice to specify requirements such as “target service is roomcontrol.” The speech recognition and NLP operators monitorthis using a variable like “targetService = RoomControl” anduse this to select the appropriate N-gram and grammar.

Rapid Extensibility

Adding support for new devices is simple. We added aninterface for GSM cellphones to the room control applicationwithin a few hours. We are extending the system to supportother services, such as multi-modal access to web-based ser-vices like headline new and stock reports. Some of theserequire different transcoding steps from the ones in the basicroom control service. However, the environment allowdevelopers to rapidly add support for such services by sim-ply specifying the interface to the new services and devices.

5. Performance and Integration Issues

5.1. GSM Cellular/IP Interworking

We have implemented an InterWorking Function (IWF)between the GSM basestation (BTS) and an IP core. TheIWF converts GSM circuit-switched voice and data calls intoIP packet streams. Using our GSM-IP IWF, GSM voice anddata calls can be terminated directly by hosts connected tothe IP network, including our in-building wireless LAN(WaveLAN).

Design of the GSM-IP IWF

Our IWF operates between an Ericsson RBS 2202 basesta-tion (supporting 15 active mobile handsets) and the IP core(Figure 15). The Ericsson User Part Simulator (UPSim) sim-ulates the GSM networks’ control components and providesa high-level control interface to the BTS, such as “establish acall to Mobile Subscriber # on time slot #”. The IP PacketAssembler and Disassembler (IP-Pad) converts the circuit-switched data frames from the BTS into Mbone packets, andvice versa. In the near future, the IP-Pad will include supportfor H.323-based applications and gateways.The IP-Padserves as the overall controller for the GSM-IP gateway,controlling call initiation, configuration, and tear-down.

Figure 16 shows the mapping between the GSM time slots

15

and the timing structure expected by the PSTN. E1 timeslots,each of which provides 8 bits at 8 Khz or 64 Kbit/second, onthe link between the RBS and the IPPad. GSM voicerequires only a 13 Kbit/second data stream, so to save band-width, the audio from four mobile subscribers is interleaved,using two bits from each call at a time, into a single timeslot.

Figure 17 illustrates the process by which the time sequencevoice data is mapped into the IP packet format. The IP-Padextracts GSM audio frames from the bit stream from theBTS, prepends the appropriate headers, and injects thepacket into the Internet. The frame format is compatible withstandard Mbone conferencing tools (e.g., VAT).

Mobility Management Interworking

An important consideration for mobile users is providing asingle access network address (e.g., phone number or IPaddress). This is accomplished by mapping a unique address

to a local address for each access network cell or segment.Each network implements its own functionality for mobility.IP nets use Mobile IP with its home agent/foreign agent-based approach, while GSM uses a Home Location Register/Visiting Location Register. Both systems provide the samefunctionality--indirection for location independence--yetthey use different mechanisms.

We are investigating how to provide interworking betweenthese, so users can freely roam between these networks. Thisis a service handoff view of vertical handoff [Stemm98,Wang99]. It involves moving the call-setup state to a newnetwork (i.e., changing the routing of the call) or forwardingthe call data to the new network (i.e., similar to the wayMSC-to-MSC handover is performed in cellular networks).

Related to mobility management are Generalized Redirec-tion Agents. These are user- or service-specified dynamicpolicy-based redirections (1-800 service, email-to-pagers,etc.), and are required for service mobility.

5.2. Graphical Multi-Layer Protocol Analysis

Cascaded networks result in multiple communications proto-cols executing at different layers (i.e., link, transport, net-work, etc.). Each layer is designed and implementedindependently. However, it is important to understand theeffects of inter-layer interactions.

This requires tools that allow developers to quickly evalu-ate protocol performance using experimental measurementsand simulation tools. The former are critical because theyexpose effects that are not visible using simulation (e.g.,errors or differences between the implementations used forexperiments and simulations). Tracing protocols, even forshort periods of time, results in a large amount of trace data(300 bytes/s of trace data for a 10 kbyte/s connection).

To provide a platform for rapid performance analysis, wehave developed a graphical, multi-layer protocol tracingtool, MultiTracer. The tool is part of a testbed for collectingtrace data at multiple protocol levels. Figure 18 illustrateswhere trace collection resides within our testbed. Multi-Tracer post-processes trace data by automatically performingthe customized correlations and representation conversions.The resulting data is presented in a comprehensive, interac-tive graphical manner (Figure 19).

We are examining the performance of TCP over GSM cir-cuit-switched digital cellular links, where the radio link isprotected by a reliable link layer protocol (RLP). We areinterested in understanding whether retransmissions at thelink layer due to high error rates are misinterpreted by TCPas congestion related losses. Initial measurements show thatTCP throughput over GSM is optimal in over 60 percent ofthe traces. These results are surprising, given that our traceset includes a disproportionate amount of results from poorcoverage areas. They confirm earlier work [Baucke97] and

Figure 15. ICEBERG GSM-IP InterWorking Function (IWF)

RBS

2202

UPSim

Ethernet

IP-PAD

Traffic

SignalingE1

ControlSignaling

GSM Phone

E1: Voice @ 13kb/sData @ 12kb/s

VAT

Internet

PC

Interactive Voice Response

Infocaster

H.323 GW

NetMeetingUses OM & TRAFFIC to

simulate BSC, MSC, andHLR functionality

PSTN

2 TRX

GPC boardThor-2

Performs rate adaptationfunction of ZAK/TRAU

01

31

2

32 E1time

slots perE1 frame 8 bits per E1

time slot

2 bits from anE1 time slot =one 16kbpsstream

Figure 16. Time Slot Mapping

IPpacket

RTP payload :33 byte GSM frame

RTP header

UDP header

IP header

320 bits (40 bytes) froma 16kbps stream

260 bit (32.5 byte) GSM audio frame

flag flag

Figure 17. Mapping onto IP Packets

16

[Kojo97]. However, our setup gave us the opportunity todetermine the throughput that TCP provided to the applica-tion relative to what RLP provided to TCP. Without insightinto link layer performance, it is impossible to infer whetherlow throughput above TCP was due to a bad radio channel orinefficient interactions between TCP and RLP.

Our results reveal that spurious TCP timeouts are rare and donot significantly affect throughput, contradicting previousclaims [DeSimone93, Balakrishnan95, Kojo97]. These men-tion the problem of competing retransmissions between TCPand a reliable link layer protocol resulting from spurious tim-eouts at the TCP sender. [Kojo97] even claims that this is themain reason for those cases where the throughput of TCPrunning over RLP is not optimal. However, we could notconfirm this explanation with our measurements.

We are just beginning to use MultiTracer for network perfor-mance. However, it already demonstrates a powerful abilityto experiment with protocol simulations across layers.

6. Related Work

6.1. Internet Technology

ICEBERG is assumes as it foundation several critical aspectsof Internet technology that are influencing the evolution offuture converged networks. The first is that intelligence andcontrol are migrating to the network edges. Packet audio isenabled, in part, by software-based audio and video codecsin powerful end-devices. This is now migrating to relativelyinexpensive hand-held devices. Internet protocols like RealTime Protocol (RTP) make no assumptions about the qualityof the underlying network, but rather adapt the sending ratesto what the network can support [Schulzrinne95]. These pro-tocols form the foundation of the H.323 gateway architecturefor integrating the PSTN with the Internet [ITU98].

The second is the placement of a computing “utility” in thecore of the network. Today, computers are at the networkedges, administered by end organizations. We expect to seemore processing and storage in the core, deployed by serviceproviders and offering capabilities for the deployment of ser-vices valuable for end users. One such service is thetranscoders used to adapt web content for display on hand-held devices [Fox98]. An impediment to such servicedeployment has been an inability to charge for it. This maychange in the converged network. Even if it does not, theability to deploy new services on top of computers embed-ded in the switching infrastructure could serve to differenti-ate service providers from their competitors through morecomprehensive and sophisticated service offerings. This isalso part of the motivation for the Advanced Intelligent Net-work, but that effort is tightly integrated with the call-ori-ented structure of the PSTN. It cannot harness the hugedeveloper community familiar with Internet technology.

IP will emerge as the common routing glue to interconnectdiverse access networks and access devices. Because of theclosed nature of many of the existing access networks,including the PSTN, the IP core is also the most attractivelocation for deploying new services.

6.2. Integrated Services Packet Network

The Next Generation Internet will likely borrow ideas frommore traditional telecommunications networks, such as res-ervations. It will also includes features not well supported inthe existing PSTN: multipoint-to-multipoint multicast com-munications, mobility, and mobile route optimization.

To provide better support for differentiated services, theInternet community has defined mechanisms for reservation-based resource allocation, using protocols such as RSVP[Zhang93]. Reservations are “promises” rather than guaran-tees, and applications must be written to adapt to varyingnetwork performance conditions. The protocols have beendesigned to scale by being receiver-directed and integrated

Fixed HostUNIX (BSDi 3.0)

TCP

MultiTracer

Trace Replayin Simulator

(e.g. ns, BONeS)

RLP

RLPDUMP

TCPDUMP

RLPDUMP

GSMBasestation

BTSIP Backend

Mobile HostUNIX (BSDi 3.0)

TCPDUMP

TCPSTATSTCPSTATS

PlottingTool

(e.g. xgraph)

TrafficSource/Sink(e.g. sock)

TrafficSource/Sink(e.g. sock)

Figure 18. Multi-layer Tracing Environment

398000

400000

402000

404000

406000

408000

410000

412000

414000

416000

480 485 490 495 500 505 510 515 520

Bytes

Time of Day (sec)

RlpSnd_rst

18 Segments

13 Segmentsdropped at

TCP receiver

TcpRcv_ack

TcpSnd_data

TcpSnd_ack

TcpRcv_data

5 Segmentslost due toRLP Reset

Figure 19. Example MultiTracer Plot

17

with the network’s multicasting routing protocols (a soft/dynamic form of connection-orientation). They are based onsoft state in the network to allow robust recovery to failure.

Software-based codecs for real-time audio and video streamsare widely used. Rather than use the PSTN’s 64 kbps pulsecode modulation (PCM) coding, numerous software codecsexist for 36 kbps adaptive pulse code modulation (ADPCM),17 kbps GSM (used in the international digital cellular stan-dard), and even 9 kbps linear predictive coding (LPC). “Ade-quate” conferencing video can be achieved at 28.8 kbps to128 kbps (entertainment video requires higher data rates).

Another key component is the Real Time Protocol (RTP). InRTP and its associated control protocol RTCP, the ends adaptaudio/video streaming rates to what the network can support.This has been integrated into the International Telecommuni-cation Union’s specifications for the H.323 protocols forintegrating the Internet and PSTN for real time streams.

An especially important point is that the Next GenerationInternet admits of the easy integration of new services liketranscoder proxies.

6.3. Voice over IP

Voice over IP is developing rapidly. Quality issues remain,especially in terms of high latencies and packet losses. How-ever, these issues will be migrated by the deployment ofappropriately provisioned (virtual) private networks, faster/scalable hardware to reduce gateway latencies, the pervasiveuse of RSVP, H.323, and more sophisticated techniques forthe reconstruction of lost packets using smarter forward errorcorrection as well as interpolation between voice packets,and better voice coding at lower data rates such as 8 kbps.

While it may be inelegant to solve performance problemssimply by adding more bandwidth, the fact remains that atleast in the wide area, bandwidth is a commodity. Severalnew generat ion service providers, such as Qwest(www.qwest.com) and Level 3 (www.level3.com), aredeploying SONET-based fiber optic, packet switched,national and international backbones with very high avail-able bandwidth. These support Voice over IP with dial-ingateways and direct conversion into 64 kbps data streams.Note however, that bandwidth in the local loop is not a com-modity, at least at this time, and must still be carefully man-aged, perhaps through such mechanisms as policy-basedqueue management across bottleneck links [Floyd95].

6.4. Major Systems Projects

InfoPad

InfoPad was an ambitious research project at Berkeleywhose goal was to develop the hardware, software, andmobile network support for ubiquitous, wireless access ofreal-time multimedia data from high speed networks using

an inexpensive, portable terminal. It was an early effort toimplement the concept of “big infrastructure, small client,”motivating primarily by moving power and computing inten-s ive p rocess ing to the w i re l ine in f ras t ruc tu re[Narayanaswamy96]. An InfoPad was a “network terminal,”but with the additional capability of portability.

Xerox PARC Ubiquitous Computing Projects

In the early 1990s, the Xerox Palo Alto Research Centerbuilt a variety of access devices, called tabs, pads, andboards, as part of its ubiquitous computing initiative. Theconcept was that traditional computers would disappear.While the effort did not develop a comprehensive servicearchitecture, it did develop many innovative applicationsrelated to smart spaces and collaborative environments[Want95].

Daedalus/GloMop

The Daedalus/GloMop project at Berkeley developed a net-working and applications support model for diverse wirelessaccess networks and heterogeneous end devices [Brewer98].A key element of the approach was the client-proxy-servermodel, which placed software functionality in the pathbetween server and client to adapt content to the capabilitiesof the end. Thus a large-format, full color web image couldbe transcoded on the fly to a format suitable for display on aPalmPilot PDA screen. Related to proxies is the ability todeploy them on a Networks of Workstations computing plat-form, which yields a service that is both scalable and highlyavailable. The TACC (Transformation, Aggregation, Cach-ing, and Customization) model makes it possible for user-written services to be embedded in the run-time environ-ment. The system supported only limited customization ofservices and persistence of data, and was not designed forwide-area execution.

6.5. Object Systems for Applications Development

Object-oriented systems are of considerable importance forthe next generation of applications. They are often calledmiddleware, because they provide building blocks thatbridge between the networking and applications layers.

CORBA

The Common Object Request Broker Architecture, orCORBA, defines a set of mechanisms based on interfacespecifications (defined in a standardized Interface DefinitionLanguage, or IDL) and Applications Programming Inter-faces (APIs) that allow location transparency among com-municating applications components. An Object RequestBroker, or ORB, provides the linkage between clients andservers, by supporting client invocation of server objectsacross the network. An ORB intercepts the call, finds a suit-able object that can service the request based on discoveringmatching APIs through a process called introspection, passes

18

the found server object the necessary parameters, invokesthe appropriate method, and finally returns the results to theclient. This is accomplished in a manner that is transparent toobject location, implementation language, or operating sys-tem on the client or server machines. Client and serverobjects communicate across a network using CORBA’sInternet InterORB Protocol, or IIOP.

Java and JavaBeans

Java is a strongly typed, C++ like object-oriented program-ming language that is network-aware, interpretive (with just-in-time compilation and dynamic run-time checking), highlyportable, and multithreaded. Its strong typing and interpre-tive architecture significantly reduce commonly encounteredbugs, thereby enhancing the safety of program executionwhile also enabling applications that are “write once, runanywhere.” Because it was designed to make it easier todevelop applications on networks, Java is particularly well-suited as a programming language for developing services inthe kind of heterogeneous environment of converged net-works. Java is used in ICEBERG and Ninja.

JavaBeans is a platform-independent, portable componentmodel for use with Java. Its design goals are very similar toCORBA: software developers provide components that canbe composed by end users to form applications. Beans aresoftware components with specified interfaces (discoverablethrough introspection), are customizable for the usage athand, support an event model based on source/listeners thatsupports bean interconnection, expose properties such asmethods and events, and provide mechanisms for persistenceso they can be easily halted and restarted. Beans span therange from simple applications building blocks (e.g., specialfunctionality buttons, graphical user interface components,or other small to medium sized control functionality) to fullblown applications (e.g., a word processor or spreadsheetapplication) that can be composed to form compound docu-ments. JavaBeans bridges to other component models.

Beans run in containers, which may be Java or non-Javaapplications. A Java RMI (Remote Method Invocation) hasbeen defined for communications with Java-based servers.There is also a Java IDL to allow Java clients to communi-cate with CORBA-based servers.

CORBA and Java are orthogonal in concept. While CORBAaddresses network location transparency, Java solves theproblem of implementation transparency to allow compo-nents to run anywhere in the network. JavaBeans is beingdesigned so as to allow interoperation with CORBA objects.

JINI

JINI is a recent development in distributed computing archi-tectures. Based on Java, JINI’s goal is to support “spontane-ous networking,” allowing new devices to discover theirnetwork environment and configure themselves to operate in

that environment. It provides standard protocols to allow ser-vice providers to register their capabilities and service clientsto find and use these capabilities. Devices announce the ser-vices they can provide, as well as their attributes and capa-bilities, allowing the services they invoke to be customizedfor their needs.

In JINI’s terminology, a service is any entity that can be usedby another person, program, or service. Services includecomputation, storage, communications, filtering, hardwaredevices, or even another user. A service protocol is a set ofJava APIs. A JINI Federation is a dynamic composition ofservices to accomplish a specific task.

JINI’s run-time environment depends on two critical opera-tions. The first is Discovery/Join, which provides the mecha-nisms for a device to register with the network for the firsttime, without knowing anything about the network. Thedevice uses broadcast mechanisms to announce its presence.

The second is the Lookup Service. This is a bulletin board fornetwork services. The service provider can deposit its exe-cutable invocation specification into the Lookup Service. Aclient can then obtain this invocation code (in essence, adynamically downloaded driver) from the Lookup Service,thereby enabling service invocation through a JAVA RemoteMethod Invocation (RMI) call. Services can also be imple-mented through combinations of local and remote process-ing. Such implementations are called smart proxies.

Other elements of JINI run-time environment include Leas-ing (timeouts on registrations), Distributed Transactions(undo/redo mechanisms), and Distributed Events (reliabledelivery of events layered onto the event model).

6.6. Telecommunications-oriented Intelligent Network

A desire to more rapidly deploy new services in the telecom-munications network has driven the development of theAdvanced Intelligent Network (AIN). This is achieved bycreating a standardized service creation environment inde-pendent of the underlying vendor-specific switch platforms.A critical enabling technology for AIN is Signaling System 7(SS7), an internationally standardized channel signaling sys-tem for controlling switches and databases throughout thephone network. Service Switching Points (SSPs) interceptcertain patterns of call processing steps to invoke servicelogic in Service Control Points6 (SCP). The service logicthen influences the subsequent call processing steps. It isthrough such mechanisms that 800 number and call forward-ing services are deployed in the PSTN. AIN is intimatelycoupled to the hierarchical switching structure of the phonenetwork and the logical sequencing of call processing.

6. An SCP is essentially the hardware/software support for a database. SCPs use thedatabase to support such operations as phone number remapping for 800 services andcall forwarding.

19

TINA, Telecommunications Information Networking Archi-tecture, is a recent research effort to open up the telecommu-nications service architecture to allow end users to accessand customize their own services. Building on CORBA tech-nology, TINA is developing the cooperating object compo-nents needed to implement telecommunications servicesusing such advanced techniques as object orientation, dis-tributed processing environments, intelligent agents, andmulti-service networks.

6.7. Comparisons with ICEBERG/Ninja

ICEBERG and Ninja, though developed independently, con-tain many of the same concepts described in the systemsabove. If TINA is a telephony-oriented service architecturebuilt on top of CORBA’s object-oriented model and distrib-uted execution environment, then ICEBERG is to TINA asNinja is to CORBA. Ninja provides service discovery ser-vices like JINI and an object composition framework likeJavaBeans and CORBA. Ninja supports service migrationthrough its foundation on the Java programming model. Italso supports wide-area object execution similar to CORBA.A key difference is Ninja’s concept of an operator path, itsfocus on automated path compilation, and its approach tooptimization based on careful placement of services withinthe network. With respect to the latter, Ninja has full knowl-edge of how to exploit networks of workstation as a process-ing base for services, providing a critical element of thesolution to scalability in the service architecture. ICEBERGfocuses on services to provide network, device, communica-tions-type, and user interface transparency and indepen-dence. In addition to Ninja’s support for cluster processingand visibility into the underlying network topology, anotheradvantage is that ICEBERG can exploit Ninja’s ability tocompose object dynamically. Such dynamic composition isnot supported in CORBA to our knowledge. The InfoS-pheres project at CalTech is researching the composition ofdistributed active mobile objects that communicate usingmessages [Caltech98]. This has some similarities with ourown approach.

7. Summary and Conclusions

We claim that future telecommunications networks will befounded on a common network core: optimized for data,based on IP, enabling packetized voice, and supporting user,terminal, and most importantly, service mobility. Voice overIP technology is already developing rapidly. The majorresearch challenge will be to develop an open and compos-able telecommunications services architecture. In manyways, this represents the wide-area “operating system” of the21st Century. The existing PSTN architecture, even with theAdvanced Intelligent Network architecture, is not the beststructure for realizing this vision. A better approach is onefounded on client-proxy-server architectures so successfullydeployed in the Internet.

This proxy agent approach is particularly crucial, because itis through ubiquitous support for transcoding and translationthat we can provide sophisticated services for a broad diver-sity of access devices. These will go well beyond handsets orcomputers, to include combinations of both as well as newcontrollable “smart space” environments.

Our approach, a telecommunications service architecturecalled ICEBERG, is based on the Ninja execution platform.Ninja provides infrastructure, in the form of Units, ActiveProxies, and Bases, and Services, in the form of operators,typed connectors, and paths, to provide powerful anddynamic software functionality in the network core. We haveICEBERG and Ninja to implement innovative applicationslike Interactive Voice Recognition to control computer-baseddevices in a smart room. We are extending these ideas to thenext level of device and network independence and transpar-ency by implementing a Universal In-Box service on top ofICEBERG/Ninja. ICEBERG contains several novel ideas:

Services Across Cascaded Networks

ICEBERG supports cascaded networks, where there aremultiple paths between networks, and multiple places forservices to execute. Core services and resources are exposedto end users. ICEBERG provides secure, authenticatedmechanisms for allowing service- and entity-specific poli-cies to be injected into the network infrastructure.

Injecting code into the network requires secure executionand authentication of policies and entities. It requires carefulstructuring of resource interfaces. ICEBERG gives entitiescontrol over the paths of their data as well as resource usage.It supports network-specific resource management, authenti-cation, policies, and billing, to allow policies to be enforced.

Not Just Bit Transformations, but Service Transformation

ICEBERG addresses service transformations involving deci-sion making about bit-level transformations and where toperform them, high-level transformations to address defi-ciencies, and routing data across multiple networks.

Entities in the system provide contextual information aboutthemselves: who they are, what services they are using,where they are, and the capabilities/limitations of enddevices and networks. This affects bit-level transformationchoice and placement.

Users may have specific requirements (e.g., cost, deliverylatency, or QoS) for cross-network information. The networkgateways and end devices may only support limited bit-leveltransformations. End-devices, services, and network gate-ways may need to negotiate a transformation to use. Thisprocess may transverse cascaded networks.

Routing must be considered as a cascaded network function.The locally optimal choice of routing may not be globallyoptimal. Consider a call being placed between a VoIP user on

20

a computer in San Francisco and GSM user London. Routingthe call over IP might be optimal locally, but could introduceexcessive jitter and delay in the wide-area.

ICEBERG is not reinventing bit-level transformations orrouting protocols. The architecture leverages existing ser-vices and management functionality in each network. Ourconcern is with the level of operation and scope of decision-making.

Many of networks are multi-modal, carrying different formsof information. This can be used to determine where trans-formations should be located. Reaching a global optimum incascaded networks requires cooperation and communicationbetween routing entities across the underlying networks.

Networks contain many of the resource needed by services,but not in a generic form. For example, all networks havesome form of naming and authentication. ICEBERG unifiesmechanisms across the networks by providing abstract nam-ing and authentication that resolves to and relies upon net-work-specific mechanisms at the lowest level.

Propagation of Service- and Entity-Specific MetadataAcross Cascaded Networks

Service handoff propagates service- and entity-specificmetadata across cascaded networks. This propagates to thecomputational resources in each network carrying the dataassociated with the particular service. ICEBERG providesthese propagation paths by providing logically or physicallyseparate bi-directional metadata and data paths.

ICEBERG provides infrastructure for specifying metadata asXML descriptions. These are used for the creation or modifi-cation of a path (e.g., during service handoff) among request-ors and services to select, place, and route transcoders.

Acknowledgments

We wish to acknowledge our colleagues on the NinjaProject, Eric Brewer and David Culler. We especially thankProfessor B. R. “Badri” Badrinath, who played a major intel-lectual role in developing the ICEBERG service architecturewhile on sabbatical at Berkeley during 1997-1998.

The ICEBERG/Ninja testbed has been enabled by the sup-port of Ericsson, IBM, Intel, Motorola, Nortel Networks, andthe financial support of the Defense Advanced ResearchProjects Agency and the California MICRO Program.

8. References

[Balakrishnan95] Balakrishnan H., Seshan S., Katz R. H., “Improv-ing Reliable Transport and Handoff Performance in Cel-lular Wireless Networks,” Wireless Networks, Vol. 1, No.1, (February 1995), pp. 469-481.

[Baucke97] Baucke S., Leistungsbewertung und Optimierung vonTCP für den Einsatz im Mobilfunknetz GSM, Diploma

Thesis, CS-Dept. 4, Aachen University of Technology,Germany, April 1997.

[Brewer98] Brewer, E., et al., “A Network Architecture for Hetero-geneous Mobile Computing,” IEEE Personal Communi-cations Magazine, V 5, N 5, (October 1998), pp. 8-24.

[Caltech98] Caltech Infospheres Project, California Institute ofTechnology, Pasadena, CA, 1998. http://www.infos-pheres.caltech.edu/.

[Cohen97] Cohen, P. R., M. Johnston, D. McGee, S. Oviatt, J. Pitt-man, I. Smith, L. Chen, J. Chow, “Quickset: MultimodalInteraction for Distributed Applications,” Proc. of ACMMultimedia 97, Seattle, WA, (November 1997), pp. 31-40.

[Culler92] von Eicken, T., D. Culler, D., S. Goldstein, K. Schauser,“Active Messages: A Mechanism for Integrated Commu-nication and Computation,” Proc. 19th Annual Sympo-sium on Computer Architecture, Gold Coast, Australia,(May 1992), pp. 256-266.

[DeSimone93] DeSimone A., Chuah M. C., Yue O.-C., “Through-put Performance of Transport-Layer Protocols over Wire-less LANs,” Proceedings of the IEEE Globecom 93,1993.

[Festival98] Centre for Speech Technology Research, “The FestivalSpeech Synthesis System,” University of Edinburgh,Edinburgh, Scotland, 1998. http://www.cstr.ed.ac.uk/projects/festival/festival.html.

[Floyd95] Floyd, S., V. Jacobson, “Link-sharing and ResourceManagement Models for Packet Networks,” IEEE/ACMTransactions on Networking, Vol. 3 No. 4, (August1995), pp. 365-386.

[Fox98] Fox, A., S. Gribble, Y. Chawathe, E. Brewer, “Adapting toNetwork and Client Variation Using Infrastructural Prox-ies: Lessons and Perspectives,” IEEE Personal Commu-nications Magazine, V 5, N 4, (August 1998), pp. 10-19.

[Goodman97] Goodman, D., Wireless Personal CommunicationSystems, Addison-Wesley Longman, Berkeley, CA, 1997.

[Heilmeier98] Heilmeier, G., “POTS to PANS: Telecommunica-tions in Transition,” Keynote Address, UC BerkeleyEECS Industrial Liason Conference, Berkeley, CA, (Feb-ruary 1998).

[Hodes97] T. D. Hodes, R. H. Katz, E. Servan-Schreiber, L. A.Rowe, “Composable Ad hoc Mobile Services for Univer-sal Interaction,” Proceedings of The Third ACM/IEEEInternational Conference on Mobile Computing. Budap-est, Hungary, (Sept. 1997).

[Hodes98] Hodes, T., R. Katz, “Enabling Smart Spaces: EntityDescription and User Interface Generation for a Hetero-geneous Component-Based Distribution System,” UCBTechnical Report CSD-98-1008, Computer Science Divi-sion, University of California, Berkeley, Berkeley, CA,(July 1998). DARPA/NIST Smart Spaces Workshop,Gaithersburg, MD.

21

[Hodes99] T. Hodes, M. Newman, S. McCanne, R. H. Katz, J. Lan-day, “Shared Remote Control of a VideoconferencingApplication: Motivation, Design, and Implementation,”Proceedings of SPIE Multimedia Computing and Net-working 1999, San Jose, CA, (Jan. 1999).

[ICSI98] International Computer Science Institute, “SpeechResearch in the Realization Group at ICSI,” Berkeley,CA, 1998. http://www.icsi.berkeley.edu/real/speech.html.

[ITU98] International Telecommunications Union, Recommenda-tion H.323, Packet Based Multimedia CommunicationsSystems, (February 1998).

[Kojo97] Kojo M., et. al., “An Efficient Transport Service for SlowWireless Telephone Links,” IEEE JSAC, Vol. 15, No. 7,(September 1997), pp. 1337-1348.

[Kuruppillai97] Kuruppillai, R., M. Dontamsetti, F. Cosentino,Wireless PCS, McGraw Hill, San Francisco, CA, 1997.

[Narayanaswamy96] Narayanaswamy, S., et al., “Application andNetwork Support for InfoPad,” IEEE Personal Commu-nications Magazine, V 3, N 2, (April 1996), pp. 4-17.

[Nelson84] Nelson, B., A. Birrell, “Implementing Remote Proce-dure Calls,” ACM Transactions on Computer Systems,2(1), February 1984.

[Nelson98] Nelson, B., “IP Dialtone, Telco Evolution, and theInternet Economy,” Keynote Address at IEEE Globe-comm Conference, Sydney, Australia, (November 1998).

[Schulzrinne95] Schulzrinne, H., S. Casner, R. Frederick, V. Jacob-son, “RTP: A Transport Protocol for Real-Time Applica-tions,” Internet Engineering Task Force, Audio-VideoTransport Working Group, (March 1995).

[Stemm98] Stemm, M., R. Katz, “Vertical Handoffs in WirelessOverlay Networks,” ACM/Balzer Mobile Networking andApplications (MONET), Special Issue on “Mobile Net-working in the Internet,” (December 1998).

[Wang99] Wang, H., R. Katz, “Policy-Driven Handoffs Across Het-erogeneous Wireless Networks,” 2nd IEEE Workshop onMobile Computing and Applications (WMCSA’99), NewOrleans, LA, (Feb. 1999).

[Want95] Want, R., B. Schilit, N. Adams, R. Gold, K. Petersen, D.Goldberg, J. Ellis, M. Weiser, “An Overview of theParcTab Ubiquitous Computing Experiment,” IEEE Per-sonal Communications Magazine, V. 2, N. 6, (December1995), pp. 28-43.

[Wong98] T. Wong, T. Henderson, S. Raman, A. Costello, andRandy Katz, “Policy-Based Tunable Reliable Multicastfor Periodic Information Dissemination”, WOSBIS'98,Dallas, TX, (October 1998).

[Zhang93] Zhang, L., S. Deering, D. Estrin, S. Shenker, D. Zappala,“RSVP: A New Resource ReSerVation Protocol,” IEEENetwork Magazine, (September 1993), pp 8-18.

a scalable service architecture for computer-telephony...

Documents