pigeon executive report

12
Team Members Thomos Charilaos [email protected] Kasselakis Georgios [email protected] Tsompanidis Ilias [email protected] Simeoforidis Zisis [email protected] Project Tutor Athena Vakali, Assistant Professor, [email protected] In response to the ImagineCup 2005 Software Design Invitational Project Report March 2005

Upload: georgios-kasselakis

Post on 11-Apr-2015

631 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pigeon Executive Report

.Pigeon: Multimodal Language-Agnostic Communication tool over MSN Networks

- 1 -

Team Members Thomos Charilaos [email protected] Kasselakis Georgios [email protected] Tsompanidis Ilias [email protected] Simeoforidis Zisis [email protected] Project Tutor Athena Vakali, Assistant Professor, [email protected]

In response to the ImagineCup 2005 Software Design Invitational

Project Report

March 2005

Page 2: Pigeon Executive Report

.Pigeon: Multimodal Language-Agnostic Communication tool over MSN Networks

- 2 -

Eligibility

Charilos Thomos is at this 3rd year of studies for the B.Sc. degree in computer Science in Department of Informatics as the Aristotle University of Thessaloniki. He has diverse interests such as programming in cutting edge technologies such as the .net framework, graphics engines with a focus on Direct X and research prototypes of Artificial Intelligence. He is currently working in mesh landscape generation and TnL shaders. He has been his school‟s representative in the 1997 competition of informatics of the Hellenic Computer Society.

Georgios Kasselakis is a 2nd year student for the Bachelor of Science degree of Computer Science from the Department of Informatics in the Aristotle University of Thessaloniki. His interests include theoretical and simulation approaches to the construction of distributed processing and data warehousing systems. He has been working on software development for tools ranging from low level programming to runtime systems. He often occupies himself with algorithm theory and also revels in the implementation of architectural patterns.

Ilias E. Tsompanidis is at the 3rd year of the B.Sc degree in Computer Science from the Department of Informatics at the Aristotle University of Thessaloniki. He received a scholarship at the first year from the Greek National Scholarships Foundation for his high grades. He is a part time network technician of the Network Operation Centre at the Aristotle University of Thessaloniki since April 2004. His research interests include routing over networks, services over IP and wireless networks, as well as alternative network protocols.

Zisis P. Simaioforidis is a 3rd year undergraduate student at the Department of Informatics of Aristotle University of Thessaloniki. He is interested in programming over TCP/IP protocols, exploiting runtime Libraries and runtime Frameworks such as .NET and .NET Compact framework using C#,C++.Net and Visual Basic.Net . Zisis also occupies himself with random number generation theory. He is currently working on Speech recognition Engines and Semantic Context Grammars via XML semantics. He has also taken part in various competitions as his school representative, such as the 2001 National Greek Competition of Mathematics.

Athena I. Vakali received a B.Sc. degree in Mathematics from the Aristotle University of Thessaloniki, Greece, a M.Sc. degree in Computer Science from Purdue University, USA (with a Fulbright scholarship) and a Ph.D. degree in Computer Science from the Department of Informatics at the Aristotle University of Thessaloniki. Since 1997, she is a faculty member of the Department of Informatics, Aristotle University of Thessaloniki, Greece (currently she is an Assistant Professor). Her research interests include Content Delivery Networks and Web data management, focusing on the Web (and XML) data storage, caching and clustering. She has published several papers in international journals and conferences and she has also published book chapters and review articles. She is a Program Committee member of various International Conferences (e.g. DASFAA‟05, ISMIS‟05, COOPIS‟05, EDBT‟04, SACMAT‟04, etc) and she is chair in the ICDE‟05 Web Information Retrieval and the EDBT‟04 Web Data Clustering Workshops. She has participated in various research programmes and she has served as an evaluator in various European and national research programmes. She is a member of the ACM, the IEEE and the IEEE Computer Society and the USENIX Association.

Page 3: Pigeon Executive Report

.Pigeon: Multimodal Language-Agnostic Communication tool over MSN Networks

- 3 -

1. Abstract

In a world with ever increasing communication needs , there is great demand for tools that can overcome the barriers posed among people of different cultures, languages and dissimilarities. Despite the evolving technological solutions, linguistic barriers still remain the major obstacle prohibiting communication among people, since the existing translation solutions suffer from long translation response times,poor translation quality (under the various linguistic backgrounds and device platforms) and input format limitations.

.Pigeon is a software solution which offers translation capabilities, exploiting the usability of web translations and the real-time networking of MSN Networks towards supporting translation of various input formats (such as speech and text and in the immediate future images depicting text) under a wide range of today‟s devices (such as PCs, mobile phones, PDAs etc). The .Pigeon users may exploit the offered functionality, either by using a regular MSN messenger client or the .Pigeon messenger. Then, by connecting with the developed .Pigeon MSN Bot (or the .Pigeon WebService), the user may either get the translation result in the language and the format of his preference or send his message to another client. The .Pigeon framework supports zero-deployment and utilizes technologies such as Microsoft SAPI 5 Speech to text, Babel Text-to-Speech and 3rd party translation Servers (such as BabelFish and WorldLingo), according to client‟s preferences.

The .Pigeon framework aims at dissolving barriers posed by linguistic differences, abilities and technological diversities, since it addresses the following issues :

facilitating communication among different language speaking people, since the provided translation services are easily offered to a wide range of users, by exploiting the diffusion of MSN Networks.

enabling people with disabilities to communicate in foreign language environments, since various formats may be used for translation. For example, visually impaired people may input voice messages and use a voice commands interface.

improving translation quality, since the .Pigeon translation service is based on a (Neural Network inspired) graph which introduces quality and speed criteria in order to resolve translation problems met at particular language translation pairs.

tuning translation under various technological platforms, since .Pigeon operates on a wide range of devices, which might support the .net framework runtime (e.g. windows CE 4.2, windows 98 and later) or not (eg. Smartphone 2002 devices).

guaranteeing extensibility of the translation service, since the proposed framework has built-in scalability functionalities such that it may accommodate new translation tools and other services, for which there is a 3rd party translation service available.

2. .Pigeon Framework and User Scenarios 2.1. Organization and Functionalities

The overall .Pigeon framework involves web services that encapsulate Speech to Text, Text to Speech and translation facilities The text service translation is available through the regular MSN messengers, while a separate client is provided, incorporating translated voice services. Essential to the framework is the MSN messenger bot which impersonates a regular MSN user in order to provide in-place functionality and content through the use of regular MSN clients. The bot also instruments all operations and after employing the requested transformations, it redirects the content sent by each user, so that each client receives the content in their preferred language and medium

Page 4: Pigeon Executive Report

.Pigeon: Multimodal Language-Agnostic Communication tool over MSN Networks

- 4 -

The .Pigeon key functionality is depicted in Figure 1, where the typical flow of a message (originating from a client device) is depicted in blue arrowed dash line. The message can be either text or sound [voice] and is received by the bot or the Web service API (as described later in the report). When needed the message is transcribed before the actual translation process (accomplished by 3rd party translation servers). The message is transformed to voice, in case that a client has requested so and then (depending on user preferences) the message can be either redirected through the bot to the clients or returned directly to the originating client device.

Table 1 summarizes the functionality offered by the application interfaces and services.

An important feature of the .Pigeon framework is the ability to operate over MSN messenger networks. Existing users can enjoy translated instant messaging through the regular MSN client, by communicating with the .Pigeon bot. The bot exposes a simple set of text commands through which tuning tasks of the service are accomplished. For example, all

messages originating from users participating in a given chat room are redirected and translated so that all participants receive messages in their native language (in a conversation window).

The preferred way of accessing the service is by using the provided „.Pigeon messenger‟. This standalone application encapsulates all .Pigeon services in a uniform interface which still connects to the bot but provides the service in a transparent way. The user does not need to know any bot commands and can take advantage of the voice transfer services [text to speech and speech to text]. The .Pigeon messenger also supports a voice interface which may be used when inputting text is not possible such as in the case of visually impaired users.

Another important feature of the .Pigeon messenger is the ability to provide direct translations without being connected to the MSN networks. This feature can be used to provide quick translation of text or voice recordings, over a wide range of languages.

Connection to .Pigeon may be established by several means of communication, depending on the underlying device. For example, mobile devices have a variety of choices: Smartphones may use gprs, blade or even Bluetooth (also available for Pocket PCs which support Wi-Fi connectivity), Desktop PCs have the ability to dialup etc.

Interfaces / Services MSN messenger with .Pigeon bot

.Pigeon messenger

Text translation

Speech recognition

Voice command interface

Text to Speech

Zero deployment

Instant text messaging

Built in support for impaired users

Figure 1 : The .Pigeon Framework

Page 5: Pigeon Executive Report

.Pigeon: Multimodal Language-Agnostic Communication tool over MSN Networks

- 5 -

Table 1 : Overview of the .Pigeon Services

2.2. User Scenarios and the .Pigeon solution

.Pigeon is appealing to a wide audience since according to [1], it is estimated that over 100 million users have downloaded and installed the MSN messenger and a large portion of these users are located in different countries. Moreover, translation services are required in almost every aspect of social or professional messaging activity (since nowadays instant messaging is a communication tool ranging from friendly chat to business meetings). Here we focus on four (4) user scenarios and to describe them, we consider the following four personas :

Danai – age 28, Greek, also speaks German: a journalist who often travels to foreign countries for various reports and filming as a freelancer reporter collaborating with popular news agencies and TV channels.

Anna Li – age 54, Chinese, speaks moderately English: a senior executive manager who works at a large scale company in China. Her company deals with organization of conferences and exhibitions and would like to expand to other countries.

Howard – age 42, UK citizen, speaks English and French: a law professor at the University of Sorbonne, who travels often abroad to participate at Conferences and Committee meetings.

Ivan – age 22, Argentinean, is mute. Ivan can communicate proficiently in Spanish and a little Japanese. He is a student of Asian Literature and participating in an exchange program with a University in Tokyo.

Scenario #1 : Translating Text to speech

Howard is traveling to Rome for a conference on archaic judicial systems.

Upon exiting the airport he calls a taxi which will take him to his hotel, and realizes that the driver does not speak English or French.

Howard connects with his smartphone to the local gprs network and launches the .Pigeon messenger for a direct translation.

Being in a noisy environment he decides he will give input as text. He types the directions, sets the target to Italian and return type to Voice.

The message is translated at the .Pigeon servers and transformed to voice. The driver hears the directions in Italian and understands his client destination.

Scenario #2 : Speech to Speech Translation

Danai has to complete a reporting of the new legislative measures discussed at the European Union (EU) parliament, where there are currently 21 spoken languages1 (even more languages will emerge as more countries are pending to enter the European Union). Given the need for translation, interpretation booths have been installed throughout the EU Parliament building, enabling language parties to communicate.

Danai has to communicate with a Portuguese representative in order to clarify a specific part of a new legislative measure, but he speaks only Spanish and French so she uses her pocket-pc where she has installed .Pigeon.

They set up a .Pigeon translated phone conversation where when a user speaks, the message is recorded and sent to the .Pigeon bot, which in turn transcribes the message, translates it, transforms it back to voices of the corresponding languages and sends it to Danai‟s interviewee, using custom made web services.

1 At the European Parliament, “people are elected not because of their language skills but to represent their

political constituency 1". The annual cost of interpretation services is currently 550 million € and is expected to become 800 million € after the recent enlargement.source: http://news.bbc.co.uk/2/hi/europe/3604069.stm

Page 6: Pigeon Executive Report

.Pigeon: Multimodal Language-Agnostic Communication tool over MSN Networks

- 6 -

Scenario #3 : Multimodal conferencing and extension of services

Anna Li has undertaken the organization of a head corporate meeting for a multinational company. The meeting will have support for remote partners which will be located in different geographical locations and they will have access by various mobile devices. Anna Li wants to exploit the infrastructure offered by her company‟s Content Delivery Network Provider which offer support to continuous media events [2].

On her preparation to host the conference, Anna decides she needs to extend the supported languages. So she leases a translation server for the new languages, and quickly installs them to the framework using the server tools and a description from the service provider.

Anna launches a new .Pigeon bot and inserts registration information for the participants. When each of them connects with the .Pigeon messenger or the legacy MSN messenger, they receive the conference content in voice or text of their native language.

Scenario #4 : Social interaction for impaired users

Ivan is visiting Tokyo, Japan and will be staying there for the duration of the spring semester.

While trying to converse with a frustrated librarian who is responsible for giving him his textbooks, Ivan inputs his personal information and chosen courses on his pocket-pc and connects to the library wi-fi hotspot.

The .Pigeon server translates the text and returns the catalog back in Japanese text, which is familiar to the librarian.

3. Architectural and functional overview

Figure 2 : . Pigeon architectural overview Figure 2 depicts a more detailed overview of the .Pigeon architecture, which is designed to offer innovative features together with ease of installation, maintainability and extensibility. The key distinct components (represented in boxes) are analyzed in certain tasks which are easy to use and offer uninterrupted operation.

speech to text

wave text

Translation process

optimal route designation

route traversal

translation servers

Client Side

.Pigeon Standalone

MSN Client

Wave playback

Text messages

Wave Recording

Text messages text to speech wave text

User authorization

Content Distribution

Profile Management

.Pigeon WS API

Server Side

.Pigeon Bot

Page 7: Pigeon Executive Report

.Pigeon: Multimodal Language-Agnostic Communication tool over MSN Networks

- 7 -

3.1 Server-side : key components and phases

The server-side tasks are encapsulated for plug and play installation: the servers can be upgraded with virtually no downtime even with unknown component replacements. This is quite important since as it is emphasized in [3], the machine translation problem is quite difficult and challenging. More specifically, the key components of the .Pigeon framework are:

the MSN messenger bot is the component which impersonates a regular user and authenticates his access in collaboration with the MSN messenger servers. After a user is acknowledged, the bot exposes a simple text command interface which is fully accessible through the installed messenger clients. This allows for zero deployment (since no additional software needs to be installed) and ensures easy usage (since users are familiar with the messenger environment). According to the team members‟ knowledge, the .Pigeon bot approach is the first to deliver content through instant messaging networks.

the text to speech engine, provides natural voices in a variety of languages (supported by Babel technologies) and it is encapsulated in a web service which enables use of mobile devices (as regular cell phones). This component is tailored for visually impaired users who would be actively involved in instant messaging.

the speech to text engine, is another web service which can support new languages by using standard commercial packages and XML grammars to specialize recognition of speech over specific semantic contexts and at the same time minimize erroneous recognition. This engine provides high recognition accuracy and supports mainstream languages. It is also executed for closed vocabularies on the mobile devices of the framework to provide a voice interface for blind users.

the translation component connects with two renowned free translation servers, namely the Babelfish (a popular service hosted by the Altavista group and powered by the Systran translation engine). Babelfish provides free translation of short messages through a simple web interface, for a variety of languages. Additionally the Worldlingo servers are used. They are oriented to business services, and are the providers of translation for the Microsoft Office series of products.

The above components are implemented by introducing specific phases for each of the involved tasks:

Installation Phase : adopting the .Pigeon translation server The .Pigeon server comes with a range of tools that can be used to regulate the behaviour of the

provided services. The installation and removal of translation servers is necessary for the extension of the supported languages and it is supported by a separate tool that wraps the functionality of the developed ServiceNet class. The class provides optimal translation path finding (as will be described next).

Vendors of machine translation who want to expose their services to the .Pigeon framework must simply provide a service that implements a special interface (called ITranslationScheme). An alternative way to install new services is through designing an http post/get message. This is a more popular choice with freeware servers that provide web page interfaces. XML Web services can be accessed this way too.

The server tool facilitates the installation of web services by encapsulating the arguments of the service and generates an ITranslationScheme compliant object for consumption by the ServiceNet (use of the tool is made in Scenario #3 in §3.2). The same tool can also visualize the translation vectors available and alter their ratings, in order to overcome problems that may arise with 3rd party translation components.

Translation Phase : translating between language pairs As mentioned above, the translation process utilizes 3rd party servers to accomplish the translation

task. It is a common realization that no single such server provides services of adequate quality or speed in every language pair [4]. Also many servers do not provide two-way translation for all their supported languages. Building on the beliefs of users on the quality and speed as well as to improve translation pair coverage of less spoken languages, the .Pigeon introduces the translation graph.

Page 8: Pigeon Executive Report

.Pigeon: Multimodal Language-Agnostic Communication tool over MSN Networks

- 8 -

The translation graph is defined such that each node corresponds to one of the supported languages and each directed edge (between two nodes) corresponds to the translation availability among the specific language pair. In order to address the aforementioned problems, we introduce the notion of translation routes (hereby referred to as „routes‟). Each route is the actual path of successive edges which need to be followed in order to provide a quality translation, i.e. if the direct edge between two languages is of poor quality, the translation will be employed to intermediate language(s) before reaching the targeted translation language 2 . The routes are dynamically built based on speed measurements performed on the .Pigeon server, and satisfaction feedback provided by the users. Each

time, users are granted a slider control to specify their choices for speed (s[1,35]3 is the preferred

translation latency i.e. smaller s values refer to faster services.) and quality (c[0,1] is the quality to speed ratio) in order to specify their criteria which will guide the routes identification process. Then, the

overall weight of a translation edge is estimated by the formula: wnodeA, nodeB = (10-q)c+s(1-c),

where q[1,10] is the quality of translation (10 is the best result possible4). At each translation task, the choice of the involved language pairs is given by the (dynamic programming inspired) equation [5]:

wnode_i, node_j = min { wnode_i, node_j , wnode_i, node_k+ wnode_k, node_j} where k is any potential intermediate language choice.

Users provide their feedback by rating the translation result they received each time, through the input of a rating on the last message they received. These quality ratings are used to adjust the weight of the involved language pairs. The collective feedback of users ensures that always, the most popular routes will be used. Furthermore, the time taken to service each request is measured by the .Pigeon server, and is used as the actual translation speed rating.

Transformation Phase : delivering requested media formats .Pigeon has built-in abilities for easily swapping from one media format to another, since it the

user is able to input two different formats (text and speech) for translation. The Speech to text and Text to speech services are exposed as xml web services. Both services are supported by the Microsoft SAPI 5.1 engine [6]. The Speech-to-Text (STT) component is preconfigured for generic recognition of voices in a noisy environment. The default semantic vocabularies are used, so that users can give input in a free dictionary. The Text-to-Speech (TTS) engine is intended for use by visually impaired clients, or clients that participate in a telephony conversation. The service returns wave files of machine voice synthesis. This service is utilized in Scenario #1 of §3.2. The conjunction of the two services is utilized in Scenario #2 of §3.2

3.2 Client-side phases

Registration Phase : New user joining .Pigeon In order to access the .Pigeon online and .bot interfaces the user is asked to sign up in .Pigeon

server webpage. After filling in some information concerning her native language, the preferred way of receiving information and the quality to speed ratio setting, the user is granted access to the later interfaces. Impaired users can take advantage of the special commands, which act as quick settings that are also available. Finally, in order to access the bot, the user is required to start a conversation and login, after having added it to her contact list. A step of the process is depicted in Figure 3 at §5.

Usage Phase I : MSN messenger user profile The MSN messenger bot guarantees that users should be able to transparently access the .Pigeon services, despite the tasks undertaken by the server. The user actions are similar with the ones common to MSN messenger clients, with the exception of commands which wrap the additional

2 This technique is documented as “relay translation”. 3 the limit of 35 seconds for translation service is used as an indicative maximum acceptable time period after

which and the translation process should be restarted. 4 initially the system considers a default rating for the quality rating (q) and in the progress of time the users

specify their own quality ratings based on the received translation result.

Page 9: Pigeon Executive Report

.Pigeon: Multimodal Language-Agnostic Communication tool over MSN Networks

- 9 -

functionality. After the login procedure, all messages sent by the users are translated so that each recipient receives the message in his own native language and preferred format. The commands required by the bot are given in a series of questions and answers. The user only needs to memorize the >login command which initializes the connection process. Table 2 contains the available commands and their use. A screenshot of the MSN messenger user experience is available in Figure 4 of §5.

>login Initiates the login procedure, the user is subsequently asked for other commands

>setqualitytospeed <value>

sets the quality to speed factor to the desired value. Number can range in [1,100 ] and is internally remapped to [0,1]

>setlanguage <language> sets the user‟s language to the desired value. The name of the language is expected to be typed in the language itself [ie Francais]

>talkto <email> sets the recipient of the messages sent by this client. Each recipient is indexed by her email and has to login in order to receive messages.

>whisper <email> <text> sends a message to a specific recipient.

>help lists all the available commands of the .pigeon bot

Table 2: MSN bot Commands

Usage Phase II : .Pigeon messenger user experience The .Pigeon messenger interface enables the user to take full advantage of the services provided

by .Pigeon bot through a graphical user interface where the user is asked to fill in his MSN Messenger username and password, together with an email linked to a .Pigeon bot. Then, he is connected to MSN Networks and automatically logs in to the .Pigeon bot he specified, while the main interface screen is presented to him5 . There are two methods to enter and send a message. The first one requires the writing of the message in the outcoming messages textbox and pressing the send button, while the second one asks that the user speaks in the message through the device microphone after pressing the Record and send button. The user is provided with a contact list to communicate with multiple users that participate in a conversation hosted on the MSN bot. In order to specify a recipient for his message, instead of using bot commands, the user can double click on the preferred contact on the list (contacts of the .Pigeon services are stored locally and new contacts can be added/removed on demand). The interface has an incoming text box containing all the messages addressed to the user and client input is set on audio or text.

The .Pigeon messenger exposes a direct translation interface which grants the user with the ability to translate text and speech from any language to any language. In the interface the user is asked to specify the source and target languages of the translation, together with a value for the quality to speed ratio.

Finally, the interface can be configured in order to become simpler for impaired people since upon retrieval of the user profile, the interface is modified to fit the user‟s special needs. More specifically if the user is mute the interface hides all references of voice input and output, whereas if the user is visually impaired, the client enables a voice interface that can control every aspect of the application. In all cases attempts by other users to send content that is not deliverable to the user are rejected with a notice message that explains the action.

4. Technological Resources 4.1 Existing External Resources

Servers:

5 The MSN Messenger account username and passwords are not transferred to any .Pigeon servers but are stored locally

Page 10: Pigeon Executive Report

.Pigeon: Multimodal Language-Agnostic Communication tool over MSN Networks

- 10 -

Windows 2003 Server Web edition [www.microsoft.com/windowsserver2003], was chosen as the premier platform for running ASP.NET applications. It is the first windows server to incorporate the .net runtime and is documented to have stable behavior over older hardware equipment.

Babelfish/Systran [babelfish.altavista.com] and Worldlingo [www.worldlingo.com] translation servers, which (as mentioned in §3) are free translation servers. Systran is a global leader in machine translation and Worldlingo is known for providing solid APIs and integrated solutions. The use of different engines allows the Translation Graph component of the .Pigeon infrastructure to demonstrate its power.

Development Tools:

Visual Studio .NET 2003 Enterprise Architect Edition [msdn.microsoft.com/vstudio], is the platform of choice for the development of .net solutions that span across many systems. The environment includes tools that allow remote debugging of distributed applications

Smart Device programmability toolkit 2003, this is a series of tools that integrate with VS.NET to allow programming on newer mobile devices.

Other Technologies:

.NET Compact Framework 1.0 SP1, .NET Framework 1.1, Visual Basic.Net

XML Web Services, ASP.NET

SAPI 5.1 SDK, which provides seamless integration with the .net framework and is the base for a variety of commercial products. In its basic edition it provides recognition of three languages: English, Chinese and Japanese. The use of generalized interfaces and semantic grammars allows for extension to other languages

Acapella group / Babel technologies [www.infovox.se] MS SAPI Voices. Babel technologies provide our TTS Engine which is placed over the SAPI 5 specification. It supports realistic synthetic voice in a variety of languages.

4.2 Developed Resources The code base of the .Pigeon framework includes wrappers to external resources and two components written from scratch. The entire framework is written in the Visual Basic.NET programming language.

The Speech-to-Text component leverages over the functionality of the Microsoft SAPI 5.1 SDK

The Speech-to-Text component wraps the voice synthesis engine from Babel technologies and is hosted in the SAPI 5.1 environment. To integrate voice recording with the mobile applications, a separate recording control was authored using platform invocations to capture sound from the devices‟ microphone.

The MSN Messenger bot is a data flow construct implementing the „Push‟ design pattern. It communicates with the MSN servers using sockets and constructs messages that are compatible with the MSN6 protocol. To address inconsistencies in the different protocols (currently MSN10 is widely used) it provides an alternative to contact lists, which is stored at the client application.

The translation graph is written with the „Abstract‟ design pattern in mind. It operates over a variety of systems and in the current implementation uses http request messages to request translation from free servers. The result is scavenged from the resulting page using regular expressions.

5. Implementation Details The screenshots below are representative of the described phases of the .Pigeon framework. Figures 3, 4 and 5 are related to the end user perspective, while Figure 6 is taken from the server configuration tool.

Page 11: Pigeon Executive Report

.Pigeon: Multimodal Language-Agnostic Communication tool over MSN Networks

- 11 -

6. .Pigeon … Flying ahead The .Pigeon framework is an ongoing effort towards extending the usefulness and efficiency of the provided services and at the same time provides new ones. There are currently two planned features that are to be released with the new versions of the client version:

Support for optical recognition: As many new smart devices now incorporate high definition digital cameras, effort is made to incorporate recognition of text from images. This feature can be of use to travelers dealing with signs in unknown languages as well as people who want to quickly transform short written messages to text.

Support for regular cell phones: Support is planned for a component that will allow users of traditional cell phones to communicate with the .Pigeon framework using sms messages.

Figure 3: The new user registration page

Figure 6: The translation graph configuration tool

Figure 4: Conversation with the MSN Bot

Figure 5: .Pigeon Messenger

Page 12: Pigeon Executive Report

.Pigeon: Multimodal Language-Agnostic Communication tool over MSN Networks

- 12 -

Analysis of multimedia messages for extraction of sound is also considered. The provision of .Pigeon messaging over legacy phones is of interest to telephony providers who want to offer added-value services to their customers.

Moreover, the .Pigeon team is planning to extend the framework towards making it an appealing solution for content distribution (CDNs-oriented) markets. In this context, the future workplan involves designing components enhanced which the ability to support automatic distribution of content (such as advertisements, weather reports and news headlines) in order to address an emerging market of wide popularity and increasing audiences.

7. References [1]. Microsoft Press Release: http://www.microsoft.com/presspass/press/2003/may03/05-12100MillionPR.asp, May 2003 [2]. A. Vakali and G. Pallis: “CDNs: Status and Trends”, IEEE Internet Computing, 7(6): 68-74, Nov.-Dec. 2003. [3]. F. Ren and H. Shi : “Parallel Machine Translation : Principles and Practice”, Proceedings of the 7th International

Conference on Engineering of Complex Computer Systems (ICECCS‟01), June 2001. [4]. H. Alsharaf, S. Cardey, P. Greenfield, Y. Shen : “Problems and Solutions in Machine Translation Involving Arabic,

Chinese and French”, Proceeding of the International Conference on Information Technology : Coding and Computing (ITCC‟04), April 2004.

[5]. E.W. Dijkstra, “A Note on Two Problems in Connection with Graphs,” Numerical Mathematics, vol.1, pp.269-271, 1959

[6]. Microsoft: Telephony Call Control with Microsoft® Speech Server 2004 White Paper, August 2004