synchronous conferencing with translation

Upload: atlash

Post on 14-Apr-2018

233 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Synchronous Conferencing with Translation

    1/63

    1

    SYNCHRONOUS CONFERENCING

    WITH TRANSLATION

    HARDWARE & SOFTWARE REQUIREMENTS

    MINIMUM HARDWARE REQUIRED

    PROCESSOR : Pentium IV at 1.5GHz or higher RAM : 256MB or above CACHE : 256KB or above HARD DISK : 40GB or above

    SOFTWARE REQUIREMENT

    SDK : JDK 1.6 IDE : Eclipse Helios 3.6 Database : MySQL Server Web Server : Tomcat 7.0 Operating System : Windows XP / Windows 7 / Ubuntu 8.04 or

    higher

    Web Browser : Google Chrome/Mozilla Firefox 3.0

  • 7/30/2019 Synchronous Conferencing with Translation

    2/63

    2

    JAVA

    Java is a programming language originally developed by James Gosling at Sun

    Microsystems (which has since merged into Oracle Corporation) and released in 1995

    as a core component of Sun Microsystems' Java platform. The language derives much

    of its syntax from C and C++ but has a simpler object model and fewer low-level

    facilities than either C or C++. Java applications are typically compiled to bytecode

    (class file) that can run on any Java Virtual Machine (JVM) regardless of computer

    architecture. Java is a general-purpose, concurrent, class-based, object-oriented

    language that is specifically designed to have as few implementation dependencies as

    possible. It is intended to let application developers "write once, run anywhere"

    (WORA), meaning that code that runs on one platform does not need to be

    recompiled to run on another. Java is as of 2012 one of the most popular

    programming languages in use, particularly for client-server web applications, with a

    reported 10 million users.[9][10]

    The original and reference implementation Java compilers, virtual machines, and class

    libraries were developed by Sun from 1995. As of May 2007, in compliance with the

    specifications of the Java Community Process, Sun relicensed most of its Java

    technologies under the GNU General Public License. Others have also developed

    alternative implementations of these Sun technologies, such as the GNU Compiler for

    Java and GNU Classpath.

    History

    Sun Microsystems released the first public implementation as Java 1.0 in 1995. It

    promised "Write Once, Run Anywhere" (WORA), providing no-cost run-times on

    popular platforms. Fairly secure and featuring configurable security, it allowed

    network- and file-access restrictions. Major web browsers soon incorporated the

    ability to run Java applets within web pages, and Java quickly became popular. With

    the advent of Java 2 (released initially as J2SE 1.2 in December 19981999), new

    versions had multiple configurations built for different types of platforms. For

    example, J2EE targeted enterprise applications and the greatly stripped-down version

    J2ME for mobile applications (Mobile Java). J2SE designated the Standard Edition.

  • 7/30/2019 Synchronous Conferencing with Translation

    3/63

    3

    In 2006, for marketing purposes, Sun renamed new J2 versions as Java EE, Java ME,

    and Java SE, respectively.

    In 1997, Sun Microsystems approached the ISO/IEC JTC1 standards body and later

    the Ecma International to formalize Java, but it soon withdrew from the process.[15]

    Java remains a de facto standard, controlled through the Java Community

    Process.[16] At one time, Sun made most of its Java implementations available

    without charge, despite their proprietary software status. Sun generated revenue from

    Java through the selling of licenses for specialized products such as the Java

    Enterprise System. Sun distinguishes between its Software Development Kit (SDK)

    and Runtime Environment (JRE) (a subset of the SDK); the primary distinction

    involves the JRE's lack of the compiler, utility programs, and header files

    Principles

    There were five primary goals in the creation of the Java language:

    It should be "simple, object-oriented and familiar"

    It should be "robust and secure"

    It should be "architecture-neutral and portable"

    It should execute with "high performance"

    It should be "interpreted, threaded, and dynamic"

    Versions

    Major release versions of Java, along with their release dates:

    JDK 1.0 (October 1, 1992)

    JDK 1.1 (February 19, 1997)

    J2SE 1.2 (December 8, 1998)

    J2SE 1.3 (May 8, 2000)

    J2SE 1.4 (February 6, 2002)

    J2SE 5.0 (September 30, 2004)

  • 7/30/2019 Synchronous Conferencing with Translation

    4/63

    4

    Java SE 6 (December 11, 2006)

    Java SE 7 (July 28, 2011)

    Java Platform

    One characteristic of Java is portability, which means that computer programs written

    in the Java language must run similarly on any hardware/operating-system platform.

    This is achieved by compiling the Java language code to an intermediate

    representation called Java bytecode, instead of directly to platform-specific machine

    code. Java bytecode instructions are analogous to machine code, but are intended to

    be interpreted by a virtual machine (VM) written specifically for the host hardware.

    End-users commonly use a Java Runtime Environment (JRE) installed on their ownmachine for standalone Java applications, or in a Web browser for Java applets.

    Standardized libraries provide a generic way to access host-specific features such as

    graphics, threading, and networking.

    A major benefit of using bytecode is porting. However, the overhead of interpretation

    means that interpreted programs almost always run more slowly than programs

    compiled to native executables would. Just-in-Time (JIT) compilers were introducedfrom an early stage that compile bytecodes to machine code during runtime.

    Automatic Memory Management

    Java uses an automatic garbage collector to manage memory in the object lifecycle.

    The programmer determines when objects are created, and the Java runtime is

    responsible for recovering the memory once objects are no longer in use. Once no

    references to an object remain, the unreachable memory becomes eligible to be freed

    automatically by the garbage collector. Something similar to a memory leak may still

    occur if a programmer's code holds a reference to an object that is no longer needed,

    typically when objects that are no longer needed are stored in containers that are still

    in use. If methods for a nonexistent object are called, a "null pointer exception" is

    thrown.

    Java does not support C/C++ style pointer arithmetic, where object addresses and

    unsigned integers (usually long integers) can be used interchangeably. This allows thegarbage collector to relocate referenced objects and ensures type safety and security

  • 7/30/2019 Synchronous Conferencing with Translation

    5/63

    5

    JavaServer Pages

    JavaServer Pages (JSP) is a technology that helps software

    developers create dynamically generated web pagesbased on HTML, XML, or other

    document types. Released in 1999 by Sun Microsystems[1], JSP is similar to PHP, but

    it uses the Java programming language.

    To deploy and run JavaServer Pages, a compatible web server with a servlet

    container, such as Apache Tomcat orJetty, is required.

    Overview

    Architecturally, JSP may be viewed as a high-level abstraction ofJava servlets. JSPs

    are translated into servlets at runtime; each JSP's servlet is cached and re-used until

    the original JSP is modified.[2]

    JSP can be used independently or as the view component of a server-side model

    viewcontrollerdesign, normally with JavaBeans as the model and Java servlets (or a

    framework such as Apache Struts) as the controller. This is a type ofModel

    2 architecture.[3]

    JSP allows Java code and certain pre-defined actions to be interleaved with static web

    markup content, with the resulting page being compiled and executed on the server to

    deliver a document. The compiled pages, as well as any dependent Java libraries, use

    Java bytecode rather than a native software format. Like any other Java program, they

    must be executed within a Java virtual machine (JVM) that integrates with the server's

    host operating system to provide an abstract platform-neutral environment.

    JSP pages are usually used to deliver HTML and XML documents, but through the

    use of OutputStream, they can deliver other types of data as well.

    Syntax

    JSP pages use several delimiters for scripting functions. The most basic is ,

    which encloses a JSPscriptlet. A scriptlet is a fragment of Java code that is run when

    the user requests the page. Other common delimiters include forexpressions, where the value of the expression is placed into the page

    delivered to the user, and directives, denoted with

    http://en.wikipedia.org/wiki/Software_developerhttp://en.wikipedia.org/wiki/Software_developerhttp://en.wikipedia.org/wiki/Dynamic_web_pagehttp://en.wikipedia.org/wiki/HTMLhttp://en.wikipedia.org/wiki/XMLhttp://en.wikipedia.org/wiki/Sun_Microsystemshttp://en.wikipedia.org/wiki/Sun_Microsystemshttp://en.wikipedia.org/wiki/Sun_Microsystemshttp://en.wikipedia.org/wiki/PHPhttp://en.wikipedia.org/wiki/Java_(programming_language)http://en.wikipedia.org/wiki/Apache_Tomcathttp://en.wikipedia.org/wiki/Jetty_(web_server)http://en.wikipedia.org/wiki/Abstraction_(computer_science)http://en.wikipedia.org/wiki/Java_servlethttp://en.wikipedia.org/wiki/Java_Servlethttp://en.wikipedia.org/wiki/Java_Server_Pages#cite_note-1http://en.wikipedia.org/wiki/Java_Server_Pages#cite_note-1http://en.wikipedia.org/wiki/Java_Server_Pages#cite_note-1http://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controllerhttp://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controllerhttp://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controllerhttp://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controllerhttp://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controllerhttp://en.wikipedia.org/wiki/JavaBeanshttp://en.wikipedia.org/wiki/Apache_Strutshttp://en.wikipedia.org/wiki/Model_2http://en.wikipedia.org/wiki/Model_2http://en.wikipedia.org/wiki/Java_Server_Pages#cite_note-2http://en.wikipedia.org/wiki/Java_Server_Pages#cite_note-2http://en.wikipedia.org/wiki/Java_Server_Pages#cite_note-2http://en.wikipedia.org/wiki/Java_virtual_machinehttp://en.wikipedia.org/wiki/Operating_systemhttp://en.wikipedia.org/wiki/Operating_systemhttp://en.wikipedia.org/wiki/Java_virtual_machinehttp://en.wikipedia.org/wiki/Java_Server_Pages#cite_note-2http://en.wikipedia.org/wiki/Model_2http://en.wikipedia.org/wiki/Model_2http://en.wikipedia.org/wiki/Apache_Strutshttp://en.wikipedia.org/wiki/JavaBeanshttp://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controllerhttp://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controllerhttp://en.wikipedia.org/wiki/Java_Server_Pages#cite_note-1http://en.wikipedia.org/wiki/Java_Servlethttp://en.wikipedia.org/wiki/Java_servlethttp://en.wikipedia.org/wiki/Abstraction_(computer_science)http://en.wikipedia.org/wiki/Jetty_(web_server)http://en.wikipedia.org/wiki/Apache_Tomcathttp://en.wikipedia.org/wiki/Java_(programming_language)http://en.wikipedia.org/wiki/PHPhttp://en.wikipedia.org/wiki/Sun_Microsystemshttp://en.wikipedia.org/wiki/Sun_Microsystemshttp://en.wikipedia.org/wiki/XMLhttp://en.wikipedia.org/wiki/HTMLhttp://en.wikipedia.org/wiki/Dynamic_web_pagehttp://en.wikipedia.org/wiki/Software_developerhttp://en.wikipedia.org/wiki/Software_developer
  • 7/30/2019 Synchronous Conferencing with Translation

    6/63

    6

    Comparison with similar technologies

    JSP pages are similar to PHP pages and ASP.NET Web Forms, in that all three add

    server-side code to an HTML page. However, all three terms refer to a different

    component of the system. JSP refers to the JSP pages, which can be used alone, with

    Java servlets, or with a framework such as Apache Struts. PHP is itself a

    programming language, designed for dynamic Web pages.[11] ASP.net is a

    framework comparable to Struts or JavaServer Faces that uses pages called Web

    Forms.[12]

    While JSP pages use the Java language, ASP.NET pages can use any .NET-

    compatible language, usually C#.

    ASP.NET is designed for a Microsoft Windows web server, while PHP and Java

    server technologies (including JSP) are fairly operating system agnostic.

    Ajax

    Ajax (an acronym for Asynchronous JavaScript and XML) is a group of interrelated

    web development techniques used on the client-side to create asynchronous web

    applications. With Ajax, web applications can send data to, and retrieve data from, a

    server asynchronously (in the background) without interfering with the display and

    behavior of the existing page. Data can be retrieved using the XMLHttpRequest

    object. Despite the name, the use of XML is not required (JSON is often used

    instead), and the requests do not need to be asynchronous.

    Ajax is not a single technology, but a group of technologies. HTML and CSS can be

    used in combination to mark up and style information. The DOM is accessed with

    JavaScript to dynamically display, and to allow the user to interact with the

    information presented. JavaScript and the XMLHttpRequest object provide a method

    for exchanging data asynchronously between browser and server to avoid full page

    reloads.

  • 7/30/2019 Synchronous Conferencing with Translation

    7/63

    7

    Technologies

    The term Ajax has come to represent a broad group of web technologies that can be

    used to implement a web application that communicates with a server in the

    background, without interfering with the current state of the page. In the article that

    coined the term Ajax, Jesse James Garrett explained that the following technologies

    are incorporated:

    HTML (or XHTML) and CSS for presentation

    The Document Object Model (DOM) for dynamic display of and interaction with data

    XML for the interchange of data, and XSLT for its manipulation

    The XMLHttpRequest object for asynchronous communication

    JavaScript to bring these technologies together

    Since then, however, there have been a number of developments in the technologies

    used in an Ajax application, and the definition of the term Ajax. XML is not required

    for data interchange and therefore XSLT is not required for the manipulation of data.

    JavaScript Object Notation (JSON) is often used as an alternative format for data

    interchange,although other formats such as preformatted HTML or plain text can also

    be used.

    Cascading Style Sheets

    Cascading Style Sheets (CSS) is a style sheet language used for describing the

    presentation semantics (the look and formatting) of a document written in a markup

    language. Its most common application is to style web pages written in HTML and

    XHTML, but the language can also be applied to any kind of XML document,

    including plain XML, SVG and XUL.

    CSS is designed primarily to enable the separation of document content (written in

    HTML or a similar markup language) from document presentation, including

    elements such as the layout, colors, and fonts.This separation can improve content

    accessibility, provide more flexibility and control in the specification of presentation

    characteristics, enable multiple pages to share formatting, and reduce complexity and

    repetition in the structural content (such as by allowing for tableless web design). CSS

    can also allow the same markup page to be presented in different styles for different

  • 7/30/2019 Synchronous Conferencing with Translation

    8/63

    8

    rendering methods, such as on-screen, in print, by voice (when read out by a speech-

    based browser or screen reader) and on Braille-based, tactile devices. It can also be

    used to allow the web page to display differently depending on the screen size or

    device on which it is being viewed. While the author of a document typically links

    that document to a CSS style sheet, readers can use a different style sheet, perhaps

    one on their own computer, to override the one the author has specified.

    CSS specifies a priority scheme to determine which style rules apply if more than one

    rule matches against a particular element. In this so-called cascade, priorities or

    weights are calculated and assigned to rules, so that the results are predictable.

    The CSS specifications are maintained by the World Wide Web Consortium (W3C).

    Internet media type (MIME type) text/css is registered for use with CSS by RFC 2318

    (March 1998), and they also operate a free CSS validation service

    Syntax

    CSS has a simple syntax and uses a number of English keywords to specify the names

    of various style properties.

    A style sheet consists of a list of rules. Each rule or rule-set consists of one or more

    selectors, and a declaration block. A declaration-block consists of a list of declarations

    in braces. Each declaration itself consists of a property, a colon (:), and a value. If

    there are multiple declarations in a block, a semi-colon (;) must be inserted to separate

    each declaration.

    In CSS, selectors are used to declare which part of the markup a style applies to, a

    kind of match expression. Selectors may apply to all elements of a specific type, to

    elements specified by attribute, or to elements depending on how they are placed

    relative to, or nested within, others in the document tree.

    Pseudo-classes are used in CSS selectors to permit formatting based on information

    that is outside the document tree. An often-used example of a pseudo-class is :hover,

    which identifies content only when the user 'points to' the visible element, usually by

    holding the mouse cursor over it. It is appended to a selector as in a:hover or

    #elementid:hover. A pseudo-class classifies document elements, such as :link or

  • 7/30/2019 Synchronous Conferencing with Translation

    9/63

    9

    :visited, whereas a pseudo-element makes a selection that may consist of partial

    elements, such as :first-line or :first-letter.

    Selectors may be combined in many ways, especially in CSS 2.1, to achieve great

    specificity and flexibility.

    Browser support

    Further information: Comparison of layout engines (Cascading Style Sheets)

    Because not all browsers correctly parse CSS code, developed coding techniques

    known as CSS hacks can either filter specific browsers or target specific browsers

    (generally both are known as CSS filters). The former can be defined as CSS filteringhacks and the latter can be defined as CSS targeting hacks. Both can be used to hide

    or show parts of the CSS to different browsers. This is achieved either by exploiting

    CSS-handling quirks or bugs in the browser, or by taking advantage of lack of support

    for parts of the CSS specifications. Using CSS filters, some designers have gone as far

    as delivering different CSS to certain browsers to ensure designs render as expected.

    Because very early web browsers were either completely incapable of handling CSS,

    or rendered CSS very poorly, designers today often routinely use CSS filters thatcompletely prevent these browsers from accessing any of the CSS. Internet Explorer

    support for CSS began with IE 3.0 and increased progressively with each version. By

    2008, the first Beta of Internet Explorer 8 offered support for CSS 2.1 in its best web

    standards mode.

    An example of a well-known CSS browser bug is the Internet Explorer box model

    bug, where box widths are interpreted incorrectly in several versions of the browser,

    resulting in blocks that are too narrow when viewed in Internet Explorer, but correctin standards-compliant browsers. The bug can be avoided in Internet Explorer 6 by

    using the correct doctype in (X)HTML documents. CSS hacks and CSS filters are

    used to compensate for bugs such as this, just one of hundreds of CSS bugs that have

    been documented in various versions of Netscape, Mozilla Firefox, Opera, and

    Internet Explorer (including Internet Explorer 7).

    Even when the availability of CSS-capable browsers made CSS a viable technology,

    the adoption of CSS was still held back by designers' struggles with browsers'

    incorrect CSS implementation and patchy CSS support. Even today, these problems

  • 7/30/2019 Synchronous Conferencing with Translation

    10/63

    10

    continue to make the business of CSS design more complex and costly than it was

    intended to be, and cross-browser testing remains a necessity. Other reasons for the

    continuing non-adoption of CSS are: its perceived complexity, authors' lack of

    familiarity with CSS syntax and required techniques, poor support from authoring

    tools, the risks posed by inconsistency between browsers and the increased costs of

    testing.

    Currently there is strong competition between Mozilla's Gecko layout engine used in

    Firefox, the WebKit layout engine used in Apple Safari and Google Chrome, the

    similar KHTML engine used in KDE's Konqueror browser, and Opera's Presto layout

    engineeach of them is leading in different aspects of CSS. As of August 2009,

    Internet Explorer 8, Firefox 2 and 3 have reasonably complete levels ofimplementation of CSS 2.1.

    Bing Translator

    Bing Translator (previously Live Search Translator and Windows Live Translator) is

    a user facing translation portal provided by Microsoft as part of its Bing services to

    translate texts or entire web pages into different languages. All translation pairs are

    powered by Microsoft Translator statistical machine translation platform and web

    service, developed by Microsoft Research, as its backend translation software. Two

    transliteration pairs (between Chinese Traditional and Chinese Simplified) are

    provided by Microsoft's Windows International team.

    Features

    In addition to standard text and web page translation, Bing Translator includes several

    additional features:

    When translating an entire web page, or when the user selects "Translate this page" in

    Bing search results, the Bilingual Viewer will be shown, which allows users to

    browse the original web page text and translation in parallel, supported by

    synchronized highlights, scrolling, and navigation

    Four Bilingual Viewer layouts are available:

    Side by side

  • 7/30/2019 Synchronous Conferencing with Translation

    11/63

    11

    Top and bottom

    Original with hover translation

    Translation with hover original

    Website owners can add a translation widget to their website for translating it into

    other languages supported by Bing Translator; this is done by inserting an HTML

    code snippet on the web page

    Any-to-Any language translation pairs

    Automatically detect the language of the text or website being translated

    Ability to easily reverse the translation direction

    The user can play back a spoken version of the translation through text-to-speech (not

    supported in every language)

    SPHINX

    CMU Sphinx, also called Sphinx in short, is the general term to describe a group of

    speech recognition systems developed at Carnegie Mellon University. These include a

    series of speech recognizers (Sphinx 2 - 4) and an acoustic model trainer

    (SphinxTrain).

    In 2000, the Sphinx group at Carnegie Mellon committed to open source several

    speech recognizer components, including Sphinx 2 and later Sphinx 3 (in 2001). The

    speech decoders come with acoustic models and sample applications. The available

    resources include in addition software for acoustic model training, Language model

    compilation and a public-domain pronunciation dictionary, cmudict.

  • 7/30/2019 Synchronous Conferencing with Translation

    12/63

    12

    Sphinx

    Sphinx is a continuous-speech, speaker-independent recognition system making use

    of hidden Markov acoustic models (HMMs) and an n-gram statistical language model.

    It was developed by Kai-Fu Lee. Sphinx featured feasibility of continuous-speech,

    speaker-independent large-vocabulary recognition, the possibility of which was in

    dispute at the time (1986). Sphinx is of historical interest only; it has been superseded

    in performance by subsequent versions. An archival article describes the system in

    detail.

    Sphinx 2

    A fast performance-oriented recognizer, originally developed by Xuedong Huang at

    Carnegie Mellon and released as Open source with a BSD-style license on

    SourceForge by Kevin Lenzo at LinuxWorld in 2000. Sphinx 2 focuses on real-time

    recognition suitable for spoken language applications. As such it incorporates

    functionality such as end-pointing, partial hypothesis generation, dynamic language

    model switching and so on. It is used in dialog systems and language learning

    systems. It can be used in computer based PBX systems such as Asterisk. Sphinx 2

    code has also been incorporated into a number of commercial products. It is no longer

    under active development (other than for routine maintenance). Current real-time

    decoder development is taking place in the Pocket Sphinx project. An archival article

    describes the system.

    Sphinx 3

    Sphinx 2 used a semi-continuous representation for acoustic modeling (i.e., a single

    set of Gaussians is used for all models, with individual models represented as a

    weight vector over these Gaussians). Sphinx 3 adopted the prevalent continuous

    HMM representation and has been used primarily for high-accuracy, non-real-time

    recognition. Recent developments (in algorithms and in hardware) have made Sphinx

    3 "near" real-time, although not yet suitable for critical interactive applications.

    Sphinx 3 is under active development and in conjunction with SphinxTrain provides

  • 7/30/2019 Synchronous Conferencing with Translation

    13/63

    13

    access to a number of modern modeling techniques, such as LDA/MLLT, MLLR and

    VTLN, that improve recognition accuracy (see the article on Speech Recognition for

    descriptions of these techniques).

    Sphinx 4

    Sphinx 4 is a complete re-write of the Sphinx engine with the goal of providing a

    more flexible framework for research in speech recognition, written entirely in the

    Java programming language. Sun Microsystems supported the development of Sphinx

    4 and contributed software engineering expertise to the project. Participants included

    individuals at MERL, MIT and CMU.

    Current development goals include:

    developing a new (acoustic model) trainer

    implementing speaker adaptation (e.g. MLLR)

    improving configuration management

    creating a graph-based UI for graphical system design

    Basic concepts of speech

    Speech is a complex phenomenon. People rarely understand how is it produced and

    perceived. The naive perception is often that speech is built with words, and each

    word consists of phones. The reality is unfortunately very different. Speech is a

    dynamic process without clearly distinguished parts. It's always useful to get a sound

    editor and look into the recording of the speech and listen to it

    All modern descriptions of speech are to some degree probabilistic. That means that

    there are no certain boundaries between units, or between words. Speech to text

    translation and other applications of speech are never 100% correct. That idea is

    rather unusual for software developers, who usually work with deterministic systems.

    And it creates a lot of issues specific only to speech technology.

  • 7/30/2019 Synchronous Conferencing with Translation

    14/63

    14

    Structure of speech

    In current practice, speech structure is understood as follows:

    Speech is a continuous audio stream where rather stable states mix with dynamically

    changed states. In this sequence of states, one can define more or less similar classes

    of sounds, or phones. Words are understood to be built of phones, but this is certainly

    not true. The acoustic properties of a waveform corresponding to a phone can vary

    greatly depending on many factors - phone context, speaker, style of speech and so

    on. The so called coarticulation makes phones sound very different from their

    canonical representation. Next, since transitions between words are moreinformative than stable regions, developers often talk about diphones - parts of phones

    between two consecutive phones. Sometimes developers talk about subphonetic units

    - different substates of a phone. Often three or more regions of a different nature can

    easily be found.

    The number three is easily explained. The first part of the phone depends on its

    preceding phone, the middle part is stable, and the next part depends on the

    subsequent phone. That's why there are often three states in a phone selected for

    HMM recognition.

    Sometimes phones are considered in context. There are triphones or even quinphones.

    But note that unlike phones and diphones, they are matched with the same range in

    waveform as just phones. They just differ by name. That's why we prefer to call this

    object senone. A senone's dependence on context could be more complex than just

    left and right context. It can be a rather complex function defined by a decision tree,

    or in some other way.

    Next, phones build subword units, like syllables. Sometimes, syllables are defined as

    reduction-stable entities. To illustrate, when speech becomes fast, phones often

    change, but syllables remain the same. Also, syllables are related to intonational

    contour. There are other ways to build subwords - morphologically-based in

    morphology-rich languages or phonetically-based. Subwords are often used in open

    vocabulary speech recognition.

  • 7/30/2019 Synchronous Conferencing with Translation

    15/63

    15

    Subwords form words. Words are important in speech recognition because they

    restrict combinations of phones significantly. If there are 40 phones and an average

    word has 7 phones, there must be 40^7 words. Luckily, even a very educated person

    rarely uses more then 20k words in his practice, which makes recognition way more

    feasible.

    Words and other non-linguistic sounds, which we call fillers (breath, um, uh, cough),

    form utterances. They are separate chunks of audio between pauses. They don't

    necessary match sentences, which are more semantic concepts.

    On the top of this, there are dialog acts like turns, but they go beyond the purpose ofthe document.

    Recognition process

    The common way to recognize speech is the following: we take waveform, split it on

    utterances by silences then try to recognize what's being said in each utterance. To do

    that we want to take all possible combinations of words and try to match them with

    the audio. We choose the best matching combination. There are few important things

    in this match.

    First of all it's a concept of features. Since number of parameters is large, we are

    trying to optimize it. Numbers that are calculated from speech usually by dividing

    speech on frames. Then for each frame of length typically 10 milliseconds we extract

    39 numbers that represent the speech. That's called feature vector. They way to

    generates numbers is a subject of active investigation, but in simple case it's a

    derivative from spectrum.

    Second it's a concept of the model. Model describes some mathematical object that

    gathers common attributes of the spoken word. In practice, for audio model of senone

    is gaussian mixture of it's three states - to put it simple, it's a most probable feature

    vector. From concept of the model the following issues raised - how good does model

    fits practice, can model be made better of it's internal model problems, how adaptivemodel is to the changed conditions.

  • 7/30/2019 Synchronous Conferencing with Translation

    16/63

    16

    Third, it's a matching process itself. Since it would take a huge time more than

    universe existed to compare all feature vectors with all models, the search is often

    optimized by many tricks. At any points we maintain best matching variants and

    extend them as time goes producing best matching variants for the next frame.

    Models

    According to the speech structure, three models are used in speech recognition to do

    the match:

    An acoustic model contains acoustic properties for each senone. There are context-

    independent models that contain properties (most probable feature vectors for each

    phone) and context-dependent ones (built from senones with context).

    A phonetic dictionary contains a mapping from words to phones. This mapping is not

    very effective. For example, only two to three pronunciation variants are noted in it,

    but it's practical enough most of the time. The dictionary is not the only variant of

    mapper from words to phones. It could be done with some complex function learned

    with a machine learning algorithm.

    A language model is used to restrict word search. It defines which word could follow

    previously recognized words (remember that matching is a sequential process) and

    helps to significantly restrict the matching process by stripping words that are not

    probable. Most common language models used are n-gram language models-these

    contain statistics of word sequences-and finite state language models-these define

    speech sequences by finite state automation, sometimes with weights. To reach a good

    accuracy rate, your language model must be very successful in search space

    restriction. This means it should be very good at predicting the next word. A language

    model usually restricts the vocabulary considered to the words it contains. That's an

    issue for name recognition. To deal with this, a language model can contain smaller

    chunks like subwords or even phones. Please note that search space restriction in this

    case is usually worse and corresponding recognition accuracies are lower than with a

    word-based language model.

  • 7/30/2019 Synchronous Conferencing with Translation

    17/63

    17

    Those three entities are combined together in an engine to recognize speech.

    CMUSphinx Toolkit

    CMUSphinx toolkit is a leading speech recognition toolkit with various tools used to

    build speech applications. CMU Sphinx toolkit has a number of packages for different

    tasks and applications

    Sphinx4adjustable, modifiable recognizer written in Java

    CMUclmtklanguage model tools

    Sphinxtrainacoustic model training tools

    MySQL

    MySQL is the world's most used open source relational database management

    system (RDBMS) that runs as a server providing multi-user access to a number of

    databases. The SQL phrase stands for Structured Query Language.

    The MySQL development project has made its source code available under the terms

    of the GNU General Public License, as well as under a variety of

    proprietary agreements. MySQL was owned and sponsored by a single for-profit firm,

    the Swedish company MySQL AB, now owned by Oracle Corporation.

    Free-software-open source projects that require a full-featured database management

    system often use MySQL. For commercial use, several paid editions are available,

    and offer additional functionality. Applications which use MySQL databases

    include: TYPO3, Joomla, WordPress, phpBB, MyBB, Drupal and other software built

    on the LAMP software stack. MySQL is also used in many high-profile, large-

    scale World Wide Web products, including Wikipedia, Google (though not for

    searches), Facebook, and Twitter.

    MySQL is a popular choice of database for use in web applications, and is a central

    component of the widely used LAMP open source web application software stackLAMP is an acronym for "Linux, Apache, MySQL, Perl/PHP/Python".

    http://en.wikipedia.org/wiki/Joomlahttp://en.wikipedia.org/wiki/WordPresshttp://en.wikipedia.org/wiki/PhpBBhttp://en.wikipedia.org/wiki/MyBBhttp://en.wikipedia.org/wiki/MyBBhttp://en.wikipedia.org/wiki/PhpBBhttp://en.wikipedia.org/wiki/WordPresshttp://en.wikipedia.org/wiki/Joomla
  • 7/30/2019 Synchronous Conferencing with Translation

    18/63

    18

    Many programming languages with language-specific APIs include libraries for

    accessing MySQL databases. These include MySQL Connector/Net for integration

    with Microsoft's Visual Studio(languages such as C# and VB are most commonly

    used) and the JDBC driver for Java. In addition, an ODBC interface

    called MyODBC allows additional programming languages that support the ODBC

    interface to communicate with a MySQL database, such as ASP or ColdFusion.

    The HTSQL - URL-based query method also ships with a MySQL adapter, allowing

    direct interaction between a MySQL database and any web client via structured

    URLs.

    As of April 2009, MySQL offered MySQL 5.1 in two different variants: the open

    source MySQL Community Server and the commercial Enterprise Server. MySQL

    5.5 is offered under the same licenses. They have a common code base and include

    the following features:

    A broad subset of ANSI SQL 99, as well as extensions Cross-platform support Stored procedures Triggers Cursors Updatable Views Information schema Strict mode (ensures MySQL does not truncate or otherwise modify data to

    conform to an underlying data type, when an incompatible value is inserted

    into that type)

    X/Open XA distributed transaction processing (DTP) support; two phasecommit as part of this, using Oracle's InnoDB engine

    Independent storage engines (MyISAM for read speed, InnoDB fortransactions and referential integrity, MySQL Archive for storing historical

    data in little space)

    Transactions with the InnoDB, and Cluster storage engines; savepoints withInnoDB

    SSL support Query caching

    http://en.wikipedia.org/wiki/MyODBChttp://en.wikipedia.org/wiki/InnoDBhttp://en.wikipedia.org/wiki/MyISAMhttp://en.wikipedia.org/wiki/InnoDBhttp://en.wikipedia.org/wiki/InnoDBhttp://en.wikipedia.org/wiki/MyISAMhttp://en.wikipedia.org/wiki/InnoDBhttp://en.wikipedia.org/wiki/MyODBC
  • 7/30/2019 Synchronous Conferencing with Translation

    19/63

    19

    Sub-SELECTs (i.e. nested SELECTs) Replication support (i.e. Master-Master Replication & Master-Slave

    Replication) with one master per slave, many slaves per master, no automatic

    support for multiple masters per slave. Full-text indexing and searching using MyISAM engine Embedded database library Partial Unicode support (UTF-8 and UCS-2 encoded strings are limited to

    the BMP)

    ACID compliance when using transaction capable storage engines(InnoDB and Cluster)[

    Partitioned tables with pruning of partitions in optimiser

    Shared-nothing clustering through MySQL Cluster Hot backup (via mysqlhotcopy) under certain conditions

    MySQL implements the following features, which some other RDBMS systems may

    not:

    Multiple storage engines, allowing one to choose the one that is most effectivefor each table in the application (in MySQL 5.0, storage engines must be

    compiled in; in MySQL 5.1, storage engines can be dynamically loaded at run

    time):

    Native storage engines (MyISAM, Falcon, Merge, Memory(heap), Federated, Archive, CSV, Blackhole, Cluster, EXAMPLE, Maria,

    and InnoDB, which was made the default as of 5.5)

    Partner-developed storage engines (solidDB, NitroEDB, ScaleDB,TokuDB, Infobright (formerly Brighthouse), Kickfire, XtraDB, IBM

    DB2).[ InnoDB used to be a partner-developed storage engine, but with recent

    acquisitions, Oracle now owns both MySQL core and InnoDB.

    Community-developed storage engines (memcache engine, httpd,PBXT, Revision Engine)

    Custom storage engines Commit grouping, gathering multiple transactions from multiple connections

    together to increase the number of commits per second.

    http://en.wikipedia.org/wiki/InnoDBhttp://en.wikipedia.org/wiki/MyISAMhttp://en.wikipedia.org/wiki/InnoDBhttp://en.wikipedia.org/wiki/SolidDBhttp://en.wikipedia.org/wiki/Infobrighthttp://en.wikipedia.org/wiki/Kickfirehttp://en.wikipedia.org/w/index.php?title=XtraDB&action=edit&redlink=1http://en.wikipedia.org/wiki/InnoDBhttp://en.wikipedia.org/w/index.php?title=Memcache_engine&action=edit&redlink=1http://en.wikipedia.org/wiki/Web_serverhttp://en.wikipedia.org/w/index.php?title=Revision_Engine&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Revision_Engine&action=edit&redlink=1http://en.wikipedia.org/wiki/Web_serverhttp://en.wikipedia.org/w/index.php?title=Memcache_engine&action=edit&redlink=1http://en.wikipedia.org/wiki/InnoDBhttp://en.wikipedia.org/w/index.php?title=XtraDB&action=edit&redlink=1http://en.wikipedia.org/wiki/Kickfirehttp://en.wikipedia.org/wiki/Infobrighthttp://en.wikipedia.org/wiki/SolidDBhttp://en.wikipedia.org/wiki/InnoDBhttp://en.wikipedia.org/wiki/MyISAMhttp://en.wikipedia.org/wiki/InnoDB
  • 7/30/2019 Synchronous Conferencing with Translation

    20/63

    20

    APACHE STRUTS FRAMEWORK

    Apache Struts is an open-source web application framework for developing Java EE

    web applications. It uses and extends the Java Servlet API to encourage developers to

    adopt a modelviewcontroller (MVC) architecture. It was originally created by Craig

    McClanahan and donated to the Apache Foundation in May, 2000. Formerly located

    under the Apache Jakarta Project and known as Jakarta Struts, it became a top-level

    Apache project in 2005.

    Design goals and overview

    In a standard Java EE web application, the client will typically call to the server via a

    web form. The information is then either handed over to a Java Servlet which

    interacts with a database and produces an HTML-formatted response, or it is given to

    a JavaServer Pages (JSP) document that intermingles HTML and Java code to achieve

    the same result. Both approaches are often considered inadequate for large projects

    because they mix application logic with presentation and make maintenance difficult.

    The goal of Struts is to separate the model (application logic that interacts with a

    database) from the view (HTML pages presented to the client) and the controller

    (instance that passes information between view and model). Struts provides the

    controller (a servlet known as ActionServlet) and facilitates the writing of templates

    for the view or presentation layer (typically in JSP, but XML/XSLT and Velocity are

    also supported). The web application programmer is responsible for writing the model

    code, and for creating a central configuration file struts-config.xml that binds togethermodel, view and controller.

    Requests from the client are sent to the controller in the form of "Actions" defined in

    the configuration file; if the controller receives such a request it calls the

    corresponding Action class that interacts with the application-specific model code.

    The model code returns an "ActionForward", a string telling the controller what

    output page to send to the client. Information is passed between model and view in the

    form of special JavaBeans. A powerful custom tag library allows it to read and write

  • 7/30/2019 Synchronous Conferencing with Translation

    21/63

    21

    the content of these beans from the presentation layer without the need for any

    embedded Java code.

    Struts is categorized as a request-based web application framework.

    Struts also supports internationalization by web forms, and includes a template

    mechanism called "Tiles" that (for instance) allows the presentation layer to be

    composed from independent header, footer, and content components.

    Struts2 features

    Simple POJO based Actions Simplified testability Thread Safe AJAX Support jQuery Plugin Dojo Plugin (deprecated) AJAX Client Side Validation Template Support Support for different result types Easy to extend with Plugins REST Plugin (REST based Actions, Extension-less URLs) Convention Plugin (Action Configuration via Conventions and Annotations) Spring Plugin (Dependency Injection) Hibernate Plugin support in Design JFreechart Plugin (Charts) jQuery Plugin (AJAX Support, UI Widgets, Dynamic Table, Charts) Rome Plugin (RSS Feeds) Competing MVC frameworks

  • 7/30/2019 Synchronous Conferencing with Translation

    22/63

    22

    Although Struts is a well-documented, mature, and popular framework for building

    front ends to Java applications, there are other frameworks categorized as

    "lightweight" MVC frameworks such as Spring MVC, Stripes, Wicket, Play!, and

    Tapestry. The new XForms standards and frameworks may also be another option to

    building complex web Form validations with Struts in the future.

    The WebWork framework spun off from Apache Struts aiming to offer enhancements

    and refinements while retaining the same general architecture of the original Struts

    framework. However, it was announced in December 2005 that Struts would re-merge

    with WebWork. WebWork 2.2 has been adopted as Apache Struts2, which reached its

    first full release in February 2007.

    In 2004, Sun launched an addition to the Java platform, called JavaServer Faces

    (JSF). Aside from the original Struts framework, the Apache project previously

    offered a JSF-based framework called Shale, which was retired in May 2009.

    Other MVC frameworks that are Java based include WebObjectsand Grails.

    ECLIPSE IDE

    Eclipse is a multi-language software development environment comprising an

    integrated development environment (IDE) and an extensible plug-in system. It is

    written mostly in Java. It can be used to develop applications in Java and, by means of

    various plug-ins, other programming languages including Ada, C, C++, COBOL,

    Haskell, Perl, PHP, Python, R, Ruby (including Ruby on Rails framework), Scala,

    Clojure, Groovy, Android and Scheme. It can also be used to develop packages for

    the software Mathematica. Development environments include the Eclipse Java

    development tools (JDT) for Java, Eclipse CDT for C/C++, and Eclipse PDT for PHP,

    among others.

    The initial codebase originated from IBM VisualAge.The Eclipse SDK (which

    includes the Java development tools) is meant for Java developers. Users can extend

    its abilities by installing plug-ins written for the Eclipse Platform, such as

    development toolkits for other programming languages, and can write and contribute

    their own plug-in modules.

  • 7/30/2019 Synchronous Conferencing with Translation

    23/63

    23

    Released under the terms of the Eclipse Public License, Eclipse SDK is free and open

    source software. It was one of the first IDEs to run under GNU Classpath and it runs

    without issues under IcedTea.

    Architecture

    The Eclipse Platform uses plug-ins to provide all functionality within and on top of

    the runtime system, in contrast to some other applications, in which functionality is

    hard coded. The Eclipse Platform's runtime system is based on Equinox, an

    implementation of the OSGi core framework specification.

    This plug-in mechanism is a lightweight software componentry framework. In

    addition to allowing the Eclipse Platform to be extended using other programming

    languages such as C and Python, the plug-in framework allows the Eclipse Platform

    to work with typesetting languages like LaTeX, networking applications such as telnet

    and database management systems. The plug-in architecture supports writing any

    desired extension to the environment, such as for configuration management. Java and

    CVS support is provided in the Eclipse SDK, with support for other version control

    systems provided by third-party plug-ins.

    With the exception of a small run-time kernel, everything in Eclipse is a plug-in. This

    means that every plug-in developed integrates with Eclipse in exactly the same way as

    other plug-ins; in this respect, all features are "created equal". Eclipse provides plug-

    ins for a wide variety of features, some of which are through third parties using both

    free and commercial models. Examples of plug-ins include a UML plug-in for

    Sequence and other UML diagrams, a plug-in for DB Explorer, and many others.

    The Eclipse SDK includes the Eclipse Java development tools (JDT), offering an IDE

    with a built-in incremental Java compiler and a full model of the Java source files.

    This allows for advanced refactoring techniques and code analysis. The IDE also

    makes use of a workspace, in this case a set of metadata over a flat filespace allowing

    external file modifications as long as the corresponding workspace "resource" is

    refreshed afterwards.

  • 7/30/2019 Synchronous Conferencing with Translation

    24/63

    24

    Eclipse implements widgets through a widget toolkit for Java called SWT, unlike

    most Java applications, which use the Java standard Abstract Window Toolkit (AWT)

    or Swing. Eclipse's user interface also uses an intermediate graphical user interface

    layer called JFace, which simplifies the construction of applications based on SWT.

    Starting with the Eclipse 4.x series, Eclipse XWT (XML Window Toolkit) replaces

    SWT as the preferred way to create user interfaces for Eclipse. SWT support is

    deprecated in Eclipse Juno, and is planned to be removed entirely in Kepler.

    Language packs developing by the "Babel project" provide translations into over a

    dozen natural languages.

    Rich Client Platform

    Eclipse provides the Rich Client Platform (RCP) for developing general purpose

    applications. The following components constitute the rich client platform:

    Equinox OSGia standard bundling framework

    Core platformboot Eclipse, run plug-ins[citation needed]

    Standard Widget Toolkit (SWT)a portable widget toolkit

    JFaceviewer classes to bring model view controller programming to SWT, file

    buffers, text handling, text editors

    Eclipse Workbenchviews, editors, perspectives, wizards

    Examples of rich client applications based on Eclipse are:

    Lotus Notes 8

    Novell/NetIQ Designer for Identity Manager

    Apache Directory Studio

    Server platform

  • 7/30/2019 Synchronous Conferencing with Translation

    25/63

    25

    Eclipse supports development for Tomcat, GlassFish and many other servers and is

    often capable of installing the required server (for development) directly from the

    IDE. It supports remote debugging, allowing the user to watch variables and step

    through the code of an application that is running on the attached server.

    Web Tools Platform

    The Eclipse Web Tools Platform (WTP) project is an extension of the Eclipse

    platform with tools for developing Web and Java EE applications. It includes source

    and graphical editors for a variety of languages, wizards and built-in applications to

    simplify development, and tools and APIs to support deploying, running, and testingapps.

    APACHE TOMCAT SERVER

    Apache Tomcat (or simply Tomcat, formerly also Jakarta Tomcat) is an open source

    web server and servlet container developed by the Apache Software Foundation

    (ASF). Tomcat implements the Java Servlet and the JavaServer Pages (JSP)

    specifications from Oracle Corporation, and provides a "pure Java" HTTP web server

    environment for Java code to run.

    Tomcat is not the same as the Apache web server, which is a C implementation of a

    HTTP web server; these two web servers are not bundled together, although they are

    frequently used together as part of a server application stack. Apache Tomcat includes

    tools for configuration and management, but can also be configured by editing XML

    configuration files.

  • 7/30/2019 Synchronous Conferencing with Translation

    26/63

    26

    PROJECT

    SYNCHRONOUS CONFERENCING

    WITH TRANSLATION

    DESCRIPTION

    It is a dynamic web application developed for synchronous conferencing over a

    network. The application is written in Java programming language. The application is

    run through Eclipse Helios IDE on Windows OS / Unix-based OS having Internet

    connection.

    SCOPE

    This web application has been developed to use field communication. The objective

    of this project is to establish communication between clients who speak different

    languages (specifically Hindi and English). The clients can use two interfaces:

    i.Text Oriented Interface

    The client can communicate by typing words in the provided text area using standard

    keyboard or a native keyboard

    ii. Voice Oriented Interface (For future implementation)

    The client can communicate by speaking into the microphone, which is then

    converted into text by the integrated speech recognizer.

    INVOCATION

    The Dynamic Web Application Project has to be imported into the workspace of

    Eclipse Helios IDE. The project requires an API key, which provides access to the

    Bing Translator, available at Microsoft Azure Marketplace. Clients wishing

    communicate need to Sign Up first.

  • 7/30/2019 Synchronous Conferencing with Translation

    27/63

    27

    SOURCE CODE

    When the project is run on Apache Tomcat Sever, the XML file web.xml opens the

    welcome page signup.jsp mentioned in XML code. The client signs up through this

    JSP page. The code for above mentioned files is shown below along with the

    screenshot of the signup page:

    web.xml

    Struts 2

    /signup.jsp

    struts2

    org.apache.struts2.dispatcher.FilterDispatcher

    struts2

    /*

  • 7/30/2019 Synchronous Conferencing with Translation

    28/63

    28

    signup.jsp

  • 7/30/2019 Synchronous Conferencing with Translation

    29/63

    29

    #signup-box {

    background: url(images/login-box-backg1.png) no-repeat;

    padding-top: 100px;

    padding-left: 200px;

    font: 12px Arial, Helvetica, sans-serif;

    color: #ebebeb;

    margin-left: 100px;

    }

    #signup-boxh2 {

    padding: 0;

    margin: 0;

    color: #ebebeb;

    font: bold 50px "Calibri", Arial;

    }

    .form-Name {

    float: left;

    margin-left: 20px;

    margin-top: 18px;

    border: 1px solid #0d2c52;

    background-color: #1e4f8a;

    width: 205px;

    font-size: 16px;

    color: #ebebeb;

    }

    .form-submit{

    float: left;

    font-weight: 900;

    font-size: 14px;

    background: #4F8CDF;

    padding-left: 10px;

    padding-right: 10px;

  • 7/30/2019 Synchronous Conferencing with Translation

    30/63

    30

    height: 42px;

    width: 103px;

    border: 2px;

    border-radius: 8px;

    color: #ebebeb;

    margin-top: 20px;

    margin-left: 200px;

    box-shadow: 0px 2px 3px #000000;

    }

    Signup

    Sign Up

  • 7/30/2019 Synchronous Conferencing with Translation

    31/63

    31

    labelposition="left"value=""size="30"

    maxlength="2048">












    After filling the form and clicking on submit button the form is validated. The controlthen moves to Java class Signup.java.

  • 7/30/2019 Synchronous Conferencing with Translation

    32/63

    32

    The code for Signup-validation.xml and Signup.java is shown below.

    Signup-validation.xml

    Display Name is required.

    156

    Password is required.

    15

    6

    Password is required.

    FirstName is required.

  • 7/30/2019 Synchronous Conferencing with Translation

    33/63

    33

    Last Name is required.

    Language is required.

    Email is required.

    Signup.java

    package translator;

    import java.sql.Connection;

    import java.sql.DriverManager;

    import java.sql.ResultSet;

    import java.sql.Statement;

    import java.util.*;

    import org.apache.struts2.ServletActionContext;

    import com.opensymphony.xwork2.ActionSupport;

  • 7/30/2019 Synchronous Conferencing with Translation

    34/63

    34

    publicclass Signup extends ActionSupport {

    String firstname, lastname, dname, password, password1, email, lang;

    public String getLang() {

    return lang;

    }

    publicvoid setLang(String lang) {

    this.lang = lang;

    }

    public Signup() {

    }

    public String getFirstname() {

    return firstname;

    }

    publicvoid setFirstname(String firstname) {

    this.firstname = firstname;

    }

    public String getLastname() {

    return lastname;

    }

    publicvoid setLastname(String lastname) {

    this.lastname = lastname;

    }

    public String getDname() {

    return dname;

    }

  • 7/30/2019 Synchronous Conferencing with Translation

    35/63

    35

    publicvoid setDname(String dname) {

    this.dname = dname;

    }

    public String getPassword() {

    return password;

    }

    publicvoid setPassword(String password) {

    this.password = password;

    }

    public String getPassword1() {

    return password1;

    }

    publicvoid setPassword1(String password1) {

    this.password1 = password1;

    }

    public String getEmail() {

    return email;

    }

    publicvoid setEmail(String email) {

    this.email = email;

    }

    Connection con;

    ResultSet rs;

    Statement st;

    int flag = 0, f = 0;

  • 7/30/2019 Synchronous Conferencing with Translation

    36/63

    36

    public String execute() {

    ServletActionContext.getContext().getSession().put("l", lang);

    try {

    Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");

    con = DriverManager.getConnection("jdbc:odbc:mydsn");

    st = con.createStatement();

    rs = st.executeQuery("select * from Table1");

    while (rs.next()) {

    String x = rs.getString("email");

    if(x.equals(email)) {

    flag = 1;

    }

    }

    if(password.equals(password1) && (flag == 0)) {

    st.executeUpdate("insert into Table1 values('" +

    firstname

    + "','" + lastname + "','" + dname + "','" +

    email

    + "','" + password + "')");

    ServletActionContext.getContext().getSession()

    .put("fsign", dname);

    f = 1;

    } else {

    f = 0;

    }

    } catch (Exception e) {

    System.out.print(e);

    }

    if(f == 1) {

    returnSUCCESS;

    } else

    returnNONE;

  • 7/30/2019 Synchronous Conferencing with Translation

    37/63

    37

    }

    }

    If the method execute(), defined in Signup.java, returns SUCCESS the details filled

    by the client are stored in the database through Java Database Connectivity (JDBC)

    and the next JSP page chat.jsp as mentioned in the XML file struts.xml is

    displayed. The code and screen shots are shown below.

    struts.xml

    /signup.jsp

    /chat.jsp

    /signup.jsp

    /login.jsp/chat.jsp

  • 7/30/2019 Synchronous Conferencing with Translation

    38/63

    38

    /login.jsp

    /chat.jsp

    /chat.jsp

  • 7/30/2019 Synchronous Conferencing with Translation

    39/63

    39

    chat.jsp

    #signup-box {

    background: url(images/login-box-backg1.png) no-repeat;

    padding-top: 20px;

    padding-left: 100px;

    font: 12px Arial, Helvetica, sans-serif;

    color: #ebebeb;

    margin-left: 100px;

    }

    #signup-boxh2 {

    padding: 0px;

    margin-left: 230px;

    margin-top: 40px;

    color: #ebebeb;

    font: bold 50px "Calibri", Arial;

    }

    .form-Name {

    float: left;

    margin-left: 0px;margin-top: 0px;

  • 7/30/2019 Synchronous Conferencing with Translation

    40/63

    40

    border: 1px solid #0d2c52;

    background-color: #1e4f8a;

    width: 205px;

    font-size: 16px;

    color: #ebebeb;

    }

    .form-submit{

    float: left;

    font-weight: 900;

    font-size: 14px;

    background: #4F8CDF;

    padding-left: 10px;

    padding-right: 10px;

    height: 42px;

    width: 103px;

    border: 2px;

    border-radius: 8px;

    color: #ebebeb;

    margin-top: 20px;

    margin-left: 490px;

    box-shadow: 0px 2px 3px #000000;

    }

    Chat

  • 7/30/2019 Synchronous Conferencing with Translation

    41/63

    41

    CHAT



























    Chat.java

    package translator;

    import org.apache.struts2.ServletActionContext;

    import com.memetix.mst.language.Language;

    import com.memetix.mst.translate.Translate;

  • 7/30/2019 Synchronous Conferencing with Translation

    42/63

    42

    import com.opensymphony.xwork2.ActionSupport;

    import edu.cmu.sphinx.frontend.util.Microphone;

    import edu.cmu.sphinx.recognizer.Recognizer;

    import edu.cmu.sphinx.result.Result;

    import edu.cmu.sphinx.util.props.ConfigurationManager;

    publicclass Chat extends ActionSupport {

    String t, translatedText, translatedChat, lng, usr,rec;

    static String chat= "";

    public String getT() {

    return t;

    }

    publicvoid setT(String t) {

    this.t = t;

    }

    public String execute() throws Exception {

    Translate.setClientId(/* Enter your Windows Azure Client Id here */); //username for

    ms azure marketplace

    Translate.setClientSecret(/* Enter your Windows Azure Client Secret here */); //API

    key

    lng = (String) ServletActionContext.getContext().getSession().get("l");

    usr = (String) ServletActionContext.getContext().getSession()

    .get("ulog");

    rec = (String) ServletActionContext.getContext().getSession().get("y");

    if(t != null || rec != null) {

    if(lng.equals("ENGLISH")) {

    System.out.println("english");

  • 7/30/2019 Synchronous Conferencing with Translation

    43/63

    43

    translatedText = Translate.execute(chat,

    Language.ENGLISH);

    translatedChat = Translate.execute(t+rec,

    Language.ENGLISH);

    chat= translatedText + " " + usr + ":" + " "

    + translatedChat + "." +"
    ";

    System.out.println(chat);

    ServletActionContext.getContext().getSession().put("z",

    chat);

    }

    if(lng.equals("HINDI")) {

    translatedText = Translate.execute(chat,

    Language.HINDI);

    translatedChat = Translate.execute(t+rec,

    Language.HINDI);

    chat= translatedText + " " + usr + ":" + " "

    + translatedChat + "" + "
    ";

    System.out.println(chat);

    ServletActionContext.getContext().getSession().put("z",

    chat);

    }

    }

    else {

    if(lng.equals("ENGLISH")) {

    translatedText = Translate.execute(chat,

    Language.ENGLISH);

    System.out.println(chat);

    ServletActionContext.getContext().getSession()

  • 7/30/2019 Synchronous Conferencing with Translation

    44/63

    44

    .put("z", translatedText);

    }

    if(lng.equals("HINDI")) {

    translatedText = Translate.execute(chat,

    Language.HINDI);

    System.out.println(chat);

    ServletActionContext.getContext().getSession()

    .put("z", translatedText);

    }

    }

    returnSUCCESS;

    }

    }

    Now as the user is on the JSP page chat.jsp, provided that another client is online,

    the synchronous conferencing can start. A separate session for each client is started

    when the client logs in.

    Another session to carry out the translation process is created based on the selected

    languages.

    The message is given as input which is stored in an object of String class. This string

    variable is then translated from the source language to the destination language as

    selected by the clients.

    The previous content if any is stored in the String variable translatedChat. The

    content of translatedChat is concatenated with new message and is stored in chat.

    Finally this chat is stored the session variable z. This session variable is then

  • 7/30/2019 Synchronous Conferencing with Translation

    45/63

    45

    printed on the JSP page result.jsp.The JSP page Chat.jsp uses AJAX to refresh and

    display this result.jsp page at regular intervals as specified.

    A registered client may login from the JSP page login.jsp.The code for

    login.jsp,Login.java and the screenshot is shown below.

    login.jsp

    Login

    #login-box {

    width: 333px;

    height: 352px;

    padding: 58px 76px 0 76px;

    color: #ebebeb;

    font: 12px Arial, Helvetica, sans-serif;

    background: url(images/login-box-backg.png) no-repeat left top;

    }

    #login-boximg {

    border: none;

    }

    #login-boxh2 {

  • 7/30/2019 Synchronous Conferencing with Translation

    46/63

    46

    padding: 0;

    margin: 0;

    color: #ebebeb;

    font: bold 44px "Calibri", Arial;

    }

    #login-box-name {

    float: left;

    display: inline;

    width: 80px;

    text-align: right;

    padding: 14px 10px 0 0;

    margin: 0px 0 7px 0;

    }

    #login-box-field{

    float: left;

    display: inline;

    width: 230px;

    margin: 0;

    margin: 0 0 7px 0;

    }

    .form-submit{

    float: left;

    margin-left: 110px;

    }

    .form-login {

    width: 205px;

    padding: 10px 4px 6px 3px;

    border: 1px solid #0d2c52;

    background-color: #1e4f8a;

    font-size: 16px;

  • 7/30/2019 Synchronous Conferencing with Translation

    47/63

    47

    color: #ebebeb;

    }

    .login-box-options {

    clear: both;

    padding-left: 87px;

    font-size: 11px;

    }

    .login-box-optionsa {

    color: #ebebeb;

    font-size: 11px;

    }

    .nits {

    margin-top: 0px;

    }

    Login



  • 7/30/2019 Synchronous Conferencing with Translation

    48/63

    48

    labelposition="left"title="Username"

    value=""size="30"

    maxlength="2048"/>


    Remember Me SIGNUP



  • 7/30/2019 Synchronous Conferencing with Translation

    49/63

    49

    Login.java

    package translator;

    import java.sql.*;

    import org.apache.struts2.ServletActionContext;

    import com.opensymphony.xwork2.ActionSupport;

    public class Login extends ActionSupport{

    public Login(){}

    String email,password,lang;

    public String getLang() {

    return lang;

    }

    public void setLang(String lang) {

    this.lang = lang;

    }

    public String getEmail() {

    return email;

    }

    public void setEmail(String email) {

    this.email = email;

    }

    public String getPassword() {

    return password;

    }

  • 7/30/2019 Synchronous Conferencing with Translation

    50/63

    50

    public void setPassword(String password) {

    this.password = password;

    }

    String e,p,fn;

    Connection con;

    ResultSet rs;

    Statement st;

    int fl=0;

    public String execute()

    {

    try {

    Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");

    con=DriverManager.getConnection("jdbc:odbc:mydsn");

    st=con.createStatement();

    rs=st.executeQuery("select * from Table1");

    while(rs.next())

    {

    e=rs.getString("email");

    p=rs.getString("password");

    if((e.equals(email))&&(p.equals(password)))

    {

    fl=1;

    }

    }

    }

    catch (Exception e) {

    System.out.print(e);

  • 7/30/2019 Synchronous Conferencing with Translation

    51/63

    51

    }

    if(fl==1)

    {

    ServletActionContext.getContext().getSession().put("ulog",email);

    ServletActionContext.getContext().getSession().put("l", lang);

    return SUCCESS;

    }

    else

    {

    return NONE;

    }

    }

    }

    The Voice Interface : CMUSphinx

    Speaking.java

    package edu.cmu.sphinx.speaking;

    import edu.cmu.sphinx.frontend.util.Microphone;

    import edu.cmu.sphinx.recognizer.Recognizer;

    import edu.cmu.sphinx.result.Result;

    import edu.cmu.sphinx.util.props.ConfigurationManager;

    public class Speaking {

    public static void main(String[] args) {

  • 7/30/2019 Synchronous Conferencing with Translation

    52/63

    52

    ConfigurationManager cm;

    String resultText="";

    if (args.length > 0) {

    cm = new ConfigurationManager(args[0]);

    } else {

    cm = new

    ConfigurationManager(Speaking.class.getResource("speaking.config.xml"));

    }

    Recognizer recognizer = (Recognizer) cm.lookup("recognizer");

    recognizer.allocate();

    Microphone microphone = (Microphone) cm.lookup("microphone");

    if (!microphone.startRecording()) {

    System.out.println("Cannot start microphone.");

    recognizer.deallocate();

    System.exit(1);

    }

    while (true) {

    Result result = recognizer.recognize();

    if (result != null) {

    resultText = result.getBestFinalResultNoFiller();

    System.out.println(resultText);

    ServletActionContext.getContext().getSession().put("y", resultText);

    } else {

    System.out.println("I can't hear what you said.\n");

    }

    }

    }

    }

  • 7/30/2019 Synchronous Conferencing with Translation

    53/63

    53

    The lines below define the frequently tuned properties. They are located at the top of

    the configuration file so that they can be edited quickly.

    Recognizer

    The lines below define the recognizer component that performs speech recognition. It

    defines the name and class of the recognizer, Recognizer. This is the class that any

    application should interact with. The javadoc of the Recognizer class has two

    properties, 'decoder' and 'monitors'. This configuration file is where the value of these

    properties are defined.

    accuracyTracker

    speedTracker

    memoryTracker

    Decoder

  • 7/30/2019 Synchronous Conferencing with Translation

    54/63

    54

    The 'decoder' property of the recognizer is set to the component called 'decoder',

    which is defined as:

    The decoder component is of class edu.cmu.sphinx.decoder.Decoder. Its property

    'searchManager' is set to the component 'searchManager', defined as:

    The searchManager is of class

    edu.cmu.sphinx.decoder.search.SimpleBreadthFirstSearchManager. This class

    performs a simple breadth-first search through the search graph during the decoding

    process to find the best path. This search manager is suitable for small to medium

    sized vocabulary decoding.

    The logMath property is the log math that is used for calculation of scores during the

    search process. It is defined as having the log base of 1.0001. Note that typically the

    same log base should be used throughout all components, and therefore there should

    only be one logMath definition in a configuration file:

  • 7/30/2019 Synchronous Conferencing with Translation

    55/63

    55

    The linguist of the searchManager is set to the component 'flatLinguist', which again

    is suitable for small to medium sized vocabulary decoding. The pruner is set to the

    'trivialPruner':

    which is of class edu.cmu.sphinx.decoder.pruner.SimplePruner. This pruner performs

    simple absolute beam and relative beam pruning based on the scores of the tokens.

    The scorer of the searchManager is set to the component 'threadedScorer', which is of

    class edu.cmu.sphinx.decoder.scorer.ThreadedAcousticScorer. It can use multiple

    threads (usually one per CPU) to score the tokens in the active list. Scoring is one of

    the most time-consuming step of the decoding process. Tokens can be scored

    independently of each other, so using multiple CPUs will definitely speed things up.

    The threadedScorer is defined as follows:

    The 'frontend' property is the front end from which features are obtained.

    Finally, the activeListFactory property of the searchManager is set to the component

    'activeList', which is defined as follows:

  • 7/30/2019 Synchronous Conferencing with Translation

    56/63

    56

  • 7/30/2019 Synchronous Conferencing with Translation

    57/63

    57

    JSGF grammars are defined in JSAPI. The class that translates JSGF into a form that

    Sphinx-4 understands is edu.cmu.sphinx.jsapi.JSGFGrammar. Note that this link to

    the javadoc also describes the limitations of the current implementation).

    The property 'grammarLocation' can take two kinds of values. If it is a URL, it

    specifies the URL of the directory where JSGF grammar files are to be found.

    Otherwise, it is interpreted as resource locator. The 'grammarName' property specifies

    the grammar to use when creating the search graph.

    'logMath' is the same log math as the other components.

    The 'dictionary' is the component that maps words to their phonemes. It is almost

    always the dictionary of the acoustic model, which lists all the words that were used

    to train the acoustic model:

  • 7/30/2019 Synchronous Conferencing with Translation

    58/63

    58

    The locations of these dictionary files are specified using the Sphinx-4 resource

    mechanism. The dictionary for filler words like BREATH and LIP_SMACK is

    the file fillerdict.

    For details about the other possible properties, please refer to the javadoc for

    FastDictionary.

    Acoustic Model

    The next important property of the flatLinguist is the acoustic model which describes

    sounds of the language. It is defined as:

    'wsj' stands for the Wall Street Journal acoustic models.

    Sphinx-4 can load acoustic models trained by Sphinxtrain. Common models are

    packed into JAR files during build and located in lib folder. Sphinx3Loader class is

    used to load them. The JAR needs to be included into classpath.

  • 7/30/2019 Synchronous Conferencing with Translation

    59/63

    59

    The JAR file for the WSJ models is called

    WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar, and is in the sphinx4/lib

    directory.

    CONVERSATION SCREEN SHOTS

  • 7/30/2019 Synchronous Conferencing with Translation

    60/63

    60

  • 7/30/2019 Synchronous Conferencing with Translation

    61/63

    61

    BIBLIOGRAPHY

    BOOKS

    1. Charles Hampfed(2000)Instant Java Server Pages University ofToranto

    2. Herbert Schildt(2000) Java Complete Reference Tata McGrow Hill3. Budi Kurniawan(2008)Struts 2:Design And Programming BPB

    Publications

    4. Jamie Jaworsky J2EE Bible Techmedia5. Oxford Brooks University HTML Primer

    ONLINE REFERENCE

    1. www.codeproject.com2. www.java.sun.com3. www.codeguru.com4. www.roseindia.net5.

    www.java2s.com

    6. www.w3schools.com7. www.onlinesbi.com8. www.nseindia.com

    http://www.codeproject.com/http://www.java.sun.com/http://www.codeguru.com/http://www.roseindia.net/http://www.java2s.com/http://www.w3schools.com/http://www.onlinesbi.com/http://www.onlinesbi.com/http://www.w3schools.com/http://www.java2s.com/http://www.roseindia.net/http://www.codeguru.com/http://www.java.sun.com/http://www.codeproject.com/
  • 7/30/2019 Synchronous Conferencing with Translation

    62/63

    62

    APPENDIX

    Reserved Words

    Reserved words are words that can't be used as identifiers. Many of them are

    keywords that have a special purpose in Java.

    abstract boolean break byte byvalue case cast

    catch char class const continue default do

    double else extends final finally float for

    future generic goto if implements import innerinstanceof int interface long native new null

    operator outer package private protected public rest

    return short static super switch synchronized this

    throw throws transient try var void volatile

    while

    Packages

    One of the goals of Java is code reusability. The library keeping function for code

    reuse is enabled through the use of packages of precompiled classes that can be

    imported into other projects. This allows developers to distribute their work as

    toolkits without distributing the source code and programmers to use these classes as

    easily as using those distributed in the developers kit.

    The package location's root is determined by an operating system environment

    variable called classpath. It can have multiple entries which are used in sequence as

    search paths. The first matching entry is used. As an example of a classpath that

    checks the current directory and if the file isn't there checks the bin folder of the

    installed distribution package: classpath=.;C: \Program Files\Java\jdk1.6.0_24\bin ;

  • 7/30/2019 Synchronous Conferencing with Translation

    63/63

    Java Beans

    Java Beans are precompiled software components (ie building blocks) that have been

    designed to be reusable in a variety of different environments. All properties, events,

    and methods of a Bean that are exposed, can be controlled. Beans can be localized for

    global usage. Beans can exchange events with other Beans. Bean designers write

    classes conforming to a specific convention.

    Builder tools provide ways to create Beans, palettes of available Beans, worksheets

    for laying Beans in a GUI, editors to configure Beans, and commands to interconnect

    Beans to form an application, and checks for state and behaviour.