speak to your customers loudly and clearly

37
Speak to your customers loudly and clearly

Upload: heinz

Post on 12-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

Speak to your customers loudly and clearly. Elan Speech mission statement. Beyond the words As a leading world player in Text to Speech, Elan Speech focuses exclusively on the development and marketing of natural-language interfaces. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Speak to your customers loudly and clearly

Speak to your customers loudly and clearly

Page 2: Speak to your customers loudly and clearly

Elan Speech mission statement

Beyond the words

As a leading world player in Text to Speech, Elan Speech focuses exclusively on the development and

marketing of natural-language interfaces.

Page 3: Speak to your customers loudly and clearly

Elan Speech brings organisations new ways of interacting with their clients,

providing new opportunities to speech-enable their world through revenue-generating applications.

Our mission is to vocalize content to the end user with efficiency and accuracy, whatever the situation is »

Antoine Kauffeisen, CEO, Elan Speech

Page 4: Speak to your customers loudly and clearly

Worldwide speech provider

Elan Speech profile

Private company, headquartered in Toulouse, France.

Funded by venture capital (raised in 2002): IRDI, Part’Com,WT.

Strong in-house R&D, Elan Sayso™ technology ownership.

Wide offer of TTS technologies, with up to 12 languages and more to come.

Large Partner & Customer network in Europe, Eastern Europe, North America, Latin America, Japan & India.

New management growth oriented, with longterm vision and roadmap.

Page 5: Speak to your customers loudly and clearly

Elan Speech background

Elan Speech was created in June 2002, from the assets of previously named Elan Informatique

1980: creation of Elan Informatique

1986: beginning of work on text to speech (LPC technology)

1996: exclusive focus on TTS (diphone concatenation technology: Elan

Tempo™)

2000: company sold to Lernout & Hauspie (L&H)

2001: legal battle against L&H, won in November 2001

2002: decision to go chapter 11 (RJ) in Feb 2002

June 2002: creation of Elan Speech, acquisition of all Elan Informatique

assets, new management, new capital structure.

July 2002: launch of new high-end TTS technology: Elan Sayso™

Page 6: Speak to your customers loudly and clearly

Elan Speech : figures

More than 3 million de licenses in automotive and multimedia applications.

More than 10,000 ports deployed in telephony services.

More than 350 active customers.

12 languages already supported.

3 target markets: Telecom, Multimed and Mobility.

2 text to speech technology families: Elan Tempo™ and Elan Sayso™

Worldwide speech provider

Page 7: Speak to your customers loudly and clearly

Focus #1 : EUROPE

Germany, France, UK, Spain, Netherlands, Belgium, Italy,

Switzerland …

Focus #2 North America

Focus #3Latin America : Brazil

Chile, Argentina, Venezuela

Focus #4Middle east (Arabic), India,

Japan, Korea, Australia

Elan Speech : Geographical markets

Page 8: Speak to your customers loudly and clearly

Elan Speech’s markets

TelecomServer based vocalization of contents for multiple users over the phone

for Enterprise : Unified messaging, Auto attendant, CRM for Telcos : Unified messaging, Voice portal, SMS2Voice, directory and

reverse directory

Automotive and mobile terminalsOn board and off-board speech solutions to free user from reading instructions.

On board car navigation systems & Off board car navigation systems Traffic information Telematics, RDS – TMC

MultimediaPersonal software on PC & Mac

Edutainment software Disabilities assistance Personal productivity

Page 9: Speak to your customers loudly and clearly

Elan Speech markets’s requirements> TTS component for Telecom

High quality High density (ports per server) High reliability (24/24 7/7) Support of markup languages and standard APIs Support of 3 major OS : Windows NT/XP/2000, Solaris Sparc, Linux

> TTS component for Automotive and mobile terminals High quality Low footprint (2 to 16 Mb depending on platform) Support of multiple RTOS (VxWorks, WinCE, PSOS, Neutrino, etc..) Support for multiple processors Support for phonetic input/output and phonetic lexicons of proper names

> TTS component for Multimedia High quality High flexibility (Speed, pitch adjustment, voice customization) availability for PC & MAC platform Support of Standard APIsServer based vocalization of contents for multiple users over the phone

Page 10: Speak to your customers loudly and clearly

Elan Speech

VAR & OEM

(integrator, platform vendor, publisher)

End Customer

(ASP, Service provider, Telco, car manufactuer)

End Customer

(Subscriber, mass market user)

Core Technology (TTS licences)

Solution, platform

Service/consumer product

Value addedservices aroundthe technology(Custom voice,

Quality monitoring,Expertise)

Elan Speech’s direct & indirect business model

Page 11: Speak to your customers loudly and clearly

> Diphone based concatenative TTS

Advantages High density (over 250 ports per server) Small footprint (2 to 6 Mb) Flexible (Pitch, Speed adjustment, prosody copying) High intelligibility 12 language supported

Disadvantage : robotic sounding

Markets/Application targeted : Automotive & consumer electronic (low footprint) High density, short ROI server based TTS (telephony), low cost

of ownership Multimedia software products

Elan Speech TTS technologies

Page 12: Speak to your customers loudly and clearly

Elan Speech TTS technologies

> Unit selection concatenative TTS

Advantages: Very high quality Highly natural Flexible (Pitch, Speed adjustment, timber alteration, whisper

feature) Support for Custom voice (“Speech Brand” Program)

Disadvantage: lower density (50 ports/server) larger footprint (16 to 70 Mb)

Markets/Application targeted : High end telephony application Mass market telco service (Voice portal, news) Public address High end multimedia software

Page 13: Speak to your customers loudly and clearly

Pre-processing

Text normalization

Phonetic transcription

Prosody calculation

Synthesizer

Pre-processing

Text normalization

Phonetic trans.

Unit selection

Decoder

Abbreviation & exception

Diphone database

Audio output Audio outputElan studio

Units database

Comparison of Elan’s TTS technologies

Page 14: Speak to your customers loudly and clearly

Footprint

Quality / Naturalness

Elan Sayso™25-50 MB

Human speaker

4 Mb 12 Mb 32 Mb

Elan Tempo™2-6 MB

Elan Sayso™ Embedded

10-16MB

Positioning of the two technologies

Page 15: Speak to your customers loudly and clearly

Languages available with Elan Tempo™ technology

American English Male Female

British English Male Female

French Male Female

German Male Female

Italian Male Female

Spanish Male Female

Polish Male

Russian Male

Dutch Male

Brazilian Portuguese Male Female

Latin American Spanish Male

Arabic Male

Page 16: Speak to your customers loudly and clearly

R&D approach : Automation & Tools

> Elan Studio : a strong R&D set of tools

Advantages

Automate most r&d tasks Build-in signal processing Build-in linguistic analysis Build-in Phonetic analysis Build-in Database generation Automatic segmentation Voice factory Fast and easy tuning & improvement Optimization tools

= Key component for R&D to rollout languages and voices rapidly.

Page 17: Speak to your customers loudly and clearly

Langue Genre Voice samples Interactive demo Product GAFrench Female Available Available AvailableUS English Female Available Available Available V1.0German Female Available Available AvailableSpanish (castillian) Female Available 30/06/03 22/07/03Brazilian PortugueseFemale Available 30/07/03 20/08/03Italian Female 14/08/03 28/08/03 15/09/03UK English Male Available 10/09/03 01/10/03Polish Female 30/08/03 21/11/03 07/12/03Arabic Male 01/09/03 28/11/03 15/12/03UK English Female Available 15/11/03 20/12/03US English Male 01/08/03 28/11/03 30/12/03

Elan Sayso™ language/voice roadmap

Page 18: Speak to your customers loudly and clearly

Pre-processing

Text normalization

Phonetic trans.

Unit selection

Decoder (HNI)

Pre-processing

Text normalization

Phonetic trans.

Prosody model.

Synthesizer

Product layer – OS related level – native API

SAPI 4 SAPI 5 NSC API NVIF JavaSpeech

Audio Layer

5 APIs supported, a 6th to be discussed

Common product framework for

Elan Tempo andElan Sayso™ providing full compatibility

Speechmanager

Elan Speech products framework

Page 19: Speak to your customers loudly and clearly

Elan Speech’s market (1)

TelecomContent vocalisation solutions for Operators & Entreprises.

Applications

Customer services automatisationIVRVoice portalSMS to voiceUnified messaging and email reading

Elan Speech OfferElan Sayso™ Telecom & Elan Tempo™ Telecom : > Multilingual, multi-channel, carrier grade TTS engine. > Client server architecture, heterogeneous architecture supported> Load balancing (multi-server architecture), centralized supervision> Dynamic user lexicons (abbreviation, exceptions)

Page 20: Speak to your customers loudly and clearly

Elan Sayso™ Telecom & Elan Tempo™ Telecom

Available for Windows NT/2000/XP, Solaris Sparc, Linux x86

Support for 12 languages, with male and female voice

Support for 5 API (SAPI4, SAPI5, NVIF, Elan NSC API,

JavaSpeech)

Cross-platform integration with Elan NSC API

Client server architecture, heterogeneous architecture

supported

Load balancing (multi-server architecture), centralized

supervision

Dynamic user lexicons (abbreviation, exceptions)

Specific modules included :

- E-mail pre-processing

- automatic language identification

- Markup language supported : SSML (VoiceXML), JSML

Elan Speech’s market (2)

Page 21: Speak to your customers loudly and clearly

Multimedia & Web Products for personal communication and content enhancement.

Applications : Edutainment software Aid for the disabled Personal productivity Personal Web assistant (Agent) Voice enabled tutorials Consumer electronics vocal interface

Specific support> Elan Sayso™ for Multimedia & Elan Tempo™ for Multimedia, TTS software component for Windows and MAC OS X platforms.> Elan Sayso™ PocketSpeech & Elan Tempo™ PocketSpeech, TTs software component for Pocket PC

Elan Speech Markets (3)

Page 22: Speak to your customers loudly and clearly

Automotive & Mobile devicesTTS multi-platforms for embedded compact solutions.

Applications

Embedded navigation aid Traffic information Navigation sytems for PDAs Telematics services Vocal interface on professional devices public address services

Elan Speech offer> A wide range of portage to serve more than 10 RTOS and 20 procesors specifically adapted to customers’ platforms.> Pocket Speech, specific offer for PDA for Windows CE

Elan Speech Markets (4)

Page 23: Speak to your customers loudly and clearly

Elan Sayso™ PocketSpeech,Elan Tempo™ PocketSpeech

Multilingual TTS engine for PDA based applicationsSupport for both Tempo & Sayso technology

Available for WinCE 2.Xx, WinCE 3.0 / PocketPC 2002 / WinCE.Net

Support for 8 languages, with male and female voiceSupport for 3 API (SAPI4, SAPI5, Elan NSC API)

Tempo PocketSpeech™: small footprint engine, high quality : 3 to 6 MB

Sayso PocketSpeech™ : high quality and high naturalness : 8 to 16 MB

Elan Speech Markets (5)

Page 24: Speak to your customers loudly and clearly

Elan Sayso™ Embedded, Elan Tempo™ Automotive

Multilingual TTS engine for embedded platformsSupports Tempo technology in 8 languages with male & female

voice

Available for WinCE 2.Xx, WinCE 3.0 Automotive, QNX, Neutrino, VxWorks, PSOS, µITROn, RTXC, Linux Embedded

On Intel X86, Motorola 68332,Motorola 68360, Motorola Power PC, Hitachi Super H(SH3, SH4), Philips Trimedia, OKI 763X, OKI ML2110, StrongARM, MIPS…

Support for 3 API (SAPI4, SAPI5, Elan NSC API)Unlimited vocabulary (names, numbers and currencies, dates, free

text, e-mail, etc.) High quality voice, smooth and natural intonation with

concatenative synthesis. Voice speed and voice pitch control. Female and male voices. User abbreviation lexicon for each language. Text tags. Phonetic input/output (SAMPA, IPA)

Elan Speech Markets (6)

Page 25: Speak to your customers loudly and clearly

Pre-processing

Text normalization

Phonetic trans.

Unit selection

Decoder (HNM)

Textual Abbreviations & Exceptions

Generic Units database

Audio output

A-TTS (Applicative Text-to-Speech) for mix of Prompts and Elan Sayso™

Sound Exceptions – Prompts

Application dependent (encoded in HNM frames)

Text input

> A-TTS: Applicative Text to Speech means that prompts are fully tunable and updatable (application corpus) and treated like “Sound Exceptions” within the generic TTS system.

A

TTS

Page 26: Speak to your customers loudly and clearly

Applicative TTS and recorded messages included in the TTS system

Sayso30-50Mb

6 to 10 Mb

Sayso Embedded10-16Mb

Hnm frames : 15ms to 20ms

>70% removed units (pruning)

Recorded or TTS

generated applicative

prompts stored at 1,6Kbps

(22khz, 15ms frames)

Recorded prompts for applicative TTS

Prompts generated with the full Elan Sayso version

ATTS

ATTS

Hnm frames : 10ms to 15ms50% removed

units (pruning)

1/3 to ¼ size

Page 27: Speak to your customers loudly and clearly

D-TTS (Distributed Text-to-Speech) for Web and Telematic applications

Elan Sayso TTS

server

Application (server)

ActiveX client

Java AppletClient

Embedded JavaClient

Servlet(Java

security)

GPRS/UMTSgateway

<16Kbps bandwidth used for a 22khz sampling rate streamed TTS

TCP/IP Socket over

GPRS/UMTS

HTTP

TCP/IP Socket over Internet

connexion

Page 28: Speak to your customers loudly and clearly

Elan HNM coder

0,00

10,00

20,00

30,00

40,00

50,00

60,00

70,00

44khz 22khz 16khz 11khz 8khz

Kb

/s

skip 1

skip 2

skip 3

skip 4

Coder performance for applicative TTS and distributed TTS

With Sayso embedded, at 22kHz sampling rate, 1 hour of recorded prompts for applicative TTS will take less than 6,5MB.

Skip 1 : 5ms frames: transparent

Skip 2 : 10ms frames : no audible change

Skip 3 : 15ms frames, slightly degraded acoustic quality, hard to perceive

skip 4 : 20 ms frames, audible degraded acoustic quality, acceptable

Page 29: Speak to your customers loudly and clearly

Elan Speech Web solutions

> “Digalo cast”

Distributed TTS over an IP network (DTSS)High quality server based Sayso and Tempo TTSSmall footprint remote client , Java native (100Kb)Low bandwidth connection (<15Kbps)“HiFi” restitution quality (22khz, no degradation)Lips synchronization tags for animated web agent (3D agents)

Java Clientfor Digalo Cast

(100Kb)

Digalo Cast Server running Elan Tempo or

Elan Sayso technology

serving from 30 to 300 users

simulatenously

IP connexion : less than 15kbps bandwith used,

TCP, UDP or HTML encapusalted

Page 30: Speak to your customers loudly and clearly

Elan Speech Tools

Elan Virtual SpeakerVoice prompts creation tools for telephony or multimedia

application

• Quick and Easy to use, available for audio updates 24/24 7/7- Automatic generation- Batch processing - Editing features- Multiple output format- Pitch, Speed adjustable- 8khz, 22khz sampling- A-law, µ-law

Page 31: Speak to your customers loudly and clearly

Elan Speech Technology tools

Elan Prosel

Applies natural intonation to synthetic speech

Elan Lexitool

Edit and enrich exception and abbreviations lexicons

Page 32: Speak to your customers loudly and clearly

Elan Speech Services

Proprietary voice : “Speech Brand”“An exclusive TTS voice based on an existing speaker of your choice. Based on Elan Sayso technology, the new voice will mimic the timber, the intonations and the accent of the original speaker.

Technology adaptation & PortingElan’s core technology adapted to a specific platform (Processor, RTOS), especially for embedded TTS

Quality monitoringA global service offer to continuously improve the result of TTS for a specific application.Audit of written contents, specialization of the TTS.

Page 33: Speak to your customers loudly and clearly

Speech Brand : the process to create custom Sayso voices

Text corpus(5 weeks)

Recordings(4 weeks)

Autosegmentation

(2 weeks)

Segmentation verification (manual)(2 to 4 months depending on size)

Database generation

and optimization

(2 weeks)

5 to 7 month process

Reduced if the language is

already available with

Sayso

Requires the Speaker.Might be

reduced for latin

languages

Computerprocessing

Longest part, required to achieve high quality. Currently Investigating reduction and automation a part of

this task.

Ready for integration

Elan Studiovoice factoryframework

Page 34: Speak to your customers loudly and clearly

Elan’s marketing tools

Elan’s partner program Web, News and tradeshow support

Elan news & EvenTTSA monthly newsletter dedicated to customers applications and

deployments,sent out to a highly focused database of 13000 e-mails

Digalo.comA website dedicated to promoting consumer speech-enabled

applications.

Joint marketing agreements : A program to refer qualified leads of prospects

Page 35: Speak to your customers loudly and clearly

Telecom references

Page 36: Speak to your customers loudly and clearly

Automotive references

Page 37: Speak to your customers loudly and clearly

Beyond the words