siva voice

5
Embedded ViaVoice A single, fully integrated architecture The modular Embedded ViaVoice architecture provides fully integrated, automatic speech recognition, speech synthesis through text-to-speech (TTS) and other technology engines supporting the full-feature requirements of an application with minimal processor utilization and memory requirements. A single architecture with consistent application interfaces enables Embedded ViaVoice to support solutions from low-resource Personal Navigation Devices to high- performance in-vehicle solutions to Java technology. This single-architecture implementation is a particular advantage to applications that need to span a broad range of platform capacities, as well as solutions where significant growth in capacity is a requirement. A broad language base Embedded ViaVoice is available in a broad set of languages to provide speech-recognition and speech-synthesis capabilities through the support of a worldwide network of IBM speech research and development laboratories. High-quality embedded concatenative TTS (eCTTS) capabilities provide more-human-sounding speech synthesis to support more-advanced applications. To learn more about IBM's continuing development of other language models for ASR and voices for TTS, as well as its continuous improvement of existing languages, contact your IBM representative. High recognition accuracy The Embedded ViaVoice recognition engine is based on small units of speech, called phonemes. This phoneme-based model uses finite state grammars to support highly accurate and noise-robust, continuous speech recognition. Through a comprehensive and vigorous research and development effort, IBM has significantly reduced the word-error rate of Embedded ViaVoice over the past several years. Large vocabulary recognition The maximum vocabulary supported by Embedded ViaVoice has grown by a factor of 25 over the past four years. Embedded ViaVoice supports the recognition of lists of a virtually limitless number of words, bounded only by the platform's processing and memory resources. Services and workshops Porting and integration services include porting to a new operating system, recompiling for different processor architecture or modifying the embedded audio layer to use a new driver or codec.  Alternatively, with a device adaptation kit, IBM supplies the tools that enable you to perform and test the audio adaptations yourself. IBM can provide on-site classes to application developers about the Embedded ViaVoice Software Developer Kit (SDK). Customized development workshops are also available to provide skills transfer and instruction on application development, evaluation methodology and tools, so you can design and tune your own system to suit your organization's business needs. Support for multiple programming models  Many small-footprint, embedded applications use Embedded ViaVoice through its C/C++ language application interface. IBM expertise in voice IBM's sustained research and development investment in speech recognition and synthesis for more than 30 years has resulted in multiple advances, including Embedded ViaVoice. IBM Embedded ViaVoice software enables you to gain competitive advantage in today's fast-moving marketplace - and offers a clear path to future growth through a single, fully integrated architecture. Functionality y Portable, event-driven architecture y Fully integrated automatic speech recognition (ASR) and text-to-speech (TTS) y Low processor utilization y Small static and dynamic footprint y Scalable, modular architecture

Upload: balunarasimhap

Post on 10-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Siva Voice

8/8/2019 Siva Voice

http://slidepdf.com/reader/full/siva-voice 1/5

Embedded ViaVoiceA single, fully integrated architecture The modular Embedded ViaVoice architecture provides fully integrated, automatic speechrecognition, speech synthesis through text-to-speech (TTS) and other technology enginessupporting the full-feature requirements of an application with minimal processor utilization and

memory requirements. A single architecture with consistent application interfaces enablesEmbedded ViaVoice to support solutions from low-resource Personal Navigation Devices to high-performance in-vehicle solutions to Java technology. This single-architecture implementation is aparticular advantage to applications that need to span a broad range of platform capacities, as wellas solutions where significant growth in capacity is a requirement.A broad language base Embedded ViaVoice is available in a broad set of languages to provide speech-recognition andspeech-synthesis capabilities through the support of a worldwide network of IBM speech researchand development laboratories. High-quality embedded concatenative TTS (eCTTS) capabilitiesprovide more-human-sounding speech synthesis to support more-advanced applications. To learnmore about IBM's continuing development of other language models for ASR and voices for TTS, aswell as its continuous improvement of existing languages, contact your IBM representative.High recognition accuracy The Embedded ViaVoice recognition engine is based on small units of speech, called phonemes.This phoneme-based model uses finite state grammars to support highly accurate and noise-robust,continuous speech recognition. Through a comprehensive and vigorous research and developmenteffort, IBM has significantly reduced the word-error rate of Embedded ViaVoice over the past severalyears.Large vocabulary recognition The maximum vocabulary supported by Embedded ViaVoice has grown by a factor of 25 over thepast four years. Embedded ViaVoice supports the recognition of lists of a virtually limitless number of words, bounded only by the platform's processing and memory resources.Services and workshops Porting and integration services include porting to a new operating system, recompiling for differentprocessor architecture or modifying the embedded audio layer to use a new driver or codec.

Alternatively, with a device adaptation kit, IBM supplies the tools that enable you to perform and testthe audio adaptations yourself.IBM can provide on-site classes to application developers about the Embedded ViaVoice SoftwareDeveloper Kit (SDK). Customized development workshops are also available to provide skillstransfer and instruction on application development, evaluation methodology and tools, so you candesign and tune your own system to suit your organization's business needs.Support for multiple programming models Many small-footprint, embedded applications use Embedded ViaVoice through its C/C++ languageapplication interface.IBM expertise in voice IBM's sustained research and development investment in speech recognition and synthesis for morethan 30 years has resulted in multiple advances, including Embedded ViaVoice. IBM EmbeddedViaVoice software enables you to gain competitive advantage in today's fast-moving marketplace -and offers a clear path to future growth through a single, fully integrated architecture.Functionality

y Portable, event-driven architecturey Fully integrated automatic speech recognition (ASR) and text-to-speech (TTS)y Low processor utilizationy Small static and dynamic footprinty Scalable, modular architecture

Page 2: Siva Voice

8/8/2019 Siva Voice

http://slidepdf.com/reader/full/siva-voice 2/5

y Single-threading and multithreading supporty Runtime event notificationy Unsupervised adaptation to speakersy Optional speaker enrollmenty Phoneme-basedy Speaker-independent

Accuracy and robustness

y Very large vocabulary recognition, exceeding 200 000 spoken words in real timey Freeform commands combining statistical language models and semantic interpretationy Tunable rejection to address nonspeech sounds and out-of-vocabulary wordsy Advanced front-end noise suppressiony Support for vendor-supplied noise suppressiony Enhanced speech and silence detectiony Continuous and discrete digit recognitiony Spell-mode capabley Word and phrase confidence scoringy Detection and adaptation for gender y Pronunciation confusability reportingy N-Best and homonym supporty Grammar weights

Solution-development tools

y Eclipse technology-based IBM Embedded Voice Toolkit, Version 6.0, including a customizedintegrated development environment (IDE) for embedded speech developers

y Application-creation wizardsy Grammar editor and templatesy Vocabulary testing and analysisy Pronunciation compiler and variant generator y Gain-control tuning tooly Tracing and debugging interfacey Device adaptation kit

Flexibility

y Broad language coveragey Additional languages in developmenty JSAPI and extensionsy Automatic gain adjustmenty Multiple listening modes, including push to talk, push to activate and always listeningy Run-time language switchingy Run-time pronunciation manipulationy Scalable acoustic models

Page 3: Siva Voice

8/8/2019 Siva Voice

http://slidepdf.com/reader/full/siva-voice 3/5

y 11/16/22kHz sampling ratesy Signal-to-noise (SNR) feedbacky Voice tags from text or acoustic inputy Embedded baseform generation

G rammar and compiler support

y Scalable vocabulary supporty Built-in grammar compiler y Finite state grammarsy Multiple grammar formats, including Speech Recognition Grammar Specification (SRGS), Backus-

Naur Format (BNF) and Java Speech Grammar Format (JSGF)y Annotationsy Statistical language modelsy Dynamic and unlimited vocabulariesy Precompiled and runtime grammars

Speech synthesis (TTS)

y Unlimited pronunciation domainy Multiple voicesy Customizable voicesy Dictionary supporty Indexing support, and pause-and-resume capabilitiesy Adjustable performance-tuning parametersy API for phoneme generationy Manual override of automatic synthesisy SSML support

P rocessors currently supported*

y Hitachi SH4y Motorola PowerPCy IBM PowerPC® processor y Intel® x86y Intel StrongARMy Intel XScaley Blackfin 539 DSPy MIPS

*Others can be added based on customer requirements.

O perating Systems Supported

y Windows XP

Page 4: Siva Voice

8/8/2019 Siva Voice

http://slidepdf.com/reader/full/siva-voice 4/5

y Windows 2000y Windows CE / Windows Mobiley QNXy Linuxy Embedded Linuxy T-Enginey MicroItrony VxWorksy RTXC

Languages O ffered Automatic Speech Recognition (ASR)

y US Englishy North American Spanishy Canadian Frenchy UK Englishy Frenchy Italiany Germany Spanishy Dutchy Japanesey Mandarin Chinesey European Portuguesey Swedishy Korean

C oncatenative Text-to-speech (eCTTS)

y US Englishy North American Spanishy Canadian Frenchy UK Englishy Germany Frenchy Italiany Spanishy Japanesey Dutch

Formant Text-to-speech

y US Englishy North American Spanish

Page 5: Siva Voice

8/8/2019 Siva Voice

http://slidepdf.com/reader/full/siva-voice 5/5

y Canadian Frenchy UK Englishy Germany Frenchy Italiany Spanishy Japanesey Dutchy Simplified Chinesey Brazilian Portuguesey Koreany Traditional Chinese