speech recognizers & generators

Speech Recognizers & Generators

Let’s Get Started…

Presented by: P. Kahoro

Presented to: Prof P. Okanda

Speech Recognizers: What are they?

A Speech is the vocalized form of human communication.

In computer science and electrical engineering, speech recognition (SR) is the translation of spoken words into text. It is also known as "automatic speech recognition" (ASR). Speech Recognition (SR) is the ability to translate a dictation or spoken word to text.

-Speech recognition has evolved quite a bit over the past few years. Initially, it used to work in discrete dictation mode, where you had to pause between each spoken word. Today, however, it uses continuous dictation. It’s also become smarter, with its own set of grammar rules to make out the meaning of what’s being said.

https://en.wikipedia.org/wiki/Computer_science

https://en.wikipedia.org/wiki/Electrical_engineering

Terms and Concepts

• Utterances

• Pronounciation

• Grammer

• Speaker Dependent System

• Speaker Independent System

• Training

• Accuracy

Terms &Concepts

Utterances:

An utterance is any stream of speech between two periods of silence. Silence delineates the start and end of an utterance. An utterance can be a single word, or it can contain multiple words (a phrase or a sentence)

Pronunciations:

One piece of information that the speech recognition engine uses to process a word is its pronunciation, which represents what the speech engine thinks a word should sound like.

Words can have multiple pronunciations associated with them. For example, the word “the” has at least two pronunciations in the U.S. English language: “thee” and “thuh”.

Cont…

Grammar:Grammars define the domain, or context, within which the recognition engine works. The engine compares the current utterance against the words and phrases in the active grammars. If the user says something that is not in the grammar, the speech engine will not be able to understand it correctly. So usually speech engines have a very vast grammar.

Accuracy:The ability of a recognizer can be examined by measuring its accuracy − or how well it recognizes utterances.

Training:

Some speech recognizers have the ability to adapt to a speaker. When the system has this ability, itmay allow training to take place.

Cont…

Speaker Dependent Systems:Speech recognition systems that require a user to train the system to his/her voice are known as speaker-dependent systems. If you are familiar with desktop dictation systems, most are speaker dependent like IBM Via Voice.

Speaker Independent Systems:

Speech recognition systems that do not require a user to train the system are known as speaker-independent systems.

How do humans do it?

Articulation produces sound waves which the ear conveys to the brain for processing

How might computers do it?

Digitization

Acoustic analysis of the speech signal

Linguistic interpretation

Acoustic waveform Acoustic signal

Speech recognition

How Speech Recognition Work?

• Audio input

• Apply a "grammar" so the speech recognizer knows what phonemes to expect.

• Acoustic Model

• Recognized text

How do computers do it?

• First, the user gives a voice command over the microphone, which is passed to the sound card in your system. This analog signal is sampled converted into digital form using a technique called Pulse Code Modulation or PCM. This digital waveform is a stream of amplitudes that look like a wavy line.

• The audio signal is further sampled and each sample is converted into a frequency domain. So, the incoming stream is now a set of discrete frequency bands, in a form that can be used by the speech recognizer.

• The next stage involves recognizing these bands of frequencies. For this, the speech recognition software has a database containing thousands of frequencies or "phonemes", as they’re called.

Hardware:

Sound Cards Soundcard with the cleanest A/D (Analog to Digital)

conversions are recommended.

Microphone The best choice for microphone is the headset style.

Computers / Processors The more the speed the better Speech Recognition

would work. For good Speech Recognition you should be having 1 GHz processor and 1 GB of RAM.

Where can it be used?

• GPS: System control/navigation e.g. GPS-connected digital maps: “How far is it to the motorway junction?”

• Commercial/Industrial applications in-car steering systems

• Mobile telephony: Voice dialing hands-free use of mobile in car e.g. “Dial office”

• Home automation - heating, ventilation and air conditioning

https://mix.office.com/watch/1otxpj7hz6kbx

https://mix.office.com/watch/1otxpj7hz6kbx

Where can it be used?

• Military: System control/navigation e.g. Military - High-performance fighter aircraft, Helicopters, Training Air Traffic Controllers

• Computer and Video Games: Speech input has been used in a limited number of computer and video games. The Microsoft Xbox, Nintendo GameCube, and Sony PlayStation 2 consoles all offer games with speech input/output.

• Usage in education - Students who are blind

• Voice Security System: security locks of gates and doors

• Wearable Computers: The most futuristic application is in the use and functionality of wearable computers.

Speech Recognition Software

• Dragon Naturally Speeking

• IBM Via Voice

• Microsoft Speech Recognition System

• MacSpeech Dictate

• Philips Speech Magic

Pros of Speech Recognition

• Faster than “hand-writing”.

• Allows for better spelling, whether it be in text or documents.

• Helpful for people with a mental or physical disability .

• Hands-free capability .

Cons of Speech Recognition

• No program is 100% perfect

• Factors that affect the accuracy of speech recognition are: slang, homonyms, signal-to-noise ratio, and overlapping speech

• Can be expensive depending on the program

• Easily misinterprets vocal commands e.g SIRI

Conclusion

• Revolutionize the way people conduct business over the Web and ,differentiate world-class e-businesses.

• VoiceXML ties speech recognition and telephony together

• voice-enabled Web solutions TODAY!

Generators:

• Software generators are programs that build other programs. In computer science, a generator is a special routine that can be used to control the iteration behavior of a loop. In fact, all generators are iterators.

• A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator.

Types of software generators:

• key generator (key-gen)

• Random Password Generators

• Code generator

• Natural language generator

• Random test generator

• Pseudorandom number generator

key generator (key-gen)

• A key generator (key-gen) is a computer program thatgenerates a product licensing key, such as a serialnumber, necessary to activate for use a softwareapplication.

• Key-gens may be legitimately distributed by softwaremanufacturers for licensing software in commercialenvironments where software has been licensed in bulkfor an entire site or enterprise, or they may bedistributed illegitimately in circumstances of copyrightinfringement or software piracy.

• A software license is a legal instrument that governs theusage and distribution of computer software.

• Illegitimate key generators are typically distributed bysoftware crackers e.g key-gens used to crack fakeWindows OS e.g Windows 8 are already available

https://en.wikipedia.org/wiki/Software_license

Random password generator• A random password generator is software program

or hardware device that takes input from a random or pseudo-random number generator and automatically generates apassword. Random passwords can be generated manually, usingsimple sources of randomness such as dice or coins, or they canbe generated using a computer.

• While there are many examples of "random" passwordgenerator programs available on the Internet, generatingrandomness can be tricky and many programs do not generaterandom characters in a way that ensures strong security. Acommon recommendation is to use open source security toolswhere possible, since they allow independent checks on thequality of the methods used. Note that simply generating apassword at random does not ensure the password is a strongpassword, because it is possible, although highly unlikely, togenerate an easily guessed or cracked password. In fact there isno need at all for a password to have been produced by aperfectly random process: it just needs to be sufficientlydifficult to guess.

https://en.wikipedia.org/wiki/Pseudo-random

Pseudorandom number generators

• A pseudorandom number generator (PRNG), also known as a deterministic random bit generator (DRBG), is an algorithm for generating a sequence of numbers whose properties approximate the properties of sequences of random numbers.

• Although sequences that are closer to truly random can be generated using hardware random number generators, pseudorandom number generators are important in practice for their speed in number generation and their reproducibility.

Code generator

• In computing, code generation is the process by which a compiler's code generator converts some intermediate representation of source code into a form (e.g., machine code) that can be readily executed by a machine.

• Sophisticated compilers typically perform multiple passes over various intermediate forms. This multi-stage process is used because many algorithms for code optimization are easier to apply one at a time, or because the input to one optimization relies on the completed processing performed by another optimization.

Men have become the tools of their tools. - P. Kahoro

The End

speech recognizers & generators

Software

speech recognition systems

speech recognition engine

stream of speech

speech recognition sr

speech engines

speakerdependent systems

speaker independent

speakerindependent systems