… · web viewthese interfaces are realised by giving the computer ... a multimodal user...
TRANSCRIPT
Perceptual Intelligent Systems 1
Seminar Report On
PERCEPTUAL INTELLIGENT SYSTEMS
Guided By: Ms. Bindu S. Moni
Submitted By:
N.M.Jophi
S1 MCA
Roll No. 29
A.J.C.E Dept. Of Comp. Science & Engg.
Perceptual Intelligent Systems 2
CONTENTS
1. Introduction 3
2. Perception 3
- Filters that make up perception 3
3. Perceptual User Interfaces 5
4. Information Flow in Perceptual User Interfaces 6
5. Perceptual Intelligence 7
6. Perceptual Intelligent Systems 7
7. Gesture Recognition Systems 8
- Challenge of Gesture Recognition 8
8. Speech Recognition Systems 9
- Performance of Speech Recognition Systems 9
9. Nouse Perceptual Vision Interface 10
- Tools available 11
10. Conclusion 13
11. Reference 14
A.J.C.E Dept. Of Comp. Science & Engg.
Perceptual Intelligent Systems 3
Introduction
Inanimate things are coming to our life. That is the simple objects that surround us are
gaining sensors, computational powers, and actuators. Consequently, desks and doors, TVs
and telephones, cars and trains, eyeglasses and shoes, and even the shirts on our backs are
changing from static, inanimate objects into adaptive, reactive systems that can be more
friendly, useful, and efficient. These new systems could be even more difficult to use than
current systems. It depends how we design the interface between the world of humans and the
world of this new generation of machines. To change inanimate objects into smart active
helpmates they need perceptual intelligence.
The main problem with today’s systems is they are both deaf and blind. They mostly
experience the world around them through a slow serial line to a keyboard and mouse. Even
multimedia computers, which can handle signals like sound and image, do so only as a
transport device that knows nothing Computers need to share our perceptual environment
before they can be really helpful. They need to be situated in the same world that we are; they
need to know much more than just the text of our words of the signals’ content.
Here comes the importance of perceptual intelligence. If the systems have the ability
to learn perception, they can act in a smart way. Perceptual intelligence is actually a learned
skill.
Perception
Perception is the end result of a thought that begins its journey with the senses. We
see, hear, physically feel, smell or taste an event. After the event is experienced it must then
go through various filters before our brains decipher what exactly has happened and how we
feel about it. Even though this process can seem instantaneous, it still always happens.
The filters that make up perception are as follows: What we know about the subject or event.
I saw an orange and knew it was editable.
What our previous experience (and/or knowledge) with the subject or event was.
Last time I ate an orange I peeled it first (knowledge to peel an orange before
eating it) and it was sweet.
A.J.C.E Dept. Of Comp. Science & Engg.
Perceptual Intelligent Systems 4
Our previous experience forms our expectations.
Our current emotional state. How we are feeling at the time of the event does affect
how we will feel after the event.
I was in a bad mood when I ate the orange and it angered me that it was sour
and not sweet (my expectation).
In the end my intellectual and emotional perception regarding the eating of an orange
was an unpleasant experience. Depending on how strong that experience was,
determines how I will feel next time I eat an orange. For example, if I got violently
sick after eating an orange, the next time I see an orange, I probably won’t want to eat
it. If I had a pleasant experience eating an orange, the next time I see an orange, I’ll
likely want to eat it.
Even though emotions seemly occur as a result of an experience, they are actually the
result of a complicated process. This process involves interpreting action and thought and
then assigning meaning to it. The mind attaches meaning with prejudice as the information
goes through the perceptual filters we mentioned above.
Our perceptual filters also determine truth, logic along with meaning - though they
don’t always do this accurately. Only when we become aware that a bad feeling could be an
indication of a misunderstanding (error in perception) we can begin to make adjustments to
our filters and change the emotional outcome.
When left alone and untrained, the mind chooses emotions and reactions based on a
"survival" program which does not take into account that we are civilized beings – it’s only
concerned with survival.
A good portion of this program is faulty because the filters have created distortions,
deletions and generalizations which alter perception. For example, jumping to a conclusion
about "all" or "none" of something based on one experience. The unconscious tends to think
in absolutes and supports "one time" learning from experience (this is the survival aspect of
learning).
A.J.C.E Dept. Of Comp. Science & Engg.
Perceptual Intelligent Systems 5
Perceptual User Interfaces
A perceptual interface is one that allows a computer user to interact with the computer
without having to use the normal keyboard and mouse. These interfaces are realised by
giving the computer the capability of interpreting the user's movements or voice commands.
Perceptual Interfaces are concerned with extending human computer interaction to use
all modalities of human perception. All current research efforts are focused at including
vision, audition, and touch in the process. The goal of perceptual reality is to create virtual
and augmented versions of the world, that are perceptually identical to the human with the
real world. The goal of creating perceptual user interfaces is to allow humans to have natural
means of interacting with computers, appliances and devices using voice, sounds, gestures,
and touch.
Perceptual User interfaces (PUI) are characterised by interaction techniques that
combine an understanding of natural human capabilities with computer I/O devices and
machine perception and reasoning. They seek to make the user interface more natural and
compelling by taking advantage of the ways in which people naturally interact with each
other and with the world-both verbally and nonverbally. Devices and sensors should be
transparent and passive if possible, and machines should perceive relevant human
communication channels as well as generate output that is naturally understood. This is
expected to require integration at multiple levels of technologies such as speed and sound
recognition and generation, computer vision, graphical animation and visualization, language
understanding, touch based sensing and feedback learning, user modelling and dialogue
management.
A.J.C.E Dept. Of Comp. Science & Engg.
Perceptual Intelligent Systems 6
Information Flow in Perceptual User Interfaces
PUI integrates perceptive, multimodal, and multimedia interfaces to bring our human
capabilities to bear on creating more natural and intuitive interfaces.
A perceptive user interface is one that adds human-like perceptual capabilities to the
computer, for example, making the computer aware of what the user is saying or what the
user’s face, body and hands are doing. These interfaces provide input to the computer while
leveraging human communication and motor skills.
A multimodal user interface is closely related emphasizing human communication
skills. We use multiple modalities when we engage in face to face communication leading to
more effective communication. Most work on multimodal UI as focused on computer
input(for example using speech together with pen based gestures).Multimodal output uses
different modalities, like visual display, audio and tactile feedback to engage human
perceptual, cognitive and communication skills in understanding what is being presented. In
multimodal UI various modalities are sometimes used independently or simultaneously or
tightly coupled.
Multimedia UI uses perceptual and cognitive skills to interpret information presented
to the user .Text, graphics, audio and video are the typical media used.
PUIs will enhance the use of computers as tools or appliances, directly enhancing
GUI-based applications. For example, by taking into account gestures, speech and eye gaze.
Perhaps, more importantly, these technologies will enable broad use of computers as
A.J.C.E Dept. Of Comp. Science & Engg.
Perceptual Intelligent Systems 7
assistance, or agents that will interact in more human like ways. Perceptual interfaces will
enable multiple styles of interaction such as speech only, speech and gesture, text and touch,
vision, and synthetic sound, each of which may be appropriate in different circumstances,
whether that be desktop apps, hands-free mobile use, or embedded household systems.
Perceptual Intelligence
Perceptual Intelligence is the knowledge and understanding that everything we
experience (especially thoughts and feelings) are defined by our perception. Perceptual
intelligence is paying attention to people and the surrounding situation in the same way
another person would, thus allowing these new devices to learn to adapt their behaviour to
suit us, rather than adapting to them as we do today.
In the language of cognitive science, perceptual intelligence is the ability to deal with
the frame problem; it is the ability to classify the current situation, so that it is possible to
know what variables are important and thus can take appropriate action. Once a computer has
the perceptual ability to know who, what, when, where, and why, then the probabilistic rules
derived by statistical learning methods are normally sufficient for the computer to determine
a good course of action.
The key to perceptual intelligence is making machines aware of their environment,
and in particular, sensitive to the people who interact with them. They should know who we
are, see our expressions and gestures, and hear the tone and emphasis of our voice.
Perceptual Intelligent Systems
We have developed computer systems that can follow people‘s actions, recognizing
their faces, gestures, and expressions.
Some of the systems are:
Gesture Recognition System
Speech Recognition System
Nouse Perceptual Vision Interface
A.J.C.E Dept. Of Comp. Science & Engg.
Perceptual Intelligent Systems 8
Gesture Recognition System
Gesture Recognition deals with the goal of interpreting human gestures via
mathematical algorithms. Gestures can originate from any bodily motion or state but
commonly originate from the face or hand. Current focuses in the field include emotion
recognition from the face and hand gesture recognition. Many approaches have been made
using cameras and computer vision algorithms to interpret sign language.
Gesture Recognition can be seen as a way for computers to begin to understand
human body language, thus building a richer bridge between machines and humans than
primitive text user interfaces or even GUIs (Graphical User Interfaces), which still limit the
majority of input to keyboard and mouse.
Gesture Recognition enables humans to interface with the machine (HMI) and interact
naturally without any mechanical devices. Using the concept of Gesture Recognition, it is
possible to point a finger at the computer screen so that the cursor will move accordingly.
This could potentially make conventional input devices such as mouse, keyboards and even
touch-screens redundant.
Gesture Recognition can be conducted with techniques from computer vision and
image processing.
Often the term gesture interaction is used to refer to inking or mouse gesture
interaction, which is computer interaction through the drawing of symbols with a pointing
device cursor. Strictly speaking the term mouse strokes should be used instead of mouse
gesture since this implies written communication, making a mark to represent a symbol.
Challenges of Gesture Recognition
There are many challenges associated with the accuracy and usefulness of Gesture
Recognition software. For image-based gesture recognition there are limitations on the
equipment used and image noise. Images or video may not be under consistent lighting, or in
the same location. Items in the background or distinct features of the users may make
recognition more difficult. The variety of implementations for image-based gesture
recognition may also cause issue for viability of the technology to general usage. For
example, recognition using stereo cameras or depth-detecting cameras are not currently
commonplace. Video or web cameras can give less accurate results based on their limited
resolution.
A.J.C.E Dept. Of Comp. Science & Engg.
Perceptual Intelligent Systems 9
Speech recognition System
Speech recognition converts spoken words to machine-readable input (for example, to
the binary code for a string of character codes). The term voice recognition may also be used
to refer to speech recognition, but more precisely refers to speaker recognition, which
attempts to identify the person speaking, as opposed to what is being said.
Speech recognition applications include voice dialling (e.g., "Call home"), call routing
(e.g., "I would like to make a collect call"), domotic appliance control and content-based
spoken audio search (e.g., find a podcast where particular words were spoken), simple data
entry (e.g., entering a credit card number), preparation of structured documents (e.g., a
radiology report), speech-to-text processing (e.g., word processors or emails), and in aircraft
cockpits (usually termed Direct Voice Input).
Performance of Speech Recognition Systems
The performance of speech recognition systems is usually specified in terms of
accuracy and speed. Most speech recognition users would tend to agree that dictation
machines can achieve very high performance in controlled conditions. There is some
confusion, however, over the interchange ability of the terms "speech recognition" and
"dictation".
Commercially available speaker-dependent dictation systems usually require only a
short period of training (sometimes also called `enrolment') and may successfully capture
continuous speech with a large vocabulary at normal pace with a very high accuracy. Most
commercial companies claim that recognition software can achieve between 98% to 99%
accuracy if operated under optimal conditions. `Optimal conditions' usually assume that
users:
Have speech characteristics which match the training data,
Can achieve proper speaker adaptation, and
Work in a clean noise environment (e.g. quiet office or laboratory space).
This explains why some users, especially those whose speech is heavily accented,
might achieve recognition rates much lower than expected. Speech recognition in video has
become a popular search technology used by several video search companies.
A.J.C.E Dept. Of Comp. Science & Engg.
Perceptual Intelligent Systems 10
Limited vocabulary systems, requiring no training, can recognize a small number of
words (for instance, the ten digits) as spoken by most speakers. Such systems are popular for
routing incoming phone calls to their destinations in large organizations.
N ouse Perceptual Vision Interface
Nouse PVI is a perceptual vision interface program that offers a complete solution to
working with a computer in Microsoft Windows OS hands-free. Using a camera connected to
a computer, the program analyzes the facial motion of the user to allow him/her to use it
instead of a mouse and a keyboard. As such Nouse - PVI allows a user, to perform the basic
three computer-control actions:
Cursor control: Includes
Cursor positioning
Cursor moving, and
Object dragging - which are normally performed using mouse motion
Clicking: Includes
Right-button click
Left-button click
Double-click, and
Holding the button down - which are normally performed using the mouse
buttons
Key/letter entry: Includes
Typing of English letters
Switching from capital to small letters, and to functional keys
Entering basic MS Windows functional keys as well as Nouse functional keys
- which would normally be performed using a keyboard.
A.J.C.E Dept. Of Comp. Science & Engg.
Perceptual Intelligent Systems 11
The program is equipped with such tools as:
Nousor (Nouse Cursor) -
The video-feedback-providing cursor that is used to point and to provide the
feeling of “touch" with a computer.
Nouse Click -
A nose-operated mechanism to simulate types of clicks.
Nouse Codes -
Configurable Nouse tool that allows entering computer commands and operate
the program using head motion codes.
Nouse Editor -
Provides an easy way of typing and storing messages hands-free using face
motion. Typed messages are automatically stored in Clipboard (as with CNTR+A,
CNTR+C).
Nouse Board-
A specially designed for face-motion-based typing on-screen keyboard that
automatically maps to the user's facial motion range.
Nouse Typer -
A configurable Nouse tool that allows typing letters by drawing them inside
the cursor (instead of using the Nouse Board).
Nouse Chalk -
A configurable Nouse tool that allows writing letters as with a chalk on a piece
of paper. Written letters are automatically saved on hard drive as images that can be
opened and emailed.
And such features as:
Automatic focusing on the user nose and motion range calibration.
Lock On Area, Glue/Unglue mechanisms that allow to map user's motion range onto
an arbitrary windows application
A.J.C.E Dept. Of Comp. Science & Engg.
Perceptual Intelligent Systems 12
Figure: The appearances of Nouse Board: groupingof letters by four is made to suit four directions
of “clicking” motion
Many of the universities have research centres which focus on perceptual intelligence.
In India MIT have developed two experimental test buds smart rooms and smart clothes.
A.J.C.E Dept. Of Comp. Science & Engg.
Perceptual Intelligent Systems 13
Conclusion
It is now possible to track people’s motion, identify them by voice and facial
appearance, and recognize their actions in real time using only modest computational
resources. By using this perceptual information we have been able to build smart room and
smart clothes that can recognize people, understand their speech, allow them to control
information displays without mouse or keyboard, communicate by facial and hand gesture,
and interact in a more personalized, adaptive manner. Our overall goal is to make the
computers seem as natural to interact with as another person. Sometimes this means than
there should be no interface; it should just recognize what is going on and what is the right
thing. At other times, it means that the system should engage in a dialogue with a person. We
want a system that is truly human centred and natural to interact with; this requires not just
perceptions but also a significant understanding of the semantics of the everyday world and
the reasoning capabilities to use this understanding flexibly.
A.J.C.E Dept. Of Comp. Science & Engg.
Perceptual Intelligent Systems 14
Reference
www.ayrmetes.com
www.centerforfuturehealth.com
www.infibeam.com
www.wikipedia.org
A.J.C.E Dept. Of Comp. Science & Engg.