25 colour recognition grid conected a robot dog
TRANSCRIPT
-
8/3/2019 25 Colour Recognition Grid Conected a Robot Dog
1/7
Velagapudi ramkrishna Siddhartha engineering collegeDEPARTMENT OF COMPUTER SCIENCE
Vijayawada
A paper on
Presented by
Ch.Phani Krishna P.Veerendra
III Year CSE III Year CSE
[email protected] [email protected]
mailto:[email protected]:[email protected]:[email protected]:[email protected] -
8/3/2019 25 Colour Recognition Grid Conected a Robot Dog
2/7
1.Abstract:Multimedia data is rapidly gaining
importance along with recent developments
such as the increasing deployment ofsurveillance cameras in public locations. AIBO
is a commercially available quadruped robot
dog equipped with a color CMOS camera. its
main aim to detect a mark on the floor using his
camera and tracking the people.. In a few years
time, analyzing the content of multimedia data
will be a problem of phenomenal proportions, as
digital video may produce data at rates beyond
100 Mb/s, and multimedia archives steadily run
into Petabytes of storage space. Consequently,
for urgent problems in multimedia content
analysis, Grid computing is rapidly becoming
indispensable. This demonstration shows the
viability of wide-area Grid systems in adhering
to the heavy demands of a real-time task in
multimedia content analysis. Specifically, we
show the application of a Sony Aibo robot dog,
capable of recognizing objects from a set of
learned objects, while connected to a large-scale
Grid system comprising of cluster computer
located in Europe, the United States, andAustralia. As such, we demonstrate the effective
integration of state-of the- art results from two
largely distinct research fields: Multimedia
Content Analysis and Grid computing. This
paper briefly introduces the Parallel- Horus
architecture, describes services-based approach
to wide-area multimedia computing.
2.Introduction:In order to create a social robot, able
to interact with people in a natural way, one ofthe things it should be able to do is to follow a
person around an area. This master's research is
aimed at enabling a Sony AIBO robot dog to
watch people closely and follow them around
the house. This is done using the visual data
provided by the robot's nose-mounted camera.
By analyzing the data with respect to colour
distributions and salient features.
This is the time in which information
be it scientific, industrial, or otherwise is
generally composed of multimedia items, i.e. acombination of pictorial, linguistic, and auditory
data.Due to the increasing storage and
connectivity of multimedia data, automatic
multimedia content analysis is becoming an
ever more important area of research.Fundamental research questions in this area
include:
1. Can we automatically find (sub-)genres
in images from a statistical evaluation
of large image sets?
2. Can we automatically learn to find
objects in images and video streams
from partially annotated image sets?
Scientifically, these questions deal
with the philosophical foundations of cognition
and giving names to things. From a societalperspective solutions are urgently needed, given
the rapidly increasing volume of multimedia
data.
Figure(a): Robotic Dog
The existing user transparent
programming model (i.e. Parallel-Horus)is
matched with an execution model based on
Wide-Area Multimedia Services, i.e. highperformance multimedia functionality that can
be invoked from sequential applications running
on a desktop machine. Evaluation of approach
by applying it to a state-of-the-art Visual
recognition task can be done. Specifically,
application of a Sony Aibo robot dog, capable
of recognizing objects from a set of 1,000
objects, while connected to a large-scale Grid
system comprising of cluster systems in Europe
and Australia.
3. Mark Detection:
2
-
8/3/2019 25 Colour Recognition Grid Conected a Robot Dog
3/7
Two solutions are possible to detect the
mark in the picture:
1. Color recognition: By knowing in
advance the color of the mark, position
in the picture can be computed by using
a virtual mask moving on all pixels: Themean color inside the mask is calculated
to find the mark position in the picture.
Figure(b): robotic dog recognizing the ball
2. Shape recognition: Image processing
algorithms allows finding circles or lines
in the picture. In the image processing
course of Mr. Jourlin, circle recognition
needed first different filters applied onthe picture to detect the edges. After
extracting edges, circles can be detected
by using shape parameters and lines are
detected by using Hough transform.
For the 4 following reasons, the color recogni-
-tion method is adopted:
1. The camera perspective distortion
modify the shape of the mark and could
disturb the shape recognition.
2. The shape recognition algorithms need
more computational power due to theimage filtering and edge extraction.
3. The floor's color will be very different
from the mark color, the color
recognition technique can then be
efficient.
4. Color recognition algorithms are easier
to implement than shape recognition.
The existing user transparent
programming model (i.e. Parallel-Horus)is
matched with an execution model based on
Wide-Area Multimedia Services, i.e. high
performance multimedia functionality that can
be invoked from sequential applications running
on a desktop machine.
4. Parrallel-Horus:
Parallel-Horus is a cluster programmingframework that allows programmers to
implement parallel multimedia applications as
fully sequential programs. The Parallel-Horus
framework consists of commonly used
multimedia data types and associated
operations, implemented in C++ and MPI. The
librarys API is made identical to that of an
existing sequential library: Horus.
4.1. Unary pixel operation:
Operations in which a unary functionis applied to each pixel in the image. The
functions sole argument is the pixel value
itself.
E.g.: negation, absolute value, square root
4.2. Binary pixel operation:
Operations in which a binary function
is applied to each pixel in the image. The
functions first argument is the pixel value
itself.The second argument is either a single value or
a pixel value obtained from image itself.
E.g.: addition, multiplication, threshold
4.3. Global reduction:
Operations in which all pixels in the
image are combined to obtain a single result
value.
E.g.: sum ,product, maximum
4.4. Neighborhood operation: Operations in which several pixel
values in the neighborhood of each pixel in the
image are combined.
E.g.:percentile, median
4.5. Generalized convolution:
Special case of neighborhood
operation.The combination of pixels in the
neighborhood of each pixel is expressed in
terms of two binary operations.
E.g.: convolution, guass, dilation
3
-
8/3/2019 25 Colour Recognition Grid Conected a Robot Dog
4/7
4.6. Geometric Transformations:
Operations in which the images
domain is transformed.
E.g.: rotation, scaling, reflection
Current developments include patterns foroperations on large datasets, as well as patterns
on increasingly important data structures, such
as feature vectors obtained from earlier
calculations on image and video data.
For reasons of efficiency, all Parallel-
Horus operations are capable of adapting to the
performance characteristics of a parallel
machine at hand, i.e. by being flexible in the
partitioning of data structures.
5. Services-Based Multimedia GridComputing:
From the users perspective, Grid
computing is still far from being more than just
an academic concept. Essentially, this is because
Grids do not yet have the full basic functionality
needed for extensive use. Consequently, as long
as programming and usage is hard, most
researchers in multimedia computing will not
regard Grids as a viable alternative to more
traditional computer systems.6. Performance Evaluation:In this section the assessment of the
architecture is given for the effectiveness in
providing significant performance gains. At the
end , the wide-area execution of a state-of-the-
art vision task is described. Specifically, the
application of a Sony Aibo robot dog, capable
of recognizing objects is presented, while
connected to a large-scale Grid system
comprising of clusters in Europe and Australia.
6.1 Object Recognition by a Sony Aibo Robot
Dog:
The example application demonstrates
object recognition performed by a Sony Aibo
robot dog (see Figure (c)). Irrespective of the
application of a robot, the general problem of
object recognition is to determine which, if any,
of a given repository of objects appears in an
image or video stream. It is a computationally
demanding problem that involves a non-trivial
tradeoff between specificity of recognition (e.g.,discriminating between different faces) and
invariance (e.g., to different lighting
conditions). Due to the rapid increase in the size
of multimedia repositories consisting of
known objects, state-of-the-art sequential
computers no longer can live up to the
computational demands, making high performance distributed computing
indispensable.
Fig. (c). Object recognition by our robot dog:
(1) an object is shown to the dogs camera;
(2) video frames are processed on a per-cluster basis;
(3) given the resulting feature vectors describing the
scene, a database of known objects is searched;
(4) in case of recognition, the dog reacts accordingly
6.2. Local Histogram Invariants.
In the robot application, localhistograms of invariant features for each aspectof an object are explained. The color invariant
features are highly invariant to illumination
color, shadow effects, and shading of the object.
The features are derived from a kernel based
histogram of feature responses. By exploiting
natural image statistics, these histograms are
modeled by parameterized density functions.
The parameters act as a new class of
photometric and geometric invariants, yielding a
very condensed representation of local imagecontent.
4
-
8/3/2019 25 Colour Recognition Grid Conected a Robot Dog
5/7
figure (d):illumination direction variationIn the learning phase of this system, a
single condensed representation of each
observed object is stored in a database.
Subsequently, object recognition is achieved by
matching local histograms extracted from the
video stream generated by the camera in the
dogs nose against the learned database.
In this approach, first transform each
pixels RGB value to an opponent color
representation,
The rationale behind this is that the
RGB sensitivity curves of the camera are
transformed to Gaussian basis functions , being
the Gaussian and its first and second order
derivative. Hence, the transformed values
represent an opponent color system, measuring
intensity, yellow versus blue, and red versus
green Spatial scale is incorporated by
convolving the opponent color images with a
Gaussian filter. Photometric invariance is nowobtained by considering two non-linear
transformations. The first color invariant W
isolates intensity variation from chromatic
variation, i.e. edges due to shading, cast shadow,
and albedo changes of the object surface. The
second invariant feature C measures all
chromatic variation in the image, disregarding
intensity variation, i.e. all variation where the
color of the pixels change. These invariants
measure point-properties of the scene, and are
referred to as point-based invariants.
channel histograms of the invariant gradients
{Ww, Cw, Cw}, and the edge detectors {Wx,
Wy, Cx, Cy, Cx, Cy}, separately.
6.3. Histogram Parameterization:
From natural image statistics research, itis known that histograms of derivative filters
can be well modeled by simple distributions.
from the previous it is shown that histograms of
Gaussian derivative filters in a large collection
of images follow a Weibull type distribution.
Furthermore, the gradient magnitude for
invariants W and C given above follow a
Weibull distribution,
(1)
where r represents the response for one of the
invariants {Ww, Cw, Cw}.
The local histogram of invariants of
derivative filters can be well modelled by an
integrated Weibull type distribution ,
.(2)
In this case, r represents the response for one of
the invariants {Wx, Wy,Cx,Cy, Cx, Cy}.
Furthermore, represents the complete Gamma
function,
(3)
In our implementation we convert the
histogram density of all local invarianthistograms to Weibull form by first centering
the data at the mode of the distribution. Then,
multiplying each bin frequency p(ri) by its
absolute response value |ri|, and normalizing the
distribution. These transformations allow the
estimation of the Weibull density parameters
and , indicating the (local) edge contrast, and
the (local) roughness or textureness,
respectively .
6.4. Color-Based Object Recognition:
5
-
8/3/2019 25 Colour Recognition Grid Conected a Robot Dog
6/7
In the robot dog system a simple
algorithm is applied for object recognition and
localization, based on the described invariant
features.
In the first learning phase of theexperiment, an object is characterized by
learning the invariant Weibull parameters at
fixed locations in the image, representing a sort
of fixed retina of receptive fields positioned at
a hexagonal grid, 2 apart, on three rings from
the center of the image. Hence, a total of
1+6+12+18 = 37 histograms are constructed.
For each histogram, the invariant Weibull
parameters are estimated. In the learning phase
dog is presented with a set of 1,000 objects
under a single visual setting. For each ofthese objects, the learned set of Weibull
parameters is stored in a database.
+ =
Figure (f) : learning phase
In the second recognition phase, the
learning step is validated by showing the sameobjects again, under many different
appearances, with varying lighting direction,
lighting color, and viewing position, using the
same retinal structure. In this manner, the robot
dog has learned each of the 1,000 objects from
only one example, while being capable of
recognizing more than 300 of these under
a diversity of imaging conditions that may occur
in everyday life. Interestingly, this recognition
rate is higher than the recognition rate of around
200 objects reported for a real dog.The important thing is the fact that
the algorithm runs at around 1 frame per 4
seconds. Moreover, voting is done by the results
obtained from 8 consecutive frames. As a result,
object recognition takes approximately 30
seconds, which is not even close to real-time
performance. This problem is overcome
by applying high-performance distributed
computing at a very large scale.
7. Benefits:
There are many benefits in using the colorbased object recognition in robot dogs,
1.Robotic pets can replace the real ones.
2.Robotic pets can provide services and
health benefits for senior citizens.
figure(g): Robot dog in household environment
3.These robotic dogs are flexible to move
and can bring the objects according to the user
requirements.
9. Conclusion: The results obtained by matching the
Parallel-Horus programming model with an
execution model based on so-called Wide-Area
Multimedia Services are described above
Considering the above features, it can be
decieded that system have succeeded in
designing and building a framework. These
have served well in showing the potential of
applying Grid resources in state-of-the-artmultimedia processing in a coordinated manner.
However, there is still room for improvement, in
functionality, reliability, fault tolerance, and
performance. With such improvements,
Parallel-Horus will continue to have an
immediate and stimulating effect on the study of
the many computationally demanding problems
in multimedia content analysis. The robot dog
application is merely one of these.
10. References:1. G. Alonso, F. Casati, H. Kuno, and V.
Machiraju. Web Services -
Concepts,Architectures and
Applications. Springer-Verlag, 2004.
2. H.E. Bal et al. The Distributed ASCI
Supercomputer Project. Operating
SystemsReview, 34(4):7696, 2000.
3. D. Comaniciu, V. Ramesh, and P. Meer.
Kernel-based Object Tracking.
IEEETransactions on Pattern Analysis
and Machine Intelligence, 25(4):564577, 2003.
6
-
8/3/2019 25 Colour Recognition Grid Conected a Robot Dog
7/7
4. D. Crookes, P.J. Morrow, and P.J.
McParland. IAL: A Parallel Image
ProcessingProgramming Language. IEE
Proceedings, Part I, 137(3):176182,
June 1990.
5. J.M. Geusebroek et al. Color Invariance.IEEE Transactions on Pattern Analysis
and Machine Intelligence, 23(12):1338
1350, 2001.
7