25 colour recognition grid conected a robot dog

Upload: ramana-yellapu

Post on 06-Apr-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 25 Colour Recognition Grid Conected a Robot Dog

    1/7

    Velagapudi ramkrishna Siddhartha engineering collegeDEPARTMENT OF COMPUTER SCIENCE

    Vijayawada

    A paper on

    Presented by

    Ch.Phani Krishna P.Veerendra

    III Year CSE III Year CSE

    [email protected] [email protected]

    mailto:[email protected]:[email protected]:[email protected]:[email protected]
  • 8/3/2019 25 Colour Recognition Grid Conected a Robot Dog

    2/7

    1.Abstract:Multimedia data is rapidly gaining

    importance along with recent developments

    such as the increasing deployment ofsurveillance cameras in public locations. AIBO

    is a commercially available quadruped robot

    dog equipped with a color CMOS camera. its

    main aim to detect a mark on the floor using his

    camera and tracking the people.. In a few years

    time, analyzing the content of multimedia data

    will be a problem of phenomenal proportions, as

    digital video may produce data at rates beyond

    100 Mb/s, and multimedia archives steadily run

    into Petabytes of storage space. Consequently,

    for urgent problems in multimedia content

    analysis, Grid computing is rapidly becoming

    indispensable. This demonstration shows the

    viability of wide-area Grid systems in adhering

    to the heavy demands of a real-time task in

    multimedia content analysis. Specifically, we

    show the application of a Sony Aibo robot dog,

    capable of recognizing objects from a set of

    learned objects, while connected to a large-scale

    Grid system comprising of cluster computer

    located in Europe, the United States, andAustralia. As such, we demonstrate the effective

    integration of state-of the- art results from two

    largely distinct research fields: Multimedia

    Content Analysis and Grid computing. This

    paper briefly introduces the Parallel- Horus

    architecture, describes services-based approach

    to wide-area multimedia computing.

    2.Introduction:In order to create a social robot, able

    to interact with people in a natural way, one ofthe things it should be able to do is to follow a

    person around an area. This master's research is

    aimed at enabling a Sony AIBO robot dog to

    watch people closely and follow them around

    the house. This is done using the visual data

    provided by the robot's nose-mounted camera.

    By analyzing the data with respect to colour

    distributions and salient features.

    This is the time in which information

    be it scientific, industrial, or otherwise is

    generally composed of multimedia items, i.e. acombination of pictorial, linguistic, and auditory

    data.Due to the increasing storage and

    connectivity of multimedia data, automatic

    multimedia content analysis is becoming an

    ever more important area of research.Fundamental research questions in this area

    include:

    1. Can we automatically find (sub-)genres

    in images from a statistical evaluation

    of large image sets?

    2. Can we automatically learn to find

    objects in images and video streams

    from partially annotated image sets?

    Scientifically, these questions deal

    with the philosophical foundations of cognition

    and giving names to things. From a societalperspective solutions are urgently needed, given

    the rapidly increasing volume of multimedia

    data.

    Figure(a): Robotic Dog

    The existing user transparent

    programming model (i.e. Parallel-Horus)is

    matched with an execution model based on

    Wide-Area Multimedia Services, i.e. highperformance multimedia functionality that can

    be invoked from sequential applications running

    on a desktop machine. Evaluation of approach

    by applying it to a state-of-the-art Visual

    recognition task can be done. Specifically,

    application of a Sony Aibo robot dog, capable

    of recognizing objects from a set of 1,000

    objects, while connected to a large-scale Grid

    system comprising of cluster systems in Europe

    and Australia.

    3. Mark Detection:

    2

  • 8/3/2019 25 Colour Recognition Grid Conected a Robot Dog

    3/7

    Two solutions are possible to detect the

    mark in the picture:

    1. Color recognition: By knowing in

    advance the color of the mark, position

    in the picture can be computed by using

    a virtual mask moving on all pixels: Themean color inside the mask is calculated

    to find the mark position in the picture.

    Figure(b): robotic dog recognizing the ball

    2. Shape recognition: Image processing

    algorithms allows finding circles or lines

    in the picture. In the image processing

    course of Mr. Jourlin, circle recognition

    needed first different filters applied onthe picture to detect the edges. After

    extracting edges, circles can be detected

    by using shape parameters and lines are

    detected by using Hough transform.

    For the 4 following reasons, the color recogni-

    -tion method is adopted:

    1. The camera perspective distortion

    modify the shape of the mark and could

    disturb the shape recognition.

    2. The shape recognition algorithms need

    more computational power due to theimage filtering and edge extraction.

    3. The floor's color will be very different

    from the mark color, the color

    recognition technique can then be

    efficient.

    4. Color recognition algorithms are easier

    to implement than shape recognition.

    The existing user transparent

    programming model (i.e. Parallel-Horus)is

    matched with an execution model based on

    Wide-Area Multimedia Services, i.e. high

    performance multimedia functionality that can

    be invoked from sequential applications running

    on a desktop machine.

    4. Parrallel-Horus:

    Parallel-Horus is a cluster programmingframework that allows programmers to

    implement parallel multimedia applications as

    fully sequential programs. The Parallel-Horus

    framework consists of commonly used

    multimedia data types and associated

    operations, implemented in C++ and MPI. The

    librarys API is made identical to that of an

    existing sequential library: Horus.

    4.1. Unary pixel operation:

    Operations in which a unary functionis applied to each pixel in the image. The

    functions sole argument is the pixel value

    itself.

    E.g.: negation, absolute value, square root

    4.2. Binary pixel operation:

    Operations in which a binary function

    is applied to each pixel in the image. The

    functions first argument is the pixel value

    itself.The second argument is either a single value or

    a pixel value obtained from image itself.

    E.g.: addition, multiplication, threshold

    4.3. Global reduction:

    Operations in which all pixels in the

    image are combined to obtain a single result

    value.

    E.g.: sum ,product, maximum

    4.4. Neighborhood operation: Operations in which several pixel

    values in the neighborhood of each pixel in the

    image are combined.

    E.g.:percentile, median

    4.5. Generalized convolution:

    Special case of neighborhood

    operation.The combination of pixels in the

    neighborhood of each pixel is expressed in

    terms of two binary operations.

    E.g.: convolution, guass, dilation

    3

  • 8/3/2019 25 Colour Recognition Grid Conected a Robot Dog

    4/7

    4.6. Geometric Transformations:

    Operations in which the images

    domain is transformed.

    E.g.: rotation, scaling, reflection

    Current developments include patterns foroperations on large datasets, as well as patterns

    on increasingly important data structures, such

    as feature vectors obtained from earlier

    calculations on image and video data.

    For reasons of efficiency, all Parallel-

    Horus operations are capable of adapting to the

    performance characteristics of a parallel

    machine at hand, i.e. by being flexible in the

    partitioning of data structures.

    5. Services-Based Multimedia GridComputing:

    From the users perspective, Grid

    computing is still far from being more than just

    an academic concept. Essentially, this is because

    Grids do not yet have the full basic functionality

    needed for extensive use. Consequently, as long

    as programming and usage is hard, most

    researchers in multimedia computing will not

    regard Grids as a viable alternative to more

    traditional computer systems.6. Performance Evaluation:In this section the assessment of the

    architecture is given for the effectiveness in

    providing significant performance gains. At the

    end , the wide-area execution of a state-of-the-

    art vision task is described. Specifically, the

    application of a Sony Aibo robot dog, capable

    of recognizing objects is presented, while

    connected to a large-scale Grid system

    comprising of clusters in Europe and Australia.

    6.1 Object Recognition by a Sony Aibo Robot

    Dog:

    The example application demonstrates

    object recognition performed by a Sony Aibo

    robot dog (see Figure (c)). Irrespective of the

    application of a robot, the general problem of

    object recognition is to determine which, if any,

    of a given repository of objects appears in an

    image or video stream. It is a computationally

    demanding problem that involves a non-trivial

    tradeoff between specificity of recognition (e.g.,discriminating between different faces) and

    invariance (e.g., to different lighting

    conditions). Due to the rapid increase in the size

    of multimedia repositories consisting of

    known objects, state-of-the-art sequential

    computers no longer can live up to the

    computational demands, making high performance distributed computing

    indispensable.

    Fig. (c). Object recognition by our robot dog:

    (1) an object is shown to the dogs camera;

    (2) video frames are processed on a per-cluster basis;

    (3) given the resulting feature vectors describing the

    scene, a database of known objects is searched;

    (4) in case of recognition, the dog reacts accordingly

    6.2. Local Histogram Invariants.

    In the robot application, localhistograms of invariant features for each aspectof an object are explained. The color invariant

    features are highly invariant to illumination

    color, shadow effects, and shading of the object.

    The features are derived from a kernel based

    histogram of feature responses. By exploiting

    natural image statistics, these histograms are

    modeled by parameterized density functions.

    The parameters act as a new class of

    photometric and geometric invariants, yielding a

    very condensed representation of local imagecontent.

    4

  • 8/3/2019 25 Colour Recognition Grid Conected a Robot Dog

    5/7

    figure (d):illumination direction variationIn the learning phase of this system, a

    single condensed representation of each

    observed object is stored in a database.

    Subsequently, object recognition is achieved by

    matching local histograms extracted from the

    video stream generated by the camera in the

    dogs nose against the learned database.

    In this approach, first transform each

    pixels RGB value to an opponent color

    representation,

    The rationale behind this is that the

    RGB sensitivity curves of the camera are

    transformed to Gaussian basis functions , being

    the Gaussian and its first and second order

    derivative. Hence, the transformed values

    represent an opponent color system, measuring

    intensity, yellow versus blue, and red versus

    green Spatial scale is incorporated by

    convolving the opponent color images with a

    Gaussian filter. Photometric invariance is nowobtained by considering two non-linear

    transformations. The first color invariant W

    isolates intensity variation from chromatic

    variation, i.e. edges due to shading, cast shadow,

    and albedo changes of the object surface. The

    second invariant feature C measures all

    chromatic variation in the image, disregarding

    intensity variation, i.e. all variation where the

    color of the pixels change. These invariants

    measure point-properties of the scene, and are

    referred to as point-based invariants.

    channel histograms of the invariant gradients

    {Ww, Cw, Cw}, and the edge detectors {Wx,

    Wy, Cx, Cy, Cx, Cy}, separately.

    6.3. Histogram Parameterization:

    From natural image statistics research, itis known that histograms of derivative filters

    can be well modeled by simple distributions.

    from the previous it is shown that histograms of

    Gaussian derivative filters in a large collection

    of images follow a Weibull type distribution.

    Furthermore, the gradient magnitude for

    invariants W and C given above follow a

    Weibull distribution,

    (1)

    where r represents the response for one of the

    invariants {Ww, Cw, Cw}.

    The local histogram of invariants of

    derivative filters can be well modelled by an

    integrated Weibull type distribution ,

    .(2)

    In this case, r represents the response for one of

    the invariants {Wx, Wy,Cx,Cy, Cx, Cy}.

    Furthermore, represents the complete Gamma

    function,

    (3)

    In our implementation we convert the

    histogram density of all local invarianthistograms to Weibull form by first centering

    the data at the mode of the distribution. Then,

    multiplying each bin frequency p(ri) by its

    absolute response value |ri|, and normalizing the

    distribution. These transformations allow the

    estimation of the Weibull density parameters

    and , indicating the (local) edge contrast, and

    the (local) roughness or textureness,

    respectively .

    6.4. Color-Based Object Recognition:

    5

  • 8/3/2019 25 Colour Recognition Grid Conected a Robot Dog

    6/7

    In the robot dog system a simple

    algorithm is applied for object recognition and

    localization, based on the described invariant

    features.

    In the first learning phase of theexperiment, an object is characterized by

    learning the invariant Weibull parameters at

    fixed locations in the image, representing a sort

    of fixed retina of receptive fields positioned at

    a hexagonal grid, 2 apart, on three rings from

    the center of the image. Hence, a total of

    1+6+12+18 = 37 histograms are constructed.

    For each histogram, the invariant Weibull

    parameters are estimated. In the learning phase

    dog is presented with a set of 1,000 objects

    under a single visual setting. For each ofthese objects, the learned set of Weibull

    parameters is stored in a database.

    + =

    Figure (f) : learning phase

    In the second recognition phase, the

    learning step is validated by showing the sameobjects again, under many different

    appearances, with varying lighting direction,

    lighting color, and viewing position, using the

    same retinal structure. In this manner, the robot

    dog has learned each of the 1,000 objects from

    only one example, while being capable of

    recognizing more than 300 of these under

    a diversity of imaging conditions that may occur

    in everyday life. Interestingly, this recognition

    rate is higher than the recognition rate of around

    200 objects reported for a real dog.The important thing is the fact that

    the algorithm runs at around 1 frame per 4

    seconds. Moreover, voting is done by the results

    obtained from 8 consecutive frames. As a result,

    object recognition takes approximately 30

    seconds, which is not even close to real-time

    performance. This problem is overcome

    by applying high-performance distributed

    computing at a very large scale.

    7. Benefits:

    There are many benefits in using the colorbased object recognition in robot dogs,

    1.Robotic pets can replace the real ones.

    2.Robotic pets can provide services and

    health benefits for senior citizens.

    figure(g): Robot dog in household environment

    3.These robotic dogs are flexible to move

    and can bring the objects according to the user

    requirements.

    9. Conclusion: The results obtained by matching the

    Parallel-Horus programming model with an

    execution model based on so-called Wide-Area

    Multimedia Services are described above

    Considering the above features, it can be

    decieded that system have succeeded in

    designing and building a framework. These

    have served well in showing the potential of

    applying Grid resources in state-of-the-artmultimedia processing in a coordinated manner.

    However, there is still room for improvement, in

    functionality, reliability, fault tolerance, and

    performance. With such improvements,

    Parallel-Horus will continue to have an

    immediate and stimulating effect on the study of

    the many computationally demanding problems

    in multimedia content analysis. The robot dog

    application is merely one of these.

    10. References:1. G. Alonso, F. Casati, H. Kuno, and V.

    Machiraju. Web Services -

    Concepts,Architectures and

    Applications. Springer-Verlag, 2004.

    2. H.E. Bal et al. The Distributed ASCI

    Supercomputer Project. Operating

    SystemsReview, 34(4):7696, 2000.

    3. D. Comaniciu, V. Ramesh, and P. Meer.

    Kernel-based Object Tracking.

    IEEETransactions on Pattern Analysis

    and Machine Intelligence, 25(4):564577, 2003.

    6

  • 8/3/2019 25 Colour Recognition Grid Conected a Robot Dog

    7/7

    4. D. Crookes, P.J. Morrow, and P.J.

    McParland. IAL: A Parallel Image

    ProcessingProgramming Language. IEE

    Proceedings, Part I, 137(3):176182,

    June 1990.

    5. J.M. Geusebroek et al. Color Invariance.IEEE Transactions on Pattern Analysis

    and Machine Intelligence, 23(12):1338

    1350, 2001.

    7