new memory

24
A Memory Learning Framework for Effective Image Retrieval

Upload: brinda-bm

Post on 10-May-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: New Memory

A Memory Learning Framework forEffective Image

Retrieval

Page 2: New Memory

ABSTRACT

Due to the rapidly growing amount of digital image data on the

Internet and in digital libraries, there is a great need for large image database

management and effective image retrieval tools.

The project presents a framework for effective image retrieval by

employing a novel idea of memory learning. It forms a knowledge memory

model to store the semantic information by simply accumulating user-

provided interactions.

A learning strategy is then applied to predict the semantic

relationships among images according to the memorized knowledge. Image

queries are finally performed based on a seamless combination of low-level

features and learned semantics.

The feedback knowledge memory model and the learning strategy are

jointly known as memory learning. Here the spatial features of the images

are used to retrieve them from large databases.

Page 3: New Memory

1. INTRODUCTION

1.1OVERVIEW OF IMAGE RETRIEVAL SYSTEMS

An image retrieval system is a computer system for browsing,

searching and retrieving images from a large database of digital images.

Most traditional and common methods of image retrieval utilize some

method of adding metadata such as captioning, keywords, or descriptions to

the images so that retrieval can be performed over the annotation words.

Manual image annotation is time-consuming, laborious and expensive; to

address this, there has been a large amount of research done on automatic

image annotation.

Image retrieval can be done with three main features as,

Color

Texture

Edge

1.1.1Overview of CBIR

"Content-based" means that the search will analyze the actual contents

of the image. The term 'content' in this context might refer colors, shapes,

textures, or any other information that can be derived from the image itself.

Without the ability to examine image content, searches must rely on

metadata such as captions or keywords, which may be laborious or

expensive to produce.

An image can be considered as a mosaic of different texture regions,

and the image features associated with these regions can be used for search

Page 4: New Memory

and retrieval. A typical query could be a region of interest provided by the

user, such as outlining a vegetation patch in a satellite image. The input

information in such cases is an intensity pattern or texture within a

rectangular window.

Potential uses for CBIR include:

Art collections

Photograph archives

Retail catalogs

Medical records

CBIR software systems and techniques

Query techniques

Different implementations of CBIR make use of different types of user

queries.

Query by example

Query by example is a query technique that involves providing the CBIR

system with an example image that it will then base its search upon. Options

for providing example images to the system include:

A preexisting image may be supplied by the user or chosen from a

random set.

The user draws a rough approximation of the image they are looking

for, for example with blobs of color or general shapes.

Page 5: New Memory

This query technique removes the difficulties that can arise when trying to

describe images with words.

Current CBIR systems therefore generally make use of lower-level

features like texture, color, and shape, although some systems take

advantage of very common higher-level features.

Other query methods

Other methods include specifying the proportions of colors desired

(e.g. "80% red, 20% blue") and searching for images that contain an object

given in a query image.

CBIR systems can also make use of relevance feedback, where the user

progressively refines the search results by marking images in the results as

"relevant", "not relevant", or "neutral" to the search query, then repeating the

search with the new information.

Relevance feedback (RF) was introduced into CBIR to improve the

performance of information systems but which lacked memory mechanism.

The feedback knowledge memory model is presented to gather the

users’ feedback information during the process of image search and

feedback. It is efficient and can be simply implemented.

A learning strategy based on the memorized information is proposed.

It can estimate the hidden semantic relationships among images.

Consequently, this technique could address the problem of user log sparsity

in a certain extent.

Page 6: New Memory

During the interactive process, a seamless combination of normal RF

(low-level feature based) and the memory learning (semantics based) is

proposed to improve the retrieval performance.

1.2OVERVIEW OF GABOR FILTER

. Gabor filters are used to extract fractional energies in various spatial-

frequency channels. The system is able to serve queries ranging from scenes

of purely natural objects such as vegetation, trees, sky, etc. to images

containing conspicuous structural objects such as buildings, towers, bridges,

etc. 

Gabor Filter (and Gabor Wavelet) has been a popular tool to extract

such frequency components from both color and grayscale images.

Gabor filter is capable of first locating and then analyzing regions of

subtle texture differences. In the case when there is similar texture pattern,

color analysis can then be used.

Gabor filters have been used in many applications, such as texture

segmentation, target detection, fractal dimension management, document

analysis, edge detection, retina identification, image coding and image

representation.

Gabor filtering

This block implements one or multiple convolutions of an input image

with a two-dimensional Gabor function:

Page 7: New Memory

To visualize a Gabor function select the option "Gabor function"

under "Output image". The Gabor function for the specified values of the

parameters "wavelength", "orientation", "phase offset", "aspect ratio", and

"bandwidth" will be calculated and displayed as an intensity map image in

the output window.

Here is the formula of a complex Gabor function in space domain

g(x, y) = s(x, y) wr(x, y)

where s(x, y) is a complex sinusoidal, known as the carrier, and wr(x, y) is a

2-D Gaussian-shaped function, known as the envelop.

The complex sinusoidal is defined as follows

s(x, y) = exp (j (2_(u0 x + v0 y) + P))

Where (u0, v0) and P define the spatial frequency and the phase of the

sinusoidal respectively.

This sinusoidal can be thought as two separate real functions,

conveniently allocated in the real and imaginary part of a complex function.

The real part and the imaginary part of this sinusoidal are

Re (s(x, y)) = cos (2∏ (u0 x + v0 y) + P)

Im(s(x, y)) = sin (2∏ (u0 x + v0 y) + P)

The parameters u0 and v0 define the spatial frequency of the

sinusoidal in Cartesian coordinates.

Page 8: New Memory

This spatial frequency can also be expressed in polar coordinates as

magnitude F0 and direction ω0:

F0 = sqrt(u0^2 +v0^2)

ω0 = tan−1 (v0/u0)

i.e. u0 = F0 cos ω0

v0 = F0 sin ω0

Using this representation, the complex sinusoidal is

s(x, y) = exp (j (2∏F0 (x cos ω0 + y sin ω0) + P))

The Gabor space is very useful in e.g., image processing applications

such as iris recognition and fingerprint recognition. Relations between

activations for a specific spatial location are very distinctive between objects

in an image.

1.3SOFTWARE DESCRIPTION

Java is an object oriented, multi thread programming language

developed by Sun Microsystems in 1991. It is designed to be simple and

portable across different platforms as well as operating systems. The

popularity of Java is due to its unique technology that is designed on the

basis of three key elements. They are the usage of applets, powerful

programming language constructs and a rich set of significant object classes.

Features of Java

Java was designed to meet all the real world requirements with its

features, which are explained in the following paragraphs:

Simple and portable

Page 9: New Memory

Java makes itself simple by not having surprising features. Since it

exposes the internal working of the machine, the programmers can perform

his desired action without fear.It is portable across multiple platforms.

Multithreaded

Java supports multithreaded programming, which allows user to write

programs that perform many functions simultaneously.

Security

Security manager - determines what resources a class can access such

as reading and writing to the local disk.

Dynamic Binding

The linking of data and methods to where they are located is done at

run-time. New classes can be loaded while a program is running. Linking is

done on the fly.

JAI:

Java Advanced Imaging (JAI) is a Java platform extension API that provides

a set of object-oriented interfaces that support a simple, high-level

programming model which allows images to be manipulated easily in Java

applications and applets. JAI goes beyond the functionality of traditional

imaging APIs to provide a high-performance, platform-independent,

extensible image processing framework.

Java swing:

Swing is a widget toolkit for Java. It is part of Sun Microsystems' Java

Foundation Classes (JFC) — an API for providing a graphical user

interface (GUI) for Java programs.

Swing was developed to provide a more sophisticated set of GUI

components than the earlier Abstract Window Toolkit. Swing

Page 10: New Memory

provides a native look and feel that emulates the look and feel of

several platforms, and also supports a pluggable look and feel that

allows applications to have a look and feel unrelated to the underlying

platform.

Architecture

Swing is a platform-independent, Model-View-Controller GUI

framework for Java. It follows a single-threaded programming model, and

possesses the following traits:

Platform independence: Swing is platform independent both in terms

of its expression (Java) and its implementation (non-native universal

rendering of widgets).

Extensibility: Swing users can extend the framework by extending

existing (framework) classes and/or providing alternative

implementations of core components.

Component-Oriented: Swing is a component-based framework.

Swing components are Java Beans components, compliant with the

Java Beans Component Architecture specifications.

Customizable: Users will programmatically customize a standard

Swing component (such as a JTable) by assigning specific Borders,

Colors, Backgrounds, opacities, etc., as the properties of that

component.

Configurable: Swing's heavy reliance on runtime mechanisms and

indirect composition patterns allows it to respond at runtime to

fundamental changes in its settings.

Page 11: New Memory

Lightweight UI: Swing's configurability is a result of a choice not to

use the native host OS's GUI controls for displaying itself. Swing

"paints" its controls programmatically through the use of Java 2D

APIs, rather than calling into a native user interface toolkit.

Loosely-Coupled/MVC: The Swing library makes heavy use of the

Model/View/Controller software design pattern, which conceptually

decouples the data being viewed from the user interface controls

through which it is viewed.

Page 12: New Memory

2. LITERATURE REVIEW

Relevance feedback for content-based image retrieval using Bayesian

network

3. SYSTEM SPECIFICATION

Page 13: New Memory

3.1 HARDWARE SPECIFICATION

Monitor : EGA / VGA

Keyboard : 112 Multimedia keyboards

Processor : Pentium IV processor

Hard Disk : 40GB

RAM : 256MB

3.2 SOFTWARE SPECIFICATION

Operating System : Windows XP

Language : Java

Software : JCreator Pro

4. SYSTEM ANALYSIS

4.1EXISTING SYSTEM

Page 14: New Memory

The limited retrieval accuracy of image-centric retrieval systems is

essentially due to the inherent gap between semantic concepts and low-level

features. In order to reduce the gap, the interactive relevance feedback (RF)

is introduced into CBIR.

The basic idea of RF is to incorporate human perception subjectivity

into the query process and provide users with the opportunity to evaluate the

retrieval results. The similarity measures are automatically refined on the

basis of these evaluations.

Although RF can significantly improve the retrieval performance, its

applicability still suffers from three inherent drawbacks.

1) Incapability of capturing semantics.

2) Scarcity and imbalance of feedback examples.

3) Lack of the memory mechanism.

To overcome these difficulties, another method, generally called long-

term learning was introduced into CBIR. They memorize and accumulate

users’ preferences in the RF process. These long-term learning algorithms

are mainly based on previous users’ behaviors, which basically embody

more semantic information than low-level features.

The limited retrieval accuracy of image-centric retrieval systems is

essentially due to the inherent gap between semantic concepts and low-level

features. In order to reduce the gap, the interactive relevance feedback (RF)

is introduced into CBIR.

The basic idea of RF is to incorporate human perception subjectivity

into the query process and provide users with the opportunity to evaluate the

retrieval results. The similarity measures are automatically refined on the

basis of these evaluations.Although RF can significantly improve the

retrieval performance, its applicability still suffers from three inherent

Page 15: New Memory

drawbacks.

1) Incapability of capturing semantics.

2) Scarcity and imbalance of feedback examples.

3) Lack of the memory mechanism.

To overcome these difficulties, another method, generally called long-

term learning was introduced into CBIR. They memorize and accumulate

users’ preferences in the RF process. These long-term learning algorithms

are mainly based on previous users’ behaviors, which basically embody

more semantic information than low-level features.

Actually, the idea of long-term learning in CBIR is borrowed from the

work of collaborative filtering and link structure analysis in the web

information retrieval.

However, they inevitably encounter two problems in practice.

1. One is the sparsity of memorized feedback information.

2. There is no learning or limited learning in such existing long-term

learning systems.

4.2PROPOSED SYSTEM

A novel memory learning framework has been proposed to address

Page 16: New Memory

those two issues.

A feedback knowledge memory model is introduced to accumulate

the previous users’ preferences. Furthermore, a learning strategy is presented

to predict hidden semantics using the memorized information, which is able

to reduce the limitation of user log sparsity to a certain extent.

The feedback knowledge memory model and the learning strategy are

jointly known as memory learning.

Feedback knowledge memory model

The feedback images are provided to the system by the experts who

use them. These feedback images are retrieved when the user gives the same

query next time. Since the user log accumulates feedback knowledge from

various users, the semantic correlations can reflect the preference of the

majority of the users.

Learning strategy:

A Learning strategy is used to estimate the hidden semantic

correlation between two images without “direct link.”

Advantages:

The memory learning provides the normal RF with a pool of

positive examples according to its captured knowledge, which helps the

normal RF to alleviate the problem of scarcity and imbalance of feedback

examples.

It is able to automatically collect and analyze the users’ historical

judgments offline without additional cost of user interaction. Also, it hardly

influences the speed of the real-time retrieval system.

Page 17: New Memory