shape extraction framework for similarity search in image databases
DESCRIPTION
Jan Klíma,Tomáš Skopal. Shape extraction framework for similarity search in image databases. Charles University in Prague Department of Software Engineering Czech Republic. Motivation. Search in image databases - PowerPoint PPT PresentationTRANSCRIPT
Shape extraction framework for similarity search in image databases
Jan Klíma,Tomáš Skopal
Charles University in PragueDepartment of Software Engineering Czech Republic
IVPF (Image and vector processing framework)
Motivation
Search in image databases
Text-based methods become useless, since the requirements exceed human possibilities
Metadata-based systems need explicit additional information to work effectively (images.google.com)
Content-based low level methods like color histograms may be misleading and do not capture high level features (Amore system, ImageMiner,..)
High level feature extraction is in practise limited to domain-specific systems (biometric features recognition, ..)
IVPF (Image and vector processing framework)
Overall approach
Shape is one of the most importnant features found in images
Although it is one of the basic features recognized by human sight, it often carries high level information
But how should we do the shape extraction to archieve the best results?
There exist plenty of algorithms for shape extraction, but which should be used and how?
One would like to have freedom for experimentation with different approaches
IVP framework was implemented to allow configurable extraction of image features, especially shapes
IVPF (Image and vector processing framework)
Overall approach
IVPF separates objects that figure in image processing
Bitmaps
Histograms
Vectors (polylines,...)
..
and algorithms which work with these objects on input-output basis
Edge detection
Vectorization
Artifact removal
..
IVPF (Image and vector processing framework)
Overall approach
Each algorithm is considered as a black box - a component that takes some input and produces defined output
Components can be put together to form a component network
Component network usually comprises of
Input components that send data into the network
Output components that save processed data
Worker components that transform their input somehow to outputs
Component network handles the high level functionality and in fact creates a separated application
IVPF (Image and vector processing framework)
Data flow example
IVPF (Image and vector processing framework)
Overall approach
Advantages
Flexibility and configurability
Maximum reusability of existing code
Room for experimentation
Disadvantages
There is always some neccessary amount of redundant work
• The objects components work with (bitmaps, vectors) must be defined general-purpose
• But certain algorithms might need data in different representations
Higher memory demands
Some performance penalty
IVPF (Image and vector processing framework)
Further details
Framework is implemented in .NET 2.0
Components are encapsulated in managed classes
Which are loaded dynamically from a DLL using .NET reflection
Minimal amount of effort is needed to create a new component
• All the work is handled by the higher levels of the framework
Component network can be created from or saved to an XML file
GUI to simplify network creation is on the way
IVPF (Image and vector processing framework)
Component catalogue
Currently implemented components focus to present basic shape extraction capabilities
Component groups
Bitmap handling(resize, thresholding,..)
Edge detection
Binary image processing
Vectorization
Polyline simplification
Artifact removal
Line connection
IVPF (Image and vector processing framework)
Transformation examples
Edge detection components
Thinning component
Iterative artifact pruning component
IVPF (Image and vector processing framework)
Scenarios
It's hard to obtain robust shape extraction capabilities on a general set of images
Instead, some methods might work only in certain situations
By creating a set of scenarios for different image types, shape extraction should bring good results even in big image databases
The most obvious examples of such shape extraction scenarios are
Maps
Drawings
Photos
...
IVPF (Image and vector processing framework)
„Simple drawing“ scenario example
For high contrast images, the edge detection alone is a reliable way extract required feature information
Artifact removal is a relatively safe operation then
A reconnection of disconnected lines and corners that follows will almost completely reconstruct the full shape information
Finally, a polyline simplification is done to straighten jagged lines and minimize the produced number of line segments
IVPF (Image and vector processing framework)
„Simple drawing“ scenario
IVPF (Image and vector processing framework)
„Simple drawing“ scenario
Work progress example
Original image
Gradient
Edge detection
Polished vector result
IVPF (Image and vector processing framework)
Future plans
Shape representation and similarity measure for database queries
Shape information made of polylines can be turned into a time series and matched using methods from the DTW family
Self-configuration
Component is not restricted to image processing work only
Components could evaluate the quality of their outputs and adjust network settings accordingly
Such self-configuration could eventually lead to fully automatical scenario recommendation