a novel approach for satellite imagery storage by classifying the non-duplicate regions
TRANSCRIPT
7/30/2019 A Novel Approach for Satellite Imagery Storage by Classifying the Non-duplicate Regions
http://slidepdf.com/reader/full/a-novel-approach-for-satellite-imagery-storage-by-classifying-the-non-duplicate 1/13
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME
147
A NOVEL APPROACH FOR SATELLITEIMAGERY STORAGE BY CLASSIFYING THE
NON-DUPLICATE REGIONS
Cyju VargheseComputer Science Department
Karunya University, India
E-Mail: [email protected]
John BlesswinComputer Science Department
Karunya University, India
E-Mail: [email protected]
Navitha VargheseComputer Science Department
Karunya University, India
E-Mail: [email protected]
Sonia Singha
Computer Science DepartmentKarunya University, India
E-Mail: [email protected]
ABSTRACT
Everyday satellite is capturing thousands of images which needs to be classified
in a proper way. In this paper, we address the problem of replacing the existing images
with the captured one. We provide a new solution by storing only the non-existing part of
the image. Though satellite images have been classified in past by using various
techniques, the researchers are always finding alternative strategies for satellite imageclassification so that they may be prepared to select the most appropriate technique for
the feature extraction task in hand. In order to overcome this difficulty, we propose an
efficient approach, which consists of an algorithm that can adopt robust feature kernel
principle component analysis (KPCA) to reduce dimensionality of image. Concerning
image clustering, we utilize Fuzzy N-Means algorithm. Finally data is stored into
International Journal of Computer Engineering
and Technology (IJCET), ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online) Volume 1
Number 2, Sep - Oct (2010), pp. 147- 159
© IAEME, http://www.iaeme.com/ijcet.html
IJCET
© I A E M E
7/30/2019 A Novel Approach for Satellite Imagery Storage by Classifying the Non-duplicate Regions
http://slidepdf.com/reader/full/a-novel-approach-for-satellite-imagery-storage-by-classifying-the-non-duplicate 2/13
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME
148
database according to specific class by utilizing support vector machine classifier. Thus
the proposed scheme improve the efficient storage of satellite images in the database,
save time consumption and make the correction of the satellite images more proficiently.
Index Terms- Compression, Duplicate Detection, Feature Extraction, Image Clustering,
Satellite Image
1. INTRODUCTION
Satellite images are playing an important role in many applications, especially to
capture earth images for environmental study and homeland security. ‘Geo’ is a generally
used satellite, to capture earth images. Thousands of thousands images are transmitted
every day to ‘digital globe’ database. Everyday the topography of the earth is changing
and therefore updating the images in database frequently is very tedious. In currentapplications, the images are being totally updated in the database. Instead of updating the
whole image, this paper employs an approach to detect non-duplicates and duplicate
blocks in the captured image and update the non-duplicate blocks only in the
corresponding image in the database. The approaches make use of a Duplication
Detection algorithm.
To avoid the duplication of same image duplication detection approach need to be
applied. Traditional approaches in duplication detection of image objects normally
partition images into several blocks. These detection methods are designed specifically
for the purpose of separating duplicate and non-duplicate image. It can detect duplication
when the locations of the extracted objects are invariant to scaling, translation, or
rotation.
The traditional techniques used in detecting duplication include discrete wavelet
transform (DWT), principle component analysis (PCA), fourier mellin transform (FMT).
These techniques are restricted with only linear features. Duplicate detection involves
division of the image into overlapping blocks, extract features from each block, detect
similar feature. Depending on the type of duplication, various measures and mechanisms
can be adopted and implemented to counter duplication.
A discrete wavelet transform (DWT) [3][4]maps the time-domain signal of f(t)
into a real-valued time frequency domain and the signals are described by the wavelet
7/30/2019 A Novel Approach for Satellite Imagery Storage by Classifying the Non-duplicate Regions
http://slidepdf.com/reader/full/a-novel-approach-for-satellite-imagery-storage-by-classifying-the-non-duplicate 3/13
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME
149
coefficients. Five-scale signal decomposition is performed to ensure that all disturbance
features in both high and low frequencies are extracted. Thus, the output of the wavelet
transform consists of five decomposed scale signals, with different levels of resolutions.
Principal component analysis (PCA) [5] in signal processing can be described as a
transform of a given set of input vectors (variables) with the same length formed in the n-
dimensional vector. FMT [6][7]is a global transform and applies on all pixels in the same
way. Fourier-Mellin Transform includes translation, scaling, and rotation invariance. To
achieve these properties, image is divided into overlapping blocks. Fourier transform is
applied into each of the block and obtains features.
KPCA [1][2]is better over other techniques because it is used for non-linear
feature extraction. It can detect duplication if a particular portion of an image has been
rotated in any direction. Quantitative analyses indicate that the KPCA-based feature
obtains excellent performance in the additive noise and lossy JPEG compression
environments. This method uses global geometric transformation and the labeling
technique to indentify the mentioned duplication. Experiments with a good number of
natural images show very promising results, when compared with the other conventional
approach. Duplication detection involves division of the image into overlapping blocks,
extract features from each block, detect similar feature. KPCA technique is mainly used
for nonlinear feature extraction where other techniques are used for linear feature
extraction. KPCA extracts more useful features than the linear PCA. Initial mapping to
high-dimensional space provides smoother dimensionality reduction than the standard
PCA. It does not require nonlinear optimization but just the solution of eigen value
problem. Although signal reconstruction is unnecessary for the tampering detection,
KPCA is computationally more expensive than the linear PCA.
2. PROPOSED SCHEME
Satellite image is used as input for this application. At the time of storing this
image in database the image size will be reduced and then stored in the database. It
requires less memory. Kernel principle is used to reduce the dimensionality of the image.
Kernel PCA is a non-linear feature extractor which is used to detect duplicate and non-
duplicate regions from satellite image. In Kernel PCA one important concern is selection
7/30/2019 A Novel Approach for Satellite Imagery Storage by Classifying the Non-duplicate Regions
http://slidepdf.com/reader/full/a-novel-approach-for-satellite-imagery-storage-by-classifying-the-non-duplicate 4/13
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME
150
of kernel function and computation of gram matrix. They can extract data-nonlinearity
and can simulate the behavior of other kernels. Gram matrix can be finding from
following equation:
K (xi, x j) = exp ( ) (1)
Where Gaussian kernel denote important property, the value of kernel parameter is very
important.
Figure 1 Flowchart of proposed scheme
To compute principle component following step has to follow:1. Construct one training and one testing matrix.
2. Compute gram matrix for training matrix.
3. Center the training gram matrix.
4. Diagonalizable the new matrix and compute Eigen value and eigenvector.
5. Construct the test gram matrix.
7/30/2019 A Novel Approach for Satellite Imagery Storage by Classifying the Non-duplicate Regions
http://slidepdf.com/reader/full/a-novel-approach-for-satellite-imagery-storage-by-classifying-the-non-duplicate 5/13
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME
151
6. Center the test gram matrix.
7. Compute projection of all vectors onto the eigenvectors.
From the compressed images, extraction of image features is the most important step
that has a great impact on the retrieval performance.
A. Satellite Image Clustering
The concept of points having significant membership to multiple classes is
deployed by Fuzzy algorithm. The points situated in the overlapped regions of different
clusters are first identified and excluded from consideration while clustering. Thereafter,
these points are given class labels based on Support vector Machine classifier which is
trained by the remaining points. The well known Fuzzy N-Means algorithm and some
recently proposed genetic clustering schemes are utilized in the process. Image is dividedinto number of blocks. Each block can have same features or different kind of features.
Clustering is performed to group same kind of features. This step will give some
additional advantage for duplication detection from image.
B. Satellite Image Segmentation
Using KPCA image is divided into number of blocks. Each block can have same
features or different kind of features. Here image segmentation is performed to group
same kind of features. This step will give some additional advantage for duplication
detection. Image segmentation is the basis of image analysis & understanding. Image
segmentation is exactly the problem of classifying pixel set of image. Clustering analysis
is naturally applied into image segmentation. Here we are using Fuzzy N means algorithm for image segmentation. Fuzzy N
means is improved version of fuzzy C means. Here outlier test is also performed to
improve performance of segmentation. The internal level is used for calculating new
centroid and updating fuzzy subjection-level matrix, and the external level is for judging
if the algorithm has been converged to estimated threshold. After finishing the iterative,
we can know generic subjection-level of certain pixel to certain clustering centre
according to generated fuzzy subjection-level matrix, and determine generic category of
the pixel by the size of the matrix[8]. Image segmentation means that image is indicated
as set of physically meaningful connected areas.
7/30/2019 A Novel Approach for Satellite Imagery Storage by Classifying the Non-duplicate Regions
http://slidepdf.com/reader/full/a-novel-approach-for-satellite-imagery-storage-by-classifying-the-non-duplicate 6/13
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME
152
Algorithm:
Input: Test image
Output: Segmented image
Step1: Initialize the parameter ⌡ and also perform normalization
Step 2: For k=1…….N
Perform outlier test using equation
Outlier test: ||xk -vI||
Where ui(k)= 2
Update centroid by:
Vi(new)
=vi(old)
+ (x-vi(old)
)
Step 3: Termination test || - ||>Є
The problem of segmenting image into different clusters is iteratively [10]
handled by means of single parameter .Outlier test is performed to improve the cluster
validity index. After finding the centers of clusters fuzzy membership value can be
measured at any point. Thus groups of clusters with similar feature are obtained after
performing this algorithm.
Image segmentation means that image is indicated as set of physically [9]
meaningful connected areas. Generally we achieve image segmentation purpose through
analyzing such different image characteristics by using fuzzy N means clustering
algorithm.
Table 1 List of Symbols
List of symbols
ui Fuzzy Membership Value
V i Centroid Value
X k Pixel Values
7/30/2019 A Novel Approach for Satellite Imagery Storage by Classifying the Non-duplicate Regions
http://slidepdf.com/reader/full/a-novel-approach-for-satellite-imagery-storage-by-classifying-the-non-duplicate 7/13
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME
153
C. Duplication Detection of Satellite Image
Satellite Image database contain previously captured images of real world that are
used as a training data. Everyday thousands of images is captured by satellite. In order to
update the database, duplication detection has to be performed before storing the image
into database. Each time satellite is storing the new image into database by replacing the
previous one which is captured by it. This process is time consuming and it requires
additional memory space. This paper proposes a new approach which updates the
existing image with the identified non-duplicate block.
To find the duplicate and non-duplicate blocks from the images duplication
detection algorithm has proposed. Input of this algorithm is test image block.
The duplication detection steps are as follows:Algorithm:
Input: Test image of N pixels.
Output: Non duplicate block of image.
Step 1: Initialize block processing parameters:
b: Number of pixels per block,
Q: Number of quantization bins,
Rth: Number of neighboring rows to search in the lexicographically sorted matrix,
Dth: Minimum offset-magnitude threshold
∊: Fraction of the ignored variance along the principal axes or the fraction of the
ignored local variance of the wavelet coefficients
M: Number of training samples for the KPCA.
Step 2: Apply KPCA on each block, b, of data, and compute a transform vector of length
L, which is equal to (M, Nt2) for the KPCA-based features with dimension
reduction.
Step 3: Construct a data matrix, Mdata, of size Nb × L, where row-elements contain
component-wise quantized features, i.e., bai /Qc.
Step 4: Apply lexicographic sorting to the rows of the above matrix to obtain a new
matrix S. Let si, be the i-th row of S, which represents the i-th block with its
center coordinates (xi, yi).
7/30/2019 A Novel Approach for Satellite Imagery Storage by Classifying the Non-duplicate Regions
http://slidepdf.com/reader/full/a-novel-approach-for-satellite-imagery-storage-by-classifying-the-non-duplicate 8/13
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME
154
Step 5: For every row si from S, select a number of adjacent rows, s j, such that |i − j| < R th
and place all pairs of coordinates (xi, yi) and (x j , y j) for j = 0, 1, ..., (Rth − 1) onto
a list Pin.
Step 6: Eliminate all pairs of points, whose offset-magnitude, Dof , is less than Dth.
Construct a set, OF, of various offsets (m, n) and offset-frequencies (f m,n) for all
elements in Pin.
Step 7: Create a refined list of point-pairs, Pout, from Pin by the algorithm or by using
manual threshold, f th.
The proposed duplication detection algorithm has several parameters to be
selected and justified before using them. These are block-size (b), number of quantization
bins (Q), block-similarity threshold (Rth), minimum offset-magnitude threshold (Dth),
offset-frequency threshold (fth), and the fraction of ignored variance (∊). The selection
of Q depends on the feature variations. The selection of Rth depends on how well
lexicographic sorting arranges similar vectors (blocks) in the sorted matrix,S. The parameter Dth is
used to avoid false detection.
D. Categorization of Satellite Image
From the identified blocks to classify satellites image manually is a tedious
process. To perform this, computer utilizes the numerical "signatures" for each training
class. Each pixel in the image is compared to these signatures and labeled as the class it
most closely resembles digitally. Hence, supervised classifiers require the user to decide
which classes exist in the image, and then to define training areas of these classes. SVM
allows not only the best classification performance (e.g., accuracy) on the training data,
but also leaves much room for the correct classification of the future data. [11]
After detecting a few duplicate pixels whose similarity scores are bigger than the
threshold using the KPCA algorithm, we have positive examples, the identified duplicate
blocks in D, and negative blocks, namely, the remaining non duplicate blocks in N.
7/30/2019 A Novel Approach for Satellite Imagery Storage by Classifying the Non-duplicate Regions
http://slidepdf.com/reader/full/a-novel-approach-for-satellite-imagery-storage-by-classifying-the-non-duplicate 9/13
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME
155
Table 2 Some Samples of the Test High-Resolution Satellite Image Database
Algorithm:
Input: Duplicate D and Non-Duplicate regions N
Original Image
Output: Updated Image
Step1: Train Classifier C1 using D and N.
Step2: Classify the Non-Duplicate region N to the corresponding class label C in the
database.
Step 3: Perform Step 2 until all the Non-Duplicate blocks in N are inserted into the
Original Image I.
The duplicate blocks in D and Non-Duplicate N are used to train the classifier
(SVM) inorder to identify where to categorize the non-duplicated block in the already
stored image I thus updating the image. Thus the satellite images are stored in an
efficient manner in the database. The proposed scheme works as follows: Image captured
is compressed using Kernel Principle Component Analysis (KPCA) and the feature
extracted. The features extracted are clustered, employing the Fuzzy P –Means Algorithm
inorder to perform the duplicate detection algorithm efficiently.
Duplication Detection is performed by comparing the captured image with the
stored image. The duplicate and non-duplicate blocks are thus detected. Later on the
missing part of the image stored in the database is updated by bringing in the non-
duplicate block.
Existing image
in databaseCaptured Image
Results afterapplying
the Algorithm
Stored ImageMS
EPSNR
0 33.56
7/30/2019 A Novel Approach for Satellite Imagery Storage by Classifying the Non-duplicate Regions
http://slidepdf.com/reader/full/a-novel-approach-for-satellite-imagery-storage-by-classifying-the-non-duplicate 10/13
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME
156
Table 2 Detection Accuracy for JPEG Dataset
Intra-dataset average precision
(P%) and recall (R%)
JPEGFeatures
P R
KPCA 73.19 40.27
KPCA based feature obtains the best recall (40.27%) for JPEG and medium
precision (73.19%) for JPEG performances in the compressed and noisy domain shown
in Table 2.
KPCA is performed on JPEG Satellite images in our experiment. It can be also be
performed BMP and SNR images. Recall varies roughly in sigmoid fashion with
increasing JPG.
III. EXPERIMENTAL RESULTS
Experimental results on satellite images demonstrate four objectives. Thus more
than 100 satellite jpeg images have been tested. Sample tested satellite images are given
in Table1. The first is the implementation of KPCA. The dimensionality of the original
image is reduced. The image is resized to 256 x 256 before applying the proposed
duplicate detection method. Moreover, the features are extracted using KPCA. The
second is, clustering the extracted features of the compressed image. Fuzzy N-means
cluster algorithm groups the similar features. This clustered information is used to
identify duplicate and non-duplicate block of the image. The third objective is the
duplication detection. To show the non-duplicate block of the image a different color is
used. First set of experiments use parameters which were empirical fixed to b=64, Q=
256, Rth=50.Dth=16, =0, =1. The identified duplicate D and non-duplicate N blocks
are used to train SVM classifier. This performs the task of blocks being inserted into the
database.
In our scheme, peak signal-to-noise ratio (PSNR) is used to evaluate the quality of
the updated image. Similarly, we use mean square error (MSE) to identify the difference
between the updated image and the captured image. The quality of the updated image is
considered by using two points of view. First, under the human resource system the
7/30/2019 A Novel Approach for Satellite Imagery Storage by Classifying the Non-duplicate Regions
http://slidepdf.com/reader/full/a-novel-approach-for-satellite-imagery-storage-by-classifying-the-non-duplicate 11/13
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME
157
updated image is almost indistinguishable from the original image. Secondly, the PSNR
values of the updated images and the original images range from 32 to 34.5db. Moreover,
all MSE’s are equal to zero when the image is exactly updated.
IV. CONCLUSION
This paper has presented one method for detecting duplicated regions in the
satellite image. An automatic duplication detection forgery has been proposed. This
technique reduces false detection as well as eliminates an important threshold parameter.
Although time-cost is high, this method can have good performance. The next method
what we are applying is clustering method. Finally classification method is applied to
store the non-duplicate region of the image in the database.
REFERENCES[1] M. K. Bashar, Member, IEEE, K. Noda, Non-member, N. Ohnishi, and K. Mori,
Member, IEEE ” ,Exploring Duplicated Regions in Natural Images”. IEEE
Transaction on Image Processing, Vol 1,pp. 1-40, March 2010.
[2] M. Turk and A. Pentland, “Eigenfaces for recognition,” Journal of Cognitive
Neuroscience, vol. 3, no. 1, 1991.
[3] G. Li, Q. Wu, D. Tu, and S. Sun, “A Sorted Neighborhood Approach for
Detecting Duplicated Regions in Image Forgeries based on DWT and SVD,” in
Proceedings of IEEE International Conference on Multimedia and Expo , Beijing
China, July 2-5, 2007, pp. 1750-1753.
[4] W .Luo, J. Huang, and G. Qiu, “Robust Detection of Region Duplication Forgery in
Digital Image,” in Proceedings of the 18th International Conference on Pattern
Recognition, Vol. 4, 2006, pp. 746-749.
[5] C. Popescu and H. Farid, “Exposing Digital Forgeries by Detecting Duplicated Image
Regions”, Technical Report , TR2004-515, Dartmouth College, Computer Science,
2004.
[6] Sevinc Bayram, Taha Sencar, and Nasir Memon, “An efficient and robust method
for detecting copy-move forgery,” in Proceedings of ICASSP 2009.
7/30/2019 A Novel Approach for Satellite Imagery Storage by Classifying the Non-duplicate Regions
http://slidepdf.com/reader/full/a-novel-approach-for-satellite-imagery-storage-by-classifying-the-non-duplicate 12/13
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME
158
[7] H. Huang, W. Guo, and Y. Zhang, “Detection of Copy-Move Forgery in Digital
Images Using SIFT Algorithm,” in Proceedings of IEEE Pacific-Asia Workshop on
Computational Intelligence and Industrial Application, Vol. 2, pp. 272-276, 2008.
[8] Sang Wan Lee, Yong Soo Kim, and Zeungnam Bien, Fellow, IEEE, “A
Nonsupervised Learning Framework of Human Behavior Patterns Based on
Sequential Actions” IEEE Transactions on Knowledge and Data Engineering, vol.
22, no. 4, April 2010.
[9] Z. Bien and M.-G. Chun, “A Fuzzy Petri Net Model,” Handbook of Fuzzy
Computation, C2.4, IOP Publishing Ltd., 1998.
[10] T. Tajima et al., “Development of a Marketing System for Recognizing Customer
Buying Behavior Sensor,” J. Japan Soc. for Fuzzy Theory and Intelligent
Informatics, vol. 20, no. vol 5,pp 18-22,apr.2007
[11] Weifeng Su, Jiying Wang, and Frederick H. Lochovsky, Member, IEEE Computer
Society. “Record Matching over Query Results from Multiple Web Databases” IEEE
Transactions On Knowledge And Data Engineering, VOL. 22, NO. 4, APRIL 2010
[12] R. Baeza-Yates and B. Ribeiro-Neto, “Modern Information Retrieval.” ACM Press,
1999.
Cyju Elizabeth Varghese received the B.E degree in Computer Science and
Engineering from CSI Institute of Technology, Thovalai, India, in
2001 and been working since. Currently she is doing M. Tech in
Computer Science and Engineering in Karunya University,
Coimbatore. Her research interests include Web Mining and areas
related to Database.
John Blesswin received the B.Tech degree in Information Technology from Karunya
University, Coimbatore, India, in 2009. He passed B.Tech
examination with gold medal. He is doing M.Tech Computer
Science and Engineering in Karunya University. His research
interests include visual cryptography, visual secret sharing schemes,
image hiding, and information retrieval.
7/30/2019 A Novel Approach for Satellite Imagery Storage by Classifying the Non-duplicate Regions
http://slidepdf.com/reader/full/a-novel-approach-for-satellite-imagery-storage-by-classifying-the-non-duplicate 13/13
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),
ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME
159
Navitha Varghese received the B.Tech degree in Computer Science and Engineering
from Model Engineering College, Ernakulam, India, in
2009.Currently she is doing M.Tech in Computer Science at
Karunya University, Coimbatore. Her research interests include Web
Mining, Web technology
Sonia Singha received the B.Tech degree in Computer Science and Engineering from
Calcutta Institute of Technology, Kolkata, India, in 2009.Currently
she is doing M. Tech in Computer Science at Karunya University,
Coimbatore. Her research interests include Data Mining, Image
Processing.