distinguishing photographic images and photorealistic computer graphics using visual vocabulary on...
Post on 20-Dec-2015
216 views
TRANSCRIPT
Distinguishing Photographic Images and
Photorealistic Computer Graphics Using Visual Vocabulary on Local
Image Edges
Rong Zhang,Rand-Ding Wang, and Tian-Tsong Ng
23/4/18
Definition
photographic images (PIM): generated from natural scene by digital imaging tools, also called as natural images.
photorealistic computer graphics (PRCG): created by a variety of rendering software with high photorealism.
Outline
Introduction Data Sets Image Classification Based on
Image Edge Vocabulary Experimental Results Conclusions and Future Work
Introduction
Acquisition of PIM
Introduction
Creating of PRCG By skilled artists or professional programmers Using artificial models Virtual scene Simplified generation process due to time-cost
and computation complexity
Introduction
We are attempting to identify natural images and photorealistic computer graphics.
Basic idea: To exploit the statistical property of local edge patches in images.
Outline
Introduction Data Sets Image Classification Based on
Image Edge Vocabulary Experimental Results Conclusions and Future Work
Data Sets
We collect two data sets, namely PIM data set and PRCG data set, respectively consisting of 1000 PIM and 900 PRCG.
Data Sets
Considerations To explore the essential properties of natural
images, we don’t take those images from Internet, which may undergo various unknown post-processing and compression at various quality factors.
We collect all images in PIM set with high quality JPEG format from 8 consumer-end cameras and without any experience of post-process outside the cameras.
Data Sets
Considerations PRCG set contains 800 images from Columbia
PRCG data set and 100 CG images with high visual realism from website www.raph.com.
Outline
Introduction Data Sets Image Classification Based on
Image Edge Vocabulary Experimental Results Conclusions and Future Work
Image Classification Based on Image Edge Vocabulary
Image Edge Vocabulary It is derived from bag-of-words model in text
categorization and bag-of-visual-words model in visual categorization.
A visual word corresponds to a cluster center and the vocabulary is constructed by a set of cluster centers.
The basic idea of the bag-of-visual-words helps us efficiently capture the significant difference in statistical distribution of geometrical structure of local edge patches between PIM and PRCG.
Image Classification Based on Image Edge Vocabulary
Presentation of 3×3 Edge Patches Convert color to grayscale. Each patch is regarded as 9-tuple of real number
(log of gray values), i.e. a vector in
Detect
edge
Detect
edge Define neighborhood
blocks around edge points
Define neighborhood blocks around edge points
Randomly select 1000 3×3 local patches
Randomly select 1000 3×3 local patches
Image Classification Based on Image Edge Vocabulary
Data preprocessing Define contrast ||X||D (D-norm) :
where i ~j represents the 4-connected neighborhoodwhere i ~j represents the 4-connected neighborhood..
ji
jiD xx~
2)(|||| x
Image Classification Based on Image Edge Vocabulary
Data preprocessing Subtracting the mean and contrast normalizing lead
to a new vector:
D||~||
~
x
xy
wherewhere
9
19
1xx~
i ix
Image Classification Based on Image Edge Vocabulary
Data preprocessing Make change of basis with 2-dimensional Discrete
Cosine Transform (DCT) basis corresponding to image patches:
v yAwherewhere
2 2 21 2 8, (1/ || || ,1/ || || ,...,1/ || || ) TA B diag e e e
1 8,...,and are contrast normalized non-constant DCT basis.e e
Image Classification Based on Image Edge Vocabulary
Data preprocessing v is located on 7-sphere in a 8-D Euclidean space:
87 8
1
v : 0,|| v || 1ii
S v
R
Image Classification Based on Image Edge Vocabulary
Data preprocessing Calculate the angular distance between two points :
1
1
21 2 2( , )
|| 1 |||| 2 ||
x xx x 1-cos<x ,x > 1-
x xd
Image Classification Based on Image Edge Vocabulary
Construction of Visual Vocabulary 17,520 Voronoi cells with roughly the same size and
efficiently covering the 7–sphere are selected as the sampling point set:
1 2{ , ,..., }, 17520NO o o o N
Where OWhere Oi i is the sampling point in the ith lattice and is a 8-D is the sampling point in the ith lattice and is a 8-D
vector.vector.
Image Classification Based on Image Edge Vocabulary
Construction of Visual Vocabulary Now, the problem to observe the distribution of data
points on the 7-sphere is converted to the one, which we should calculate the possibilities that data points fall into the corresponding Voronoi tessellations.
we can use the histograms with 17,520 bins to respectively describe the possibility distribution of geometrical structure of edge patches of PIM and PRCG.
Image Classification Based on Image Edge Vocabulary
Construction of Visual Vocabulary Figure: Probabilities of edge patches in Voronoi cells that
are sorted according to decreasing probability: (left) 200K edge patches from 200 PIM images; (right) 98,599 edge patches from 100 CG images collected by ourselves.
0 2000 4000 6000 8000 10000 12000 14000 16000 180000
0.005
0.01
0.015
0.02
0.025
Voronoi cells(sorted)
Pro
babi
lity
0 2000 4000 6000 8000 10000 12000 14000 16000 180000
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
Voronoi cells(sorted)
Pro
babi
lity
Image Classification Based on Image Edge Vocabulary
Construction of Visual Vocabulary We find a smaller set of Voronoi cells can pick up
the majority of the patches for both PIM and PRCG.
We look upon the sampling points corresponding to those Voronoi cells with larger possibility as key sampling points.
Image Classification Based on Image Edge Vocabulary
Construction of Visual Vocabulary We construct image edge vocabulary based on
these key sampling points.
Image Classification Based on Image Edge Vocabulary
Image Classification
Outline
Introduction Data Sets Image Classification Based on
Image Edge Vocabulary Experimental Results Conclusions and Future Work
Experimental Results
We use the remaining 800 PIM and 800 PRCG images from our data sets to evaluate the proposed method.
The 1,600 images are used for evaluating our method through 10-fold cross-validation.
Experimental Results
Image Classification We determine vocabulary size according to different
possibility thresholds. Table: Classification accuracy of different vocabulary sizes
As a performance-cost tradeoff
On the same datasets, the results of Farid’s method
Experimental Results
Generalization Capability To verify the generalization capability, we remove all (50)
images taken by the Samsung camera from the PIM samples and train the classifier with the remaining 750 PIM images and 800 Columbia PRCG images.
Only one image is incorrectly classified (classification accuracy 98%) with the proposed method while all images fail to be correctly detected with Farid’s.
This result may indicate that the visual vocabulary based on local edge patch can characterize the general property of a special image source better.
We need more experiments to ensure it.
Experimental Results
Compression Attack 1600 images are compressed respectively with quality
factor 90, 70, and 40. Table: Comparative experimental results of the proposed
approach and Farid’s([5]) on datasets with different JPEG compression factors.
Outline
Introduction Data Sets Image Classification Based on
Image Edge Vocabulary Experimental Results Conclusions and Future Work
Conclusions and Future Work Conclusions
we have proposed a new approach of PIM and PRCG classification based on the idea of bag-of-visual-words.
By projecting the image patch data onto a 7-dimensional sphere with a series of transforms, we observe the distribution of data points in individual Voronoi lattice.
Then, visual vocabulary is constructed through determining the key sampling points corresponding to Voronoi cells.
And then, a given image is represented as a histogram of visual words.
Finally, we employ SVM classifier.
Conclusions and Future Work Conclusions
Our experimental results demonstrate the efficient discrimination of the features.
It is revealed that the intrinsic difference between PIM and PRCG may be captured by the geometry structure of local edge patches.
Our conclusions is of great significance for digital image forensic as well as photorealism evaluation for computer graphics.
Conclusions and Future Work Future work
To modify the proposed method, we are considering a closer analogy to document retrieval.
We wish to make sense of the visual vocabulary set. We have had attempts to evaluate the generalization and
resistance to compression of the proposed approach. More experiments are being done on a wider range of images.