image forgery localization via fine-grained analysis of...
TRANSCRIPT
Image Forgery Localization via Fine-Grained Analysis of
CFA Artifacts P. Ferrara1, T. Bianchi2, A. De Rosa3, A. Piva1
1 Dipartimento di Ingegneria dell’Informazione – Università di Firenze
2 Dipartimento di Ingegneria Elettronica e delle Telecomunicazioni - Politecnico di Torino 3 Consorzio Nazionale InterUniversitario delle Telecomunicazioni
Information Forensics and Security, IEEE Transactions on , vol.7, no.5, pp.1566,1577, Oct. 2012
Overview
Introduction
• Image Forensics • Color Filter Array and Demosaicing
Proposed Method
• Local analysis of the prediction error • Gaussian Mixtures Model and Expectation-Maximization Algorithm • Bayesian approach for localization
Experimental evaluation
• Model Validation • ROC curves
Conclusions
Fake Images
The spread of digital cameras and photo-editing software allows anyone to create photomontages, without leaving visible traces
You need to ensure the authenticity of digital images in many areas
• Crime investigation • Video-surveillance • Journalism • Assurance • Science • Health • And so on…
Image Forensics
A Digital Forensics branch, whose aim is:
Image Authentication Source identification
•In-camera processing (CFA, demosaicing, white balancing, etc..) •Out-camera processing (compression, tampering, enhancing, etc…)
Digital Fingerprints
Colour Filter Array and Demosaicing
CFA di Bayer
Demosaicing (interpolation) introduces a correlation between pixels These artifacts are periodic, because of the spatial distribution of Color Filter Array
0,
),(),(),(),(vu
vvuxIvuyxIyxe
e(x,y) is a high-pass random signal Usually modeled by a Laplacian or Gaussian distribution e(x,y) is a ciclostationary random process → distribution parameters change periodically
Prediction Error
Digitare l'equazione qui. Value interpolated by a neighborhood of pixels
Prediction Error
Working on Green channel
Bayer Pattern Quincunx of Acquired and Interpolated
pixels
𝑨 is the set of acquired pixels 𝑰 is the set of interpolated pixels
Prediction Error
0,
),(),(),(),(vu
vvuxIvuyxIyxe
Assumption: the interpolation kernel 𝜶 𝒖, 𝒗 is known
If 𝑥, 𝑦 ∈ 𝑨 → 𝒆 𝒙, 𝒚 ~ 𝑵 𝟎, 𝝈𝒍𝑨 If 𝑥, 𝑦 ∈ 𝑰 → 𝒆 𝒙, 𝒚 = 𝟎 ideally
In practice …. 𝒆 𝒙, 𝒚 ~ 𝑵 𝟎, 𝝈𝒍𝑰 where
𝝈𝒍𝑨 > 𝝈𝒍𝑰
Periodic structure Ciclostationarity
State of the Art Expectation-Maximization Algorithm di A.C. Popescu e H.Farid [1] M. Kirchner [2] Dirik and Memon [3] Gallagher and Chen [4]
Absence of periodic
correlation Forgery
Idea
Limitations
Limitations of previous works They work on the whole image or on a large sub-region of the image A fine localization of the tampering is not allowed Performance are content dependent
Innovative approach
Fine-grained localization of artifacts Content independent approach
Local variance of prediction error
To estimate 𝝈 for both classes, we employ a local weighted estimate of the variance to limit the dependence from the content.
Weights have a Gaussian shape
𝝈𝒍𝑨(𝒙, 𝒚) is estimated by 𝑨 pixels only
𝝈𝒍𝑰(𝒙, 𝒚) is estimated by 𝑰 pixels only
)},({
)},({log),(
2
2
lkGM
lkGMlkL
lI
lA
L-statistics
We need a metrics to distinguish locally the presence or the absence of periodic patterns
We work on sub-blocks of the image The dimension 𝑩 of the blocks is a multiple of a 𝟐 × 𝟐 pixels, equal to the dimension of Bayer filter
Geometric Mean
Local variance of I pixels in (k,l)-th block
Local variance of A pixels in (k,l)-th block
),0(}|),(Pr{ 2
22 NMlkL
Rationale
Bayer pattern
)}]1,1(log{)},([log{
)}1,(log{)}1,(log{),(
22
22
lklk
lklklkL
lIlI
lAlA
Central Limit Theorem
con
In presence of CFA (M1 Hypothesis)
),(}|),(Pr{ 2
111 NMlkL 01
][][ 22
lIlA EE
Assumption: working on a 𝟐 × 𝟐 block of the green channel
2
lA
2
lA2
lI2
lI
),0(}|),(Pr{ 2
22 NMlkL
Rationale
Bayer pattern
)}]1,1(log{)},([log{
)}1,(log{)}1,(log{),(
22
22
lklk
lklklkL
lIlI
lAlA
Central Limit Theorem
In absence of CFA (M2 Hypothesis)
][][ 22
lIlA EE
Assumption: working on a 𝟐 × 𝟐 block of the green channel
Gaussian Mixture Model
In case of the presence of a forged region, L(k,l) is distributed as a Mixture of Gaussians
To estimate the parameters of the model, we employ
Expectation-Maximization Algorithm
),0()1(),()|( 2
2
2
11 NNxp
Expectation-Maximization Algorithm
Introduced by Dempster, Laird e Rubin on 1977 [5], is an
iterative algorithm employed in regression problems, as Gaussian Mixture models.
It estimates the mean and the variance of component distributions by maximizing the expected value of a complete log-likelihood function with respect to the distribution parameters.
In our case, the mean 𝝁𝟐 is fixed to 0 in estimating the model
Then, we need to estimate 𝜇1, 𝜎1 and 𝜎2
Bayesian Approach to Localization
From Bayes’ Theorem, by assuming that Pr{M1}=Pr{M2}=1/2 (maximum uncertainty)
}|),(Pr{}|),(Pr{
}|),(Pr{)},(|Pr{
21
11
MlkLMlkL
MlkLlkLM
That we semplify
)],([1
1)},(|Pr{ 1
lkLlkLM
Where is a Likelihood Map }|),(Pr{
}|),(Pr{)],([
1
2
MlkL
MlkLlkL
Denoising
This maps are usually noisy
Denoising by:
Increasing the dimension B of the L-statistics
Cumulating C blocks of L-statistics, by assuming L-values as i.i.d. random variables (multiplication)
Filtering: Mean filter
Median filter
Experimental Results
Dataset
Model validation: HOW GOOD IS THE MODEL WE ASSUMED?
Performance evaluation: HOW DOES THE SYSTEM WE PROPOSED WORK?
Dataset • 400 images from different cameras: Canon EOS 450D, Nikon D50, Nikon
D90, Nikon D7000
• Each camera equipped with a known CFA Bayer pattern
• In-camera demosaicing algorithm is unknown
• Each image is cropped to 512x512, maintaining the original Bayer pattern
• 4 different predictors are employed: – Bilinear
– Bicubic
– Median
– Gradient Based
• Absence of CFA artifacts is obtained by a resizing of 200%, a 7x7 median filtering and a subsampling of 50%
Model Validation
• Generalized Gaussian distribution (GGD)
𝑃 𝐿 = 1
𝑍𝑒−( 𝐿−𝜇 𝜂) 𝜐
Shape
Parameter
Median values of shape parameter, estimated for each image by using [6], employing 8x8 dimension of L-statistics
Model Validation
Histograms of mean values of L, in both hypothesis, when predictor matches interpolation algorithm
Bilinear Bicubic Median Gradient-based
Model Validation Bilinear Bicubic Median Gradient-based
Canon EOS 450D
Nikon D50
Nikon D7000
Nikon D90
Predictor
Camera
Performance Evaluation
Image Quality: •Ideal case:
•Uncompressed images with matching between interpolation and prediction algorithms
•Uncompressed images •No matching between interpolation and prediction algorithms
•Compressed images and no matching •JPEG with quality factor 100% •JPEG with quality factor 95% •JPEG with quality factor 90% •JPEG with quality factor 85%
Tampering: a centered square region, with different dimensions (32x32, 64x64 and 128x128), obtained removing artifacts by resizing and filtering.
In general, low-pass processing decreases the performance of the localization
Comparison with the State of the Art
DM, method proposed in [3]: based on MSE between interpolated and acquired pixels GC – B, method proposed in [4]: based on blockwise DFT of the variance of the prediction error GC – L , method proposed in [4]: based on sliding DFT of the variance of the prediction error
Resolution is fixed to 8x8 pixels for each block
Conclusions A forgery localization algorithm has been devoloped by using the periodic
correlations introduced by demosaicing
The system is based on a fine-grained local analysis of the variance of the prediction error, on a Gaussian Mixture Model and on a Bayesian approach for localizing forgeries
The performance of the system are accurate with high quality factors of the compression
However, it should be remarked that the detection performance is strongly affected by JPEG compression
Due to the presence of uniform or very sharp regions, automatic detection may give a remarkable false positive rate.
Therefore, in order to limit the incidence of false positives human interpretation of the forgery maps is still required.
Reference 1. A. C. Popescu and H. Farid, “Exposing digital forgeries in color filter array interpolated
images,” IEEE Trans. Signal Proc., pp. 3948 – 3959, Oct. 2005. 2. M. Kirchner, “Fast and reliable resampling detection by spectral analysis of fixed linear
prediction residue,” in 10th ACM Multimedia and Security Workshop (MM&Sec ’08), 2008, pp. 11–20.
3. A. E. Dirik and N. Memon, “Image tamper detection based on demosaicing artifacts,” in 16th IEEE Int. Conf. on Image Processing (ICIP ’09), 2009, pp. 1497–1500.
4. A. C. Gallagher and T. Chen, “Image authentication by detecting traces of demosaicing,” in IEEE Computer Vision and Pattern Recognition Workshops (CVPRW 2008), 2008, pp. 1–8.
5. A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” Journal of the Royal Statistical Society: Series B 39, pp. 1–38, 1977.
6. S. Mallat, “A theory for multiresolution signal decomposition: the wavelet representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 674 –693, July 1989.
7. Demo is available on the web site http://iapp.det.unifi.it/index.php?page=source-code_en