image forgery localization via fine-grained analysis of...

Image Forgery Localization via Fine-Grained Analysis of

CFA Artifacts P. Ferrara1, T. Bianchi2, A. De Rosa3, A. Piva1

1 Dipartimento di Ingegneria dell’Informazione – Università di Firenze

2 Dipartimento di Ingegneria Elettronica e delle Telecomunicazioni - Politecnico di Torino 3 Consorzio Nazionale InterUniversitario delle Telecomunicazioni

Information Forensics and Security, IEEE Transactions on , vol.7, no.5, pp.1566,1577, Oct. 2012

Overview

Introduction

• Image Forensics • Color Filter Array and Demosaicing

Proposed Method

• Local analysis of the prediction error • Gaussian Mixtures Model and Expectation-Maximization Algorithm • Bayesian approach for localization

Experimental evaluation

• Model Validation • ROC curves

Conclusions

Fake Images

The spread of digital cameras and photo-editing software allows anyone to create photomontages, without leaving visible traces

You need to ensure the authenticity of digital images in many areas

• Crime investigation • Video-surveillance • Journalism • Assurance • Science • Health • And so on…

Image Forensics

A Digital Forensics branch, whose aim is:

Image Authentication Source identification

•In-camera processing (CFA, demosaicing, white balancing, etc..) •Out-camera processing (compression, tampering, enhancing, etc…)

Digital Fingerprints

Colour Filter Array and Demosaicing

CFA di Bayer

Demosaicing (interpolation) introduces a correlation between pixels These artifacts are periodic, because of the spatial distribution of Color Filter Array

0,

),(),(),(),(vu

vvuxIvuyxIyxe

e(x,y) is a high-pass random signal Usually modeled by a Laplacian or Gaussian distribution e(x,y) is a ciclostationary random process → distribution parameters change periodically

Prediction Error

Digitare l'equazione qui. Value interpolated by a neighborhood of pixels

Prediction Error

Working on Green channel

Bayer Pattern Quincunx of Acquired and Interpolated

pixels

𝑨 is the set of acquired pixels 𝑰 is the set of interpolated pixels

Prediction Error

0,

),(),(),(),(vu

vvuxIvuyxIyxe

Assumption: the interpolation kernel 𝜶 𝒖, 𝒗 is known

If 𝑥, 𝑦 ∈ 𝑨 → 𝒆 𝒙, 𝒚 ~ 𝑵 𝟎, 𝝈𝒍𝑨 If 𝑥, 𝑦 ∈ 𝑰 → 𝒆 𝒙, 𝒚 = 𝟎 ideally

In practice …. 𝒆 𝒙, 𝒚 ~ 𝑵 𝟎, 𝝈𝒍𝑰 where

𝝈𝒍𝑨 > 𝝈𝒍𝑰

Periodic structure Ciclostationarity

State of the Art Expectation-Maximization Algorithm di A.C. Popescu e H.Farid [1] M. Kirchner [2] Dirik and Memon [3] Gallagher and Chen [4]

Absence of periodic

correlation Forgery

Idea

Limitations

Limitations of previous works They work on the whole image or on a large sub-region of the image A fine localization of the tampering is not allowed Performance are content dependent

Innovative approach

Fine-grained localization of artifacts Content independent approach

Overall System

Local variance of prediction error

To estimate 𝝈 for both classes, we employ a local weighted estimate of the variance to limit the dependence from the content.

Weights have a Gaussian shape

𝝈𝒍𝑨(𝒙, 𝒚) is estimated by 𝑨 pixels only

𝝈𝒍𝑰(𝒙, 𝒚) is estimated by 𝑰 pixels only

)},({

)},({log),(

2

2

lkGM

lkGMlkL

lI

lA

L-statistics

We need a metrics to distinguish locally the presence or the absence of periodic patterns

We work on sub-blocks of the image The dimension 𝑩 of the blocks is a multiple of a 𝟐 × 𝟐 pixels, equal to the dimension of Bayer filter

Geometric Mean

Local variance of I pixels in (k,l)-th block

Local variance of A pixels in (k,l)-th block

),0(}|),(Pr{ 2

22 NMlkL

Rationale

Bayer pattern

)}]1,1(log{)},([log{

)}1,(log{)}1,(log{),(

22

22

lklk

lklklkL

lIlI

lAlA

Central Limit Theorem

con

In presence of CFA (M1 Hypothesis)

),(}|),(Pr{ 2

111 NMlkL 01

][][ 22

lIlA EE

Assumption: working on a 𝟐 × 𝟐 block of the green channel

2

lA

2

lA2

lI2

lI

),0(}|),(Pr{ 2

22 NMlkL

Rationale

Bayer pattern

)}]1,1(log{)},([log{

)}1,(log{)}1,(log{),(

22

22

lklk

lklklkL

lIlI

lAlA

Central Limit Theorem

In absence of CFA (M2 Hypothesis)

][][ 22

lIlA EE

Assumption: working on a 𝟐 × 𝟐 block of the green channel

Gaussian Mixture Model

In case of the presence of a forged region, L(k,l) is distributed as a Mixture of Gaussians

To estimate the parameters of the model, we employ

Expectation-Maximization Algorithm

),0()1(),()|( 2

2

2

11 NNxp

Expectation-Maximization Algorithm

Introduced by Dempster, Laird e Rubin on 1977 [5], is an

iterative algorithm employed in regression problems, as Gaussian Mixture models.

It estimates the mean and the variance of component distributions by maximizing the expected value of a complete log-likelihood function with respect to the distribution parameters.

In our case, the mean 𝝁𝟐 is fixed to 0 in estimating the model

Then, we need to estimate 𝜇1, 𝜎1 and 𝜎2

Bayesian Approach to Localization

From Bayes’ Theorem, by assuming that Pr{M1}=Pr{M2}=1/2 (maximum uncertainty)

}|),(Pr{}|),(Pr{

}|),(Pr{)},(|Pr{

21

11

MlkLMlkL

MlkLlkLM

That we semplify

)],([1

1)},(|Pr{ 1

lkLlkLM

Where is a Likelihood Map }|),(Pr{

}|),(Pr{)],([

1

2

MlkL

MlkLlkL

Denoising

This maps are usually noisy

Denoising by:

Increasing the dimension B of the L-statistics

Cumulating C blocks of L-statistics, by assuming L-values as i.i.d. random variables (multiplication)

Filtering: Mean filter

Median filter

Experimental Results

Dataset

Model validation: HOW GOOD IS THE MODEL WE ASSUMED?

Performance evaluation: HOW DOES THE SYSTEM WE PROPOSED WORK?

Dataset • 400 images from different cameras: Canon EOS 450D, Nikon D50, Nikon

D90, Nikon D7000

• Each camera equipped with a known CFA Bayer pattern

• In-camera demosaicing algorithm is unknown

• Each image is cropped to 512x512, maintaining the original Bayer pattern

• 4 different predictors are employed: – Bilinear

– Bicubic

– Median

– Gradient Based

• Absence of CFA artifacts is obtained by a resizing of 200%, a 7x7 median filtering and a subsampling of 50%

Model Validation

• Generalized Gaussian distribution (GGD)

𝑃 𝐿 = 1

𝑍𝑒−( 𝐿−𝜇 𝜂) 𝜐

Shape

Parameter

Median values of shape parameter, estimated for each image by using [6], employing 8x8 dimension of L-statistics

Model Validation

Histograms of mean values of L, in both hypothesis, when predictor matches interpolation algorithm

Bilinear Bicubic Median Gradient-based

Model Validation Bilinear Bicubic Median Gradient-based

Canon EOS 450D

Nikon D50

Nikon D7000

Nikon D90

Predictor

Camera

Performance Evaluation

Image Quality: •Ideal case:

•Uncompressed images with matching between interpolation and prediction algorithms

•Uncompressed images •No matching between interpolation and prediction algorithms

•Compressed images and no matching •JPEG with quality factor 100% •JPEG with quality factor 95% •JPEG with quality factor 90% •JPEG with quality factor 85%

Tampering: a centered square region, with different dimensions (32x32, 64x64 and 128x128), obtained removing artifacts by resizing and filtering.

In general, low-pass processing decreases the performance of the localization

ROC Curve

Localization performance

AUC

𝑅𝑇𝑃

𝑅𝐹𝑃

𝑹𝑻𝑷 = True Positive Rate

𝑹𝑭𝑷 = False Positive Rate

ROC Curve: Predictor

Demosaicing is known Demosaicing is unknown

ROC Curve: Tampering Dimension and filtering 128x128 64x64

32x32 Filtering

Comparison with the State of the Art

DM, method proposed in [3]: based on MSE between interpolated and acquired pixels GC – B, method proposed in [4]: based on blockwise DFT of the variance of the prediction error GC – L , method proposed in [4]: based on sliding DFT of the variance of the prediction error

Resolution is fixed to 8x8 pixels for each block

Examples

DM

GC – B GC – L

Uncompressed images

Examples

DM

JPEG 100% JPEG 95% JPEG 90%

Examples

GC – B


Examples

GC – L


Conclusions A forgery localization algorithm has been devoloped by using the periodic

correlations introduced by demosaicing

The system is based on a fine-grained local analysis of the variance of the prediction error, on a Gaussian Mixture Model and on a Bayesian approach for localizing forgeries

The performance of the system are accurate with high quality factors of the compression

However, it should be remarked that the detection performance is strongly affected by JPEG compression

Due to the presence of uniform or very sharp regions, automatic detection may give a remarkable false positive rate.

Therefore, in order to limit the incidence of false positives human interpretation of the forgery maps is still required.

Reference 1. A. C. Popescu and H. Farid, “Exposing digital forgeries in color filter array interpolated

images,” IEEE Trans. Signal Proc., pp. 3948 – 3959, Oct. 2005. 2. M. Kirchner, “Fast and reliable resampling detection by spectral analysis of fixed linear

prediction residue,” in 10th ACM Multimedia and Security Workshop (MM&Sec ’08), 2008, pp. 11–20.

3. A. E. Dirik and N. Memon, “Image tamper detection based on demosaicing artifacts,” in 16th IEEE Int. Conf. on Image Processing (ICIP ’09), 2009, pp. 1497–1500.

4. A. C. Gallagher and T. Chen, “Image authentication by detecting traces of demosaicing,” in IEEE Computer Vision and Pattern Recognition Workshops (CVPRW 2008), 2008, pp. 1–8.

5. A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” Journal of the Royal Statistical Society: Series B 39, pp. 1–38, 1977.

6. S. Mallat, “A theory for multiresolution signal decomposition: the wavelet representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 674 –693, July 1989.

7. Demo is available on the web site http://iapp.det.unifi.it/index.php?page=source-code_en

http://iapp.det.unifi.it/index.php?page=source-code_en




Thank you for

your kind attention

image forgery localization via fine-grained analysis of...

Documents