multimedia quality of...

Multimedia Quality of

ExperienceRiccardo Amadeo, Ph.D.

Email: [email protected]

Outline

Quality of Service vs Quality of Experience

Multimedia QoE – subjective approach

Multimedia QoE – psycho-visual approach in video quality evaluation

Multimedia QoE – objective approach in image quality evaluation

Quality of Service

The “Quality of Service” (QoS) is the totality of characteristics of a

telecommunications service that bear on its ability to satisfy stated and

implied needs of the user of the service.

Following this definition means having a strong technological approach and

caring about the performance of the systems.

International Telecommunication Union, "ITU-T Rec. E.800: Definitions of terms related to quality of

service," ITU, 2009

QoS

QoS: born with the first multimedia transmissions (radio, phone, tv).

Monitor and solve problems related to a young/unstable/complex technology:

Transmission devices

Protocols (if any…)

Physical aspects (power, distance, environmental condition, etc…)

Examples:

Bandwidth monitoring

Error rates monitoring

Throughput

Transmission delay

…

QoS

1871: first telephone patent filed (Meucci)

1895: first radio transmission (Marconi)

1909: first tv demonstration (Rignoux, Fournier)

1969: ARPANET

1989: World Wide Web (Tim Berners-Lee)

QoS: mature and stable paradigm for telecommunications when Internet and

the WWW were invented.

QoS lead the development of the new technologies.

QoS

Early 2000: the ICT technologies have become extremely popular and cheap

Resource allocation has never been easier

Web 2.0: explosion of user generated content

Smartphones.: explosion of user generated multimedia content

QoS

Evaluating the QoS of a transmission technology means understanding its

capacity to transmit a message (data) in terms of its

Reliability

Availability

Velocity

Delay

QoS tecnhiques for internet networks:

Traffic classification

Routing

Traffic shaping

Scheduling

Traffic control

QoS objective: the perfect optimization of resources.

No human subjectivity is included in the evaluation.

Good for many industries and fields (including legal/contract law)

QoS open discussions

Do service providers/apps need to give the user a 100% optimized service

even if the user would be perfectly satisfied with a 75%, 85% or 95% optimized

one?

Can we accept a six sigma optimization, if the user perceives a low quality

service?

Quality of Experience: definitions

2013: Qualinet, “Qualinet White Paper on Definitions of Quality of Experience

(QoE) and Related Concepts,” Qualinet, Novi Sad

http://www.qualinet.eu/

“Quality” is the outcome of an individual’s comparison and judgment process.

It includes perception, reflection about the perception, and the description of

the outcome.

An “Experience” is an individual’s stream of perception and interpretation of

one or multiple events.

Quality of Experience (QoE) is the degree of delight or annoyance of the user

of an application or service. It results from the fulfillment of his or her

expectations with respect to the utility and/or enjoyment of the application

or service in the light of the user’s personality and current state.

Quality of Experience: definitions

Notes:

«Working definition»

It can be applied to an «application» or a «service»

Multi-disciplinary subject

The switch from the QoS approach to the QoE approach is happening, and will

not be over soon.

The USER is now the center of the service/app evaluation process.

QoE

Features:

Direct perception (e.g., noisiness, blurriness)

Interaction (e.g., response velocity, latency gap)

Usage situation (e.g., stability of the service)

Service (e.g., usability, usefulness)

QoE vs. QoS

QoE is heavily dependant on QoS

Generic relationship focusing on a single factor:

𝑄𝑜𝐸 = 𝑓(𝐼0)

Where parameter 𝐼0 is controlled through QoS means.

The user perceived quality of a content/application is more sensitive to

changes when the quality level is already considered good or better, while

changes in the QoS are rather unperceivable if the QoE is already low.

QoS – QoE model

Law of Weber Fechner: psyhcophysical principle that relates the magnitudeof a physical stimulus and its perceived intensity:

𝑑𝑃 = 𝑎 (𝑑𝑆

𝑑𝑆0)

Where 𝑑𝑃 the differential perception, is directly proportional to the relative

change of the physical stimulus 𝑑𝑆

𝑑𝑆0.

P. Reichl, S. Egger, R. Schatz and A. D'Alconzo, "The Logarithmic Nature of QoE and the Roleof the Weber-Fechner Law in QoE Assessment," in IEEE International Conference onCommunications (ICC), Cape Town, South Africa, 2010

Challenges of QoE

Understanding the fundamentals of quality perception

Guidelines for system design and network planning

QoE metrics and models

QoE monitoring

QoE-centric network management and service management

Subjective Quality Assessment

Interviewing human users to collect measures of the perceived quality of a

service/app/digital content.

Mean Opinion Score: the only officially accepted metric in the current state of the

art.

Setting up a subjective experiment is quite complex and time consuming:

I.-T. P.910, "Subjective video quality assessment methods for multimedia applications,"

ITU’s Telecommunication Standardization Sector, 04/2008.

I.-T. P.911, "Subjective audiovisual quality assessment methods for multimedia

applications," ITU’s Telecommunication Standardization Sector, 12/1998.

I.-T. P.913, "Methods for subjectively assessing audiovisual quality of internet video and

distribution quality television, including separate assessment of video quality and audio

quality, and including multiple environments," ITU’s Telecommunication Standardization

Sector, 01/2014

Core characteristics of a SQ evaluation

Evaluation scale (MOS): continuos or discrete, 5, 10 points or 100 points

Tipology: No Reference, Reduced Reference, Full Reference

Single stimulus/double stimulus (MOS or Differential MOS, DMOS)

Subjects (QoE experts, naive viewers)

Physiological metrics: heartrate, Human Visual System (HVS) behavior

Task performance (speed of goal completion)

Impossible to use in real time environments

Sample test for subjective video quality

assessment

Step 1: Selection of test material

Generally provided by the client of the evaluation/state of the art literature

Digital content has a significant influence on the test results

From 8 to 16 different content must be included in the test (i.e.,

www.vqeg.org, Susie, Waterfall, Tree, Tampete, Mobile & calendar ...)

http://www.vqeg.org/


assessment

Step 2: choice of testers

Experts vs. non-experts

All must pass an eligibility threshold (acuity, color perception)

Snellen Eye Chart

Ishihara test plate

Acceptable: 16 to 24 testers to get statistically acceptable results

Experts only for tests aimed at developing algorithms. For a test that

simulates the perception of an average consumer will choose non-expert

testers.


assessment

Step 3: Design of experiments session

Each subject should not usually watch more than 90 minutes of content, in

sessions of 30 minutes max.

Mode is selected: single-monitor, triple monitor, and physical parameters

(screen size, distance, light)

The external environment plays a role too: brightness and calibration of the

display compared to the light of the room, lab conditions or real-life

conditions… etc.


assessment

Step 3: Design of experiments session

Definition of sessions warm up and re-set.

Generation of instructions to testers.

Choice of test Single or Double Stimulus

Choice of sequences of trial.


assessment

Step 4: Choosing the quality scale

Double stimulus impairment method

5 point discrete scale, 1 to 5 with meaning:

Imperceptible

Perceptible but not annoying

Slightly annoying

Annoying

Very annoying.

Double stimulus quality scale method

Continuous scale (0 to 100) ad hoc application (slider).


assessment

Step 5: data analysis

We obtain the detection averaged quality score over all test participants, for

EACH test sequence and for EACH variable studied

We analyze the statistics to evaluate which variable is significant or which is

irrelevant to the perception of video quality


assessment

Which variables?

Psychological variables (interest and expectations of the user)

Display properties (resolution, physical dimensions, size, viewing distance)

Properties of coding / transmission (bit rate, encoder, video resolution)

Viewing conditions (artificial lighting, contrast of the screen)

Fidelity (due to compression, color saturation, natural)

It also evaluates the sound? (synchronization, sound quality)

Others, defined by the tester

Test for video quality assessment

Role of the Engineer:

Set up of the test and characterize it completely.

Recruit testers and test their reliability

Test and consolidate data

Presentation of statistical results with an analysis of their meaning and

possible interpretation.

Objective Quality Metrics

NON STANDARD

Idea: find mathematical/ML methodologies or metrics that can replicate the

human SQA

Challenge: the mathematical methodology ideally must have a perfect

correlation with the subjective evaluation, because the perceived quality is

subjective by definition.

Objective Quality Metrics

characterization criteria

Target service: VoIP, IPTV, web browsing, video conferencing, etc…

Model type: presence of reference signal.

Application: codec testing, network transmission testing, monitoring, etc.

Model input: Parameters of the description of the processing path.

Model output: overall quality or specific quality aspects, in terms of MOS or

using another equivalent index.

Modelling approach: Psychophysical or Statistical/Empirical

Model type characterization

No Reference: impossible to analyze the original unimpaired content

Reduced Reference: only a subset of the original unimpaired content is

available

Full Reference: the full set of original unimpaired content is available

Most relevant: NR.

The original unimpaired content is on one end of the transmission channel, the

user is subject to the digital content on the other end.

Almost impossible outside lab-conditions to have the know and reverse-engineer all

the elaboration a digital content gose through when transmitted

Video QoE – HVS cognitive process

characteristics

The observers gaze is usually focused on the same regions of the same

stimuli.

RoI, Regions of Interest model

Motion attracts the gaze: the HVS evolved to keep in the center of the fovea

the moving objects. The HVS doesn’t react well to big accelerations of

objects or to unpredictable movements.

Change-blindness

How to define the RoIs?

Eye-Tracking

Eye-Tracking

Saccade: interval of time when the eye is moving from one focus point to

another

Fixation: interval of time when the eye is gathering informations

Gaze-map: representation of the eye-tracked data

Raw map: eye is sampled with a constant frequency

Fixations map: only fixations are shown

Saliency-map: most watched regions of the stimuli.

A Saliency-map is the ground truth for the RoI models of QoE metrics.

Average Gaze Point Weight

Aim: anticipate the human quality ranking of a set of video clips without analyzing the digital content

INPUT DATA? The “involountary” human behavior – not the digital content or a model of the HVS.

Principle:

Eye reacts differently to different quality stimuli

J. Lassalle, L. Gros and G. Coppin, "Combination of Physiological and Subjective Measures to Assess Quality of Experience for Audiovisual Technologies," in Third International Workshop on Quality of Multimedia Experience, Mechelen, Belgium, 2011.

“[…] the viewing strategy changes as a function of the distortion severity […]”

A. Mittal, A. K. Moorthy, W. Geisler and A. C. Bovik, "Task dependence of visual attention on compressed videos: point of gaze statistics and analysis," in Human Vision and Electronic Imaging XVI, San Francisco, CA, 2011.

Semantic dependency

HVS behavior strongly depends on the semantic of the content of the

stimulus.

Principle of general agreement between viewers

R. B. Goldstein, R. L. Woods and E. Peli, "Where people look when watching movies: do all viewers

look at the same place?," Computers in Biology and Medicine, vol. 37, no. 7, pp. 957 - 964, 2007.

L. B. Stelmach, W. J. Tam and P. J. Hearty, "Static and dynamic spatial resolution in image coding:

An investigation of eye movements," in Human Vision, Visual Processing, and Digital Display II, San

Jose, CA, 1992

Memory effect: can only be eliminated by an accurate design of the validation

protocol

The algorithm

Step 1: dataset creation and initial filtering

Step 2: timestamp normalization and clustering

Step 3: Gaze-Maps generation

Step 4: R-dependent voting process analysis

The algorithm: step 1

Required data

Set of stimuli impaired to create different quality levels of the same semantics.

The more different semantics, the better.

Eye-Tracked data. The records must include:

X and Y coordinates of the position of the user’s gaze on the screen while watching each

stimulus.

Timestamps

Initial filtering

Clean the dataset from the outlying measures, usually caused by a misreading of

the device or by the user moving his gaze out of the recording plane

Negative values of the gaze coordinates are excluded.


Each set of gaze records is elaborated in gaze maps through clustering over

intervals of 1 s.

All the records whose timestamp (ms) is included in [𝑡𝑖 , 𝑡𝑖 + 1000] are

averaged to create the clustered Gaze-Point (𝑐𝐺𝑃𝑘) relative to that interval,

for that user, on that stimulus.

The procedure is repeated for each user’s scanpath.

Example 1: from the recorded

scanpath to the clustered Gaze-Points

Different colors show different 1 s intervals. The

simplified example shows a 3 s scanpath.

Fixation and Saccades records have the same

relevance. Our algorithm includes the study of the

saccadic behavior.


Clustered Gaze-Maps (cGM) generation.

Grouping of the elaborated scanpaths obtained from different observers on

the same stimulus.

At the end of this step, to each stimulus corresponds one cGM.

Example 2: from clustered Gaze-

Poins to clustered Gaze-Maps

Video «vid», observer 1

Video «vid», observer n-

1

Video «vid», observer 2

Video «vid», observer

n

Clustered Gaze Map

«vid»

...


R-dependent voting process: self-definition of each cGP weight by the cGP on

the same cGM.

Contribute of 𝑐𝐺𝑃ℎ to the weight of 𝑐𝐺𝑃𝑖:

1, if 𝑐𝐺𝑃ℎ belongs to 𝑐𝐺𝑃𝑖 neighborhood

0, otherwise

Neighborhood of 𝑐𝐺𝑃𝑖: circumference centered in 𝑐𝐺𝑃𝑖, with radius R (pixel).

This analysis is performed on all the clustered Gaze-Maps.

The Average Gaze-Point Weight (𝐴𝐺𝑃𝑊𝑘)of 𝑐𝐺𝑀𝑘 is the average of all the

weights of the cGP on 𝑐𝐺𝑀𝑘. One value for each cGM.

Example 3: R-dependent voting process

Scheme of the cGM for stimulus

«vid», with all the cGP obtained

from all the observers.

The Weight of 𝑐𝐺𝑃𝑟 for 𝑅 = 𝑅0 is 4.

R0

cGP0

Clustered Gaze Map «vid»

Experimental validation

Database of 19 videos, impaired two times: 57 total stimuli taken from

current literature.

Reference, compressed at 450 b/s and compressed at 150 b/s

Three playlists with mixed quality stimuli and no repeated semantic: no memory

effect in the results.

6 testers per playlist, total of 18 people involved. The same setup was used in

(Mittal, et al. 2011) to study the HVS behavior.

Absolute Category rating with a 5-point discrete rating scale. The MOS was

collected after each viewing and averaged.

Elaboration Avg. MOS Variance

Reference 3.77 0.74

Br_450 2.81 0.66

Br_150 1.86 0.42

Results

The indipendent variable is R on both graphs.

1) QLAGPW: average of AGPW over videos of the same quality level.

2) SDQLGPW: standard deviation of AGPW over videos of the same quality

level.

Results

We can identify an interval of R on graph 1 where the curve are completely

monotonic and separated, 10<R<27 (analysis radius in [100; 270]).

The average gaze point weight of videos is inversely proportional to the

percieved quality in that interval.

Graph 2 shows that the SDQLGPW has the same regular behavior: the higher is

the perceived quality; the lower is the Standard Deviation of the GP weight.

Interpretation

The lower Weight of Gaze-Points (given R) of HQ stimuli means that the cGP’s

on an HQ video are more distant from each other.

The lower standard deviation of Gaze-Points weights on HQ videos suggests

that the AGPW difference between HQ videos is inferior than the one in LQ

stimuli.

The lower AGPW of HQ videos suggests that viewers had time and chance to

gaze around the finest details

The higher average weight and standard deviation obtained on LQ stimuli,

instead, suggest that the viewer focused on smaller portions of the screen,

with a high density of “heavy” observations in it and a low number of “light”

observations outside the region of interest.

Image quality and similarity evaluation

Digital images are subject to wide variety of distortions during transmission

Acquisition

Processing

Compression

Storage

Reproduction

Any of these may result in perceivable degradation of quality

So the ultimate aim of data compression is to remove the redundancy from

the source signal. Therefore it reduces the number of bits required to

represent the information contained within the source.

What is Image Quality Assessment?

Image quality is a characteristic of an image thatmeasures the perceived image degradation.

It plays an important role in various image processingapplication.

Goal of image quality assessment is to supply qualitymetrics that can predict perceived image qualityautomatically.

Two Types of image quality assessment

Subjective quality assessment

Objective quality assessment

49

Subjective Quality Measure

Subjective image quality is how a viewer perceives an image,

including giving his or her opinion on a particular image.

The mean opinion score (MOS) has been used for subjective

quality assessment from many years.

Too Inconvenient, time consuming and expensive

50

Example of MOS score

The MOS is generated by avaraging the result of a set of standard, subjective tests.

Mean Opinion Score (MOS)

MOS Quality Impairment

5 Excellent Imperceptible

4 Good Perceptible but not annoying

3 Fair Slightly annoying

2 Poor Annoying

1 Bad Very annoying51

Objective Quality Measure

Mathematical models that approximate results of

subjective quality assessment

Goal of objective quality evaluation is to develop a

quantitative measure that can predict perceived image

quality.

It plays variety of roles, for example:

To monitor and control image quality for quality control systems

To benchmark image processing systems;

To optimize algorithms and parameters;

52

Objective evaluation

Three types of objective evaluation

It is classified according to the availability of an original image with which

distorted image is to be compared

Full reference (FR)

No reference – Blind (NR)

Reduced reference (RR)

53

Full reference quality metrics

MSE and PSNR: the most widely used image “quality”metrics during last 20 years.

SSIM: new metric (2004) shows better results than PSNRwith reasonable computational complexity increasing.

Other metrics were suggested by VQEG, private companiesand universities, but not so popular.

A great effort has been made to develop new objectivequality measures for image/video that incorporateperceptual quality measures by considering the humanvisual system (HVS) characteristics.

54

HVS – Human visual system

Quality assessment (QA) algorithms predict visual quality

by comparing a distorted signal against a reference,

typically by modeling the human visual system.

These measurement methods consider human visual

system (HVS) characteristics in an attempt to incorporate

perceptual quality measures.

55

MSE – Mean square error

MSE and PSNR are defined as

Where x is the original image and y is the distorted image.

M and N are width and height of an image.

L is the dynamic range of the pixel values.

56

Property of MSE

If the MSE decrease to zero, the pixel-by-pixel matching

of the images becomes perfect.

If MSE is small enough, this correspond to a “high

quality” compressed image.

In general MSE value increases as the compression ratio

increases.

57

Original “Einstein” image with different

distortions, MSE value

(a) Original Image MSE=0

(b) MSE=306 (c) MSE=309 (d) MSE=309

(e) MSE=313 (f) MSE=309 (g) MSE=308

58

SSIM – Structural similarity index

Recent proposed approach for image quality assessment.

Method for measuring the similarity between two images.Full reference

metrics

Value lies between [0,1]

The SSIM is designed to improve on traditional metrics like PSNR and MSE,

which have proved to be inconsistant with human eye perception. Based on

human visual system.

59

SSIM measurement system

Fig. 2. Structural Similarity (SSIM) Measurement System60

Example images at different quality

levels and their SSIM index maps

61

Equation for SSIM

If two non negative images placed together:

Mean intensity

Standard deviation

C1, C2 are constant.

62

SSIM components

With c3 = c2/2

Equation for SSIM

Structure comparison is conducted s(x,y) on these normalized signals

(x- µx )/σx and (y- µy )/ σy

α, β and γ are parameters used to adjust the relative importance of the threecomponents.

64

Property of SSIM

Symmetry: S(x,y) = S(y,x)

0 <= S(x,y) <= 1

Unique maximum:

S(x,y) = 1 if and only if x=y

65

MSE vs. MSSIM

66

MSE vs. SSIM simulation result

Type of Noise MSE MSSIM VIF

Salt & Pepper Noise 228.34 0.7237 0.3840

Spackle Noise 225.91 0.4992 0.4117

Gaussian Noise 226.80 0.4489 0.3595

Blurred 225.80 0.7136 0.2071

JPEG compressed 213.55 0.3732 0.1261

Contrast Stretch 406.87 0.9100 1.2128

67

MSE vs. MSSIM

MSE=226.80 MSSIM =0.4489 MSE = 225.91 MSSIM =0.4992

68

MSE vs. MSSIM

MSE = 213.55 MSSIM = 0.3732 MSE = 225.80 MSSIM =0.7136

MSE vs. MSSIM

MSE = 226.80 MSSIM = 0.4489 MSE = 406.87 MSSIM =0.910

Why MSE is poor?

MSE and PSNR are widely used because they are easy to calculate and mathematically easy to deal with for optimization purposes.

There are a number of reasons why MSE or PSNR do not correlate well with the human perception of quality.

Digital pixel values, on which the MSE is typically computed, may not exactly represent the light stimulus entering the eye.

Simple error summation, like the one implemented in the MSE formulation, may be markedly different from the way the HVS and the brain arrives at an assessment of the perceived distortion.

Two distorted image signals with the same amount of error energy may have very different structure of errors, and hence different perceptual quality.

71

Visual Information Fidelity (VIF)

Relies on modeling of the statistical image source, the image distortion

channel and the human visual distortion channel.

VIF was developed for image and video quality measurement based on natural

scene statistics (NSS).

Images come from a common class: the class of natural scene.

72

VIF – Visual Information Fidelity

Mutual information between C and E quantifies the information that the brain could ideally extract from the reference image, whereas the mutual information between C and F quantifies the corresponding information that could be extracted from the testimage [11].

Image quality assessment is done based on information fidelity where the channel imposes fundamental limits on how much information could flow from the source (the reference image), through the channel (the image distortion process) to the receiver (the human observer).

VIF = Distorted Image Information / Reference Image Information

73

VIF quality

The VIF has a distinction over traditional quality

assessment methods, a linear contrast enhancement of

the reference image that does not add noise to it will

result in a VIF value larger than unity, thereby signifying

that the enhanced image has a superior visual quality than

the reference image

No other quality assessment algorithm has the ability to

predict if the visual image quality has been enhanced by a

contrast enhancement operation.

74

SSIM vs. VIF

75

VIF and SSIM

Type of Noise MSE MSSIM VIF

Salt & Pepper Noise 101.78 0.8973 0.6045

Spackle Noise 119.11 0.7054 0.5944

Gaussian Noise 65.01 0.7673 0.6004

Blurred 73.80 0.8695 0.6043

JPEG compressed 49.03 0.8558 0.5999

Contrast Stretch 334.96 0.9276 1.1192

76

VIF and SSIM

VIF = 0.6045 MSSIM = 0.8973 VIF = 0.5944 MSSIM = 0.705477

VIF and SSIM


JPEG compressed Image- Tiffny.bmp

Quality Factor Compression Ratio MSSIM

100 0 1

1 52.79 0.3697

4 44.50 0.4285

7 33.18 0.5041

10 26.81 0.7190

15 20.65 0.7916

20 17.11 0.8158

25 14.72 0.8332

45 9.36 0.8732

60 7.68 0.8944

80 4.85 0.9295

90 3.15 0.9578

99 1.34 0.998480

Comparison of QF, CR and MSSIM

CR= 0 MSSIM = 1 Q.F = 1 CR= 52.79 MSSIM =0.3697 81


Q.F = 4 CR= 44.50 MSSIM = 0.4285 Q.F = 7 CR= 33.18 MSSIM = 0.5041 82


Q.F = 10 CR= 26.81MSSIM = 0.7190 Q.F = 15 CR= 20.65 MSSIM = 0.7916 83


Q.F = 20 CR= 17.11 MSSIM = 0.8158 Q.F = 25 CR= 14.72 MSSIM = 0.8332 84


85Q.F = 45 CR= 9.36 MSSIM = 0.8732 Q.F = 80 CR= 4.85 MSSIM = 0.9295

86Q.F = 45 CR= 3.15 MSSIM = 0.9578 Q.F = 99 CR= 1.34 MSSIM = 0.9984


What is Wavelet Analysis ?

What is a wavelet?

A wavelet is a waveform of effectively

limited duration that has an average value

of zero.

The Continuous Wavelet

Transform (CWT) A mathematical representation of the

Fourier transform:

Meaning: the sum over all time of the

signal f(t) multiplied by a complex

exponential, and the result is the Fourier

coefficients.

dtetfwF iwt)()(

Wavelet Transform (Cont’d)

Those coefficients, when multiplied by a

sinusoid of appropriate frequency, yield

the constituent sinusoidal component of

the original signal:

Wavelet Transform

And the result of the CWT are Wavelet

coefficients .

Multiplying each coefficient by the

appropriately scaled and shifted wavelet

yields the constituent wavelet of the

original signal:

CWT

Reminder: The CWT Is the sum over all

time of the signal, multiplied by scaled

and shifted versions of the wavelet

function

Step 1:

Take a Wavelet and compare

it to a section at the start

of the original signal

CWT

Step 2:

Calculate a number, C, that represents

how closely correlated the wavelet is

with this section of the signal. The

higher C is, the more the similarity.

CWT

Step 3: Shift the wavelet to the right and

repeat steps 1-2 until you’ve covered

the whole signal

CWT

Step 4: Scale (stretch) the wavelet and

repeat steps 1-3

Types of Wavelets

There are many different wavelets, for example:

MorletHaar Daubechies

Continuous Wavelet Transform

Define the continuous wavelet transform of f(x):

This transforms a continuous function of one variable into a continuous

function of two variables: translation and scale

The wavelet coefficients measure how closely correlated the wavelet is with

each section of the signal

Discrete Wavelet Transform

Don’t need to calculate wavelet coefficients at every possible scale.

Can choose scales based on powers of two, and get equivalent accuracy.

We can represent a discrete function f(n) as a weighted summation of

wavelets plus a coarse approximation (where j0 is an arbitrary

starting scale, and n = 0,1,2, … M)

Discrete wavelet transform

Approximations and Details:

Approximations: High-scale, low-

frequency components of the signal

Details: low-scale, high-frequency

components

Input Signal

LPF

HPF

Haar Wavelet

Scaling function

Mother function

Wavelet

The mother wavelet must be

Normalized:

Finite:

Zero mean:

Lena

Let’s consider an N × N image as two dimensional pixel array I with N rows

and N columns.

Complex Wavelet SSIM

It extends the SSIM to the complex wavelet domain

It is a a “general purpose” similarity index according to the creators.

Robust to small rotations and translations

Complex wavelets used: those that can be written as a low pass-filter.

The structural information of local image features is carried by the relative

phase patterns of wavelet coefficients and because the contrast phase shift

does not change the structure of the local image feature.

CWSSIM

Feature Point Hash scheme

Content based image similarity (Copy detection)

Perceptual Image Hash Functions

Wavelet based detection algorithm to extract feature points

Dictionary of base features (SIFT)

GSSIM

Non state of the art

Remembering that

We define

GSSIM

That is generalized to:

And finally, with an appropriate choice of coefficients

Or:

Experimental validation

TID 2008 and TID 2013

TID 2008:

TID 2008

Experimental validation: ROC curves and

empirical probability

Used to evaluate a classifier

X: false positive rate, Y: true positive rate

Perfect classifier: FPR = 0, TPR = 1

Area Under Curve (AUC)

Empirical probability for metric comparisons.

AUC for GSSIM with different father wavelets

Empirical probabilty of GSSIM in its

HAAR implementation

Comparison with the state of the art

metric

GSSIM for QOE

GSSIM for perceptual image similarity

multimedia quality of...

Documents