multimedia quality of...
TRANSCRIPT
Outline
Quality of Service vs Quality of Experience
Multimedia QoE – subjective approach
Multimedia QoE – psycho-visual approach in video quality evaluation
Multimedia QoE – objective approach in image quality evaluation
Quality of Service
The “Quality of Service” (QoS) is the totality of characteristics of a
telecommunications service that bear on its ability to satisfy stated and
implied needs of the user of the service.
Following this definition means having a strong technological approach and
caring about the performance of the systems.
International Telecommunication Union, "ITU-T Rec. E.800: Definitions of terms related to quality of
service," ITU, 2009
QoS
QoS: born with the first multimedia transmissions (radio, phone, tv).
Monitor and solve problems related to a young/unstable/complex technology:
Transmission devices
Protocols (if any…)
Physical aspects (power, distance, environmental condition, etc…)
Examples:
Bandwidth monitoring
Error rates monitoring
Throughput
Transmission delay
…
QoS
1871: first telephone patent filed (Meucci)
1895: first radio transmission (Marconi)
1909: first tv demonstration (Rignoux, Fournier)
1969: ARPANET
1989: World Wide Web (Tim Berners-Lee)
QoS: mature and stable paradigm for telecommunications when Internet and
the WWW were invented.
QoS lead the development of the new technologies.
QoS
Early 2000: the ICT technologies have become extremely popular and cheap
Resource allocation has never been easier
Web 2.0: explosion of user generated content
Smartphones.: explosion of user generated multimedia content
QoS
Evaluating the QoS of a transmission technology means understanding its
capacity to transmit a message (data) in terms of its
Reliability
Availability
Velocity
Delay
QoS tecnhiques for internet networks:
Traffic classification
Routing
Traffic shaping
Scheduling
Traffic control
QoS objective: the perfect optimization of resources.
No human subjectivity is included in the evaluation.
Good for many industries and fields (including legal/contract law)
QoS open discussions
Do service providers/apps need to give the user a 100% optimized service
even if the user would be perfectly satisfied with a 75%, 85% or 95% optimized
one?
Can we accept a six sigma optimization, if the user perceives a low quality
service?
Quality of Experience: definitions
2013: Qualinet, “Qualinet White Paper on Definitions of Quality of Experience
(QoE) and Related Concepts,” Qualinet, Novi Sad
http://www.qualinet.eu/
“Quality” is the outcome of an individual’s comparison and judgment process.
It includes perception, reflection about the perception, and the description of
the outcome.
An “Experience” is an individual’s stream of perception and interpretation of
one or multiple events.
Quality of Experience (QoE) is the degree of delight or annoyance of the user
of an application or service. It results from the fulfillment of his or her
expectations with respect to the utility and/or enjoyment of the application
or service in the light of the user’s personality and current state.
Quality of Experience: definitions
Notes:
«Working definition»
It can be applied to an «application» or a «service»
Multi-disciplinary subject
The switch from the QoS approach to the QoE approach is happening, and will
not be over soon.
The USER is now the center of the service/app evaluation process.
QoE
Features:
Direct perception (e.g., noisiness, blurriness)
Interaction (e.g., response velocity, latency gap)
Usage situation (e.g., stability of the service)
Service (e.g., usability, usefulness)
QoE vs. QoS
QoE is heavily dependant on QoS
Generic relationship focusing on a single factor:
𝑄𝑜𝐸 = 𝑓(𝐼0)
Where parameter 𝐼0 is controlled through QoS means.
The user perceived quality of a content/application is more sensitive to
changes when the quality level is already considered good or better, while
changes in the QoS are rather unperceivable if the QoE is already low.
QoS – QoE model
Law of Weber Fechner: psyhcophysical principle that relates the magnitudeof a physical stimulus and its perceived intensity:
𝑑𝑃 = 𝑎 (𝑑𝑆
𝑑𝑆0)
Where 𝑑𝑃 the differential perception, is directly proportional to the relative
change of the physical stimulus 𝑑𝑆
𝑑𝑆0.
P. Reichl, S. Egger, R. Schatz and A. D'Alconzo, "The Logarithmic Nature of QoE and the Roleof the Weber-Fechner Law in QoE Assessment," in IEEE International Conference onCommunications (ICC), Cape Town, South Africa, 2010
Challenges of QoE
Understanding the fundamentals of quality perception
Guidelines for system design and network planning
QoE metrics and models
QoE monitoring
QoE-centric network management and service management
Subjective Quality Assessment
Interviewing human users to collect measures of the perceived quality of a
service/app/digital content.
Mean Opinion Score: the only officially accepted metric in the current state of the
art.
Setting up a subjective experiment is quite complex and time consuming:
I.-T. P.910, "Subjective video quality assessment methods for multimedia applications,"
ITU’s Telecommunication Standardization Sector, 04/2008.
I.-T. P.911, "Subjective audiovisual quality assessment methods for multimedia
applications," ITU’s Telecommunication Standardization Sector, 12/1998.
I.-T. P.913, "Methods for subjectively assessing audiovisual quality of internet video and
distribution quality television, including separate assessment of video quality and audio
quality, and including multiple environments," ITU’s Telecommunication Standardization
Sector, 01/2014
Core characteristics of a SQ evaluation
Evaluation scale (MOS): continuos or discrete, 5, 10 points or 100 points
Tipology: No Reference, Reduced Reference, Full Reference
Single stimulus/double stimulus (MOS or Differential MOS, DMOS)
Subjects (QoE experts, naive viewers)
Physiological metrics: heartrate, Human Visual System (HVS) behavior
Task performance (speed of goal completion)
Impossible to use in real time environments
Sample test for subjective video quality
assessment
Step 1: Selection of test material
Generally provided by the client of the evaluation/state of the art literature
Digital content has a significant influence on the test results
From 8 to 16 different content must be included in the test (i.e.,
www.vqeg.org, Susie, Waterfall, Tree, Tampete, Mobile & calendar ...)
Sample test for subjective video quality
assessment
Step 2: choice of testers
Experts vs. non-experts
All must pass an eligibility threshold (acuity, color perception)
Snellen Eye Chart
Ishihara test plate
Acceptable: 16 to 24 testers to get statistically acceptable results
Experts only for tests aimed at developing algorithms. For a test that
simulates the perception of an average consumer will choose non-expert
testers.
Sample test for subjective video quality
assessment
Step 3: Design of experiments session
Each subject should not usually watch more than 90 minutes of content, in
sessions of 30 minutes max.
Mode is selected: single-monitor, triple monitor, and physical parameters
(screen size, distance, light)
The external environment plays a role too: brightness and calibration of the
display compared to the light of the room, lab conditions or real-life
conditions… etc.
Sample test for subjective video quality
assessment
Step 3: Design of experiments session
Definition of sessions warm up and re-set.
Generation of instructions to testers.
Choice of test Single or Double Stimulus
Choice of sequences of trial.
Sample test for subjective video quality
assessment
Step 4: Choosing the quality scale
Double stimulus impairment method
5 point discrete scale, 1 to 5 with meaning:
Imperceptible
Perceptible but not annoying
Slightly annoying
Annoying
Very annoying.
Double stimulus quality scale method
Continuous scale (0 to 100) ad hoc application (slider).
Sample test for subjective video quality
assessment
Step 5: data analysis
We obtain the detection averaged quality score over all test participants, for
EACH test sequence and for EACH variable studied
We analyze the statistics to evaluate which variable is significant or which is
irrelevant to the perception of video quality
Sample test for subjective video quality
assessment
Which variables?
Psychological variables (interest and expectations of the user)
Display properties (resolution, physical dimensions, size, viewing distance)
Properties of coding / transmission (bit rate, encoder, video resolution)
Viewing conditions (artificial lighting, contrast of the screen)
Fidelity (due to compression, color saturation, natural)
It also evaluates the sound? (synchronization, sound quality)
Others, defined by the tester
Test for video quality assessment
Role of the Engineer:
Set up of the test and characterize it completely.
Recruit testers and test their reliability
Test and consolidate data
Presentation of statistical results with an analysis of their meaning and
possible interpretation.
Objective Quality Metrics
NON STANDARD
Idea: find mathematical/ML methodologies or metrics that can replicate the
human SQA
Challenge: the mathematical methodology ideally must have a perfect
correlation with the subjective evaluation, because the perceived quality is
subjective by definition.
Objective Quality Metrics
characterization criteria
Target service: VoIP, IPTV, web browsing, video conferencing, etc…
Model type: presence of reference signal.
Application: codec testing, network transmission testing, monitoring, etc.
Model input: Parameters of the description of the processing path.
Model output: overall quality or specific quality aspects, in terms of MOS or
using another equivalent index.
Modelling approach: Psychophysical or Statistical/Empirical
Model type characterization
No Reference: impossible to analyze the original unimpaired content
Reduced Reference: only a subset of the original unimpaired content is
available
Full Reference: the full set of original unimpaired content is available
Most relevant: NR.
The original unimpaired content is on one end of the transmission channel, the
user is subject to the digital content on the other end.
Almost impossible outside lab-conditions to have the know and reverse-engineer all
the elaboration a digital content gose through when transmitted
Video QoE – HVS cognitive process
characteristics
The observers gaze is usually focused on the same regions of the same
stimuli.
RoI, Regions of Interest model
Motion attracts the gaze: the HVS evolved to keep in the center of the fovea
the moving objects. The HVS doesn’t react well to big accelerations of
objects or to unpredictable movements.
Change-blindness
How to define the RoIs?
Eye-Tracking
Eye-Tracking
Saccade: interval of time when the eye is moving from one focus point to
another
Fixation: interval of time when the eye is gathering informations
Gaze-map: representation of the eye-tracked data
Raw map: eye is sampled with a constant frequency
Fixations map: only fixations are shown
Saliency-map: most watched regions of the stimuli.
A Saliency-map is the ground truth for the RoI models of QoE metrics.
Average Gaze Point Weight
Aim: anticipate the human quality ranking of a set of video clips without analyzing the digital content
INPUT DATA? The “involountary” human behavior – not the digital content or a model of the HVS.
Principle:
Eye reacts differently to different quality stimuli
J. Lassalle, L. Gros and G. Coppin, "Combination of Physiological and Subjective Measures to Assess Quality of Experience for Audiovisual Technologies," in Third International Workshop on Quality of Multimedia Experience, Mechelen, Belgium, 2011.
“[…] the viewing strategy changes as a function of the distortion severity […]”
A. Mittal, A. K. Moorthy, W. Geisler and A. C. Bovik, "Task dependence of visual attention on compressed videos: point of gaze statistics and analysis," in Human Vision and Electronic Imaging XVI, San Francisco, CA, 2011.
Semantic dependency
HVS behavior strongly depends on the semantic of the content of the
stimulus.
Principle of general agreement between viewers
R. B. Goldstein, R. L. Woods and E. Peli, "Where people look when watching movies: do all viewers
look at the same place?," Computers in Biology and Medicine, vol. 37, no. 7, pp. 957 - 964, 2007.
L. B. Stelmach, W. J. Tam and P. J. Hearty, "Static and dynamic spatial resolution in image coding:
An investigation of eye movements," in Human Vision, Visual Processing, and Digital Display II, San
Jose, CA, 1992
Memory effect: can only be eliminated by an accurate design of the validation
protocol
The algorithm
Step 1: dataset creation and initial filtering
Step 2: timestamp normalization and clustering
Step 3: Gaze-Maps generation
Step 4: R-dependent voting process analysis
The algorithm: step 1
Required data
Set of stimuli impaired to create different quality levels of the same semantics.
The more different semantics, the better.
Eye-Tracked data. The records must include:
X and Y coordinates of the position of the user’s gaze on the screen while watching each
stimulus.
Timestamps
Initial filtering
Clean the dataset from the outlying measures, usually caused by a misreading of
the device or by the user moving his gaze out of the recording plane
Negative values of the gaze coordinates are excluded.
The algorithm: step 2
Each set of gaze records is elaborated in gaze maps through clustering over
intervals of 1 s.
All the records whose timestamp (ms) is included in [𝑡𝑖 , 𝑡𝑖 + 1000] are
averaged to create the clustered Gaze-Point (𝑐𝐺𝑃𝑘) relative to that interval,
for that user, on that stimulus.
The procedure is repeated for each user’s scanpath.
Example 1: from the recorded
scanpath to the clustered Gaze-Points
Different colors show different 1 s intervals. The
simplified example shows a 3 s scanpath.
Fixation and Saccades records have the same
relevance. Our algorithm includes the study of the
saccadic behavior.
The algorithm: step 3
Clustered Gaze-Maps (cGM) generation.
Grouping of the elaborated scanpaths obtained from different observers on
the same stimulus.
At the end of this step, to each stimulus corresponds one cGM.
Example 2: from clustered Gaze-
Poins to clustered Gaze-Maps
Video «vid», observer 1
Video «vid», observer n-
1
Video «vid», observer 2
Video «vid», observer
n
Clustered Gaze Map
«vid»
...
The algorithm: step 4
R-dependent voting process: self-definition of each cGP weight by the cGP on
the same cGM.
Contribute of 𝑐𝐺𝑃ℎ to the weight of 𝑐𝐺𝑃𝑖:
1, if 𝑐𝐺𝑃ℎ belongs to 𝑐𝐺𝑃𝑖 neighborhood
0, otherwise
Neighborhood of 𝑐𝐺𝑃𝑖: circumference centered in 𝑐𝐺𝑃𝑖, with radius R (pixel).
This analysis is performed on all the clustered Gaze-Maps.
The Average Gaze-Point Weight (𝐴𝐺𝑃𝑊𝑘)of 𝑐𝐺𝑀𝑘 is the average of all the
weights of the cGP on 𝑐𝐺𝑀𝑘. One value for each cGM.
Example 3: R-dependent voting process
Scheme of the cGM for stimulus
«vid», with all the cGP obtained
from all the observers.
The Weight of 𝑐𝐺𝑃𝑟 for 𝑅 = 𝑅0 is 4.
R0
cGP0
Clustered Gaze Map «vid»
Experimental validation
Database of 19 videos, impaired two times: 57 total stimuli taken from
current literature.
Reference, compressed at 450 b/s and compressed at 150 b/s
Three playlists with mixed quality stimuli and no repeated semantic: no memory
effect in the results.
6 testers per playlist, total of 18 people involved. The same setup was used in
(Mittal, et al. 2011) to study the HVS behavior.
Absolute Category rating with a 5-point discrete rating scale. The MOS was
collected after each viewing and averaged.
Elaboration Avg. MOS Variance
Reference 3.77 0.74
Br_450 2.81 0.66
Br_150 1.86 0.42
Results
The indipendent variable is R on both graphs.
1) QLAGPW: average of AGPW over videos of the same quality level.
2) SDQLGPW: standard deviation of AGPW over videos of the same quality
level.
Results
We can identify an interval of R on graph 1 where the curve are completely
monotonic and separated, 10<R<27 (analysis radius in [100; 270]).
The average gaze point weight of videos is inversely proportional to the
percieved quality in that interval.
Graph 2 shows that the SDQLGPW has the same regular behavior: the higher is
the perceived quality; the lower is the Standard Deviation of the GP weight.
Interpretation
The lower Weight of Gaze-Points (given R) of HQ stimuli means that the cGP’s
on an HQ video are more distant from each other.
The lower standard deviation of Gaze-Points weights on HQ videos suggests
that the AGPW difference between HQ videos is inferior than the one in LQ
stimuli.
The lower AGPW of HQ videos suggests that viewers had time and chance to
gaze around the finest details
The higher average weight and standard deviation obtained on LQ stimuli,
instead, suggest that the viewer focused on smaller portions of the screen,
with a high density of “heavy” observations in it and a low number of “light”
observations outside the region of interest.
Image quality and similarity evaluation
Digital images are subject to wide variety of distortions during transmission
Acquisition
Processing
Compression
Storage
Reproduction
Any of these may result in perceivable degradation of quality
So the ultimate aim of data compression is to remove the redundancy from
the source signal. Therefore it reduces the number of bits required to
represent the information contained within the source.
What is Image Quality Assessment?
Image quality is a characteristic of an image thatmeasures the perceived image degradation.
It plays an important role in various image processingapplication.
Goal of image quality assessment is to supply qualitymetrics that can predict perceived image qualityautomatically.
Two Types of image quality assessment
Subjective quality assessment
Objective quality assessment
49
Subjective Quality Measure
Subjective image quality is how a viewer perceives an image,
including giving his or her opinion on a particular image.
The mean opinion score (MOS) has been used for subjective
quality assessment from many years.
Too Inconvenient, time consuming and expensive
50
Example of MOS score
The MOS is generated by avaraging the result of a set of standard, subjective tests.
Mean Opinion Score (MOS)
MOS Quality Impairment
5 Excellent Imperceptible
4 Good Perceptible but not annoying
3 Fair Slightly annoying
2 Poor Annoying
1 Bad Very annoying51
Objective Quality Measure
Mathematical models that approximate results of
subjective quality assessment
Goal of objective quality evaluation is to develop a
quantitative measure that can predict perceived image
quality.
It plays variety of roles, for example:
To monitor and control image quality for quality control systems
To benchmark image processing systems;
To optimize algorithms and parameters;
52
Objective evaluation
Three types of objective evaluation
It is classified according to the availability of an original image with which
distorted image is to be compared
Full reference (FR)
No reference – Blind (NR)
Reduced reference (RR)
53
Full reference quality metrics
MSE and PSNR: the most widely used image “quality”metrics during last 20 years.
SSIM: new metric (2004) shows better results than PSNRwith reasonable computational complexity increasing.
Other metrics were suggested by VQEG, private companiesand universities, but not so popular.
A great effort has been made to develop new objectivequality measures for image/video that incorporateperceptual quality measures by considering the humanvisual system (HVS) characteristics.
54
HVS – Human visual system
Quality assessment (QA) algorithms predict visual quality
by comparing a distorted signal against a reference,
typically by modeling the human visual system.
These measurement methods consider human visual
system (HVS) characteristics in an attempt to incorporate
perceptual quality measures.
55
MSE – Mean square error
MSE and PSNR are defined as
Where x is the original image and y is the distorted image.
M and N are width and height of an image.
L is the dynamic range of the pixel values.
56
Property of MSE
If the MSE decrease to zero, the pixel-by-pixel matching
of the images becomes perfect.
If MSE is small enough, this correspond to a “high
quality” compressed image.
In general MSE value increases as the compression ratio
increases.
57
Original “Einstein” image with different
distortions, MSE value
(a) Original Image MSE=0
(b) MSE=306 (c) MSE=309 (d) MSE=309
(e) MSE=313 (f) MSE=309 (g) MSE=308
58
SSIM – Structural similarity index
Recent proposed approach for image quality assessment.
Method for measuring the similarity between two images.Full reference
metrics
Value lies between [0,1]
The SSIM is designed to improve on traditional metrics like PSNR and MSE,
which have proved to be inconsistant with human eye perception. Based on
human visual system.
59
SSIM measurement system
Fig. 2. Structural Similarity (SSIM) Measurement System60
Example images at different quality
levels and their SSIM index maps
61
Equation for SSIM
If two non negative images placed together:
Mean intensity
Standard deviation
C1, C2 are constant.
62
SSIM components
With c3 = c2/2
Equation for SSIM
Structure comparison is conducted s(x,y) on these normalized signals
(x- µx )/σx and (y- µy )/ σy
α, β and γ are parameters used to adjust the relative importance of the threecomponents.
64
Property of SSIM
Symmetry: S(x,y) = S(y,x)
0 <= S(x,y) <= 1
Unique maximum:
S(x,y) = 1 if and only if x=y
65
MSE vs. MSSIM
66
MSE vs. SSIM simulation result
Type of Noise MSE MSSIM VIF
Salt & Pepper Noise 228.34 0.7237 0.3840
Spackle Noise 225.91 0.4992 0.4117
Gaussian Noise 226.80 0.4489 0.3595
Blurred 225.80 0.7136 0.2071
JPEG compressed 213.55 0.3732 0.1261
Contrast Stretch 406.87 0.9100 1.2128
67
MSE vs. MSSIM
MSE=226.80 MSSIM =0.4489 MSE = 225.91 MSSIM =0.4992
68
MSE vs. MSSIM
MSE = 213.55 MSSIM = 0.3732 MSE = 225.80 MSSIM =0.7136
MSE vs. MSSIM
MSE = 226.80 MSSIM = 0.4489 MSE = 406.87 MSSIM =0.910
Why MSE is poor?
MSE and PSNR are widely used because they are easy to calculate and mathematically easy to deal with for optimization purposes.
There are a number of reasons why MSE or PSNR do not correlate well with the human perception of quality.
Digital pixel values, on which the MSE is typically computed, may not exactly represent the light stimulus entering the eye.
Simple error summation, like the one implemented in the MSE formulation, may be markedly different from the way the HVS and the brain arrives at an assessment of the perceived distortion.
Two distorted image signals with the same amount of error energy may have very different structure of errors, and hence different perceptual quality.
71
Visual Information Fidelity (VIF)
Relies on modeling of the statistical image source, the image distortion
channel and the human visual distortion channel.
VIF was developed for image and video quality measurement based on natural
scene statistics (NSS).
Images come from a common class: the class of natural scene.
72
VIF – Visual Information Fidelity
Mutual information between C and E quantifies the information that the brain could ideally extract from the reference image, whereas the mutual information between C and F quantifies the corresponding information that could be extracted from the testimage [11].
Image quality assessment is done based on information fidelity where the channel imposes fundamental limits on how much information could flow from the source (the reference image), through the channel (the image distortion process) to the receiver (the human observer).
VIF = Distorted Image Information / Reference Image Information
73
VIF quality
The VIF has a distinction over traditional quality
assessment methods, a linear contrast enhancement of
the reference image that does not add noise to it will
result in a VIF value larger than unity, thereby signifying
that the enhanced image has a superior visual quality than
the reference image
No other quality assessment algorithm has the ability to
predict if the visual image quality has been enhanced by a
contrast enhancement operation.
74
SSIM vs. VIF
75
VIF and SSIM
Type of Noise MSE MSSIM VIF
Salt & Pepper Noise 101.78 0.8973 0.6045
Spackle Noise 119.11 0.7054 0.5944
Gaussian Noise 65.01 0.7673 0.6004
Blurred 73.80 0.8695 0.6043
JPEG compressed 49.03 0.8558 0.5999
Contrast Stretch 334.96 0.9276 1.1192
76
VIF and SSIM
VIF = 0.6045 MSSIM = 0.8973 VIF = 0.5944 MSSIM = 0.705477
VIF and SSIM
VIF = 0.60 MSSIM = 0.7673 VIF = 0.6043 MSSIM = 0.869578
VIF and SSIM
VIF = 0.5999 MSSIM = 0.8558 VIF = 1.11 MSSIM = 0.927279
JPEG compressed Image- Tiffny.bmp
Quality Factor Compression Ratio MSSIM
100 0 1
1 52.79 0.3697
4 44.50 0.4285
7 33.18 0.5041
10 26.81 0.7190
15 20.65 0.7916
20 17.11 0.8158
25 14.72 0.8332
45 9.36 0.8732
60 7.68 0.8944
80 4.85 0.9295
90 3.15 0.9578
99 1.34 0.998480
Comparison of QF, CR and MSSIM
CR= 0 MSSIM = 1 Q.F = 1 CR= 52.79 MSSIM =0.3697 81
Comparison of QF, CR and MSSIM
Q.F = 4 CR= 44.50 MSSIM = 0.4285 Q.F = 7 CR= 33.18 MSSIM = 0.5041 82
Comparison of QF, CR and MSSIM
Q.F = 10 CR= 26.81MSSIM = 0.7190 Q.F = 15 CR= 20.65 MSSIM = 0.7916 83
Comparison of QF, CR and MSSIM
Q.F = 20 CR= 17.11 MSSIM = 0.8158 Q.F = 25 CR= 14.72 MSSIM = 0.8332 84
Comparison of QF, CR and MSSIM
85Q.F = 45 CR= 9.36 MSSIM = 0.8732 Q.F = 80 CR= 4.85 MSSIM = 0.9295
86Q.F = 45 CR= 3.15 MSSIM = 0.9578 Q.F = 99 CR= 1.34 MSSIM = 0.9984
Comparison of QF, CR and MSSIM
What is Wavelet Analysis ?
What is a wavelet?
A wavelet is a waveform of effectively
limited duration that has an average value
of zero.
The Continuous Wavelet
Transform (CWT) A mathematical representation of the
Fourier transform:
Meaning: the sum over all time of the
signal f(t) multiplied by a complex
exponential, and the result is the Fourier
coefficients.
dtetfwF iwt)()(
Wavelet Transform (Cont’d)
Those coefficients, when multiplied by a
sinusoid of appropriate frequency, yield
the constituent sinusoidal component of
the original signal:
Wavelet Transform
And the result of the CWT are Wavelet
coefficients .
Multiplying each coefficient by the
appropriately scaled and shifted wavelet
yields the constituent wavelet of the
original signal:
CWT
Reminder: The CWT Is the sum over all
time of the signal, multiplied by scaled
and shifted versions of the wavelet
function
Step 1:
Take a Wavelet and compare
it to a section at the start
of the original signal
CWT
Step 2:
Calculate a number, C, that represents
how closely correlated the wavelet is
with this section of the signal. The
higher C is, the more the similarity.
CWT
Step 3: Shift the wavelet to the right and
repeat steps 1-2 until you’ve covered
the whole signal
CWT
Step 4: Scale (stretch) the wavelet and
repeat steps 1-3
Types of Wavelets
There are many different wavelets, for example:
MorletHaar Daubechies
Continuous Wavelet Transform
Define the continuous wavelet transform of f(x):
This transforms a continuous function of one variable into a continuous
function of two variables: translation and scale
The wavelet coefficients measure how closely correlated the wavelet is with
each section of the signal
Discrete Wavelet Transform
Don’t need to calculate wavelet coefficients at every possible scale.
Can choose scales based on powers of two, and get equivalent accuracy.
We can represent a discrete function f(n) as a weighted summation of
wavelets plus a coarse approximation (where j0 is an arbitrary
starting scale, and n = 0,1,2, … M)
Discrete wavelet transform
Approximations and Details:
Approximations: High-scale, low-
frequency components of the signal
Details: low-scale, high-frequency
components
Input Signal
LPF
HPF
Haar Wavelet
Scaling function
Mother function
Wavelet
The mother wavelet must be
Normalized:
Finite:
Zero mean:
Lena
Let’s consider an N × N image as two dimensional pixel array I with N rows
and N columns.
Lena
Complex Wavelet SSIM
It extends the SSIM to the complex wavelet domain
It is a a “general purpose” similarity index according to the creators.
Robust to small rotations and translations
Complex wavelets used: those that can be written as a low pass-filter.
The structural information of local image features is carried by the relative
phase patterns of wavelet coefficients and because the contrast phase shift
does not change the structure of the local image feature.
CWSSIM
Feature Point Hash scheme
Content based image similarity (Copy detection)
Perceptual Image Hash Functions
Wavelet based detection algorithm to extract feature points
Dictionary of base features (SIFT)
GSSIM
Non state of the art
Remembering that
We define
GSSIM
That is generalized to:
And finally, with an appropriate choice of coefficients
Or:
Experimental validation
TID 2008 and TID 2013
TID 2008:
TID 2008
Experimental validation: ROC curves and
empirical probability
Used to evaluate a classifier
X: false positive rate, Y: true positive rate
Perfect classifier: FPR = 0, TPR = 1
Area Under Curve (AUC)
Empirical probability for metric comparisons.
AUC for GSSIM with different father wavelets
Empirical probabilty of GSSIM in its
HAAR implementation
Comparison with the state of the art
metric
GSSIM for QOE
GSSIM for perceptual image similarity
GSSIM for perceptual image similarity
GSSIM for perceptual image similarity