directed-graph epidemiological models of computer viruses presented by: (kelvin) weiguo jin “…...
TRANSCRIPT
Directed-Graph Epidemiological Models of Computer Viruses
Presented by: (Kelvin) Weiguo Jin
“… (we) adapt the techniques of mathematical epidemiology to the study of computer virus propagation. … We conclude that an imperfect defense against computer viruses can still be highly effective in preventing their widespread proliferation …”
Jeffery O. Kephart and Steve R. White
-- Proceedings of the 1991 IEEE Computer Society Symposium on Research in Security and Privacy, California, 1991.
Directed-Graph Epidemiological Models of Computer Viruses CompSci725 Oral Presentation 224-May-2002
1. Outline
2. Computer Virus Recap 3. Motivation 4. Methodologies 5. Result 6. Conclusion 7. Question
Directed-Graph Epidemiological Models of Computer Viruses CompSci725 Oral Presentation 324-May-2002
2. Computer Virus Recap Impact of computer virus
“Hard cost” vs. “soft costs” US$2.62 billion – “Code Red” in 2001 -- NewsFactor Network
Cohen’s pioneering work on computer virus Transitive closure of information flow No algorithm can perfectly detect all possible viruses
Security flaw Taxonomy by genesis: intentional, malicious?
A computer virus is executable code that, when run by someone, infects or attaches itself to other executable code in a computer in an effort to reproduce itself.
Directed-Graph Epidemiological Models of Computer Viruses CompSci725 Oral Presentation 424-May-2002
3. Motivation
Biological analogy Neural network, artificial life, etc.
Mathematical Modeling Aid in evaluation and development of general policies and
heuristics for inhibiting the spread of viruses Apply to a particular epidemic
Previous epidemiological models and limitations Origin from 1760 Homogenous
Epidemiology is the study of the distribution and determinants of health-related states or events in specified populations and the application of this study to the control of health problems.
Directed-Graph Epidemiological Models of Computer Viruses CompSci725 Oral Presentation 524-May-2002
4. Methodologies
Modeling computer systems & viral epidemics Developing analytical techniques & approximations
SIS model on a random graph Deterministic approximation Probabilistic analysis Weak links
Hierarchical model Spatial model
Simulations
Directed-Graph Epidemiological Models of Computer Viruses CompSci725 Oral Presentation 624-May-2002
4a. Methodologies: Modeling Directed Graph
Assumptions Ignore the details of infection within a node A small set of discrete states
– i.e. infected or susceptible Ignore how virus transmitted among nodes
Notation Node: an individual system
– cure rate Arc: infectable individuals
– infection rate
SIS model on a random graph Susceptible infected Susceptible Random graph with N nodes and edge probability p
– p N( N - 1)
4
Infection rate (birth rate of a virus): the probability per unit time that a particular infected individual will infect a particular uninfected individualCure rate (death rate of a virus): the probability per unit time for an infected individual to be cured.
1
2
3
4
5
6
72
24
Directed-Graph Epidemiological Models of Computer Viruses CompSci725 Oral Presentation 724-May-2002
4b. Methodologies: SIS model
Deterministic approximation (DA)Deterministic differential equation:
Solution:
where
Interpretation The fraction of infected individuals decays exponentially from i0 0 when
grows exponentially from i0 when
iiibdt
di )1(
teii
iti
)(00
0
)1(
)1()(
(1)
(2)
/
)0(0 tii
)p(Nb 1: the expected no. of edges emanating from this node
(connectivity)
: the no. of uninfected nodes that can be infected by this node)1(_
ib _
b : the average total infection rate of this
node)1( iI I
: system-wide infection rate
: system-wide cure rate
: the total number of infected nodes at time t,)(tI NtIti /)()(
11
1
Directed-Graph Epidemiological Models of Computer Viruses CompSci725 Oral Presentation 824-May-2002
4c. Methodologies: Other Techniques
DA ignores the stochastic features of a virus Size of fluctuations in the number of infected individuals, …
Probabilistic Analysis (PA) p(I, t) – the probability distribution for the no. of infected
individuals I at time t. PA corroborates DA results when , however, an
epidemic may not happen even .
Weak links (Sparse systems) Infrequent program sharing with others
Hierarchical Model (Localised systems) Hierarchy of cliques
11
Directed-Graph Epidemiological Models of Computer Viruses CompSci725 Oral Presentation 924-May-2002
5. Results: Simulations It is shown that
Comparison of I(t) as given by deterministic theory and a typical simulation run on a randomly-generated graph with 100 nodes
Comparison of I(t) in the deterministic and stochastic models.
Black curve: deterministic I(t). White curve: stochastic average of I(t). Gray area: One standard deviation about the stochastic average.
The final equilibrium values differ by only 0.3%.
0.1 2.0 2.0 8.01
Directed-Graph Epidemiological Models of Computer Viruses CompSci725 Oral Presentation 1024-May-2002
6. Conclusion Directed random graph model & Three different
techniques Deterministic approximation, probabilistic approximation and
simulation Theoretical results
Homogeneous systems (fully-connected graphs) Epidemic threshold When , an epidemic occurs with probability , The number of
infections increases exponentially, and saturates at an equilibrium of N(1-p)
When , an epidemic is not certain Sparse systems
Epidemic threshold < 1, slow growth rate and depressed equilibrium level
Localised systesm The growth of number of infections is sub-exponential
1
1
1/ 1
Directed-Graph Epidemiological Models of Computer Viruses CompSci725 Oral Presentation 1124-May-2002
7. Question??
According to previous theoretical analysis, how can we prevent the widespread of computer viruses? What are the considerations when developing anti-virus policies and heuristics?
Is it still necessary to update the anti-virus software periodically, why or why not?