the role of speculation vs. fundamentals in the recent oil
TRANSCRIPT
Part I: Introductory MaterialsIntroduction to Graph Theory
Dr. Nagiza F. SamatovaDepartment of Computer ScienceNorth Carolina State University
andComputer Science and Mathematics Division
Oak Ridge National Laboratory
2
Graphs
Graph with 7 nodes and 16 edges
UndirectedEdges
Nodes / Vertices
Directed
1 2
( , )
{ , ,..., }
{ ( , ) | , , 1,..., }n
k i j i j
G V E
V v v v
E e v v v v V k m
=== = ∈ =
( , ) ( , )i j j iv v v v= ( , ) ( , )i j j iv v v v≠
3
Types of Graphs
• Undirected vs. Directed
• Attributed/Labeled (e.g., vertex, edge) vs. Unlabeled
• Weighted vs. Unweighted
• General vs. Bipartite (Multipartite)
• Trees (no cycles)
• Hypergraphs
• Simple vs. w/ loops vs. w/ multi-edges
4
Labeled Graphs and Induced Subgraphs
Bold: A subgraph induced by vertices b, c and d
Labeled graph w/ loops
Graph Automorphism
6
Which graphs are automorphic?
Automorphism is isomorphism that preserves the labels.
(A) (B) (C)B
Vertex degree, in-degree, out-degree
77
Directed
headtail
t h
In-degree of the vertex is the number of in-coming edges
Out-degree of the vertex is the number of out-going edges
Degree of the vertex is the number of edges (both in- & out-degree)
8
Graph Representation and Formats
• Adjacency Matrix (vertex vs. vertex)
• Incidence Matrix (vertex vs. edge)
• Sparse vs. Dense Matrices
• DIMACS file format
• In R: igraph object
9
Adjacency Matrix Representation
A(1) A(2)
B (6)
A(4)
B (5)
A(3)
B (7) B (8)
A(1) A(2) A(3) A(4) B(5) B(6) B(7) B(8)A(1) 1 1 1 0 1 0 0 0A(2) 1 1 0 1 0 1 0 0A(3) 1 0 1 1 0 0 1 0A(4) 0 1 1 1 0 0 0 1B(5) 1 0 0 0 1 1 1 0B(6) 0 1 0 0 1 1 0 1B(7) 0 0 1 0 1 0 1 1B(8) 0 0 0 1 0 1 1 1
A(2) A(1)
B (6)
A(4)
B (7)
A(3)
B (5) B (8)
A(1) A(2) A(3) A(4) B(5) B(6) B(7) B(8)A(1) 1 1 0 1 0 1 0 0A(2) 1 1 1 0 0 0 1 0A(3) 0 1 1 1 1 0 0 0A(4) 1 0 1 1 0 0 0 1B(5) 0 0 1 0 1 0 1 1B(6) 1 0 0 0 0 1 1 1B(7) 0 1 0 0 1 1 1 0B(8) 0 0 0 1 1 1 0 1
Representation is NOT unique. Algorithms can be order-sensitive.
Src: “Introduction to Data Mining” by Kumar et al
Families of Graphs
10
• Cliques• Path and simple path• Cycle• Tree• Connected graphs
Read the book chapter for definitions and examples.
12
The CLIQUE Problem
Maximum Clique of Size 5
Clique: a complete subgraph
Maximal Clique: a cliquecannot be enlarged by adding any more vertices
Maximum Clique: the largest maximal clique in the graph
{ , | has a clique of size }CLIQUE G k G k= < >
13
Does this graph contain a 4-clique?
Indeed it does!
But, if it had not,
what evidence would have been needed?
14
Problem: Decision, Optimization or Search
Problem
Decision Optimization Search
Formulate each version for the CLIQUE problem.
(self-reduction)“Yes”-”No” Parameter k �max/min Actual solution
•Which problem is harder to solve?• If we solve Decision problem, can we use it for the others?
Enumeration
All solutions
15
Refresher: Class P and Class NP
Definition: P (NP) is the class of languages/problems that are decidable in polynomial time on a (non-)deterministic single-tape Turing machine.
Class
P ????NP
( )k
k
P DTIME n=U ( )k
k
NP NTIME n=U
non-polynomial
Non-deterministic polynomialPolynomially verifiable
16
PSPACE∑2
P
… …
“forget about it”
P vs. NP
The Classic Complexity Theory View:
P NP
“easy”
“hard”
“About ten years ago some computer scientists came by and said they heard we have some really cool problems. They showed that the problems are NP-complete and went away!”
17
Classical Graph Theory ProblemsCSC505:Algorithms, CSC707 :Complexity Theory, CSC5??:Graph Theory
• Longest Path
• Maximum Clique
• Minimum Vertex Cover
• Hamiltonian Path/Cycle
• Traveling Salesman (TSP)
• Maximum Independent Set
• Minimum Dominating Set
• Graph/Subgraph Isomorphism
• Maximum Common Subgraph
• …
NP-hardProblems
18
Graph Mining ProblemsCSC 422/522 and Our Book
• Clustering + Maximal Clique Enumeration
• Classification
• Association Rule Mining +Frequent Subgraph Mining
• Anomaly Detection
• Similarity/Dissimilarity/Distance Measures
• Graph-based Dimension Reduction
• Link Analysis
• …
Many graph mining problems have to deal with classical graph problems as part of its data mining pipeline.
19
Dealing with Computational Intractability
• Exact Algorithms:
– Small graph problems
– Small parameters to graph problems
– Special classes of graphs (e.g., bounded tree-width)
• Approximation Polynomial-Time Algorithms (O(nc))
– Guaranteed error-bar on the solution
• Heuristic Polynomial-Time Algorithms
– No guarantee on the quality of the solution
– Low degree polynomial solutions
Our focus