algorithmic problems in the internet
DESCRIPTION
Algorithmic Problems in the Internet. Christos H. Papadimitriou www.cs.berkeley.edu/~christos. Goals of TCS (1950-2000): - PowerPoint PPT PresentationTRANSCRIPT
Algorithmic Problems in the Internet
Christos H. Papadimitriouwww.cs.berkeley.edu/~christos
Iowa State, April 2003 2
Goals of TCS (1950-2000): Develop a productive mathematical understanding
of the capabilities and limitations of the von Neumann computer and its software (the dominant and most novel computational artifacts of that time);Mathematical tools: combinatorics, logic
What should the goals of TCS be today?(and what math tools will be handy?)
Iowa State, April 2003 3
Iowa State, April 2003 4
The Internet
• huge, growing, open, emergent, mysterious• built, operated and used by a multitude
of diverse economic interests• as information repository: open, huge,
available, unstructured, critical• foundational understanding urgently
needed
5Iowa State, April 2003
Today…
• Games and mechanism design• Getting lost in the web• The Internet’s heavy tail
Iowa State, April 2003 6
Games, games…strategies
strategies3,-2
payoffs
(NB: also, many players)
Iowa State, April 2003 7
1,-1 -1,1
-1,1 1,-1
0,0 0,1
1,0 -1,-1
3,3 0,4
4,0 1,1
matching pennies prisoner’s dilemma
chicken
e.g.
Iowa State, April 2003 8
Nash equilibrium
• Definition: double best response (problem: may not exist)
• randomized Nash equilibriumTheorem [Nash 1952]: Always exists.
• Problem: there are usually many...
Iowa State, April 2003 9
The price of anarchycost of worst Nash equilibrium
“socially optimum” cost [Koutsoupias and P, 1998]
in networkrouting
= 2 [Roughgarden and Tardos, 2000,Roughgargen 2002]
Iowa State, April 2003 10
mechanism design(or inverse game theory)
• agents have utilities – but these utilities are known only to them
• game designer prefers certain outcomes depending on players’ utilities
• designed game (mechanism) has designer’s goals as dominating strategies
Iowa State, April 2003 11
e.g., Vickrey auction
• sealed-highest-bid auction encourages gaming and speculation
• Vickrey auction: Highest bidder wins, pays second-highest bid
Theorem: Vickrey auction is a truthful mechanism.
(Theorem: It maximizes social benefit and auctioneer expected revenue.)
Iowa State, April 2003 12
Vickrey shortest paths
6
63
45
1110
3
ts
pay e Vc(e) = its declared cost c(e),plus a bonus equal to dist(s,t)|c(e) = - dist(s,t)
Iowa State, April 2003 13
Problem:
ts
11
1
1
1
10
Iowa State, April 2003 14
But…• …in the Internet Vickrey overcharge would
be only about 30% on the average [FPSS 2002]
• Could this be the manifestation of rational behavior at network creation?
• [FPSS 2002]: Vickrey charges– Depend on origin and destination– Can be computed on top of BGP
Iowa State, April 2003 15
But… (cont)
• [FPSS 2002]: Vickrey charges– Depend on origin and destination– Can be computed on top of BGP
• [with Mihail and Saberi, 2003]– They are small in expectation in random
graphs.– (Also: Why traffic grows moderately as the
Internet grows…)
Iowa State, April 2003 16
The web as a graphcf: [Google 98], [Kleinberg 98]
• how do you sample the web? [Bar-Yossef, Berg, Chien, Fakcharoenphol,
Weitz, VLDB 2000]
• e.g.: 42% of web documents are in html. How do you find that?
• What is a “random” web document?
17Iowa State, April 2003
documents
hyperlinks
Idea: random walk
Problems:
1. asymmetric 2. uneven degree3. 2nd eigenvalue?
= 0.99999
Iowa State, April 2003 18
The web walker: results
• mixing time is ~log N/(1-)• WW mixing time: 3,000,000• actual WW mixing time: 100
• .com 49%, .jp 9%, .edu 7%, .cn 0.8%
Iowa State, April 2003 19
Q: Is the web a random graph?• Many K3,3’s (“communities”)• Indegrees/outdegrees obey “power laws”
• Model [Kumar et al, FOCS 2000]: copying
Iowa State, April 2003 20
Also the Internet
• [Faloutsos3 1999] the degrees of the Internet are power law distributed
• Both autonomous systems graph and router graph
• Eigenvalues: ditto!??!• Model?
Iowa State, April 2003 21
The world according to Zipf
• Power laws, Zipf’s law, heavy tails,…• i-th largest is ~ i-a (cities, words: a = 1,
“Zipf’s Law”)• Equivalently: prob[greater than x] ~ x -b
• (compare with law of large numbers)• “the signature of human activity”
Iowa State, April 2003 22
Models
• Size-independent growth (“the rich get richer,” or random walk in log paper)
• Growing number of growing cities• In the web: copying links [Kumar et al,
2000]• Carlson and Doyle 1999: Highly optimized
tolerance (HOT)
Iowa State, April 2003 23
Our model [with Fabrikant and Koutsoupias, 2002]:
minj < i [ dij + hopj]
Iowa State, April 2003 24
Theorem:
• if < const, then graph is a star degree = n -1• if > n, then there is exponential
concentration of degrees prob(degree > x) < exp(-ax)• otherwise, if const < < n, heavy tail: prob(degree > x) > x -b
Iowa State, April 2003 25
Heuristically optimized tradeoffs
• Also: file sizes (trade-off between communication costs and file overhead)
• Power law distributions seem to come from tradeoffs between conflicting objectives (a signature of human activity?)
• cf HOT, [Mandelbrot 1954]• Other examples? • General theorem?
Iowa State, April 2003 26
PS: eigenvalues
Model: Edge [i,j] has prob. ~ di dj
Theorem [with Mihail, 2002]: If the di’s obey a power law, then the nb largest eigenvalues are almost surely very close to d1, d2, d3, …
(NB: The eigenvalue exponent observed in Faloutsos3 is about ½ of the degree exponent)
Corollary: Spectral methods are of dubious value in the presence of large features