1 the zipf seminars at emu-um sunday, april 06, 2014 after zipf: from city size distributions to...

73
1 The Zipf Seminars at EMU-UM Saturday, July 2, 2022 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of how cities talk to each other Michael Batty & Yichun Xie UCL EMU [email protected] [email protected] http://www.casa.ucl.ac.uk/ http://www.ceita.emich.edu /

Upload: bailey-bolton

Post on 28-Mar-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

1 The Zipf Seminars at EMU-UM

Monday, April 10, 2023

After Zipf: From City Size Distributions to Simulations

Or why we find it hard to build models of how cities talk to each other

Michael Batty & Yichun XieUCL [email protected] [email protected] http://www.casa.ucl.ac.uk/ http://www.ceita.emich.edu/

Page 2: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

2 The Zipf Seminars at EMU-UM

What we will do in this talk

1. Continue Tom and John’s discussion of Zipf’s Law in particular and scaling in urban systems in general from last week

2. Review very briefly what this area is about from last week

3. Review the key problems – power functions v. lognormal, fat tails, thin tails, primate cities

4. Note the basic stochastic models where cities do not talk to each other but do produce ‘good’ simulations. Illustrate such a simulation.

Page 3: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

3 The Zipf Seminars at EMU-UM

What we will do in this talk

5. Outline some more examples of Zipf’s Law in terms of data applications – countries, spatial partitions, telecoms systems, the geography of citations

6. Note how connectivity or interaction is entering the debate through social networks and the web

7.

Page 4: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

4 The Zipf Seminars at EMU-UM

Zipf’s Law …

Says that in a set of well-defined objects like words (or cities ?), the size of any object (is inversely proportional to its size; and in the strict Zipf case this inverse relation is

This is the strict form because the power is -1 which gives it somewhat mystical properties but a more general form is the inverse power form

1 Krr

KPr

KrPr

Page 5: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

5 The Zipf Seminars at EMU-UM

In one sense, this is obvious – in a competitive system where resources are scarce, it is intuitively obvious that there are less big things than small things

And when you have a system in which big things ‘grow’ from small things, this is even more obvious

But why should the slope be -1 and why should the form be inverse power

In fact as we shall see and as Tom intimated last week this is highly questionable

Page 6: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

6 The Zipf Seminars at EMU-UM

Here are some classic examples from last weekFirst from Zipf (1949)

Page 7: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

7 The Zipf Seminars at EMU-UM

Now from Tom (2003) – top 135 cities

Page 8: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

8 The Zipf Seminars at EMU-UM

As you can see the curve is not quite straight but slightly curved – this is significant but there are some obvious problems

• Most researchers have taken the top 100 or so cities– they have disregarded the bottom but what happens at the bottom is where it all begins – where growth starts – the short tail

• Cities are not well defined objects – they grow into each other

• 3. Cities do not keep their place in the rank order –but shift but the order stays stable – how ?

• 4 Primate cities are problematic at the top of the long tail

Page 9: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

9 The Zipf Seminars at EMU-UM

Let’s look at some cities, countries, & spatial partitions

USA-3149 citiesR-sq = 0.992 = -0.81

Mexico-36 citiesR-sq = 0.927 = -1.27

World-216 countriesR-sq = 0.708 = -2.26

UK-459 areasR-sq = 0.760 = -0.58

Page 10: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

10 The Zipf Seminars at EMU-UM

Basically what these relations show is that as soon as you define something a little bit different from cities, you get Zipf exponents which are nowhere near unity. In fact it would seem that for countries we have much greater inequality than cities which in turn is much greater than for exhaustive spatial divisions

Now to show how different this all is, then I will show yet another set of countries where there are now only 149 countries, not 216 – from another standard data set (MapInfo)

Page 11: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

11 The Zipf Seminars at EMU-UM

0123456789

10

0 0.5 1 1.5 2 2.5

Log

Pop

ulat

ion

Log Rank

Page 12: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

12 The Zipf Seminars at EMU-UM

0123456789

10

0 0.5 1 1.5 2 2.5

Log

Pop

ulat

ion

Log Rank

The King or Primate City Effect

Scaling only over restricted orders of magnitude

A different regime in the thin tail

Page 13: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

13 The Zipf Seminars at EMU-UM

Log Population versus Log Rank

02468

1012

0 1 2 3log rank

log

po

pu

lati

on

Residuals against Rank Orders

-3

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

0 0.5 1 1.5 2 2.5

log rank

Re

sid

ua

ls99.1157.10 rPr

36.015.4 rPr

Page 14: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

14 The Zipf Seminars at EMU-UM

Related Problems

• Scaling - many indeed most distributions are not power functions

• The events are not independent - in medieval times they may have been but for the last 200 years, cities have grown into each other, nations have become entirely urbanized, and now there are global cities - the tragedy in NY tells us this - where more than half of those killed were not US citizens

• Should we expect scaling ? We know that cities depend on history as well as economic growth

Page 15: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

15 The Zipf Seminars at EMU-UM

• Confusion over Zipf exponents and their value• Why should we expect no characteristic length

scale - when the world is finite ? We should avoid the sin of ‘Asymptopia’.

• As scaling is often said to be the signature of self-organization, why should we expect disparate and distant places to self-organize ?

• The primate city effect is very dominant in historically old countries

• BUT should we expect these differences to disappear as the world becomes global ?

Page 16: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

16 The Zipf Seminars at EMU-UM

Let’s first look at arbitrary events - An Example for the UK based on Administrative Units, not on trying to define cities as separate fields

These are 458 admin units, somewhat less than full cities in many cases and some containing towns in county aggregates - we have data from 1901 to 1991 so we can also look at the dynamics of change - traditional rank size theory says very little about dynamics

Page 17: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

17 The Zipf Seminars at EMU-UM

Log Population

Log Rank

Page 18: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

18 The Zipf Seminars at EMU-UM

Page 19: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

19 The Zipf Seminars at EMU-UM

Year t Correlation R2 Intercept Kt tKtP 101* Slope t

1901 0.879 6.547 3526157.772 -0.8171911 0.880 6.579 3801260.554 -0.8101921 0.887 6.604 4025650.857 -0.8121931 0.892 6.607 4046932.207 -0.8021941 0.865 6.532 3410371.276 -0.7401951 0.869 6.482 3034245.953 -0.7001961 0.830 6.414 2595897.640 -0.6511971 0.815 6.322 2101166.738 -0.6011981 0.816 6.321 2095242.746 -0.6011991 0.791 6.272 1872348.019 -0.577

This is what we get when we fit the rank size relation Pr=P1 r - to the data. The parameter is hardly 1 but it is more than 1.99 which was the value for world population in 1994

Page 20: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

20 The Zipf Seminars at EMU-UM

A Digression –Many other systems show such rank size – here we will look at geography of scientific citation –the Highly Cited

Table 2: Top Ten Ranking of Highly Cited Scientists by Country

Rank Country

No. Highly

Cited

No of Places

Concentration: Scientists/Places

Highly Cited per

Million Population

1

US 815

90

9.06

3.16

2 UK 100 24 4.17 1.72

3 Germany 62 21 2.95 0.78

4 Canada 42 15 2.80 1.53

5 Japan 34 14 2.43 0.27

6 France 29 11 2.64 0.50

8 Switzerland 26 5 5.20 3.78

9 Sweden 17 2 8.50 1.96

10 Italy 17

10 1.7 0.29

Page 21: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

21 The Zipf Seminars at EMU-UM

Table 1: Top Twenty Ranking of Highly Cited Scientists by Institution

Rankings Research Institution

No of Highly Cited Scientists

Percent Highly Cited Scientists

1

Harvard

52

4.3

2 Stanford 36 2.9 3 U-Cal, San Diego 30 2.5 4 MIT 26 2.1 5 NIH National Cancer Institute 19 1.6 6 U-Cal, San Francisco

Cornell 17 1.4

8 U-Cal, Berkeley University College London UK

16 1.3

10 CalTech 15 1.2 11 NIH Allergy & Infectious Diseases 13 1.1 12 Johns Hopkins

University of Cambridge UK Washington, Seattle Washington, St Louis

12 1.0

16 U-Cal, Davis U-Texas Cancer Center

11 0.9

18 Michigan Northwestern Yale

10 0.8

Page 22: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

22 The Zipf Seminars at EMU-UM

-2

-1.5

-1

-0.5

0

0.5

1

1.5

-3 -2.5 -2 -1.5 -1 -0.5 0

ln [r/M]

ln [

P(x

)/<

x>]

Rank-Size Distributions of Highly Cited Scientists

red institution, black place, grey by countrystraightline fits

by institution (red)

)2.80( )5.90(

0.938 ,429,/ln816.0555.0)(ln 2

RMMrxxP

by place/city (black)

)8.76( )3.94(

0.962 ,232,/ln049.1768.0)(ln 2

RMMrxxP

by country (grey)

)6.21( )232(

0.949 ,27,/ln997.1583.1)(ln 2

.

RMMrxxP

by country (grey)

Page 23: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

23 The Zipf Seminars at EMU-UM

The Highly Cited By Place

Page 24: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

24 The Zipf Seminars at EMU-UM

Explaining City Size Distributions Using Multiplicative Processes

The last 10 years has seen many attempts to explain scaling distributions such as these using various simple stochastic processes. Most do not take any account of the fact that cities compete – talk to each other.

In essence, the easiest is a model of proportionate effect or growth first used for economic systems by Gibrat in 1931 which leads to the lognormal distribution

Page 25: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

25 The Zipf Seminars at EMU-UM

The key idea is that the change in size of the object in question is proportional to the size of the object and randomly chosen, that is

This leads to the log of differences across time being a function of the sum of random changes

This gives the model of proportionate effect

itit

it

P

P

t

iiit PP0

0loglog

ititit PP 1

Page 26: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

26 The Zipf Seminars at EMU-UM

Year t Correlation R2 Intercept Kt tKtP 101* Slope t

1 1 0 1 0900 0.840 -1.077 0.083 -0.7771000 0.844 -0.995 0.101 -0.824

Here’s a simulation which shows that the lognormal is generated with much the same properties as the observed data for UK

Note how long it takes for the lognormal to emerge, note also the switches in rank – too many I think for this to be realistic

Page 27: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

27 The Zipf Seminars at EMU-UM

-4.5

-4

-3.5

-3

-2.5

-2

-1.5

-1

0 0.5 1 1.5 2 2.5 3

tt=1000=1000

tt=900=900

Log of RankLog of Rank

tt=1000 Population based =1000 Population based on on tt=900 Ranks=900 Ranks

Log

of

Pop

ulat

ion

Sha

res

Log

of

Pop

ulat

ion

Sha

res

Page 28: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

28 The Zipf Seminars at EMU-UM

This is a good model to show the persistence of settlements, it is consistent with what we know about urban morphology in terms of fractal laws, but it is not spatial.

In fact to demonstrate how this model works let me run a short simulation based on independent events – cities on a 20 x 20 lattice using the Gibrat process – here it is

Page 29: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

29 The Zipf Seminars at EMU-UM

Other Stochastic Processes which have been used to explain scaling

1. The Simon model - birth processes are introduced

2. Multiplicative random growth with constraints on the lowest size - size is not allowed to become too small otherwise the event is removed: Solomon’s model; Sornette’s work

3. Work on growth rates consistent with scaling involving Levy distributions – Stanley’s work

4. Economic variants – Gabaix, Krugman, LSE group, Dutch group, Reed etc

Page 30: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

30 The Zipf Seminars at EMU-UM

Dynamics of Rank-Size: Applications

We will now look again at countries and population change and then at penetration of telecoms devices by country

We have country data from 1980 to 2000, and telecoms data over the same period – we are interested in the dynamics – we can measure changes using the so-called Havlin plot defined as

Page 31: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

31 The Zipf Seminars at EMU-UM

This is the average difference in ranks over N cities or countries with respect to two time periods j and k.

So at each time we can plot a curve of differences away from that time in terms of every other time period.

This lets us identify big shifts in rank and thus unusual dynamics.

2/12

N

rrR i

ikij

jk

Page 32: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

32 The Zipf Seminars at EMU-UM

This is population of countries

Page 33: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

33 The Zipf Seminars at EMU-UM

And the average rank distances

Page 34: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

34 The Zipf Seminars at EMU-UM

This is the telecoms data

Page 35: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

35 The Zipf Seminars at EMU-UM

And the average rank distances

Page 36: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

36 The Zipf Seminars at EMU-UM

Some More Issues

Note the way systems grow in terms of the telecoms data

Note the fact that there is no connectivity at all in these systems

Let’s finish by looking at connectivity – how cities talk to each other – can we say anything at all about models that take such interactions into account – its another seminar but let us sketch some ideas

Page 37: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

37 The Zipf Seminars at EMU-UM

Networks and Scaling

These are distributions where the events are unambiguous or less ambiguous - the distribution of links in and out of nodes defining networks have been shown to be scaling by many people over the last four years, notably by Barabasi and his Notre Dame group and by Huberman and his Xerox Parc now HP Internet Ecologies group

Here we take a look at the distribution of in-degrees and out-degrees formed by links relating to web pages - a web page is pretty unambiguous, and s is a link unlike a city. This is some work that we did in 1999 at CASA.

Page 38: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

38 The Zipf Seminars at EMU-UM

This is based on some network data produced by Martin Dodge and Naru Shiode in CASA from their web crawlers

Page 39: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

39 The Zipf Seminars at EMU-UM

Number of Web Pagesand Total Links - indegrees and outdegrees

These are taken from relevant searches of AltaVista for 180 domains in 1999

Note the notion of a system which is immature – in terms of the lognormal form

Page 40: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

40 The Zipf Seminars at EMU-UM

Number of Web Pages,Total Links, GDP and Total World Populations

Page 41: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

41 The Zipf Seminars at EMU-UM

As a general conclusion, it does not look as though the event size issue has much to do with the scaling or lack of it.We urgently need some work on spatial systems with fixed event areas, thus shifting the focus to densities not distributions

Distribution Intercept log K Slope -q Correlation r2 P’(1)/P(1)No. Web Pages 21.22 2.91 0.90 35.84Total Links 18.60 1.60 0.92 1.35Incoming Links 21.48 2.98 0.89 37.28Outgoing Links 17.83 1.46 0.91 1.03GDP 11.98 2.18 0.80 22.67Population 23.39 2.00 0.72 12.64

Page 42: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

42 The Zipf Seminars at EMU-UM

Two regimes for the in-degrees and out-degrees

tribution Slope –q1 forupper ranks

Correlation r2

for upper ranksSlope –q2 forlower ranks

Correlation r2

for lower ranksw2q2 / w1q1

. Web Pages 0.88 0.97 4.25 0.98 31.05al Links 0.86 0.97 2.07 0.91 15.47

Incoming Links 1.04 0.98 4.49 0.97 26.30tgoing Links 0.78 0.97 1.87 0.88 17.29P 1.22 0.99 3.25 0.80 5.65

pulation 1.01 0.91 2.80 0.73 1.31

Page 43: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

43 The Zipf Seminars at EMU-UM

The Key Issues: Where do we go from here?

Scaling can be shown to be consistent with more micro-based, hence richer, less parsimonious models; but there is a disjunction between work on spatial fractals such as in our 1994 book Fractal Cities (Academic Press)and the rank size rule – very hard to know how to build consistent models that work at the spatial level and give fractal relations which translate into city size distributions

Page 44: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

44 The Zipf Seminars at EMU-UM

Resources

ReferencesPapersWeb Resources

We will assemble a list and put these on a web site. I will out this power point on the China Data Center site like Tom and John’s from last week if I can penetrate the Chinese walls of EMU ! Take a look at our web site where at least the web paper can be downloaded from the publications section and some of the work on cyberspace is reported

http://www.casa.ucl.ac.uk/http://www.cybergeography.org/http://www.casa.ucl.ac.uk/citations/

Page 45: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

45 The Zipf Seminars at EMU-UM

Network Approaches to ScalingNetwork Approaches to Scaling

Here we take a look at the distribution of Here we take a look at the distribution of indegrees and outdegrees formed by links indegrees and outdegrees formed by links relating to web pages - a web page is pretty relating to web pages - a web page is pretty unambiguous. There is a lot of work on this unambiguous. There is a lot of work on this produced during the last three years, notably produced during the last three years, notably the Xerox Parc group & the Notre Dame groupthe Xerox Parc group & the Notre Dame group

let me start with some notions of about graphslet me start with some notions of about graphs

Page 46: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

46 The Zipf Seminars at EMU-UM

On the left a random On the left a random graph, whose distribution graph, whose distribution of the numbers/density of of the numbers/density of links at each node is near links at each node is near normal - this has a normal - this has a characteristic length - the characteristic length - the averageaverage

On the left, what is much On the left, what is much more typical - a graph more typical - a graph which is scaling - one which is scaling - one whose distribution is rank whose distribution is rank size, following a power lawsize, following a power law

P(k) ~ kP(k) ~ k - 2.5 - 2.5

Page 47: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

47 The Zipf Seminars at EMU-UM

Not only does the topology of web Not only does the topology of web pages follow power lawspages follow power laws

so does the physical hardware - the so does the physical hardware - the routers and wiresrouters and wires

This and the last diagram are taken This and the last diagram are taken from the article by Barabasi called from the article by Barabasi called “The Physics of the Web” printed in “The Physics of the Web” printed in the July 2001 issue of the July 2001 issue of Physics Physics WorldWorld

Page 48: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

48 The Zipf Seminars at EMU-UM

Some statistics from Steve’s work - which imply scale free networks

Lots and lots of issues here - we need models of how Lots and lots of issues here - we need models of how networks grow and form, how does the small world effect networks grow and form, how does the small world effect mesh into scale free networks ? We need to map mesh into scale free networks ? We need to map cyberspace onto real space and back, and this is no more cyberspace onto real space and back, and this is no more than mapping social space onto real space and back - its than mapping social space onto real space and back - its not new.not new.

I will finishI will finish

Page 49: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

49 The Zipf Seminars at EMU-UM

Page 50: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

50 The Zipf Seminars at EMU-UM

Page 51: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

51 The Zipf Seminars at EMU-UM

Page 52: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

52 The Zipf Seminars at EMU-UM

Page 53: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

53 The Zipf Seminars at EMU-UM

Links as indegrees and outdegrees compared to the Total Links

Page 54: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

54 The Zipf Seminars at EMU-UM

Page 55: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

55 The Zipf Seminars at EMU-UM

Page 56: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

56 The Zipf Seminars at EMU-UM

Page 57: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

57 The Zipf Seminars at EMU-UM

Number of Web Number of Web PagesPagesand Total Links - and Total Links - indegrees and indegrees and outdegreesoutdegrees

These are taken These are taken from relevant from relevant searches of searches of AltaVista for 180 AltaVista for 180 domains in 1999domains in 1999

Page 58: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

58 The Zipf Seminars at EMU-UM

Network Approaches to ScalingNetwork Approaches to Scaling

Here we take a look at the distribution of Here we take a look at the distribution of indegrees and outdegrees formed by links indegrees and outdegrees formed by links relating to web pages - a web page is pretty relating to web pages - a web page is pretty unambiguous. There is a lot of work on this unambiguous. There is a lot of work on this produced during the last three years, notably produced during the last three years, notably the Xerox Parc group & the Notre Dame groupthe Xerox Parc group & the Notre Dame group

let me start with some notions of about graphslet me start with some notions of about graphs

Page 59: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

59 The Zipf Seminars at EMU-UM

As an introductory example, I will repeat what I say in the editorial I handed out on ‘small worlds’. You can read this laterThe term ‘small worlds’ was first ‘coined’ in psychology and sociology in the 1960s by Stanley Milgram but remained a talking point only, for 30 years largely because there was

1. No technical apparatus to measure connectivity in very large graphs - where you have say more than 1 million nodes2. There was no real way in which one could handle processes taking place on graphs3. There was not much thinking about how real graphs structures evolved - through time

Page 60: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

60 The Zipf Seminars at EMU-UM

All these points needed to be resolved before one could get anywhere and they are slowly being resolved.

An example of a small world - a kind of connectivity in graphs

Page 61: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

61 The Zipf Seminars at EMU-UM

Examples:•Evolution of transport systems in big cities•What makes small spaces in cities attractive and livable in•Spread of disease - foot and mouth for example•How social systems hold together•Academic communities, like us•Nervous systems, how particles interact, WWW etc

Page 62: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

62 The Zipf Seminars at EMU-UM

Some of the most interesting work is being done in virtual space - in cyberspace not in real space. Here is an example of such a network

Page 63: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

63 The Zipf Seminars at EMU-UM

The world wide web is a small world as are most systems that don’t break apart under tension - thing about cities that break apart - London currently with the fact that no decent freeway system was built in the automobile age and the subway hasn’t been fixed for 50 years. Global cities are small worlds.However there is a much more general theory of networks being devised which examines regularity and processes in such structures. Recently it looks as though most stable networks are scale free - this means that when you examine their structure, there is no characteristic length scale - they are fractal - moreover as they grow, they grow through positive feedback - dense clusters get denser - the rich get richer - again think of cities - in short they do not grow randomly

Page 64: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

64 The Zipf Seminars at EMU-UM

On the left a random graph, whose distribution of the numbers/density of links at each node is near normal - this has a characteristic length - the averageOn the left, what is much more typical - a graph which is scaling - one whose distribution is rank size, following a power law

P(k) ~ k - 2.5

Page 65: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

65 The Zipf Seminars at EMU-UM

Not only does the topology of web pages follow power laws

so does the physical hardware - the routers and wires

This and the last diagram are taken from the article by Barabasi called “The Physics of the Web” printed in the July 2001 issue of Physics World

Page 66: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

66 The Zipf Seminars at EMU-UM

Here is some work that Steve Coast in our group at CASA is doing on detecting and measuring the hardware of the web and visualizing it - all this is prior to measuring its properties - i.e. is it scaling, is it a small world and so on

Challenge is to map real space onto cyberspace and that so far has not really been attempted in these new ideas about how network systems growThis is the cluster of routers, and hubs and machines in UCL

Page 67: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

67 The Zipf Seminars at EMU-UM

Some more fancy visualizations of these networks

Page 68: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

68 The Zipf Seminars at EMU-UM

Some statistics from Steve’s work - which imply scale free networks

Lots and lots of issues here - we need models of how networks grow and form, how does the small world effect mesh into scale free networks ? We need to map cyberspace onto real space and back, and this is no more than mapping social space onto real space and back - its not new………………… I will finish

Page 69: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

69 The Zipf Seminars at EMU-UM

Some references - Martin Dodge and Rob Kitchin’s new book

Steve Coast’s web sitewww.fractalus.com/steve/

Our web sitewww.casa.ucl.ac.uk

and drill down to get to Martin’s www.cybergeography.org

Page 70: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

70 The Zipf Seminars at EMU-UM

3

3.5

4

4.5

5

5.5

6

6.5

0 0.5 1 1.5 2 2.5 3

Page 71: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

71 The Zipf Seminars at EMU-UM

-4.5

-4

-3.5

-3

-2.5

-2

-1.5

-1

0 0.5 1 1.5 2 2.5 3

19011901

19911991

Log of RankLog of Rank

1991 Population based 1991 Population based on 1901 Rankson 1901 Ranks

Log

of

Pop

ulat

ion

Sha

res

Log

of

Pop

ulat

ion

Sha

res

Here is an example of the shift in size and ranks over Here is an example of the shift in size and ranks over the last 100 years in GBthe last 100 years in GB

Page 72: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

72 The Zipf Seminars at EMU-UM

Figure 1: Rank-Size Distributions of Highly Cited Scientists

red institution, black place, grey by country

straightline fits by institution (red)

)2.80( )5.90(

0.938 ,429,/ln816.0555.0)(ln 2

RMMrxxP

by place/city (black)

)8.76( )3.94(

0.962 ,232,/ln049.1768.0)(ln 2

RMMrxxP

by country (grey)

)6.21( )232(

0.949 ,27,/ln997.1583.1)(ln 2

.

RMMrxxP

Page 73: 1 The Zipf Seminars at EMU-UM Sunday, April 06, 2014 After Zipf: From City Size Distributions to Simulations Or why we find it hard to build models of

73 The Zipf Seminars at EMU-UM