mtms.02.056 algorithmic information theory · mtms.02.056 algorithmic information theory, what are...

30
MTMS.02.056 Algorithmic Information Theory What are random sequences? Sven Laur University of Tartu

Upload: others

Post on 14-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

MTMS.02.056 Algorithmic Information

Theory

What are random sequences?

Sven LaurUniversity of Tartu

Page 2: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Historical perspective

Page 3: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Three dominant ways to view probability

Knowledgeof the

Long Run

FairPrice

Probability

BayesianismFrequentism

MathematicalStatistics

. Three notions in the graph form a vicious cycle.

. Depending on the starting point we get different interpretations.

. Each of them has its own application area and weaknesses.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 1

Page 4: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Approximate time-line

1700 1800 1900 2000

1713 J. BernoulliArs Conjectandi

1764 T. BayesBayes Theorem

1774 P. LaplaceBayes Theorem

1810 P. LaplaceCentral Limit Theorem

1834 A. CournotFinite frequentism

1900 K. Pearson!2-test

1921 M. KeynesLogical probability

1919 R. MisesKollektivs

1931 F. RamseySubjective probability

1933 A. KolmogorovGrundbegri!e

1937 de FinettiCoherence principle

1954 SavageSubjective utility

1969 P. Martin-LofRandom sequences

Kolmogorov’s neat axiomatisation of probability as a measure set off thebalance and mathematical statistics quickly became a dominant school.

. It took decades for other interpretations to return.

. The resurrection of Mises theory of kollektivs is particularly interesting.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2

Page 5: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Kollektivs as a way to define randomness

1919 von Mises postulated what is a random sequence. Probability isa property of an infinite sequence. A sequence x ∈ 0, 1∞ is a collectiveif it satisfies the following conditions.

. Relative frequency has a limiting value p(x).

. For any admissible sub-sequence x′, the corresponding relative frequencymust converge to p(x).

. An sub-sequence is admissible if it is chosen by a method that uses onlyvalues x1, . . . , xi to decide whether to take xi+1 or not.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 3

Page 6: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Formal definition

A sequence x = (xi)∞i=1 is uniformly random collective if for any set

of selection functions φn : 0, 1n → 0, 1 the following subsequencex′ = (xi)i∈I for I = n+ 1 : φn(x1, . . . , xn) = 1 has a limiting frequency

p(x′) = limn

1

n∑n=1

xik =1

2

Theorem (Kamke 1932). There no uniformly random collectives.

Proof. Fix a sequence x and define φn(·) = xn+1. Then (φn)∞n=1 is anadmissible selection function but p(x′) = 1.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 4

Page 7: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Mises-Wald-Church random sequences

Theorem (Wald 1936). Let us consider a countable set of place selectionfunctions H = (A` : 0, 1∗ → 0, 1)∞`=1 which for any sequence (xi)

∞i=1

defines the set of selected elements as follows:

I`(x) = n+ 1 : A`(x1 . . . xn) = 1

Then the set of sequences (xi)∞i=1 such that

∀` ∈ N : p((xi)i∈I`

)=

1

2(1)

is uncountable set. If the infinite sequence is generated by flipping the faircoin the outcome (xi)

∞i=1 will satisfy the condition (1) with probability 1.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 5

Page 8: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Frechet’s objections

Theorem I (Ville 1936). For any countable set of selection functions Hthere exists a specific sequence (xi)

∞i=1 that passes the Mises-Wald criterion

for randomness but for which the following condition holds

∃k0 ∈ N : ∀k > k0 :1

k∑i=1

xk ≥1

2.

In other words, a gabler always wins if he or she plays long enough.

Theorem II (Ville 1936). For any countable set of selection functions Hthere exists a specific sequence (xi)

∞i=1 that passes Mises-Wald criterion for

randomness but for which the law of iterated logarithms does not hold.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 6

Page 9: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Mises-Wald-Church sequences:

Ville’s construction

Page 10: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Construction target

Let A1, A2, . . . , A`, . . . be a countable set of place selection functions andlet φ(x) continuously increasing positive function such that

limx→∞

φ(x)

x= 0 and lim

x→∞φ(x) =∞ ,

i.e., φ(x) is sub-linear function. Then there exist a sequence (xi)∞i=1 such

that for any ` ∈ N there exists constants α`, β` > 0 such that:

∀n ∈ N : −α`

n≤ 1

n∑k=1

xik −1

2<α`

n+ β` ·

φ(n)

n

for I` = (i1, i2, . . . , ik, . . .) and thus satisfies the Mises-Wald criterion.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 7

Page 11: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Corresponding illustration

1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

2 ↵`

n

1

2+

↵`

n+ ` · (n)

n

For each selection function, the frequency converges asymmetrically to 12.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 8

Page 12: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

High-level description of the construction

Direct operation with selection functions A1, A2, . . . , A`, . . . is troublesome.Instead, we construct a new set of selection functions B1, B2, . . . , B`, . . .such that for our sequence (xi)

∞i=1, we can express

A` = D` ∨B2`+1 ∨ . . . ∨B2`+1

where A∨B denotes disjunction of selections and D` is a selection functionthat selects only a finite number of indices.

. As a result, it is sufficient to guarantee that (xi)∞i=1 is random with

respect to selection functions B1, B2, . . . , B`, . . ..

. The specific form of selection function B` makes it easy to construct thesequence (xi)

∞i=1.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 9

Page 13: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Logical operations with selection functions

Conjunction and disjunction of two selection functions:

∀n ∈ N : (A ∧B)(x1, . . . , xn) = A(x1, . . . , xn) ∧B(x1, . . . , xn)

∀n ∈ N : (A ∨B)(x1, . . . , xn) = A(x1, . . . , xn) ∨B(x1, . . . , xn)

Power operation with a Boolean constant c ∈ 0, 1:

∀n ∈ N : (Ac)(x1, . . . , xn) =

A(x1, . . . , xn), if c = 1

¬A(x1, . . . , xn), if c = 0

Special clipping operator for m ∈ N

A(m)(x1, . . . , xn) =

A(x1, . . . , xn), if

∑ni=1A(x1, . . . , xi) ≤ m

0, if∑n

i=1A(x1, . . . , xi) > m

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 10

Page 14: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

List of ortogonal selection functions

We will construct a set of selection functions C1, C2 . . . , C`, . . . such that

. a sequence element xi can be chosen only by a single selection function;

. each selection function C` will select up to m` indices;

. And for a particular (xi)∞i=1 there exists C` that chooses xi.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 11

Page 15: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Double recursive definition of selection operators

Let (yi)∞i=1 be an arbitrary binary sequence and (mi)

∞i=1 be a sequence

of selection limits. Then (yi)∞i=1 together with initial selection functions

A1, A2, . . . , A`, . . . defines a series of selection functions:By1 = Ay1

1

Cy1 = Bm1y1

By1y2 = ¬Cy1 ∧Ay11 ∧Ay2

2

Cy1y2 = Bm2y1y2

. . . . . .By1...yk = ¬Cy1 ∧ . . . ∧ ¬Cy1...yk−1 ∧A

y11 ∧Ayk

k

Cy1...yk = B(mk)y1...yk

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 12

Page 16: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Corresponding illustration

Let the the sequence of selection limits be m1 = 1,m2 = 2,m3 = 2, . . ..

1 2 3 4 5 6 7 . . .(xi)

∞1 1 0 1 0 1 0 1 . . .

A1 1 0 0 1 1 1 1 . . .A2 0 0 1 1 0 0 0 . . .A3 1 1 0 1 1 0 1 . . .

B1 1 0 0 1 1 1 1 . . .B10 0 0 0 0 1 1 1 . . .B101 0 0 0 1 0 0 1 . . .

C1 1 0 0 0 0 0 0 . . .C10 0 0 0 0 1 1 0 . . .C101 0 0 0 0 0 0 1 . . .C111 0 0 0 1 0 0 0 . . .

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 13

Page 17: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Properties of the construction

Lemma. Let (xi)∞i=1 be a candidate sequence and let (yi)

∞i=1 and (zi)

∞i=1

be arbitrary binary sequences. Let (mi)∞i=1 be a sequence of selection limits.

Then two different selection functions Cy1...yk and Cz1...z` cannot choosexm at the same time:

∀i ∈ N : ¬Cy1...yk(x1, . . . , xi−1) ∨ ¬Cz1...z`(x1, . . . , xi−1)

Proof.

. If there exists j ≤ k, ` such that xj 6= zj, the conjunction terms Ayjj and

Azjj cannot be true at the same time.

. If y1 . . . yk is a proper prefix of z1 . . . z` then the conjunction term¬Cy1...yk does the trick.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 14

Page 18: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Properties of the construction

Lemma. Let (xi)∞i=1 be a candidate sequence and let

∑∞i=1mi be a

diverging sequence of selection limits. Then for any index i there exists afinite prefix y1 . . . y` such that Cy1...yk chooses xi:

Cy1...y`(x1, . . . , xi1) = 1

and selection functions Cy1...yk corresponding to its proper prefixes selectexactly mk indices.

Proof.

. Let us choose yi such that Ay1...yii (x1, . . . , xi−1) = 1.

. Now By1...yk(x1, . . . , xi−1) ≡ 1 until Cy1...yk−1(x1, . . . , xi−1) ≡ 0.

. As Cy1...yk can select up to mk indices the Cy1...yk−1cannot remain 0.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 15

Page 19: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Construction of the sequence

High-level goal. We have countably many selection functions (Cy)y∈0,1+

defined in terms of A1, A2, . . . , A`, . . .. We have to find a sequence (xi)∞i=1

that for these finite selection functions looks suitably random.

Let Iy1...yk(x1, . . . , xi) =n+ 1 : n < i ∧ Cy1...yk(x1, . . . , xn)

be the set

of selected indices by looking at the first i− 1 sequence elements. Then wewould like to construct (xi)

∞i=1 such that for alli ∈ N and for all y ∈ 0, 1+:

|Iy(x1, . . . , xi)|2

≤∑

j∈Iy(x1,...,xi)

xj <|Iy(x1, . . . , xi)|

2+ 1

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 16

Page 20: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Inductive construction

Basis. Set x1 = 1. Then for all nonempty index sets Iy() the goal holds.

Induction step. Assume that the goal holds for the sequence prefixx1, . . . , xi and for all non-empty index sets Iy1...yk(x1, . . . , xi).

. As Iy1...yk(x1, . . . , xi, xi+1) does not depend on xi+1, we can find outinto which set Iy(x1, . . . , xi, xi+1) the index i+ 1 falls.

. If |Iy(x1, . . . , xi, xi+1)| = 1 then set xi+1 = 1.

. Otherwise we must fix xi+1 such that

|Iy(x1, . . . , xi)|+ 1

2≤

∑j∈Iy(x1,...,xi)

xj + xi+1 <|Iy(x1, . . . , xi)|+ 1

2+ 1

This is always possible.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 17

Page 21: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

The first claim about randomness

Theorem. The constructed sequence (xi)∞i=1 is Mises-Wald random wrt

the selection functions (By)y∈0,1+.

Proof. Let Jy1...yk(x1, . . . , xi) = n+ 1 : n < i ∧By(x1, . . . , xn) bethe set of selected indices wrt the functions (By)y∈0,1+. Then we can

decompose this set into disjoint union of sets Iy1...yk...z`(x1, . . . , xi).

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 18

Page 22: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Corresponding illustration

Iy

Iy0 Iy1

Iy00 Iy01 Iy10 Iy11

Iy000 Iy001 Iy010 Iy011 Iy100 Iy101 Iy110 Iy111

There are two types of non-empy sets. Full sets denoted by the olive colourand incomplete denoted by the red colour. We must make sure that theapproximation is good, i.e., that sets are large enough.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 19

Page 23: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

How big are the sets?

If the length y is k and the length of the longest suffix that creates anon-empty set is `, then the number of selected elements is bounded:∣∣Jy1...yk(x1, . . . , xi)∣∣ ≥ mk +mk+1 · · ·+mk+`−1∣∣Jy1...yk(x1, . . . , xi)∣∣ ≤ mk + 2mk+1 · · ·+ 2`mk+`

The total number of green and red sets r is bounded by 2`+1.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 20

Page 24: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

How good is the approximation?

Let aj denote the size of the jth nonempty selection set and bj the numberof ones in this selection set. Then by the construction

aj2≤ bj <

aj2

+ 1∣∣Jy1...yk(x1, . . . , xi)∣∣2

≤r∑

j=1

bj <

∣∣Jy1...yk(x1, . . . , xi)∣∣2

+ 2`+1

and we must choose the set of selection limits (mi)∞i=1 so that

1

2≤ 1∣∣Jy1...yk(x1, . . . , xi)∣∣ ·

r∑j=1

bj2<

1

2+

2`+1∣∣Jy1...yk(x1, . . . , xi)∣∣the last term converges to zero with a right speed.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 21

Page 25: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

What do we need?

We need at least that the right hand side of the inequality would go to zero

ε(n) =2`+1∣∣Jy1...yk(x1, . . . , xi)∣∣ ≤ 2`+1

mk + · · ·mk+`−1

to prove that (x)∞i=1 is Mises-Wald random wrt the functions (By)y∈0,1+.

This means that mk+` −mk ≥ 2`+2 for all k, ` ∈ N, i.e., mk = Ω(2k).

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 22

Page 26: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

The second claim about randomness

Theorem. The constructed sequence (xi)∞i=1 is Mises-Wald random wrt

the selection functions (A`)∞`=1.

Proof. Let K`(x1, . . . , xi) = n+ 1 : n < i ∧A`(x1, . . . , xn). As thefollowing conjunction is always true

A` =∨

y∈0,1`−1

(`−1∧i=1

Ayii ∧An

)

A`(x1, . . . , xi) =∨

y∈0,1`−1By1(x1, . . . xn) ∨D(x1, . . . , xi)

where D selects only finite number of indices.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 23

Page 27: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Direct consequences

By disjointness of selection functions (By)y∈0,1+, we can express

K`(x1, . . . , xi) in terms of Jy1(x1, . . . , xi) where y ∈ 0, 1`−1. Letay denote the size of Jy1(x1, . . . , xi) and by the number of ones inJy1(x1, . . . , xi). Then we get

ay2≤ by <

ay2

+ ayε(ay)

which yields

1

2<

1

|K`(x1, . . . , xi)|·∑

y∈0,1`by <

1

2+

∑y∈0,1`−1

ayε(ay)

|K`(x1, . . . , xi)|+ o(1)

and thus the last term must converges to zero with right speed.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 24

Page 28: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Final push

It turn out that if we choose the selection limits (mi)∞i=1 large enough, we

can assure that

ε(n) ≤ φ(n+ φ(`))

n2`−1

and thus∑y∈0,1`−1

ayε(ay)

|K`(x1, . . . , xi)|≤ 1

K`(x1, . . . , xi)·

∑y∈0,1`−1

φ(ay + φ(`))

2`−1

≤ φ(|K`(x1, . . . , xi)|+ φ(`))

|K`(x1, . . . , xi)|=φ(n+ φ(`))

n

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 25

Page 29: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Properties of constructed random sequence

Note that indices 1, . . . , k belong to sets Jy(x1, . . . , xk) for y ∈ 0, 1nsince

∨y∈0,1n

(n∧

i=1

Ayii

)≡ 1 .

Note that by the construction the number of ones in these sets is alwaysequal or more than one half. Thus, we have proven

i∑j=1

xj ≥i

2.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 26

Page 30: MTMS.02.056 Algorithmic Information Theory · MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 2 Kollektivs as a way to de ne randomness 1919

Mises-Wald-Church sequences

Definition. Let (A`)∞`=1 be the set of all partially recursive functions. Then

the sequence (xi)∞i=1 is Mises-Wald-Church sequence if the Mises-Wald

criterion is satisfied.

Unpredictable sequences. For any algorithm B : 0, 1∗ → 0, 1 we cancompute its asymptotic accuracy as follows

AdvB(n) =1

n+ 1·n+1∑i=0

[B(x1, . . . , xi)?= xi+1]

A sequence is unpredictable if AdvB(n)→ 12 for any algorithm.

Corollary. Mises-Wald-Church sequences are unpredictable sequences.

MTMS.02.056 Algorithmic Information Theory, What are random sequences?, October 4, 2013 27