lecture 31: some applications of eigenvectors: markov ... · let’s nd an eigenvector with ......

21
Lecture 31: Some Applications of Eigenvectors: Markov Chains and Chemical Reaction Systems Winfried Just Department of Mathematics, Ohio University April 9–11, 2018 Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

Upload: hoangdang

Post on 13-Jun-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Lecture 31: Some Applications of Eigenvectors:Markov Chains and Chemical Reaction Systems

Winfried JustDepartment of Mathematics, Ohio University

April 9–11, 2018

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

Review: Eigenvectors and left eigenvectors

A nonzero column vector ~x is an eigenvector aka right eigenvectorof a square matrix A with eigenvalue λ if

A~x = λ~x.

A nonzero row vector ~y is a left eigenvector of a square matrix Awith eigenvalue λ if

~yA = λ~y. Note that ~y left-multiplies A here.

By Homework 91, ~x is a (right) eigenvector of A with eigenvalue λif, and only if, ~y = ~xT is a left eigenvector of AT with the sameeigenvalue λ.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

Review: Markov chains

A Markov chain is a stochastic process.

Time proceeds in discrete steps t = 0, 1, 2, . . .

At each time t the process can only be in one of several statesthat are numbered 1, . . . , n.

The probability of being in a given state at time t + 1 dependsonly on the state at time t.

The matrix P = [pij ]n×n gives the transition probabilities pijfrom state i at time t to state j at time t + 1.

When ~x(t) = [x1(t), . . . , xn(t)] is the probability distributionfor the states at time t, then the probability distribution~x(t + 1) at time t + 1 is given by

~x(t + 1) = ~x(t)P = [x1(t), . . . , xn(t)]P.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

Review: Markov chains for weather.com light

Time proceeds in steps of days.

State 1: sunny day, State 2: rainy day.Each day is somehow unambiguously classified in this way.The meaning of the transition probabilities:

p11 is the probability that a sunny day is followed by anothersunny day.p12 is the probability that a sunny day is followed by a rainyday.p21 is the probability that a rainy day is followed by a sunnyday.p22 is the probability that a rainy day is followed by anotherrainy day.

P =

[p11 p12p21 p22

]~x(t) = [x1(t), x2(t)], where

x1(t) is the probability that day t will be a sunny day.x2(t) is the probability that day t will be a rainy day.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

An example of P for weather.com light

Let P =

[p11 p12p21 p22

]=

[0.6 0.40.3 0.7

]A sunny day is followed by another sunny day withprobability 0.6.

A sunny day is followed by a rainy day with probability 0.4.

A rainy day is followed by a sunny day with probability 0.3.

A rainy day is followed by another rainy day withprobability 0.7.

P is a stochastic matrix, which means that each row adds up to 1.

This will be true for every transition probability matrix of aMarkov chain, as each state i must be followed by some state inthe next time step.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

One-state transitions for our example of P

Let P =

[p11 p12p21 p22

]=

[0.6 0.40.3 0.7

]Consider the following probability distributions for day t:

~x(t) = [1, 0] means that day t is sunny for sure.

~y(t) = [0.5, 0.5] means equal likelihood of a sunny or a rainy day.

Note that the probabilities of all states always add up to 1.

The corresponding probabilities for the next day are:

~x(t + 1) = [1, 0]P = [1, 0]

[0.6 0.40.3 0.7

]= [0.6, 0.4]

~y(t + 1) = [0.5, 0.5]P = [0.5, 0.5]

[0.6 0.40.3 0.7

]= [0.45, 0.55]

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

The eigenvalues and an eigenvector for P of our example

Let P =

[0.6 0.40.3 0.7

]P− λI =

[0.6− λ 0.4

0.3 0.7− λ

]det(P− λI) = λ2 − 1.3λ+ 0.3 = (1− λ)(0.3− λ).

The eigenvalues are λ1 = 1 and λ2 = 0.3.

Let’s find an eigenvector with eigenvalue 1:

Form P− 1I =

[0.6− 1 0.4

0.3 0.7− 1

]=

[−0.4 0.40.3 −0.3

]

Solve

[−0.4 0.40.3 −0.3

] [x1x2

]=

[00

]−0.4x1 + 0.4x2 = 00.3x1 − 0.3x2 = 0

By setting x1 = 1, we see that ~x = [1, 1]T is an eigenvector witheigenvalue 1 of P.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

The meaning of the eigenvector [1, 1]T

Eigenvectors with eigenvalues λ 6= 1 are less important fortransition matrices of Markov chains, so we will skip finding aeigenvector with eigenvalue λ2 = 0.3 in our example. But we willtake a closer look at the eigenvector [1, 1]T with eigenvalue λ1 = 1.

Let A = [aij ]n×n be any square matrix. Then [1, 1, . . . , 1]T is aneigenvector of A with eigenvalue λ if, and only if,a11 a12 . . . a1na21 a22 . . . a2n

......

...an1 an2 . . . ann

11...1

=

a11 + a12 + · · ·+ a1na21 + a22 + · · ·+ a2n

...an1 + an2 + · · ·+ ann

=

λλ...λ

Thus [1, 1, . . . , 1]T is an eigenvector of A with eigenvalue λ if, andonly if, each row of A adds up to λ.

In particular, [1, 1, . . . , 1]T is an eigenvector of A with eigenvalue 1if, and only if, A is a stochastic matrix.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

We have proved a theorem ...

The observation on the previous slide proves parts (a) and (b) ofthe following result:

Theorem

Let P = [pij ]n×n be the matrix of transition probabilities for aMarkov chain. Then

(a) λ∗ = 1 is an eigenvalue of P.

(b) [1, 1, . . . , 1]T is an eigenvector of P with eigenvalue 1.

(c) Every eigenvalue λ of P satisfies |λ| ≤ |λ∗| = 1.

Part (c) is a consequence of a more general theorem called thePerron-Frobenius Theorem that goes beyond the scope of thiscourse. This part says that λ∗ = 1 is a so-called leading eigenvalueof P.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

How about the eigenvectors of PT?

Since every square matrix has the same eigenvalues as itstranspose, λ∗ = 1 must also be an eigenvalue of PT .Let’s find a corresponding eigenvector for our example of P:

Form PT − 1I =

[0.6 0.30.4 0.7

]−[

1 00 1

]=

[−0.4 0.30.4 −0.3

]

Solve

[−0.4 0.30.4 −0.3

] [x1x2

]=

[00

]−0.4x1 + 0.3x2 = 00.4x1 − 0.3x2 = 0

We find that every vector of the form ~x =[x1,

43x1]T

is aneigenvector with eigenvalue 1 of PT . These are the onlyeigenvectors with eigenvalue 1 of PT .

Here it will be useful to find the eigenvector ~x = [x1, x2]T with

x1 + x2 = 1 = x1 + 43x1 = 7

3x1.

It is ~x =[37 ,

47

]T ≈ [0.4286, 0.5714]T .

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

The meaning of the eigenvector [0.4286, 0.5714]T of PT

By the result of Homework 91, the vector([0.4286, 0.5714]T )T = [0.4286, 0.5714] is a left eigenvector of P.

Moreover, since the coordinates add up to 1 and are nonnegative,[0.4286, 0.5714] is a probability distribution.

It follows that if the probability distribution of the weather in ourexample on day t is

~x(t) = [0.4286, 0.5714],

then the probability distribution of the weather on day t + 1 is

~x(t + 1) = [0.4286, 0.5714]P = [0.4286, 0.5714].

~x ∗ = [0.4286, 0.5714] is a stationary (probability) distribution,which means that it remains the same on the next and all futuredays.

In fact, ~x ∗ = [0.4286, 0.5714] is the only stationary distribution inthis example.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

These observations generalize

Theorem

Let P be the transition probability matrix of a Markov chain with nstates and let ~x ∗ = [x∗1 , x

∗2 , . . . x

∗n ] be a probability distribution.

(a) ~x ∗ is a stationary distribution for this Markov chain if, and onlyif, ~x ∗ is a left eigenvector with eigenvalue 1 of P.

(b) There exists at least one stationary distribution ~x ∗ of theMarkov chain.

(c) If ~x ∗ is the only stationary distribution of the Markov chain,then for any given initial distribution ~x(0), the distributions ~x(t)always approach ~x ∗ as t →∞.

Point (b) follows from point (a) and the previous theorem.

Note also that in point (c) it is necessary that ~x ∗ is unique,because when we start in one stationary distribution ~y ∗ then wecannot approach another stationary distribution ~x ∗.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

Some alternative versions of weather.com light

Let us consider some other transition probability matrices P forweather.com light Markov chains.

Homework 93: (a) Let P1 = I2 (the weather always stays thesame). Show that in this case every probabilitydistribution ~x = [x1, x2] is a stationary distribution.

(b) Let P2 =

[0 11 0

]Show that in this case ~x ∗ = [0.5, 0.5] is the unique stationarydistribution.

(c) Find a third transition probability matrix P3 with stationarydistribution ~x ∗ = [0.5, 0.5].

(d) Formulate a condition on P that appears to guaranteethat ~x ∗ = [0.5, 0.5] is a stationary distribution and prove that itdoes.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

Remember Waldo?

Waldo is a highly gregarious and motivated and spends all of hisevenings working with six students on his MATH 3200 homework.At 7p.m. he visits a randomly chosen student i among those six,and then operates as follows:

He starts working with i . After 10 minutes, he flips a fair coin.

If the coin comes up heads, he continues working with i foranother 10 minutes before flipping the coin again.

If the coin comes up tails, he moves to the room of a randomlychosen friend of i and repeats the procedure.

He never tires of these efforts until 1a.m.

Where should we go looking for Waldo at midnight?

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

The stationary distribution for Waldo

Waldo’s itinerary can be modeled as a Markov chain withstates i = 1, 2, . . . , 6, where one time step lasts 10 minutes.State i simply means that Waldo is in i ’s room.

The transition probability matrix for this Markov chain is

P =

1/2 0 0 1/4 0 1/4

0 1/2 0 1/4 0 1/40 0 1/2 1/4 1/4 0

1/8 1/8 1/8 1/2 0 1/80 0 1/2 0 1/2 0

1/6 1/6 0 1/6 0 1/2

= [pij ]6×6

Homework 94: Show that this Markov chain has a uniquestationary probability distribution and find it.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

Eigenvectors with eigenvalue 0 and the nullspace of A

Let A be a square matrix. Let N(A) denote the set of alleigenvectors of A with eigenvalue 0 together with the zerovector ~0. It is the nullspace of A. (A nullspace can be defined forany matrix A, but only for square matrices in terms ofeigenvectors.)

Proposition

(a) N(A) has a nonzero element ~x 6= ~0 if, and only if, A is singular.

(b) N(A) is the set of all nonzero solutions ~x of the homogeneoussystem A~x = ~0.

(c) If ~a1,~a2, . . . ,~an denote the column vectors of A, then N(A) isthe set of all vectors [x1, x2, . . . , xn]T of coefficients such that

x1~a1 + x2~a2 + · · ·+ xn~an = ~0.

(d) N(A) is the set of all vectors ~x such that TA(~x) = ~0.N(A) is also called the kernel of TA.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

Review: Chemical reaction networks;net change of concentrations

Consider a chemical reaction network like:

A + 2B−→←− 2C A + 2C

−→←− 2D

A + B−→←− D B + D

−→←− 2C

If initial concentrations are denoted by [A]0, [B]0, [C ]0, [D]0 and

concentrations are measured again after some time and denoted by

[A]1, [B]1, [C ]1, [D]1, then the vector

~w = [[A]1 − [A]0, [B]1 − [B]0, [C ]1 − [C ]0, [D]1 − [D]0]

represents the net change in concentrations.

If some coordinate [X ]1 − [X ]0 is positive, then a net production ofcompound X was observed, if some coordinate [X ]1 − [X ]0 isnegative, then a net consumption of compound X was observed.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

Review: Chemical reaction networks;reaction vectors and stoichiometric matrix

The reaction vectors of the chemical reaction network

1 A + 2B−→←− 2C

2 A + 2C−→←− 2D

3 A + B−→←− D

4 B + D−→←− 2C

~v1 =

−1−220

~v2 =

−10−22

~v3 =

−1−101

~v4 =

0−12−1

represent the net changes in concentrations if only one reactionoccurs and consumes one mole of its first reactant.

They can be written as the columns of the stoichiometric matrix S.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

Review: The linear transformation TS

If we let ~k = [k1, k2, k3, k4]T be the column vector of average netrates at which the reactions occur over a given time interval, thenthe matrix product

S~k = ~w gives us the net change in concentrations.

Positive values ki > 0 signify that the forward reaction dominates;negative values ki < 0 signify that the backward reactiondominates.

When ~k = ~0 over arbitrarily short time intervals, then eachreaction is at equilibrium.

When S~k = ~0 over arbitrarily short time intervals, then noobservable change occurs and the system is at equilibrium.

The nullspace N(S) is the set of all rate vectors ~k where thesystem is at equilibrium.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

The rank of the stoichiometric matrix S

Recall the result of Group Work 6 :

Proposition

Suppose S represents a stoichiometric matrix of order m × n for nreactions between m chemical species in a closed reactionsystem (without net inflow, net outflow, or contributions from orto other reactions).

Then r(S) < m.

It follows that if m = n, then S is singular, so that it has at leastone eigenvector with eigenvalue 0.

Each such eigenvector represents a vector of reaction rates wherethe system is at equilibrium, but at least one reaction is not atequilibrium.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors

Homework problems

Homework 95: Let S be a stoichiometric matrix of order n × n,and let ~k be an eigenvector with eigenvalue 0 for S. Show that ~kmust have at least 2 nonzero coordinates.

Homework 96: Let S be a stoichiometric matrix for the chemicalreaction network

A + 2B−→←− 2C A + 2C

−→←− 2D

A + B−→←− D B + D

−→←− 2C

Find the set of all eigenvectors of S with eigenvalue 0.

Winfried Just, Ohio University MATH3200, Lecture 31: Applications of Eigenvectors