origin of life on earth and shannon’s

Upload: babis-trabakoulas

Post on 02-Jun-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Origin of Life on Earth and Shannons

    1/19

  • 8/10/2019 Origin of Life on Earth and Shannons

    2/19

    106 H.P. Yockey / Computers Chemistry 24 (2000) 105-123

    1 In tro duct io n

    S o c r a t e s ( T h e R e p u b l i c , B o o k V I , p . 7 4 5 ) h a d n o t e d ,

    i n a n e a r l i e r c o n v e r s a t i o n w i t h G l a u c o n , t h a t s t u d e n t s

    o f g e o m e t r y a n d r e c k o n i n g f i r s t s e t u p p o s t u l a t e s a p -

    p r o p r i a t e t o e a c h b r a n c h o f s c i e n c e , t r e a t i n g t h e m a s

    k n o w n a b s o l u t e a s s u m p t i o n s , t a k i n g i t f o r g r a n t e d t h a t

    t h e y a r e o b v i o u s t o e v e r y b o d y . T h u s , a s S o c r a t e s t a u g h t

    u s , t h e e s s e n c e o f s c i e n c e i s m e a s u r i n g , c o u n t i n g a n d

    w e i g h i n g t o g e t h e r w i t h r e a s o n i n g f r o m p o s t u l a t e s o r

    a x i o m s . W h e n s c ie n c e a t t e m p t s t o p r o c e e d f r o m q u a l i -

    t a t i v e a r g u m e n t s , t h e r e i s a d a n g e r t h a t w e m a y b e

    l o o k i n g a t R o r s c h a c h i n k b l o ts , s o t o s p e a k , a n d s e e in g

    w h a t w e w a n t t o se e. T h e m o r e d i s c u s s io n s c a n b e m a d e

    q u a n t i t a t i v e a n d a v o i d a d h o c e x p l a n a t i o n s t h e b e t t e r

    t h e o r e t i c a l b i o l o g y i s s e r v e d . O n l y b y e x p r e s s i n g o u r

    k n o w l e d g e i n n u m b e r s a n d e m p l o y i n g c a l c u l a t i o n a n d

    r e a s o n i n g c a n w e s e p a r a t e f a c t f r o m t h e t r i c k s o f i l l u -

    s i o n i n d e t e r m i n i n g w h e t h e r a n y p r o p o s a l o n t h e n a -

    t u r e , o r i g i n a n d e v o l u t i o n o f l i fe c a n b e r e t a i n e d o r

    d i s c a r d e d . T h i s b r e a k s m o l e c u l a r b i o l o g y o u t o f s o p h i s -

    t i c a t e d K i p l i n g ' s Ju s t S o S t o r i e s in t o t h e q u a n t i t a t iv e

    m o d e , u s e d b y n a t u r a l s c i e n t i s ts ( W o l y n e s , 1 9 9 8) .

    C h a r l e s D a r w i n ( 1 8 0 9 - 1 8 8 2 ) a n d a l m o s t a l l n a t u r a l -

    i s t s o f h i s t i m e , b e l i e v e d i n t h e b l e n d i n g t h e o r y o f

    g e n e t i c s ( J e n k i n , 1 8 6 7) . H o w e v e r , G r e g o r J o h a n n

    M e n d e l ( 1 8 2 2 - 1 8 8 4 ) , p u b l i s h e d a p a p e r i n 1 8 6 5

    ( M e n d e l , 1 8 65 ) t h a t p r o v e d i n h e r i t e d c h a r a c t e r i s t i c s a r e

    s e g r e g a t e d a n d d o n o t b l e n d . I t w a s n o t u n t i l 1 9 30 t h a t

    F i s h e r ( F i s h e r , 1 93 0 ) s h o w e d t h a t t h e b l e n d i n g t h e o r y i s

    i n c o r r e c t a n d t h a t n a t u r a l s e l ec t io n p r o c e e d s a c c o r d i n g

    t o t h e p a r t i c u l a t e la w s o f G r e g o r M e n d e l . T h e n e x t

    i m p o r t a n t d i s c o v e r y i n g e n e ti c s w a s t h a t t h e m e c h a n i s m

    o f i n h e r i t a n c e i s l i n e a r . E 1 m o m e n t o d e l a v e r d a d c a m e

    i n 1 95 3, w h e n W a t s o n a n d C r i c k d i s c o v e r e d t h a t t h e

    f o r m a t i o n o f b i o m o l e c u l e s is c o n t r o l l e d b y t h e s e-

    q u e n c e s o f n u c l e o t i d e s r e c o r d e d i n t h e d o u b l e h e l i x o f

    D N A ( o r R N A i n s o m e v i r u s e s ) . T h u s t h e g e n e t i c

    i n f o r m a t i o n s y s t e m i s s e g r e g a t e d , l i n e a r a n d d i g i t a l . I t i s

    a s t o n i s h in g , f u r t h e r m o r e , t h a t t h e t e c h n o l o g y o f i n f o r-

    m a t i o n t h e o r y a n d c o d i n g t h e o r y , j u s t n o w b e i n g d is -

    c o v e r e d , h a s b e e n i n p l a c e i n b i o l o g y f o r a t l e a s t 3 .8 5 0

    b i l l io n y ea r s. I n f o r m a t i o n t h e o r y a n d c o d i n g t h e o r y

    a n d t h e i r to o l s o f m e a s u r i n g i n f o r m a t i o n i n s e q u e n c e s

    a r e e s s e n t i a l t o u n d e r s t a n d i n g t h e c r u c i a l q u e s t i o n s o f

    t h e n a t u r e a n d o r i g i n o f li fe .

    T h e d e v e l o p m e n t o f a h u m a n b e i n g i s g u i d e d b y ju s t

    7 5 0 m e g a b y t e s o f d i g i t a l i n f o r m a t i o n ( O l s o n , 1 9 95 ). I t

    c o u l d b e s t o r e d o n a s in g l e C D - R O M i n t h e b i o l o g i s t 's

    p e r s o n a l c o m p u t e r . T h e g e n o m e i n p r in c i p l e c o n t a i n s

    a l l th e i n f o r m a t i o n n e c e s s ar y t o b r i d g e t h e g a p b e t w e e n

    g e n o t y p e a n d p h e n o t y p e . A l l i n f o r m a t i o n n e c e s s a r y t o

    d e t e r m i n e t h e t h r e e - d i m e n s i o n a l s t r u c t u r e o f p r o t e i n s

    l i e s i n t h e f o r m o f a m i n o a c i d s e q u e n c e s , y e t w e r e m a i n

    u n a b l e t o p r e d i c t t h e i r t e r t i a r y s t r u c t u r e s ( H u y n e n a n d

    Bork, 1998) .

    C h e m i c a l o r p h y s i c a l r e a c t i o n s i n n o n - l i v i n g m a t t e r

    a r e n o t c o n t r o l l e d b y a m e s s a g e . I f th e g e n e t i c a l p r o -

    c e s s e s w e r e p u r e l y c h e m i c a l , a s m e c h a n i s t s - r e d u c t i o n -

    i s ts b e li e v e, th e l a w o f m a s s a c t i o n a n d t h e r m o d y n a m i c s

    w o u l d g o v e r n t h e p l a c e m e n t o f a m i n o a c i d s i n th e

    p r o t e i n s e q u e n c e s a c c o r d i n g t o t h e i r c o n c e n t r a t i o n .

    T h e r e f o r e , i f a s c e n a r i o f o r t h e o r i g i n o f l i f e i s t o b e

    a c c e p t a b l e , i t m u s t s h o w h o w a g e n e t i c m e s s a g e w a s

    g e n e r a t e d a n d a t t a i n e d a m i n i m u m t h r e s h o l d o f c o m -

    p l e x i t y - - w h i c h is m e a s u r a b l e i n b it s a n d b y t e s - - i n

    a m o l e c u l e c h a r a c t e r i s t i c o f l i f e a n d n e e d e d f o r t h e

    a s s i m i l a t io n o f c a r b o n d i o x i d e a n d n i t r o g e n b y t h e

    p r o t o b i o n c

    F i r s t , h o w l a r g e a n i n f o r m a t i o n c o n t e n t d o e s a g e -

    n e t i c m e s s a g e r e q u i r e t o b e c h a r a c t e r i s t i c o f l i f e ? S e c -

    o n d , d o t h e l a w s o f p h y s ic s a n d c h e m i s t ry h a v e e n o u g h

    i n f o r m a t i o n c o n t e n t s o t h a t n o n - l i v i n g m a t t e r m a y o r -

    g a n i z e i ts e l f a n d b e c o m e l iv i n g m a t t e r , s a y i n t h e m a n -

    n e r t h a t c r y s t a l s a r e f o r m e d ? C o m p u t e r u s e r s a r e

    f a m i l i a r w i t h t h e r a t h e r l a r g e m e m o r y r e q u i re m e n t s ,

    m e a s u r e d i n b y t e s , o f c o m p l i c a t e d p r o g r a m s a n d t h e

    c a p a c i t i e s o f t h e i r h a r d d r i v e s . T h e s a m e a p p l i e s t o

    p r o g r a m s t h a t p u r p o r t t o g e n e r a t e t h e g e n o m e o f t h e

    p r o t o b i o n t .

    2 The m a the m a t i ca l theo ry o f sequences

    I s h a l l t h e r e f o re a p p r o a c h t h e s u b j e c t fr o m t h e m a t h -

    e m a t i c a l t h e o r y o f s e q u e n c e s a n d t h e c o d e s b e t w e e n

    s e q u e n c e s . F i r s t w e m u s t e s t a b l i s h a m a t h e m a t i c a l

    m e a n i n g a n d a m e a s u r e o f t h e information in a message .

    S h a n n o n ( 1 94 8 ) e s t a b l i s h e d i n f o r m a t i o n t h e o r y a s a

    m a t h e m a t i c a l d i s c i p l in e in h i s c l a ss i c p a p e r a n d p o i n t e d

    o u t , i n t h e s e c o n d p a r a g r a p h :

    T h e f u n d a m e n t a l p r o b l e m o f c o m m u n i c a t i o n i s t h a t o f

    r e p r o d u c i n g a t o n e p o i n t e i t h er e x a c t ly o r a p p r o x -

    i m a t e l y a m e s s a g e s e le c t ed a t a n o t h e r p o i n t . F r e q u e n t l y

    t h e m e s s a g e s h a v e

    meaning;

    t h a t i s t h e y r e f e r t o o r a r e

    c o r r e l a t e d a c c o r d i n g t o s o m e s y s t e m w i t h c e r t a i n

    p h y s i c a l o r c o n c e p t u a l e n t i t i e s . T h e s e s e m a n t i c a s p e c t s

    o f c o m m u n i c a t i o n a r e i r r el e v a n t to t h e e n g i n e e ri n g

    ( b i o l o g i c a l ) p r o b l e m . T h e s i g n i f i c a n t a s p e c t i s t h a t t h e

    a c t u a l m e s s a g e i s

    selected from a set

    o f p o s s i b l e

    m e s s a g e s . T h e s y s t e m m u s t b e d e s i g n e d t o o p e r a t e f o r

    e a c h p o s s i b l e s e l e c t i o n , n o t j u s t t h e o n e w h i c h w i l l b e

    a c t u a l l y c h o s e n s i n c e t h i s is u n k n o w n a t t h e t i m e o f

    d e s i g n .

    S i m i l a r l y , t h e g e n e t i c lo g i c s y s t e m m u s t b e c a p a b l e o f

    a c c o m m o d a t i n g t h e g e n e t i c m e s s a g e s o f a l l o r g a n i s m s

    t h a t h a v e e v e r l i v e d , l i v e n o w o r w i l l b e e v o l v e d i n t h e

    f u t u r e . T h e D N A s e q u e n c e s t h a t m a k e u p t h e g e n o m e

    o f a n y o r g a n i s m a r e s e l e c t e d f r o m a s e t o f p o s s i b l e

    g e n e t i c m e s s a g e s . T h u s t h e b i o l o g i c a l i n f o r m a t i o n s y s -

  • 8/10/2019 Origin of Life on Earth and Shannons

    3/19

    H.P . Yockey / Computers Chemistry 24 (2000) 105 123 107

    tem or genetic logic system is independent of the specifi-

    city or meaning of the genetic message in the genome.

    Likewise, commu nica tion systems will process meanin g-

    less noise as well as a play by Sophocles.

    To some people it may seem counter-intuitive that

    the information in a sequence or a message can be

    measured, while there is no measure for

    m e a n i n g ,

    k n o w le d g e , e r u d i t io n o r s p e c i f i c i t y . We have accepted

    this as true in our daily life in many ways. As comput-

    ers, hard drives, modems and other devices that mea-

    sure information in bits and bytes are becoming

    increasingly commonplace , one becomes aware that

    information and meaning are separate concepts.

    The meaning, if any, of words, that is, a sequence of

    letters, is arbitrary. It is determined by the natural

    language and is not a property of the letters or their

    arrangement. For example, the English word 'hell'

    means bright in German, 'fern' means far, 'Gift' means

    poison, 'bald' means soon, 'Boot' means boat, "singe"

    means sing. In French 'pain' means bread, 'ballot'

    means a bundle, 'coin' means a corner or a wedge,

    'chair' means flesh, 'cent' means hundred, 'son' means

    his, 'tire' means a pull, 'ton' means your. This confu-

    sion of meaning goes as far as sentences. For example,

    'O singe fort ' has no meaning as a sentence in English,

    although each is an English word, yet in German it

    means 'O sing on ' and in French it means 'O strong

    monkey. ' However, while the meaning of a sequence of

    letters is arbitrary, the meaning or specificity of a

    sequence of amino acids composing a protein is not

    arbitrary, but rather is determined by nature.

    In formal probability theory, one must establish the

    probabil ity sample space ~ consisting of the set A, of

    events, x, under discussion, which are random variables

    with the cor responding probabiliti es, p~. The set of

    probabilities Pi form a probability vector p. The proba-

    bility sample space is designated (~, A, p). All messages

    are formed from a sequence of the members of a finite

    alphabet. For example, the primary alphabet used in

    electronic commu nica tion is [0, 1]; in molecular biology

    they are the nucleotide bases in DNA, RNA and the

    amino acids in protein. The members of these alphabets

    are the events in the probability sample space.

    Gilbert (1987) and Gilbe rt et al. (1997) ca lculated the

    number of possible sequences in polypeptide chains of

    length N by the following expression:

    (20) N (1)

    Eq. (1) gives the total number of sequences we must

    be concerned with only if all events are equally proba-

    ble. However, many events in general and amino acids

    in particular do not have the same probability. Does

    neglecting that fact affect our analysis? Sha nno n (1948)

    addressed this problem as follows: Let us consider a

    long sequence of N symbols selected from an alphabet

    of A letters or events. In the present case the letters or

    events will be the alphabet of either codons or amino

    acids. With high probability the sequence will contain

    N p ( i ) of the ith symbol. Let P be the probability of the

    sequence. Then:

    p = H U p(i )p ( i)N (2)

    Taking the logarithm of both sides:

    log2 P = N Z , p ( i ) log2p ( i ) (3)

    log2 P = N Z ~ p ( i ) log2p ( i ) = - N H (4)

    Where

    H = - Z~p(i) log2p( i) (5)

    H is the informati on or S hann on entropy of the proba-

    bility space containing the events i. Accordingly, the

    probability of a sequence of N symbols or events is

    approximately:

    P = 2 NH

    6 )

    The number of sequences of length N is approximately:

    2 NH (7)

    Notice that the expression for H was not introduced ad

    hoc; rather it comes out of the woodwork, so to speak.

    Let us apply Eq. (7) to calculating the number of

    sequences in 100 throws of a dice game where the

    probabilities of all events are known exactly and are

    not all equal. For a given throw the probability of 2

    and 12 is 1/36 whereas the probability of 7 is 6/36

    because there are six ways to roll a 7 and only one way

    to roll 2 or 12. Substit uting the probabi lities of each of

    the possible events in Eq. (7) to calculate H one finds

    that the n umber of sequences is only 2.69 x 10 -6 of

    that calculated from Gilbert's expression (Eq. (1)) 11N.

    We have approached the calculation of the number

    of sequences of length N in two apparently correct ways

    and the question arises as to what happened to the

    sequences left out of the total possible by the second

    method. This is explained by the Sh an non -M cMi ll an -

    Breiman theorem ( Shannon, 1948; McMillan, 1953;

    Breiman, 1957):

    For sequences of length N being sufficiently long, all

    sequences being chosen from an alphabet of A symbols,

    the ensemble of sequences can be divided into two

    groups such that:

    1. The probabil ity P of any sequence in the first group

    is equal to 2 NH

    2. The sum of the probabilit ies of all sequences in the

    second group is less than e, a very small number.

    The Sha nno n-M cMi lla n-B rei man theorem tells us

    that the number of sequences in the first or high

    probability group is 2 NH and they are all nearly equally

    probable. We can ignore all those in the second or low

    probability group because, if N is large, their

    t o t a l

    p r o b a b i l i t y

    is very small.

  • 8/10/2019 Origin of Life on Earth and Shannons

    4/19

    108

    H.P. Yockey / Computers & Chemistry 24 (2000) 105-123

    The expression for the measure of information that

    appears in Shannon (1948) is:

    H = - KS ip

    logpi (8)

    where the p~ are the probabilities of the ith member of

    the alphabet. Shannon was quick to explain that K is a

    positive constant that 'merely amoun ts to a choice of a

    unit of measure'. K appears from the mathematical

    argument that justifies Eq. (8). It has no physical

    dimensions and when the logarithm is taken to the base

    2 and K= 1, H is measured in bits, a contraction of

    binary digit. Like all messages, the life message is

    non-materi al but has a measurable inform ation con-

    tent. Of course, the genetic message, although non-ma-

    terial, must be recorded in matter or energy. Now in

    this paper, the term

    information

    has the meaning given

    in Eq. (8). It does not mean

    knowledge,

    although a

    message composed of a sequence of symbols may trans-

    fer knowledge to the receiver of the message.

    The expression given in Eq. (8) is very similar to that

    for entropy in statistical mechanics. As I explained in

    Yockey (1992), entropy is a general term and must be

    defined as a function of the probability space and the

    probability distribution of the elements or events to

    which it refers. As Hamming (1986) wrote:

    For us entropy is simply a function of a probability

    distribution p~ .. . The confusio n at this point has been

    very great for outsiders who glance at information

    theory.

    Every probability sample space has an entropy. For

    example, the p robab ility sample space in classical statis-

    tical mechanics, called

    phase space

    by theoretical physi-

    cists, is six-dimensional and the Pi are defined by the

    position and momentum vectors of the particles in the

    ensemble. The function for entropy in both classical

    statistical mechanics and quantum statistical mechanics

    has the dimensions of the Boltzma nn constant k a nd

    has to do with energy and moment um, not information.

    However, entropy in information theory and probabil-

    ity theory has no mechanical dimensions. There are no

    counterparts in communication theory to temperature,

    energy, pressure, work or volume. There is, further-

    more, no counterpart to the First Law of Thermody-

    namics, namely, the conservation of the energy of a

    system. To illustrate this point further, one may con-

    sider the probability space of a dice game that consists

    of the numbers 2 through 12 as random variables and

    calculate the cor respond ing entropy (Yockey, 1992, see

    exercise on page 88), Clearly, the entropy o f a dice

    game has nothing whatever to do with statistical me-

    chanics or thermodynamics. It may have something to

    do with information theory since a sequence of symbols

    selected from the alphabet is generated as a stochastic

    process by a series of tosses of the dice. Such a se-

    quence of letters forms a message in which some gam-

    biers find meaning or knowledge by which they make

    their bets. On the other hand, information theory is

    concerned with messages expressed in sequences of

    letters selected from a finite alphabet by a Markov

    process. The probability sample space is constructed of

    the letters of the alphabet under consideration as ran-

    dom variables and the p~ are defined accordingly.

    3 The ex ten s io ns o f a n a lpha bet : def in i ti o n o f the by te

    A binary alphabet consisting of zero and one, each

    containin g one bit of information , is used in the source

    alphabet of computers and all communi cation technol-

    ogy. The receiving alphabets in all applications have

    more than two letters. The binary source alphabet must

    be

    extended

    by forming pairs, triplets, quadruplets and

    so forth to form receiving alphabets larger than two.

    These alphabets are called the first, second, third a nd so

    forth extensions of the primary b inary alphabet. These

    extensions are called a

    byte.

    Packaged items in stores have a bar code attached

    that permits the cashier to record the price of the item.

    The Postal Service in the United States has established

    a ZIP + 4 bar code in which mailing addresses can be

    written. The alphabets of the postal and other bar

    codes are composed of short bars and long bars, that is,

    the source alphabet is binary. There are 32 (25= 32)

    members of the fifth extension of the source binary bar

    code alphabet. Thus the postal bar code has a five bit

    byte. An important receiving alphabet is one in which

    the ten numbers from 0 to 9, have two ones or two

    zeros. The source probabili ty space (fl, A, p) alphabet is

    mapped on the receiver probability space alphabet, (fL

    B, p). The Postal Service has assigned arbitrarily the ten

    members of the fifth extension that have two ones to

    one of the ten digits in the decimal system. The ten

    members selected from the fifth extension are called

    sense code letters.

    The assignment of sense code letters

    in the fifth extension alphabet of the postal ZIP + 4

    code is shown by the first row in Table 1.

    The other code letters are called

    non-sense

    because

    they have been given

    no sense

    or

    meaning

    assignment in

    the receiving alphabet. (Remember that non-sense dose

    not mean nonsense or foolishness.) I have listed those

    non-sense code letters of the fifth extension of the

    binary code with just one change from the sense code

    letters in each column, five rows down. Notice that the

    postal ZIP + 4 code is an error detecting code. A single

    error does not change one sense code letter to another

    sense code letter. If, due to smudging or other malfunc-

    tion, the sort ing machine reads a non-sense code letter

    the mail piece is rejected to be examined by a postal

    employee.

    Computer equipment uses the seventh extension,

    ASCII [A(merican) S(tandard) C(ode for) I(nforma-

  • 8/10/2019 Origin of Life on Earth and Shannons

    5/19

    H.P. Yoekey / Computers Chemistry 24 (2000) 105-123 109

    Table 1

    The postal ZIP+4 code

    0 1 2 3 4 5 6 7 8 9

    11000 00011 00101 00110 01001 01010 01100 10001 10010 10100

    01000 10011 10101 10110 11001 11010 11100 0000 00010 00100

    10000 010|1 01101 01110 00001 00010 00100 11001 11010 11100

    11100 00111 00001 00001 01101 01110 01000 1010 10110 10000

    11010 00001 00111 00100 01011 01000 01110 10011 10000 10110

    11001 00010 00100 00111 01000 01011 01101 10000 10001 10101

    tion) I(nterchange)] computer code. The byte is 7 bits,

    the amount of information in one printed character in

    the receiver alphabet. There are 27= 128 members of

    the seventh extension alphabet. Members of this sev-

    enth extension of the primary alphabet are assigned

    arbitrarily to the letters in the receiving alphabet of the

    word processor. This assignment is appropriate to the

    language for which the word processor is designed.

    4 T h e g e n e ti c c o d e

    In another corner of the groves of academe, remote

    from the activities of mathematicians and communica-

    tion engineers, after the discoveries of Watson and

    Crick in 1953, it was realized that there must be a code

    that performs a mapping between the sequences of the

    four nucleotides in DNA to the sequences of the 20

    amino acids in protein. George Gamow made the first

    suggestion of an overlapping code. It was soon proved

    to be incorrect, but the search was on. It is easy to see

    that overlapping codes impose severe restrictions on the

    allowed amino acid sequences. Since it was clear that

    there are 64 possible triplets and only 20 amino acids,

    the genetic code was believed to be 'degenerate' and

    that some codons must be 'nonsense'. Both terms are

    unnecessarily pejorative and non-constructive. The first

    sentence of Sonnebo rn (1965) expresses the att itude of

    the time: "Part I of this paper deals with the amou nt of

    degeneracy in the genetic code or, looked at in reverse,

    the amo unt o f nonsense." Unfortunate ly, the use of the

    word 'nonse nse' persists in some current publications

    (Maeshiro and Kimura, 1998).

    The apartheid in the groves of academe between

    mathematicians, electrical engineers and molecular biol-

    ogists prevented the notion of block codes, initiators

    and enders to be applied to the und erstand ing of the

    genetic code and the genetic logic system. For example,

    there are three different frames in which a sequence of

    nucleotides in DNA can be read. Crick et al. (1957)

    assumed that there must be 'commas' to separate a

    string of nucleotides forming codons to prevent them

    from reading out of frame. They attempted to show

    that the maximum number of sense codons cannot be

    greater than 20 and gave a solution for the 20. Unfortu-

    nately they proved (sic) that the triplet AAA and other

    triplets must be nonsense (sic). Table 2 shows that

    AAA codes for lysine, CCC codes for proline, UUU

    codes for p henylalanine and GG G codes for glycine.

    Furthermore, a tRNA species possesses an anticodon

    complementary to UGA, one of the nonsense codons,

    which codes for selenocysteine, thus making 21, not 20

    amino acids (Hawkes and Tappel, 1983; Sunde and

    Evenson, 1987; Leinfelder et al., 1988; Mizutani and

    Hitaka, 1988), The use of an initiator codon an d

    codons to terminate the sequence and the realization

    that the genetic code is a block code, like the bar codes

    discussed above, makes it unnecessary to assume that

    commas are needed.

    The source alphabet of the genetic code is the four

    nucleotides of DNA and mRNA. Thus each nucleotide

    has a two-bit byte. The first extension of that quater-

    nary alphabet has sixteen letters. That is not sufficient

    to code the canonical 20 amino acids that are tran-

    scribed in protein; accordingly, Nature has gone to the

    second extension 64-letter alphabet. Thus the genetic

    code has a six-bit byte, usually called a codon. Some-

    times the codon is called a word, but clearly this is

    inappropriate. The codon is a letter in the second

    extension of the mR NA alphabet; the protein is a word.

    The genetic code, shown in Table 2, shares a number of

    properties with the postal ZIP + 4 code and the ASCII

    computer codes. The genetic code is a block code

    because all codons are triplets. The genetic code is

    distinct and uniquely decodable because the single meo

    thionine codon AUG, and sometimes the leucine

    codons UUG and CUG, serve as a starting signal for

    the protein sequence and perform the same function as

    the long frame bar at the beginning of the postal

    message in the ZI P +4 code. The non-sense codons

    UGA, UAA and UAG stop the translation of the

    protein from the mRNA and initiate the release of the

    protein sequence from the mRNA (Maeshiro and

    Kimura, 1998). They perform the same function as the

    long frame bar at the end of the postal bar code

    message.

  • 8/10/2019 Origin of Life on Earth and Shannons

    6/19

    110 H.P. Yockey / Computers Chemistry 24 (2000) 105 123

    Table 2

    The mRNA genetic code

    Amino acid Triplet codons Amino acid Triplet codons Amino acid Triplet codons

    Glycine GGG Phenylalanine UUU Leucine UUA

    GGC UUC UUG

    GGU

    GGA

    Proline CCG Cysteine UGU Tryptophan UGG

    CCC UGC Nonsense UGA

    CCU

    CCA

    Leucine CUG Glutamine CAA Histidine CAU

    CUC CAG CAC

    CUU

    CUA

    Arginine CGG Aspartic acid AAU Lysine AAA

    CGC AAC AAG

    CGU

    CGA

    Threonine ACG Glutamic acid GAA Asparagine GAU

    ACC GAG GAC

    ACU

    ACA

    Valine GUG Isoleucine AUU Methionine AUG

    GUC AUC

    GUU AUA

    GUA

    Alanine GCG Nonsense UAA Tyrosine UAU

    GCC UAG UAC

    GCU

    GCA

    Serine UCG Arginine AGA

    UCC AGG

    UCU Serine AGU

    UCA AGC

    5 T h e o r i g i n a n d e v o lu t i o n o f t h e g e n e t i c c o d e

    The Standard Genetic Code as represented in Table 2

    was once thought to be universal in all biology, indeed,

    in the earlier papers it was sometimes refereed to as the

    Universal Genetic Code. Crick (1968) proposed that the

    genetic code was a froze n accident . It has been learned

    since that the genetic code is not universal, nor is it

    fixed or frozen. Placing this problem in the hands of the

    fickle goddess Fortuna begs the question. There has

    been considerable speculation about the order in the

    genetic code. Any small group of chance events seems

    to ha ve ord er . This is the gambler s ruin. As the

    ancient Greeks and Romans presumed to see the future

    in the flight of birds, he sees a pattern or ord er in the

    fall of the dice or the cards. Fortuna, the goddess of

    chance, may allow him a brief episode of luck but

    inevitably she turns against him to his ruin (Feller,

    1957).

    Jukes (1966, 1983) suggested that the second exten-

    sion triplet genetic code may have evolved from a

    preceding first extension doublet code. Jukes archety-

    pal genetic code had 16 codons that translated 14 or 15

    amino acids together with a stop codon (Jukes, 1966).

    All amino acid codons except Methionine and Tryp-

    tophan have some redundance in the third place in the

    codon; this lends some support for that suggestion.

    That redundance provides some protection from muta-

    tional error. I have given a detailed discussion of the

    consequences of the evolution of the standard and

    mitochondrial genetic codes from an archetypal do ublet

    code in which the third nucleotide is silent (Yockey,

    1992).

    Crick (1968) pointed out, correctly, that any change

    in the code would cause errors throughout the process

    of protein formation, resulting in a disruption that

    would be lethal. However, Jukes (1993) noted that

    codon use varies considerably and that in some organ-

    isms certain codons fall out of use altogether. He

  • 8/10/2019 Origin of Life on Earth and Shannons

    7/19

    H.P. Yockey / Computers Chemistry 24 (2000) 105-123 111

    s u g g e s t e d t h a t a f t e r a t i m e s u c h c o d o n s h a d b e e n

    r e a s s i g n e d o r c a p t u r e d . F o r e x a m p l e , t h e c o d o n i n

    v e r t e b r a t e m i t o c h o n d r i a f o r m e t h i o n i n e i s A U A , a n d

    U G A f o r t r y p t o p h a n . T h e m o s t r e c e n t s t u d ie s o f c o d o n s

    t h a t c o d e f o r a m i n o a c i d s th a t d i f f e r f r o m t h o s e i n t h e

    S t a n d a r d G e n e t i c C o d e i n T a b l e 2 is in O s a w a ( 19 9 5 ) a n d

    O s a w a e t a l . , ( 19 9 0, 1 9 92 ) a n d M a e s h i r o a n d K i m u r a

    (1998) .

    6 T h e C e n t r al D o g m a o f F r a n c is C r ic k

    W i t h t h e a p p e a r a n c e o f t h e e x p e r im e n t a l k n o w l e d g e

    a b o u t t h e t r a n s fe r o f i n f o r m a t i o n f r o m t h e f o u r l e t te r

    a l p h a b e t o f D N A a n d o f m R N A , t h e q u e s ti o n e m e rg e d

    a m o n g m o l e c u l a r b io l o g i s t s o f h o w a f o u r l e t t er a lp h a b e t

    c o u l d s e n d i n f o r m a t i o n t o a 2 0 l e t t e r a l p h a b e t . C r i c k

    ( 1 9 68 ) , i n a r e m a r k a b l y p r e s c i e n t p a p e r , s u g g e s t e d t h a t

    i n f o r m a t i o n c o u l d b e t r a n s f e r r e d f r o m D N A t o R N A

    a n d f r o m R N A t o p r o t e i n , b u t n o t f r o m p r o t e i n t o

    p r o t e i n . H e c a l l e d t h i s b y t h e s o m e w h a t e c c l e s i a s t i c a l

    n a m e The Central Dogma o f m o l e c u l a r b i o l o g y ( C r i c k ,

    1 9 70 ). T h e r e a s o n f o r t h e s e s t a t e m e n t s , a s w e h a v e s e e n

    a b o v e , i s t h a t N a t u r e h a d e x t e n d e d t h e p r i m a r y f o u r

    l e t t e r a l p h a b e t t o t h e s i x - b i t , 6 4 - c o d o n a l p h a b e t o f t h e

    g e n e t i c c o d e . T h u s t h e g e n e t i c c o d e i s a s e v e r a l - t o - o n e

    c o d e . N o t e a l s o , t h a t t h e d i c e g a m e i s a l s o a s e v e r a l - t o -

    o n e c o d e a n d f o r t h a t r e a so n h a s a C e n t r a l D o g m a . T h e

    s o u r c e a l p h a b e t h a s 3 6 m e m b e r s , n a m e l y , a l l p a i rs o f t h e

    n u m b e r s 1 t h r o u g h 6 . T h e s e p a i r s o f n u m b e r s a r e

    a n a l o g o u s t o c o d o n s i n t h e g e n e t ic c o d e . T h e r e c ei v e r

    a l p h a b e t i s f o r m e d b y a d d i n g t h e t w o n u m b e r s a n d h a s

    o n l y 1 1 m e m b e r s , t h e n u m b e r s 2 t h r o u g h 1 2 . T h e s e

    n u m b e r s a r e a n a l o g o u s t o t h e a m i n o a c i d s i n p r o t e i n

    s e q u e n c e s . E x c e p t f o r 2 a n d 1 2, a l l t o t a l s c a n b e m a d e

    b y m o r e t h a n o n e c o m b i n a t i o n o f n u m b e r s r e a d f ro m t h e

    d i c e. T h u s t h e r e i s a l o s s o f i n f o r m a t i o n ( a n d k n o w l e d g e )

    w h e n t h e n u m b e r s a r e a d d e d b e c a u s e t h e l o g i c o p e r a t i o n

    l a c k s a s i n g l e - v a l u e d i n v e r s e. T h i s i s k n o w n a s a l o g i c

    A D D g a t e a n d i s i r re v e r s ib l e . T h u s t h e C e n t r a l D o g m a

    i s a t h e o r e m i n c o d i n g t h e o r y a n d t h e l o g i c o f t h e g e n e t i c

    c o m m u n i c a t i o n s y s t e m . I t d o e s n o t c o m e f r o m b i o c h e m -

    i s t r y . I n t h e f u l l g l o r y o f i t s m a t h e m a t i c a l g e n e r a l i t y i t

    a p p l i e s t o a l l c o d e s w h e r e t h e i n f o r m a t i o n e n t r o p y o f t h e

    s o u r c e a l p h a b e t i s l a r g e r t h a n t h a t o f t h e r e c e iv e r

    a l p h a b e t ( K o l m o g o r o v , 1 9 5 8 ; B i l l i n g s l e y , 1 9 6 5 ; S h i e l d s ,

    1973; O r ns te in , 1974; Y ockey , 1992) .

    I t i s o b v i o u s t h a t i f t h e s o u r c e a n d r e c e iv e r a l p h a b e t s

    h a v e t h e s am e n u m b e r o f s y m b o l s , a n d a o n e - t o - o n e

    c o r r e s p o n d e n c e b e t w e e n t h e m e m b e r s o f t h e a l p h a b e t s ,

    t h e l o g i c o p e r a t i o n h a s a s i n g l e v a l u e d i n v e r s e a n d

    i n f o r m a t i o n m a y b e p a s s e d , w i t h o u t l o s s , i n e i t h e r

    d i r e c ti o n . T h u s , s i n ce R N A a n d D N A b o t h h a v e a

    f o u r - l e t t e r a l p h a b e t , g e n e t i c m e s s a g e s m a y b e p a s s e d

    f ro m R N A t o D N A . T h e p a s sa g e o f i n f o r m a t io n f r o m

    p r o t e i n t o R N A o r D N A i s p r o h i b i t e d f o r t w o r e a so n s :

    ( 1 ) t h e l o g i c O R g a t e i s i rr e v e r s i b l e , a n d ( 2 ) t h e r e i s n o t

    e n o u g h i n f o r m a t i o n i n th e 2 0 l e t te r p r o t e i n a l p h a b e t t o

    d e t e r m i n e t h e 6 4 le t t e r m R N A a l p h a b e t f r o m w h i c h i t i s

    t r a n s l a t e d ( Y o c k e y , 1 9 92 , 1 9 95 a) . T h e r e i s n o m a t h e m a t -

    i c a l r e s t r i c t i o n o f t h e t r a n s f e r o f i n f o r m a t i o n f r o m

    p r o t e i n t o p r o t e i n . T h is w a s o n c e t h o u g h t t o b e t h e m e a n s

    b y w h i c h p r i o n s w e r e f o r m e d b u t t h a t i s n o l o n g e r

    b e l i e v e d t o b e t h e c as e . T h e r e a p p e a r s t o b e n o b i o l o g i c a l

    m e c h a n i s m b y w h i c h i n f o r m a t i o n c a n b e e x c h a n g e d

    b e t w e e n t w o p r o t e i n s .

    7 T h e g e n e t i c c o d e a n d e r ro r in th e g e n e t ic m e s s a g e

    L e t u s r ef e r t o t h e s e c o n d p a r a g r a p h i n S h a n n o n ' s 1 94 8

    p a p e r w h e r e h e w r o t e :

    T h e f u n d a m e n t a l p r o b l e m o f c o m m u n i c a t i o n i s t h a t o f

    r e p r o d u c i n g a t o n e p o i n t e i t h e r e x a c t l y o r a p p r o x i m a t e l y

    a m e s s a g e s e l e c t e d a t a n o t h e r p o i n t .

    S h a n n o n m o d e l e d t h e g e n e r a t i o n o f a m e s s a g e a s a

    M a r k o v p r o c e s s . A t a l l s t ag e s o f t h e c o m m u n i c a t i o n

    s y s te m t h e m e s s a g e m a y b e a c t e d u p o n b y a s e c o n d

    M a r k o v p r o c e s s l e a d i n g t o a n i n t e r c h a n g e i n s o m e o f t h e

    l e t te r s i n t h e m e s sa g e in a r a n d o m a n d n o n - r e p r o d u c i b l e

    f a s h i o n . T h e r e s u l t o f t h i s p r o c e s s i s c a l l e d n o i s e .

    F i g . 1 ( f r o m I n f o r m a t i o n T h e o r y a n d M o l e c u l a r B i-

    o l o g y ( 1 9 9 2 ) , C a m b r i d g e U n i v e r s i t y P r e s s , w i t h p e r m i s -

    s o u r c e

    tape

    c o d e

    I

    e n c o d e r

    transmitter

    c h a n n e l

    de c ode r de s t inat ion

    tape

    channel

    code

    - t J -I i - I

    T

    r e c e ive r

    n o i s e

    Fig. 1. The transmission of inform ation from source to de stinatio n as conceived in electrical engineering. No ise occurs in all stages

    but is shown according to accepted practice in electrical engineering.

  • 8/10/2019 Origin of Life on Earth and Shannons

    8/19

    112 H.P. Yockey / Computers Chemistry 24 (2000) 105-123

    Source ~ I ~ Channel code

    code

    r I

    Source tape Encoding

    Message

    ] [ D N A co de I Channe l

    in DNA ~ to ~ ~-

    code mRNA code I -mRNA

    message

    D N A tape Tr an sc r ip t io n in RN A code l

    Nucleic I Genetic noise I

    RN A

    polymerase

    Destination

    [ code letters ,.._ Destination

    I

    -I code

    D e c o d i n g ~ _ ~ protein

    Channel I

    message

    ' ~ in

    protein

    mRNA message code

    + genetic noise Tran]~ation Protein

    T

    tape

    I Genetic

    noise

    mischarged

    RNA I

    I amino acyl

    synthetases

    tRNA

    amino

    acids

    Fig. 2. The transmission of genetic messages from the DNA to the protein tape as conceived in molecular biology. Genetic noise

    occurs in all stages but is lumped in the figure to f ix the idea.

    s i o n ) s h o w s t h e b a s i c e l e m e n t s o f a c o m m u n i c a t i o n s

    s y s t e m a s p r e s e n t e d b y S h a n n o n . I n F i g . 2 ( f r o m I n f o r -

    m a t i o n T h e o r y a n d M o l e c u l a r B io l o g y, C a m b r i d g e

    U n i v e r s i t y P r e s s , w i t h p e r m i s s i o n ) w e s e e t h e c o m m u n i -

    c a t i o n o f i n f o r m a t i o n f r o m t h e g e n o m e a s r e c o r d e d in

    D N A t o t h e m e s s a g e t h a t d e t e r m i n e s t h e p r o t e i n . T h e

    s e q u e nc e o f n u c l e o t i d e s t h a t c o m p o s e D N A p l a y s t h e

    s a m e r o l e a s th e t a p e i n a t a p e r e c o r d i n g m a c h i n e . T h e

    p r o t e i n m o l e c u l e , w h i c h i s t h e d e s t i n a t i o n o f t h e m e s -

    s a g e , i s a l s o a t a p e . T h u s t h e o n e - d i m e n s i o n a l g e n e t i c

    m e s s a g e i s r e c o r d e d i n a s e q u e n c e o f a m i n o a c i d s ,

    w h i c h f o l d s u p t o b e c o m e a t h r e e - d i m e n s i o n a l a c t i v e

    p r o t e i n m o l e c u l e . O n e i s r e m i n d e d o f t h e l i n e a r s e -

    q u e n c e o f s i g n a ls f ro m a t e l e v i s i o n s t a t i o n t h a t , b e c a u s e

    o f t h e r a s t e r , d e fi n e s a t w o - d i m e n s i o n a l o b j e c t , n a m e l y ,

    t h e p i c t u r e o n t h e s c r e e n .

    A r e a d i n g e r r o r t h a t a f f li c t s r e a d i n g t h e m e s s a g e i n

    t h e p o s t a l c o d e , a n d i n d e e d , a l l c o d e s , i s c a l l e d g e n e t i c

    n o i s e i n F i g . 2 . T h e r e i s v e r y l i t t l e e r r o r d e t e c t i o n o r

    c o r r e c t i o n a t t r i b u t e d t o t h e s t r u c t u r e o f t h e g e n e ti c

    c o d e . N e v e r t h e l e s s , t h e g e n e t i c i n f o r m a t i o n s y s t e m i s

    e x t r e m e l y a c c u r a t e . T h e e r r o r f r e q u e n c y i n t h e f i r s t

    a m i n o a c y l a t i o n o f a n a m i n o a c id o n i t s a d a p t e r t R N A

    i s a b o u t 3 x 1 0 - 4 I t i s a c u r i o u s p h e n o m e n o n t h a t

    t h e r e i s a p r o o f r e a d i n g o r d o u b l e s i e v i n g m e c h a n i s m

    t h a t c h e c k s t h i s p r o c e s s b y d i s c a r d i n g ' w r o n g ' a m i n o

    a c i d s . I t a ls o h a s a n e r r o r f re q u e n c y o f a b o u t 3 x 1 0 4 .

    T h e e r r o r f r e q u e n c y a f t e r t h e o p e r a t i o n o f b o t h m e c h a -

    n is m s is t he p r o d u ct 3 x 1 0 - 4 x 3 x 1 0 - 4 = 9 x 10 a

    ( F r e i s t e t a l . , 1998) .

    8 . Sha n no n s Cha nnel Ca pa c i ty Theo rem a nd the ro le

    of error in protein funct ion and speci f ici ty

    S h a n n o n ' s C h a n n e l C a p a c i t y T h e o re m p r o v e d t h a t

    c o d e s e x i s t s u c h t h a t s u f f i c ie n t r e d u n d a n c e c a n b e i n t r o -

    d u c e d s o t h a t a m e s s a g e c a n b e s e n t f r o m s o u r c e t o

    r e c e i v e r w i t h a s f e w e r r o r s a s m a y b e s p e c i fi e d . T h e

    t h e o r e m i s n o t c o n s t r u c t i v e s o o n e m u s t u s e o n e ' s

    i n g e n u i t y t o c o n s t r u c t s u i t a b le c o d e s . E r r o r d e t e c t in g

    a n d c o r r e c t i o n c o d e s a r e f o r m e d b y g o i n g t o h i g h e r

    e x t e n s io n s a n d u s i n g t h e r e d u n d a n c e o f t h e e x t en d e d

    s y m b o l s f o r e r r o r d e t e c t i o n a n d c o r r e c t i o n ( H a m m i n g ,

    1 9 8 6 ) . T h e i n t e n s e a c a d e m i c a n d i n d u s t r i a l i n t e r e s t i n

    t h e i m p r o v e m e n t s i n c o m m u n i c a t i o n p r o m i s e d b y

    S h a n n o n ' s C h a n n e l C a p a c i t y T h e o r e m l e d t o t h e d is -

    c o v e r y b y m a t h e m a t i c i a n s a n d e l e c tr i c al e n g i n e e rs o f a

    n u m b e r o f e r r o r c o r r e c t in g c o d e s .

    B i o l o g is t s, w o r k i n g i n a n o t h e r p a r t o f t h e g r o v e s o f

    a c a d e m e , c o n f r o n t e d t h e p r o b l e m o f e r r o r in t h e f o r m a -

    t i o n o f p r o t e i n f r o m t h e s e q u e n c e o f n u c l e o t i d e s in

    D N A . A l t h o u g h t h ey h a d t h e i d e a o f e r r o r a c c u m u l a-

    t i o n a n d i t s e f fe c t o n t h e v i a b i l i t y o f o r g a n i s m s , r e g r e t -

    t a b l y t h e y a t t e m p t e d t o c o n s i d e r t h e p r o b l e m i n t e r m s

    of the e r r or s them se lves ( O r ge l , 1963 , 1970). This l ed

    m a n y o f t h e m t o s p e c u l a te t h a t a n ' e r r o r c a t a s t r o p h e '

  • 8/10/2019 Origin of Life on Earth and Shannons

    9/19

    H.P . Yockey / Computers Chemis try 24 (2000) 105 -123

    113

    w o u l d o c c u r i n e v i t a b l y a s e r r o r s a c c u m u l a t e d , l e a d i n g

    t o t h e d e a t h o f t h e o r g a n i s m . E i g e n (1 9 71 , 1 9 92 ) a n d

    E i g e n a n d S c h u s t e r ( 1 9 7 7 ) , a l s o a d d r e s s t h e q u e s t i o n o f

    t h e e f fe c t o f e r r o r s i n t h e f o r m a t i o n o f p r o t e i n s f r o m

    c o n s i d e r a t i o n s o f t h e e r r o r s t h e m s e l v e s . T h e s e a u t h o r s

    f i n d a n ' e r r o r t h r e s h o l d ' t o a p p l y t o t h e t r a n s f e r o f

    i n f o r m a t i o n f r o m D N A t h r o u g h m R N A t o p r o t e i n .

    S o m e e r r o r c a n b e t o l e r a t e d i n t h e p r o c e s s o f p r o t e i n

    f o r m a t i o n . S p e c if ic p r o t e i n m o l e c u l e s h a v in g a m i n o

    a c i d s t h a t d i f f e r f r o m t h o s e c o d e d f o r i n D N A m a y

    h a v e f u l l s p e c i f i c it y i f t h e m u t a t i o n i s t o a f u n c t i o n a l l y

    e q u i v a l e n t a m i n o a c i d . P r o t e in m o l e c u l e s re n d e r e d i n a c -

    t i ve b e c a u s e o f m i s c h a r g e d a m i n o a c i d s t h a t a r e n o t

    f u n c t i o n a l l y e q u i v a l e n t a r e d e g r a d e d b y p r o t e o l y t i c e n -

    z y m e s . I t i s o n l y w h e n t h e s u p p l y o f e s s e n t i a l p r o t e i n s

    d e c a y s b e l o w a c r i t i c a l l e v e l t h a t p r o t e i n e r r o r b e c o m e s

    l e t h al . A l t h o u g h o t h e r c o n s i d e r a t i o n s o n t h e a g i n g p r o -

    c e s s h a v e b e e n m a d e ( H o l l i d a y , 1 9 8 6 ) , t h e a c c u r a c y o f

    p r o t e i n b i o s y n t h e s i s i s s t i l l o f t o p i c a l i n t e r e s t ( F r e i s t e t

    al. , 1998).

    S i n c e S h a n n o n r e g a r d e d t h e g e n e r a t i o n o f a m e s s a g e

    t o b e a M a r k o v p r o c e s s , i t w a s n a t u r a l t o m e a s u r e t h e

    e f fe c t o f e rr o r s d u e t o n o i s e b y t h e c o n d i t i o n a l e n t r o p y ,

    H ( x ] y ) , b e t w e e n t h e s o u r c e p r o b a b i l i t y s p a c e (f L A , p )

    w i t h a n i n p u t a l p h a b e t A w i t h el e m e n t s x , a n d t h e

    r e c e i v i n g p r o b a b i l i t y s p a c e ( ~ , B , p ) w i t h r e c e i v i n g

    a l p h a b e t B w i t h e l e m e n t s y . T h e c o n d i t i o n a l p r o b a b i l i t y

    m a t r i x P w i t h m a t r i x e l e m e n t s

    p ( i [ j )

    g i v es t h e p r o b a -

    b i l i t y t h a t i f le t t e r y j a p p e a r s a t t h e r e c e i v e r t h a t l e t t e r x~

    w a s s e n t . T h e p r o b a b i l i t i e s p ~ i n ( f l , A , p ) a n d p j i n ( f L

    B , p ) a r e r e l a t e d b y t h e f o l l o w i n g e q u a t i o n :

    p j = Z ~ p , p ( i [ j ) ( 9 )

    T h e c o n d i t i o n a l e n t r o p y , H ( x [ y ) i s w r i t t e n i n t e rm s o f

    t h e c o m p o n e n t s o f p a n d t h e e l e m e n t s o f P :

    H ( x [ y ) = E ~ p j p (i [ j ) l o g 2 p ( i [ j ) (1 0)

    T h e m u t u a l e n t r o p y

    I ( A ; B )

    t h a t m e a s u r e s t h e a m o u n t

    o f s h a r e d o r m u t u a l i n f o r m a t i o n o f th e i n p u t s e q u e n c e

    a n d t h e o u t p u t i s :

    I ( A ; B ) = H ( x ) - - H ( x [ y) ( 11)

    I t p r o v e s m o r e c o n v e n i e n t t o d e a l w i t h t h e m e s s a g e a t

    t h e s o u r c e a n d t h e r e f o r w i t h t h e p~ a n d t h e p ( j [ i )

    ( Y o c k e y , 1 9 74 , 1 99 2 ). F r o m B a y e s ' t h e o r e m o n c o n d i -

    t i o n a l p r o b a b i l i l i t i e s , w e h a v e ( F e l l e r , 1 9 5 7) :

    p ( i

    ] j )

    = p ~ p ( j [ i ) / p j

    (12)

    S u b s t i t u t i n g t h i s e x p r e s s i o n f o r p ( i [ j ) i n E q . ( I 0 ) w e

    h a v e :

    I ( A ; B ) = H ( x ) - - g ( y [ x ) - - Z ~ p i p ( j [ i ) [ log2 (Pj/Pt)]

    (13)

    w h e r e

    H ( y ] x ) = - - ~ : p i p ( j [ i ) l o g 2 p ( j [ i ) (14)

    H ( y [ x ) v a n i s h e s i f t h e r e i s n o n o i s e b e c a u s e t h e m a t r i x

    e l e m e n t s p ( j [ i ) a r e a l l e i t h e r 0 o r 1 . T h e t h i r d t e r m i n

    E q . ( 1 3 ) i s t h e i n f o r m a t i o n t h a t c a n n o t b e t r a n s m i t t e d

    t o t h e r e c e i v e r i f th e e n t r o p y o f ( fL A , p ) i s g r e a t e r t h a n

    t h e e n t r o p y o f ( f L B , p ) . F o r i l l u s t r a t i o n w e m a y s e t a l l

    t h e m a t r i x e l e m e n t s o f P , p ( j [ i ) t o t h e v a l u e s g i v e n i n

    T a b l e 3 w h e r e c~ i s t h e p r o b a b i l i t y o f m i s r e a d i n g o n e

    n u c l e o t i d e . S u b s t i t u t i n g t h e s e m a t r i x e l e m e n t s i n E q s .

    ( 1 3 ) a n d ( 1 4 ) a n d , r e p l a c i n g t h e l o g a r i t h m b y i t s e x p a n -

    s i o n , k e e p i n g o n l y t e r m s o f s e c o n d d e g r e e w e h a v e :

    I ( A ; B ) = H ( x ) - 1.7915 + 34 .2018~ 2 + 6 .803~ log 2

    (15)

    T h e g e n e t i c c o d e c a n n o t t r a n s f e r 1 .7 9 15 b i t s o f i t s

    s i x - b i t a l p h a b e t t o t h e p r o t e i n s e q u e n c e s e v e n w h e n f r e e

    o f e r ro r s . T h e r e a d e r m a y f in d i t i n s tr u c t iv e a n d a m u s -

    i n g t o c a r r y o u t t h e s e c a l c u l a t i o n s f o r t h e d i c e g a m e .

    H o w m u c h i n f o r m a t i o n i s l o s t b y a d d i n g t h e n u m b e r s

    o n t h e d i c e ? O n e m u s t w r i t e z e r o i n s t e a d o f ~ f o r t h e

    m a t r i x e l e m e n t i n T a b l e 3 i f t h e r e p l a c e m e n t a m i n o a c i d

    i s f u n c t i o n a l l y e q u i v a l e n t a t e a c h s i te in t h e p r o t e i n

    s e q u e n c e . I h a v e p u r s u e d t h i s d i s c u s s i o n t o c a l c u l a t e t h e

    i n f o r m a t i o n c o n t e n t o f th e c y t o c h r o m e c m o l e c u l e t o b e

    374 b i t s ( Y ockey , 1992) .

    W e c a n r e p l a c e t h e s e n d e r a n d r e c e iv e r a n d c a l c u l a te

    t h e m u t u a l e n t r o p y o f a n y s e q u e n c e s o r f a m i l ie s o f

    s e q u e n c e s h o w e v e r t h e y m a y h a v e b e e n g e n e r a t e d . T h e

    m u t u a l e n t r o p y w i ll gi v e u s a m e a s u r e o f t h e ' s i m i l a r i t y '

    o f t h e s e q u e n c e s o r f a m i l i e s o f s e q u e n c e s ( S h a n n o n ,

    1948; Y ockey , 1992; Tononi e t a l . , 1996 , 1999) . Pe r haps

    b e c a u s e o f t h e a p a r t h e i d i n t h e g r o v e s o f a c a d e m e , t h e

    m e a s u r e o f t h e s i m i l a r i t y o f s e q u e n c e s i s g i v e n i n t h e

    l i t e r a t u r e a s ' p e r c e n t i d e n t i t y ' . ' P e r c e n t i d e n t i t y '

    ( A l t sc hul e t a l . , 1997; Levi t t and G er s te in , 1998) is on ly

    a n a d h o c s c o r e o f si m i l a r i ty f o r t h e s a m e r e a s o n s t h a t

    e r r o r f r e q u e n c y i s n o t a n a c c e p t a b l e m e a s u r e o f p r o t e i n

    e r r o r . G i v e n a s p e c if i c s i t e i n a n a l i g n m e n t o f t w o

    s e q u e n c e s o r f a m i l i e s o f s e q u e n c e s , o n e m u s t c o n s i d e r

    t h e c l o s e l y r e l a t e d f u n c t i o n a l l y e q u i v a l e n t a m i n o a c i d s .

    ' P e r c e n t id e n t i t y ' d o e s n o t t a k e i n t o a c c o u n t t h a t

    a m i n o a c i d s a r e a l m o s t a l w a y s n o t e q u a l l y p r o b a b l e

    a n d f o r t h i s r e a s o n l e a d s t o i l l u s i o n s . S i n c e a s S o c r a t e s

    s a i d, . . . b u t s a t i s fa c t o r y m e a n s h a v e b e en f o u n d f o r

    d i s p e l l i n g t h e s e i l lu s i o n s b y m e a s u r i n g , c o u n t i n g a n d

    w e i g h i n g . t h e m u t u a l e n t r o p y s h o u l d b e u s e d a s t h e

    o n l y c o r r e c t m e a s u r e o f ' s i m i l a r i t y '.

    A c c o r d i n g t o m e c h a n i s t s - r e d u c t i o n i s t s , n o n - l i v i n g

    m a t t e r c a n s e l f - o rg a n i z e , a n d s o li f e a n d e v o l u t i o n a r e

    m a n i f e s t a t i o n s o f th e s p o n t a n e o u s e m e r g e n c e o f ' o r d e r ' .

    L i v i n g m a t t e r , t h e y s a y , h a s b o t h ' o r d e r ' a n d ' c o m p l e x -

    i t y . ' I n f o r m a t i o n t h e o r y s h o w s t h a t t h e i d e a s o f ' c o m -

    p l e x i t y ' a n d ' o r d e r ' a r e c o n t r a d i c t o r y a n d m u t u a l l y

    e x c l u s i v e . C r y s t a l s a r e h i g h l y o r d e r e d d u e t o t h e r e g u -

    l a r i t i e s i n t h e p l a c e m e n t o f t h e i r m o l e c u l e s a s d i r e c t e d

    b y t h e l a w s o f p h y s i c s a n d c h e m i s t r y . H o w e v e r , t h e s e

    r e g u l a r i t i e s , w h i c h a l l o w c r y s t a l s t o b e d e s c r i b e d b y a

  • 8/10/2019 Origin of Life on Earth and Shannons

    10/19

    T

    3

    G

    e

    c

    o

    p

    y

    m

    a

    x

    e

    m

    e

    p

    (

    A

    m

    i

    n

    a

    d

    L

    S

    A

    r

    A

    l

    V

    a

    P

    T

    h

    G

    l

    y

    l

    T

    e

    m

    T

    y

    H

    i

    G

    i

    n

    A

    s

    L

    A

    s

    G

    lu

    C

    y

    P

    T

    r

    M

    e

    C

    o

    x

    U

    U

    A

    (

    7

    ~

    :

    ~

    U

    U

    G

    (

    7

    ~

    :

    ~

    C

    U

    U

    (

    6

    :

    ~

    C

    U

    C

    (

    6

    z

    z

    C

    U

    A

    (

    5

    x

    C

    U

    G

    1

    ~

    U

    C

    U

    (

    c

    U

    C

    C

    (

    U

    C

    A

    ~

    (

    6

    U

    C

    G

    (

    6

    :

    ~

    A

    G

    U

    (

    8

    3

    A

    G

    C

    (

    8

    3

    C

    G

    U

    :

    ~

    :

    ~

    (

    C

    G

    C

    z

    ~

    C

    G

    A

    :

    ~

    (

    5

    ~

    C

    G

    G

    ~

    (

    5

    A

    G

    G

    2

    ~

    (

    7

    A

    G

    A

    2

    ~

    (

    7

    G

    C

    U

    ~

    (

    G

    C

    C

    x

    (

    G

    C

    A

    ~

    (

    :

    G

    C

    G

    z

    (

    G

    U

    U

    ~

    :

    ~

    (

    6

    G

    U

    C

    ~

    ~

    (

    G

    U

    A

    2

    ~

    (

    0

    G

    U

    G

    2

    ~

    :

    ~

    (

    C

    C

    U

    :

    ~

    ~

    ~

    C

    C

    C

    ~

    ~

    ~

    C

    C

    A

    ~

    ~

    ~

    C

    C

    G

    ~

    ~

    ~

    2

    ~

    2

    ~

    1

    -

    6

    ~

    )

    ~

    (

    0

    :

    1

    -

    6

    :

    ~

    )

    ~

    I

    4

  • 8/10/2019 Origin of Life on Earth and Shannons

    11/19

    T

    a

    3

    C

    o

    n

    A

    m

    i

    n

    a

    d

    L

    e

    S

    A

    r

    A

    l

    V

    a

    P

    T

    h

    G

    l

    y

    l

    T

    e

    m

    T

    y

    H

    i

    G

    i

    n

    A

    s

    L

    y

    A

    s

    G

    l

    u

    C

    y

    P

    T

    r

    M

    e

    A

    C

    U

    2

    A

    C

    C

    2

    A

    C

    A

    :

    :

    x

    A

    C

    G

    :

    ~

    G

    G

    U

    x

    :

    G

    G

    A

    2

    G

    G

    C

    :

    ~

    :

    G

    G

    G

    2

    A

    U

    U

    ~

    z

    A

    U

    C

    :

    ~

    A

    U

    A

    2

    ~

    :

    U

    A

    A

    z

    U

    A

    G

    :

    ~

    c

    U

    G

    A

    :

    ~

    :

    ~

    2

    U

    A

    U

    :

    ~

    U

    A

    C

    C

    A

    U

    :

    :

    C

    A

    C

    :

    ~

    :

    C

    A

    A

    ~

    C

    A

    G

    ~

    A

    A

    U

    ~

    A

    A

    C

    A

    A

    A

    A

    A

    G

    :

    G

    A

    U

    G

    A

    C

    G

    A

    A

    G

    A

    G

    U

    G

    U

    2

    :

    ~

    U

    G

    C

    2

    z

    U

    U

    U

    3

    U

    U

    C

    3

    :

    ~

    U

    G

    G

    :

    ~

    :

    ~

    2

    ~

    A

    U

    G

    2

    :

    ~

    ~

    0

    ~

    :

    ~

    :

    ~

    :

    ~

    :

    0

    :

    ~

    ~

    0

    ~

    c

    :

    ~

    0

    :

    3

    c

    :

    ~

    0

    9

    3

    0

    3

    g

    7

    3 3

    7

    0

    2

    ~

    0

    2

    :

    ~

    ~

    :

    ~

    c

    ~

    z

    2

    ~

    0

    :

    :

    ~

    :

    ~

    :

    ~

    0

    2

    ~

    :

    :

    ~

    c

    2

    0

    :

    :

    ~

    z

    ~

    2

    :

    ~

    :

    ~

    c

    0

    2

    ~

    :

    :

    :

    ~

    2

    ~

    0

    :

    :

    2

    0

    :

    ~

    ~

    O

    ~

    :

    ~

    :

    ~

    :

    ~

    2

    0

    :

    ~

    :

    ~

    :

    ~

    2

    0

    :

    2

    ~

    O

    S

    ~

    8

    0

    ~

    :

    ~

    8

    c

    8

    2

    9

    R

    e

    m

    a

    f

    m

    Y

    o

    H

    P

    A

    n

    a

    o

    o

    n

    m

    a

    o

    t

    h

    o

    h

    c

    d

    m

    a

    a

    h

    h

    h

    J

    T

    h

    B

    i

    o

    1

    4

    3

    P

    w

    i

    h

    p

    m

    x

    o

  • 8/10/2019 Origin of Life on Earth and Shannons

    12/19

    116 H.P. Yockey / Computers Chemistry 24 (2000) 105-123

    short sequence or algorithm, mean that crystals have a

    very low information content. Information theory nar-

    rows this definition to say that we may only call a

    sequence 'highly ordered' if it has regularities and can

    be described by a much shorter sequence or algorithm.

    The information content of a sequence is measured in

    the bits (or bytes).

    Sequences of symbols that exhibit no orderly pattern

    from which the rest of the sequence can be predicted

    may, nevertheless, have low complexity because they

    can be computed from an algorithm of finite informa-

    tion content (Pincus and Kalman, 1997; Pincus and

    Singer, 1996). For example, 7r and e have been calcu-

    lated to sequences of more than a billion digits. They

    exhibit no orderly pattern - - yet each digit in turn is

    computable. We may conclude, in general, that a finite

    message, specified by a computer program, may exist

    that carries all the information contained in an infinite

    sequence even though there is no discernible pattern in

    the sequence of symbols.

    A sequence of symbols is highly 'complex' when it

    has little or no redundance or 'order' and cannot be

    described by a much shorter sequence or calculated by

    an algorithm of finite length. A random sequence has

    the highest degree of complexity, has no redundance

    and cannot be described except by the sequence itself.

    Thus ~r and e, although their digits have no orderly

    pattern, are neither complex nor random. Thus when

    we speak of the amou nt o f complexity in a sequence we

    are speaking about the amount of its randomness as

    well (Yockey, 1990).

    The principle, that computer data compression soft-

    ware may reduce the order, redundancies, patterns and

    regularities peculiar to natural languages, is based on a

    theorem in infor mation theory by Sh annon (1948). The

    result, achieved after the compression, is nearly indis-

    tinguishable from a random sequence since most of the

    patterns and regularities have been removed.

    9 . I n f o r m a t i o n t h e o r y a n d t h e o r i g i n o f l i f e : t h e g a p

    b e t w e e n l i v i n g a n d n o n l i v i n g

    It is a curious fact, highly relevant to the origin of

    life, that the genetic code is construc ted to confront and

    solve the problems of commun ication and recording by

    the same principles found both in the genetic informa-

    tion system and in modern computer and communica-

    tion codes. The origin of a rather accurate genetic code

    is a pons asinorum that must be crossed to pass over

    the abyss that separates crystallography, high polymer

    chemistry and physics from biology. As I mentioned

    above, there is nothing in the physico-chemical world

    that remotely resembles reactions being determined by

    a sequence and codes between sequences. The existence

    of a genome and the genetic code divides living organ-

    isms from non-living matter.

    The origin of life has concerned philosophers and

    theologians since antiquity, for example, the idealist

    philosopher, Immanuel Kant (1724-1804) in his Cri-

    tique of Judgment. (A quotation in English transla-

    tion can be found in Wfichtersh/iuser (1997).) It was

    not until the nineteenth century that these consider-

    ations can be called scientific. There were, in general,

    three philosophical approaches to the origin of life.

    The first was vitalism, the belief there is a metaphysi-

    cal, supe rnatura l, non -mate rial, idealist 61an vital, a

    vital princ iple or life force, which dis tinguishes living

    from non-li ving matter. Vitalism has its roots in the

    German idealist philosophy of F.W.A. Schelling

    (1775 1854) and others in the ninet eenth century,

    members of a romantic philosophic movement,

    Naturphilosophie, who believed all creation was a

    manifestation of a World Spirit. They believed all

    matter possessed this Spirit and organized bodies had

    it to an intense degree. In the nineteen th century it

    was quite possible to be a vitalist, believing in a vital

    force or 61an vital, withou t th inking of the vital force

    being supernatural. At the time it was as valid to

    attribute the laws and effects of vitality to a nonma-

    terial vital force as it was to attribute the laws and

    effects of gravity to a nonm ater ial gravitati onal force.

    As Darwi n put it, Who can explain what is the

    essence of the attraction of gravity? No one now ob-

    jects to following out the results consequent on this

    unk nown element of attracti on. (Origin of Species,

    Chapter XV) Vitalism is no longer considered since

    the discovery by Watson and Crick of the role played

    by DNA and RNA in the formation of protein. The

    second was mecha nist-re ductioni sm. A group of

    young physiologists at an 1869 meeting in Innsbruck,

    Austria, declared: The ultimate objective of the nat-

    ural sciences is to reduce all processes in Nature to

    the movements that underlie them and to find their

    driving forces, that is, to reduce them to mechanics

    (Mayr , 1982). Wfichtersh/iuser (1997) is a cur ren t

    support er of mechanist reductioni sm, the view that if

    the historic process of the origin and evolution of life

    could be followed it would prove to be a purely

    chemical process. The third was the dialectical materi-

    alism of Friedrich Engels (1954), supported by Oparin

    (1957). According to dialectical materialism, the ap-

    pearance of life is achieved, not through the laws of

    physics and chemistry, but through the Law of the

    Transformation of Quantity into Quality. For that

    reason the proponents of dialectical materialism are

    extremely hostile to mechanist reductionism. Engels

    (1820-1895) had observed in one of the fragments of

    Dialectics of Nature:

    Mechanism applied to life is a hopeless category, at

    the most we could speak of chemism, if we do not

    want to renounce all understanding of names.

  • 8/10/2019 Origin of Life on Earth and Shannons

    13/19

    H.P. Yockey / Computers Chemistry 24 (2000) 105-123 117

    O p a r i n , u p o n r e a d i n g t h a t , a b a n d o n e d r e d u c t i o n -

    i s m - m e c h a n i s m a n d b e c a m e a n i m m e d i a t e c o n v e r t t o

    d i a l e c t ic a l m a t e r i a l i s m . O p a r i n , i n h i s l a t e r b o o k s , w e n t

    i n to g r e a t d e t a il t o d e n o u n c e r e d u c t i o n i s m - m e c h a n i s m

    a n d s u p p o r t d i a l e c t i c a l m a t e r i a l i s m . W e s t e r n i n t el l e c-

    t u a l a p o l o g i s t s f o r O p a r i n w h o a r e n o t t r a i n e d i n t h e

    i n t r i c a c i e s o f p h i l o s o p h y m a y f i n d t h e r e l a t i o n s h i p b e -

    t w e en r e d u c t i o n i s m - m e c h a n i s m a n d d i a le c t ic a l m a t e r i-

    a l i s m t o b e a d i s t i n c t i o n w i t h o u t a d i f f e re n c e ( L a z c a n o ,

    1997; M ille r e t a l . , 1997) .

    I s h a l l p u r s u e t h o s e p r o p o s a l s t h a t p u r p o r t t o g e n e r -

    a t e a g e n o m e a n d a g e n e t i c c o d e . A c e n t r a l s p e c u l a t io n

    i n v o l v e s t h e p r e m i s e t h a t l i f e o r i g i n a t e d f r o m a p r i m e -

    v a l s o u p o f o r g a n i c s u b s t a n c e s i n t h e o c e a n b y p r e b i o t i c

    o r c h e m i c a l e v o l u t i o n , a b e l i e f t h a t l i fe w o u l d a r i s e

    s p o n t a n e o u s l y i n f l a g r a n t e d e l i c t o f r o m t h i s n o n - l i v i n g

    m a t t e r a n d t h a t i t w o u l d a l m o s t i n e v i t a b l y a r i se o n

    ' s u f f i c i e n t l y s i m i l a r y o u n g p l a n e t s e l s e w h e r e ' . G e o r g e

    G a y l o r d S i m p s o n ( 1 9 6 4 ) p o i n t e d o u t t h a t t h e N a t i o n a l

    A e r o n a u t i c s a n d S p a c e A d m i n i s t r a t io n ( N A S A ) h a s a

    p r o g r a m i n ' s p a c e b i o s c i e n c e '. " T h e r e i s e v e n i n c r e a s in g

    r e c o g n i t i o n o f a n e w s c i e n c e o f e x t r a t e r r e s t r i a l l i f e ,

    s o m e t i m e s ca l l e d e x o b i o l o g y - - a c u ri o u s d e v e l o p m e n t

    i n v i e w o f t h e f a c t t h a t t h i s ' s c i e n c e ' h a s y e t t o d e m o n -

    s t r a t e t h a t i t s s u b j e c t m a t t e r e x i s t s " T h i r t y f i v e y e a r s

    l a t e r t h a t s t a t e m e n t i s s t i l l t r u e .

    S t a n l e y M i l l e r ( 1 9 53 ) i s u s u a l l y g i v e n c r e d i t f o r b e i n g

    t h e f i r s t t o g e n e r a t e a m i n o a c i d s i n a p r e b i o t i c a t m o -

    s p h e r e. H o w e v e r , W a l t h e r L 6 b (1 91 3) a n d O s k a r B a u d -

    i s c h ( 1 9 1 3 ) a n t i c i p a t e d h i m b y 4 0 y e a r s . T h e s e t w o

    w o r k e r s , b o t h a d d r e s s i n g t h e q u e s t i o n o f t h e a s s im i l a -

    t i o n o f n i tr o g e n t o f o r m p r o t e i n , s h o w e d t h a t a m i n o

    a c i d s a r e g e n e r a t e d i n t h e s i l e n t e le c t r i c a l d i s c h a r g e

    ( L 6 b , 1 9 13 ) o n l y i n a r e d u c i n g a t m o s p h e r e , a n d b y U V

    ( B a u d i s c h , 1 9 1 3 ) , a l s o o n l y i n a r e d u c i n g a t m o s p h e r e . I

    h a v e g i v e n a r e v i ew o f t h e v i ew s o f D a r w i n , O p a r i n a n d

    M i l l e r t o g e t h e r w i t h a c r i t i c i sm t h a t t h e r e i s n o g e o l o g -

    i c a l e v i d e n c e t h a t s u c h a p r i m e v a l s o u p e v e r e x i s t e d

    ( Y o cke y , 1992 , 1995a ,b , 1997) . Mi l le r e t a l . ( 1997) and

    L a z c a n o ( 19 9 7) a d m i t t h a t t h e r e i s n o g e o l o g i c a l o r

    g e o c h e m i c a l e v i d e n c e t h a t a p r i m e v a l s o u p e v e r e x i s t e d .

    M o j z s i s e t a l . ( 1 9 9 8 ) r e m a r k e d : " H o w e v e r , i t i s n o w

    h e l d h i g h l y u n l i k e l y t h a t t h e c o n d i t i o n s u s e d i n t h e s e

    e x p e r i m e n t s ( s i l e n t e l e c t r i c a l d i s c h a r g e ) c o u l d r e p r e s e n t

    t h o s e i n t h e A r c h a e n a t m o s p h e r e . E v e n s o , s c i e n t i f i c

    a r t ic l e s s ti l l o c c a s i o n a l l y a p p e a r t h a t r e p o r t e x p e r i m e n t s

    m o d e l l e d o n t h e s e c o n d i t i o n s a n d e x p l i c it y o r t a c i t l y

    c l a i m t h e p r e s e n c e o f r e s u l t i n g p r o d u c t s i n r e a c t i v e

    c o n c e n t r a t i o n s ' o n t h e p r i m o r d i a l E a r t h ' o r i n a ' p r e b i -

    o t i c s o u p . ' ( S e e f o r e x a m p l e , D e a m e r , 1 9 9 7 ; S m i t h ,

    1 9 98 , 1 9 99 ). T h e i d e a o f s u c h a ' s o u p ' c o n t a i n i n g a l l th e

    d e s i r e d o r g a n i c m o l e c u l e s i n c o n c e n t r a t e d f o r m i n t h e

    o c e a n h a s b e e n a m i s l e a d in g c o n c e p t a g a i n s t w h i c h

    o b j e c t i o n s w e r e r a i s e d e a r l y (S i l l6 n , 1 9 6 5 ) ." T h i s r e m a r k

    b e l i e s t h e t i t l e o f t h e b o o k i n w h i c h i t a p p e a r s .

    E x a c t ly b y w h o m a n d w h e n t h e m o d e r n n o t i o n t h a t

    l i f e o r i g i n a t e d f r o m c o l l o i d s o r c o a c e r v a t e s g e n e r a t e d

    f r o m t h e o r g a n i c s u b s t a n c e s i n t h e e a r l y o c e a n w a s f i r s t

    e x p o u n d e d i n s p e ci f ic f o r m i s h a r d t o f i n d . H a e c k e l

    ( 1 8 3 4 - 1 9 1 9 ) c l a i m e d p r i o r i t y , H e w r o t e : " T h e m o n i s t i c

    h y p o t h e s i s o f a b i o g e n e s i s , o r a u t o g e n y i n t h e s t r i c t l y

    s c i e n ti f ic s e n s e o f t h e w o r d , w a s f i r st f o r m u l a t e d b y m e

    i n 1 8 66 i n th e s e c o n d b o o k o f t h e G e n e r a l M o r p h o l -

    o g y . " ( T o d a y a u t o g e n y i s ca l l ed s e l f - o r g a n i z a ti o n . ) A s

    H a e c k e l ( 1 8 6 6 ) e x p l a i n e d :

    " T h e c h e m i c a l p r o c e s s e s w h i c h f i r s t s e t i n a t t h i s s t a g e

    o f d e v e l o p m e n t m u s t h a v e b e e n c a t a ly s i s , w h i c h l e d t o

    t h e f o r m a t i o n o f a l b u m i n o u s c o m b i n a t i o n s , a n d e v e n -

    t u a l l y o f p l a s m . T h e e a r l i e s t o r g a n i s m s t o b e t h u s

    f o r m e d c a n o n l y h a v e b e en p l a m o d o m o u s M o n e r a ,

    s t r u c t u r e l e s s o r g a n i s m s w i t h o u t o r g a n s ; t h e f i r s t f o r m s

    i n w h i c h l i v i n g m a t t e r i n d i v i d u a l i z e d w e r e p r o b a b l y

    h o m o g e n e o u s g l o b s o f p l as m , l i k e c e r t a i n o f t h e a c t u a l

    c h r o m a c e a ( chroococcus ) . T h e f i rs t c el l s w e r e d e v e l o p e d

    s e c o n d a r i l y f r o m t h e s e p r i m i t iv e M o n e r a , b y s e p a r a t i o n

    o f t h e c e n t r a l c a r y o p l a s m ( n u c l e u s) a n d p e r i p h e r a l c y t o -

    p l a s m ( ce ll b o d y ) . "

    H a e c k e l ' s v i e w s w e r e w e l l - k n o w n i n t h e n i n e t e e n t h

    a n d e a r l y t w e n t ie t h c e n t u r i e s f o r h e w a s w i d e l y p u b -

    l i s h e d i n p r o f e s s i o n a l b o o k s a n d j o u r n a l s a n d i n b e s t -

    s e ll i ng p o p u l a r b o o k s t r a n s l a t e d f r o m t h e G e r m a n t o

    s e v e r a l l a n g u a g e s . H a e c k e l ' s d i s c u s s i o n o f t h e o r i g i n o f

    l i fe f r o m a p r i m o r d i a l s o u p i n t h e e a r l y o c e a n w a s s o

    w e l l k n o w n a m o n g s c i e n t i s t s , t h e o l o g i a n s a n d t h e t h e -

    a t e r - g o i n g p u b l i c i n 1 8 7 8 t h a t S i r W i l l i a m S c h w e n c k

    G i l b e r t (1 8 3 6 - 1 9 1 1 ) h a d P o o h - B a h , a c o m i c , g r e e d y

    a n d c o n c e i t e d c h a r a c t e r i n T h e M i k a d o , i n t r o d u c e h i m -

    s e l f a s f o l l o w s :

    I a m i n p o i n t o f f ac t , a p a r t i c u l a r ly h a u g h t y a n d

    e x c lu s i v e p e r s o n , o f p r e - A d a m i t e a n c e s t r a l d e s c e n t. Y o u

    w i l l u n d e r s t a n d t h i s w h e n I t e ll y o u t h a t I c a n t r a c e m y

    a n c e s t r y b a c k t o a p r o t o p l a s m a l p r i m o r d i a l a t o m i c

    globule ( G i lbe r t , 1878) .

    L o u i s P a s t e u r ( 1 8 2 2 - 1 8 9 5 ) s h o w e d t h a t p r o p e r l y

    s t e r i li z e d c u l t u r e s a l w a y s b e c a m e i n f e c t e d w h e n e x p o s e d

    t o a i r . A l p i n e a i r , w h i c h w a s a l m o s t f r e e o f g e rm s ,

    s e ld o m p r o d u c e d a g r o w t h o f o r g an i s m s . T h o r o u g h l y

    s t e r i l iz e d c u l t u r e s r e m a i n e d s o , f o r y e a r s , i n t h e a b s e n c e

    o f g e r m s a d d e d f r o m w i t h o u t . L i fe , th e r e f o re , m u s t

    c o m e o n l y f r o m l i f e . H a e c k e l m a i n t a i n e d t h a t P a s t e u r

    h a d o n l y s e t t l e d t h e n e g a t i v e i n c e r t a i n c i r c u m s t a n c e s .

    I t b e i n g v e r y d i f fi c u l t o r i m p o s s i b l e t o p r o v e a n e g a t i v e ,

    m a n y s c i e n t is t s i n t h e n i n e t e e n t h c e n t u r y , a n d m a n y

    t o d a y , s u p p o r t s p o n t a n e o u s o r i g i n o f l if e f r o m a p r i m e -

    v a l c h e m i c a l s o u p i n t h e e a r l y o c e a n .

    T h e ' w a r m l i t t le p o n d ' q u o t a t i o n ( D a r w i n , 1 89 8)

    f r o m D a r w i n ' s p r i v a t e c o r r e s p o n d e n c e d o e s n o t r e f l e c t

    C h a r l e s D a r w i n ' s v i e w s o n t h e o r i g i n o f li fe . H a d h e

  • 8/10/2019 Origin of Life on Earth and Shannons

    14/19

    118 H.P. Yockey / Computers Chemistry 24 (2000) 105 123

    t h o u g h t t h e ' w a r m l i tt l e p o n d ' i d e a w o r t h y o f p u b l i c a -

    t i o n , h e w o u l d c e r t a i n l y h a v e d o n e s o . S i g n i f i c a n t l y ,

    D a r w i n a v o i d e d t h e o r i g i n o f li fe c o n t ro v e r s y i n t h e

    s i x t h e d i t i o n ( 18 7 2) o f T h e O r i g i n o f S p e c ie s :

    I t i s n o v a l i d o b j e c t i o n t h a t s c i e n c e a s y e t t h r o w s n o

    l i g h t o n t h e f a r h i g h e r p r o b l e m o f t h e essence or or igin

    o f l i f e . W h o c a n e x p l a i n w h a t i s t h e e s s e n c e o f t h e

    a t t r a c ti o n o f g r a v i t y ? N o o n e n o w o b j e c t s t o f o l l o w i n g

    o u t t h e r e s u l t s c o n s e q u e n t o n t h i s u n k n o w n e l e m e n t o f

    a t t r a c t i o n ; n o t w i t h s t a n d i n g t h a t L e i b n i t z f o r m e r l y a c -

    c u s e d N e w t o n o f i n t r o d u c i n g ' o c c u l t q u a l i t ie s i n to p h i -

    l o s o p h y ' ( C h a p t e r X V ) . ( M y i t a l i c s )

    T h o s e w h o s p e c u l a t e d a b o u t t h e o r i g in o f l if e p r o -

    p o s e d v a r i o u s m o d e l s s i m i l a r t o H a e c k e l ' s . B e f o r e t h e

    d i s c o v e r y o f t h e g e n e t i c c o d e , J a c q u e s L o e b i n 1 9 06

    o b j e c t e d t o t h e p r o p o s a l t h a t l i f e e m e r g e d f r o m c o l l o i d s

    t h r o u g h c a t a l y s i s ( L o e b , 1 9 06 ). L o e b w a s a n e x p e r t i n

    t h e c h e m i s t r y o f c o l l o i d s a n d v e r y w e l l - k n o w n i n t h e

    f i rs t p a r t o f t h e t w e n t i e t h c e n t u r y . I n h i s b o o k , T h e

    D y n a m i c s o f L i v i n g M a t t e r , p u b l i s h e d i n 1 90 6, L o e b

    w r o t e :

    B u t w e s e e t h a t p l a n t s a n d a n i m a l s d u r i n g t h e i r g r o w t h

    c o n t i n u a l l y t r a n s f o r m d e a d i n t o l i v i n g m a t t e r , a n d t h a t

    t h e c h e m i c a l p r o c e s s e s i n l i v i n g m a t t e r d o n o t d i f f e r i n

    p r i n c i p l e f r o m t h o s e i n d e a d m a t t e r . T h e r e i s , t h e r e f o r ,

    n o r e a s o n t o p r e d i c t t h a t a b i o g e n e s i s is im p o s s i b l e a n d

    I b e l i e v e t h a t i t c a n o n l y h e l p s c i e n c e i f y o u n g e r i n v e s t i -

    g a t o r s r e a l i z e t h a t e x p e r i m e n t a l b i o g e n e s i s i s t h e g o a l o f

    b i o l o g y . O n t h e o t h e r h a n d , o u r l e c tu r e s s h o w c l e a r l y

    t h a t w e c a n o n l y c o n s i d e r th e p r o b l e m o f a b io g e n e s i s

    s o l v e d w h e n t h e a r t i f ic i a l l y p r o d u c e d s u b s t a n c e i s c a p a -

    b l e o f d e v e l o p m e n t , g r o w t h , a n d r e p r o d u c t i o n . I t is n o t

    s u f f i c i e n t f o r t h i s p u r p o s e t o m a k e p r o t e i n s y n t h e t i c a l l y ,

    o r t o p r o d u c e i n g e l a ti n o r o t h e r c o l l o i d a l m a t e r i a l

    r o u n d g r a n u l e s t h a t h a v e a n e x t e r n a l r e s e m b l a n c e t o

    l iv ing ce l l s .

    L o e b r e je c t e d t h e c o l l o i d o r P o o h B a h ' s p r o t o p l a s m a l

    p r i m o r d i a l a t o m i c g l o b u l e m o d e l o f t h e o r i g i n o f l i f e ;

    h o w e v e r , m u c h i t m a y h a v e a p p e a l e d t o h i s m e c h a n i s t

    i d e o l o g y . H e s a w t h a t c o l l o i d s l a c k t h e c h a r a c t e r i s t i c

    c h e m i c a l p r o c e s s e s , n a m e l y e n z y m e s , b y w h i c h o r g a n -

    i s m s m a k e s u g a r s , f a t s , p r o t e i n s a n d o t h e r m o l e c u l e s

    e s s e n t i a l t o t h e i r m e t a b o l i s m . I n t h e t e r m s I a m d i s -

    c u s s i n g , t h e y h a v e n o g e n o m e t o c o n t r o l t h e f o r m a t i o n

    o f t h e s e c r i t i c a l c o m p o u n d s . I t i s a t r a v e s t y t h a t L o e b ' s

    c o m m e n t s a r e n o t n o w m e n t i o n e d i n t h e l i t e r a t u r e .

    S e l f - r e p l i c a t i o n i s a f u n d a m e n t a l r e q u i r e m e n t i n s p e c -

    u l a t i o n s o n t h e o r i g i n o f l if e . S e l f - r e p l i c a t i o n m e c h a -

    n i s m s , a t t h e o r i g i n o f l if e , m u s t b e a c c o m p l i s h e d

    w i t h o u t t h e a c t i o n o f e n z y m e s b e c a u s e e n z y m e s t h e m -

    s e l v e s a r e p r o d u c e d b y i n s t r u c t i o n s i n t h e g e n e t i c m e s -

    s a g e . A l l s c h e m e s f o r n o n - e n z y m a t i c s e l f - r e p l i c a t i o n

    m u s t h a v e a s k i ll e d c h e m i s t a c t i n g a s a d e u s e x m a c h i n a

    u s i n g p u r e c h e m i c a l s in a w e l l - e q u i p p e d c h e m i c a l l a b o -

    r a t o r y . A t t e m p t s t o d i s c o v e r n o n - e n z y m a t i c s e l f - r e p l i -

    c a t i n g c h e m i c a l s y s t e m s i s a c u r r e n t f i e l d o f

    b i o c h e m i s t ry . T h e f u n c t i o n o f e n z y m e s is t o s p e e d u p o r

    c a t a l y z e c h e m i c a l r e a c t i o n s . O n t h e b i o l o g i c a l l y s i g n if i -

    c a n t t i m e s c a l e o f s e co n d s t o y e a r s , s o m e e n z y m e s m a y

    c a t a l y z e a s m a n y a s a m i l l i o n r e a c t i o n s p e r m i n u t e ,

    w h e r e a s i n t h e a b s e n c e o f th e e n z y m e t h e r e a c t i o n m a y

    t a k e m a n y y e a r s .

    D i r e c t e d P a n s p e r m i a , t h e s u g g e s t i o n t h a t l i f e w a s

    i n t r o d u c e d f r o m o u t e r s p a c e , r a is e s i t s h e a d f r o m t i m e

    t o t im e . H e r m a n n L u d w i g F e r d i n a n d v o n H e l m h o l t z

    ( 1 8 2 1 -1 8 9 4 ) a n d L o r d K e l v in , W i l l i a m T h o m p s o n

    ( 1 8 2 4 - 1 9 0 7 ) , p r o p o s e d t h e D i r e c t e d P a n s p e r m i a t h e o r y

    i n t h e n i n e t e e n t h c e n t u r y . H e l m h o l t z a n d A r r h e n i u s

    ( 19 0 8) p r o p o s e d t h a t l i f e i s e t e r n a l a n d t h a t m e t e o r s

    t h a t r o a m a b o u t t h e s o l a r sy s t e m m i g h t c o n t a i n g e r m s

    o f o r g a n i s m s t h a t , u n d e r f a v o r a b l e c o n d i t io n s , m i g h t

    r e a c h t h e E a r t h a n d o t h e r p l a n e t s . A l t h o u g h C r i c k

    ( 1 9 68 ) s u g g e s t e d t h a t : P o s s i b l y t h e f i r st e n z y m e w a s a n

    R N A m o l e c u l e w i th r e p l ic a s e p r o p e r t i e s , t h is p r o p o s a l

    i s n o w k n o w n a s T h e R N A W o r m a s c o i n e d b y G i l b e r t

    ( 19 8 6) . T h e d i s c o v e r y o f ri b o z y m e s , i n w h i c h R N A

    p l a y s b o t h r o le s , n a m e l y , t h a t o f r e c o r d i n g a n d t r a n s f e r

    o f i n f o r m a t i o n a n d t h e c a t a l y s is ro l e o f e n z y m e s p r o v e d

    t h i s p r o p h e t i c ( C e c h , 1 9 8 6 ; Z a u g a n d C e c h , 1 9 8 6 ) .

    S o m e t h o u g h t r i b o zy m e s p r o v id e d t h e p a t h w a y f r om

    t h e p r i m e v a l so u p t o t h e p r o t o b i o n t . I t w a s p r o p o s e d a s

    a s e l f - r e p l i c a t i n g s y s t e m p r e c e d i n g t h e p r e s e n t s y s t e m ,

    w h i c h d i d n o t d e p e n d o n t h e e n z y m e a c t i o n o f p r o t e i n s

    b u t r a t h e r o n a p r e - e n z y m e a c t i o n o f n u cl e ic a c id s

    ( J o y c e , 1 9 9 8 ) . T h i s s u g g e s t i o n s e e m s t o m o v e t h e p r o b -

    l e m a s t e p n e a r e r t o t h e p r o t o b i o n t b u t s ti ll e n c o u n t e r s

    t h e p r i m a r y q u e s t i o n s o f t h e g e n e r a t i o n o f t h e g e n e t i c

    m e s s a g e a n d o f t h e o r i g i n a n d e v o l u t i o n o f t h e g e n e t i c

    c o d e .

    T a k e n a t b e s t, h o w e v e r , th i s p r o p o s a l s i m p l y m o v e d

    t h e s e a r c h f o r a n o r i g i n o f l i f e s c e n a r i o o n e s t e p f r o m

    t h e p r i m e v a l s o u p , w h e r e i t e n c o u n t e r s t h e n e e d f o r a

    c h e m i c a l p r o c e d u r e f o r t h e p r e b i o t i c f o r m a t i o n o f t h e

    s u g a r r i b o se , a c o m p o n e n t o f R N A . T h e r e i s o n l y o n e

    p l a u s i b l e s y n th e s is o f r i b o s e t h a t m a y b e c o n s i d e r e d i n

    t h e p r e b i o t i c m i l i e u , n a m e l y , t h e p o l y m e r i z a t i o n o f

    f o r m a l d e h y d e . R i b o s e i s o n l y o n e o f a n u m b e r o f

    s u g a r s a n d n e v e r t h e p r i m a r y p r o d u c t ( O r g e l , 1 9 8 6 ) .

    F u r t h e r m o r e , t h e c o n d e n s a t i o n o f a d e n i n e o r g u a n i n e

    w i t h r i b o s e l e a d s t o m i x t u r e s o f t h e o p t i c a l i s o m e r s t h a t

    a r e v e r y c o m p l e x . F r o m t h e p o i n t o f vi e w o f t h e

    b i o c h e m i s t , t h i s p r o b l e m i s a s d i f f i c u l t a s t h e o n e t h e

    d i s c o v e ry o f r i b o z y m e s w a s s u p p o s e d t o s o lv e . F u r t h e r -

    m o r e , n o n - b i o l o g i c a l r e a c t i o n s i n t h e p r e b i o t i c m i l i e u

    l e a d t o 2 ' , 5 ' i s o m e r s ( O r g e l , 1 9 8 6 ). T h i s d o e s n o t l e a d i n

    t h e d i r e c t i o n o f t h e o r i g i n o f l i f e b e c a u s e t h e r e a c t i o n s

    t h a t a r e t y p i c a l o f b i o c h e m i c a l p r o d u c t s a r e a l m o s t

  • 8/10/2