mthesisv2 final corrected

Upload: sugumar-lakshmi-narayanan

Post on 05-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Mthesisv2 Final Corrected

    1/66

    Masarykova UniverzitaPrrodovedecka Fakulta

    Mathematical models of dissolution

    Masters thesis

    Brno 2009 Jakub Cupera

  • 7/31/2019 Mthesisv2 Final Corrected

    2/66

    Declaration

    Hereby I declare, that this paper is my original authorial work, which Ihave worked out by my own. All sources, references and literature used orexcerpted during elaboration of this work are properly cited and listed in

    complete reference to the due source.

    Brno, May 4, 2009Jakub Cupera

  • 7/31/2019 Mthesisv2 Final Corrected

    3/66

    Acknowledgement

    I would like to thank my advisor doc. RNDr. Petr Lansky CSc. for hisfriendly and pleasant attitude, enthusiasm and care, and my parents fortheir love and support.

  • 7/31/2019 Mthesisv2 Final Corrected

    4/66

    Abstract

    We present basic models of dissolution and their stochastic modifications,including a new model based on the theory of stochastic differential equa-tions. Theory of the Fisher information is applied on the stochastic models

    in order to obtain the optimal times to measure the experimental dissolutiondata. Parameters of studied dissolution models are estimated by maximumlikelihood method and appropriate Matlab procedures are presented.

  • 7/31/2019 Mthesisv2 Final Corrected

    5/66

    Keywords

    Deterministic models of dissolution, stochastic models of dissolution, Fisherinformation, Rao-Cramer lower bound, parameter estimation, maximum like-lihood method.

  • 7/31/2019 Mthesisv2 Final Corrected

    6/66

    Contents

    Introduction 2

    Notation 3

    1 Models of dissolution 4

    1.1 Deterministic models . . . . . . . . . . . . . . . . . . . . . . . 61.1.1 Homogenous model . . . . . . . . . . . . . . . . . . . . 61.1.2 Weibull model . . . . . . . . . . . . . . . . . . . . . . . 81.1.3 Hixson-Crowell model . . . . . . . . . . . . . . . . . . 9

    1.2 Stochastic models . . . . . . . . . . . . . . . . . . . . . . . . . 101.2.1 Gaussian model . . . . . . . . . . . . . . . . . . . . . . 111.2.2 Log-normal model . . . . . . . . . . . . . . . . . . . . 14

    2 Theoretical background 212.1 Fisher information and its lower bound . . . . . . . . . . . . . 212.2 Maximum likelihood estimation . . . . . . . . . . . . . . . . . 26

    3 Parameter estimation in stochastic models of dissolution 28

    3.1 Fisher information in theory of dissolution . . . . . . . . . . . 283.1.1 Single-parameter model . . . . . . . . . . . . . . . . . 283.1.2 Two-parameter model . . . . . . . . . . . . . . . . . . 31

    3.2 Parameters of dissolution models and their maximum likeli-hood estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 373.2.1 Gaussian model . . . . . . . . . . . . . . . . . . . . . . 37

    3.2.2 Log-normal model . . . . . . . . . . . . . . . . . . . . 383.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

    3.3.1 Stochastic homogenous model . . . . . . . . . . . . . . 403.3.2 Stochastic Weibull model . . . . . . . . . . . . . . . . . 433.3.3 Stochastic Hixson-Crowell model . . . . . . . . . . . . 47

    4 Computational procedures and examples 50

    4.1 Simulation of random processes . . . . . . . . . . . . . . . . . 504.2 Maximum likelihood estimation . . . . . . . . . . . . . . . . . 51

    Summary 59

    Bibliography 60

    1

  • 7/31/2019 Mthesisv2 Final Corrected

    7/66

    Introduction

    Tablet dissolution testing where the drug release is observed as a function oftime is an essential part of tablet formulation studies. These tests provideinformation about the dissolution mechanism and ensure the reproducibility

    of drug release, which is important for tablet quality assurance. However,the dissolution tests are time-consuming and thus optimal planning of exper-iment and sophisticated methods for its results evaluation would be needed.Identification of the parameters of specific dissolution models is crucial prob-lem in dissolution studies. Aim of this thesis is to find the optimal time forobservation of the empirical dissolution data if the parametric form of thedissolution profile is assumed to be known and the parameters should beestimated. Our approach is based on stochastic properties of the dissolutionprocess and the Fisher information concept.

    This text is divided into four chapters. The most important results fromthe theory of dissolution are introduced in the first chapter, the most com-mon deterministic models are described and a new stochastic model is pro-posed. In the second chapter we summarize well known results for theoryof regular densities, Fisher information and maximum likelihood estimation.Theory of Fisher information is applied on the stochastic models of disso-lution, and maximum likelihood estimation of their parameters is proposedin the third chapter. Finally, a brief introduction to Matlab programs fornumerical simulation of random processes and procedures for maximum like-lihood estimation are presented. Proposed theory is applied on Monte-Carlosimulated concentration data.

    Supplement of this thesis is CD with Matlab procedures and programs.

    Their description can be called implicitly with command help name-of-procedure.

    2

  • 7/31/2019 Mthesisv2 Final Corrected

    8/66

    Notation

    Matrices and fields

    Rn n-dimensional real euclidean space

    y = (y1, . . . , yn)T

    n-dimensional row vectorR

    n+ = {y Rn : yi > 0, i = 1, . . . , n}

    In unit matrix of size nJ = (Jij )

    ni,j=1 square matrix of size n

    J1 inverse matrix of the matrix JJT transposed matrix to J|J| determinant of the square matrix JJ 0 denotes that matrix J is positive semi-definite

    Random vectors

    = (1, . . . , m) m-dimensional vector parameter = (1, . . . ,m) estimate of the vector parameter X = (X1, . . . , X q)

    T q-dimensional random vectorx = (x1, . . . , xq)

    T realisation of the random vector Xf(x;) probability density of the random vector X with vec-

    tor parameter EX expected value of the random vector XvarX, varX covariance matrix of the random vector X, resp.

    variance of the random variable X.cov(Xi, Xj ) covariance of two randon variables Xi, Xj. It holds

    cov(Xi, Xi) = varXiX L random variable X has probability distribution LN(, 2) Gaussian distribution with mean and variance 2

    logN(, 2) log-normal distribution with parameters and 2

    Functions

    argmax f(y) = {y0 Y : y Y, f(y0) f(y)}yY

    argmin f(y) = {y0 Y : y Y, f(y0) f(y)}yY

    f()

    = df()d

    = f() differentiation off() with respect to

    3

  • 7/31/2019 Mthesisv2 Final Corrected

    9/66

    1Models of dissolution

    The dissolutionis defined as a process of attraction and association of molecu-les of a solvent with molecules of a solute. Number of these associated parti-cles is given in moles, where 1 mole contains approximately 6.022045 1023particles. The concentration C is defined as a number of solvent molecules inunit volume of solute, mathematically C = n/V, where n is the total numberof dissolved particles and V is the total volume of solute.

    The mathematical models of dissolution can be divided into two basicgroups - deterministic and stochastic. Both of them investigate the wholepopulation of particles and describe the time course of concentration C(t).While the deterministic models work with given function C(t), the stochasticones describe the dissolution as a random process. It is natural to assumethat the concentration C(t) is a nondecreasing function of time. Furthermore,we assume the function C(t) is continuous and smooth (with derivatives ofall orders), C(0) = 0 and C(t) CS for increasing t, where CS is a limitconcentration after the solvent is dissolved or the solute is saturated. For

    comparison among the dissolution profiles, we normalize the profile C(t) tothe form

    F(t) =C(t)

    CS, (1.1)

    where the function F(t) expresses the dissolved fraction of solvent at the timeinstant t. Note that F(t) 1 for increasing t.

    Function F(t) satisfies all the conditions to be a cumulative distributionfunction (cdf) of a random variable and thus it can be seen as a cdf of arandom variable T representing the time until a randomly selected moleculeenters solution. Due to the assumptions made on C(t), function F(t) iscontinuous and smooth with corresponding probability density dF(t)/dt ofthe random variable T.

    4

  • 7/31/2019 Mthesisv2 Final Corrected

    10/66

    An alternative way how to characterize the dissolution is by defining thefractional dissolution rate

    k(t) =dF(t)dt

    1

    F(t). (1.2)

    The quantity k(t)dt can be seen as a conditional probability that a randomlyselected particle of solvent will be dissolved in the interval [t, t + dt) underthe condition that this has not happened up to time t.

    0 1 2 30

    0.2

    0.4

    0.6

    0.8

    1

    (A)

    t

    F(t)

    0 1 2 30

    0.5

    1

    1.5

    2

    (B)

    t

    k(t)

    Fig. 1.1: Different profiles of fractional dissolution rate and corresponding cdfs (A)Weibull functions (see Section 1.1.2) with scale parameter a = 1 and shape parametersb = 0.9 (red) and b = 1.1 (black), (B) corresponding fractional dissolution rates.

    If the form of the fractional dissolution rate k(t) is known, then the dissolu-tion profile can be evaluated from equation

    F(t) = 1 exp

    t0

    k(s)ds

    , (1.3)

    which is a solution of differential equation (1.2) with initial condition F(0) = 0.Advantage of the fractional dissolution rate is its sensitivity to certain prop-erties of the dissolution profiles F(t). In Fig. 1.1 we can see that despite theshapes of F(t) are hardly distinguishable, the shapes of k(t) are apparentlydifferent.

    5

  • 7/31/2019 Mthesisv2 Final Corrected

    11/66

    1.1. DETERMINISTIC MODELS

    A crucial problem in dissolution studies is identification of the dissolutionprofile from a measured data. In contrast to the typical problem of statisticalinference we are not able to measure realizations of the random variable T.What is measurable is concentration C(t) which is related to F(t) by equation(1.1). So, under the assumption we know or we can measure limit concentra-tion CS, we can determine empirical values of the dissolved fraction F(t). Wehave two basic methods used to fit the measured data. The nonparametricmethods, like kernel smoothing, are descriptive and could give informationabout the half dissolution time t1/2, where F(t1/2) = 0.5, or mean dissolutiontime ET given by formula

    ET =

    0

    (1 F(t)) dt. (1.4)

    The parametric methods fit the measured data to certain functions, whereits estimated parameters could give us an information about properties of

    the dissolution process and consequently about the random variable T.

    1.1 Deterministic models

    Every continuous cdf defined on the interval [0, ) potentially characterizesa model of dissolution. The most common models in theory of dissolutionare described in the following sections.

    1.1.1 Homogenous model

    The basic deterministic model of dissolution called homogenous, or the first-order model, is described by differential equation

    dC(t)

    dt= a(CS C(t)), C(0) = 0, (1.5)

    where a > 0 is constant and CS is the limit concentration achieved after thesolvent is dissolved or the solute is saturated. In this model the rate of disso-lution, dC(t)/dt, is proportional to the difference between the instantaneousconcentration and the final concentration of solvent. This model was intro-

    duced by chemists Noyes and Whitney in [23]. In generalizations of (1.5) theconstant a is often considered to depend on temperature, surface of solventetc., for example see section 1.1.3. Solution of differential equation (1.5) is

    C(t) = CS(1 eat), (1.6)

    6

  • 7/31/2019 Mthesisv2 Final Corrected

    12/66

    1.1. DETERMINISTIC MODELS

    thusF(t) = 1 eat. (1.7)

    The fractional dissolution rate defined by equation (1.2) is constant, k(t) a,for model (1.7). Thus the probability that a selected particle of the solvent

    undissolved at the time t will be dissolved in the time interval [t, t + dt), isconstant. This property makes the homogenous model specific among thewhole spectrum of dissolution models.

    As seen from equation (1.7), the dissolution never ends in this model, andthus there is always a small amount of undissolved substance. Practically,as an end of the dissolution we could consider, for example, the time instantwhen 99% of the solvent is dissolved. Examples of the homogenous modelwith different values of parameter a are shown in Fig. 1.2.

    0 0.5 1 1.5 2 2.5 30

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    t

    F(t)

    Fig. 1.2: Time course of dissolution profile F(t) of homogenous model (1.7) with param-eter a = 0.5 (blue), a = 1 (red) and a = 2 (black).

    7

  • 7/31/2019 Mthesisv2 Final Corrected

    13/66

    1.1. DETERMINISTIC MODELS

    1.1.2 Weibull model

    Despite the Weibull model is one of the most succesful in fitting the exper-imental dissolution curves to theoretical functions (see ref. [3],[11],[32]), ithas not been deduced from any physical law. This model was introduced by

    Langenbucher in [14] and is described by cdf

    F(t) = 1 eatb , (1.8)where a > 0 is the scale parameter and b > 0 is the shape parameter. Thefractional dissolution rate of the Weibull model has form

    k(t) = abtb1 (1.9)

    and is increasing (b > 1), decreasing (b (0, 1)) or constant (b = 1). How theparameters a and b affect the course of the dissolution profile is illustrated inFig. 1.3. Note that for b = 1 this model coincides with homogenous model(1.7). Similarly to it, the dissolution never ends in this model and there isalways infinitely small amount of undissolved substance. Practically as anend of the dissolution we could consider again the time instant when 99% ofthe solvent is dissolved.

    As the Weibull model is only descriptive and has not been deduced fromany fundamental physical law, it has been the subject of some criticism.Costa and Lobo summarized these arguments in [3], such as lack of anykinetic background, or that the model has not any single parameter relatedto the intrinsic dissolution rate of the solvent.

    0 0.5 1 1.5 20

    0.2

    0.4

    0.6

    0.8

    1

    (A)

    t

    F(t)

    0 0.5 1 1.5 20

    0.2

    0.4

    0.6

    0.8

    1

    (B)

    t

    F(t)

    Fig. 1.3: Time course of dissolution profile F(t) of Weibull model (1.8). (A) Parameterb fixed, b = 3 and parameter a = 4 (black), a = 1 (red), a = 0.4 (blue). (B) Parameter ais fixed, a = 1 and parameter b = 0.4 (black), b = 1 (red) and b = 4 (blue).

    8

  • 7/31/2019 Mthesisv2 Final Corrected

    14/66

    1.1. DETERMINISTIC MODELS

    1.1.3 Hixson-Crowell model

    From physical principles one can expect that the rate of dissolution dependson the surface of solvent - the larger is area the faster is dissolution. Thisstatement can be mathematically expressed as follows

    dF(t)

    dt= a0S(t), F(0) = 0, (1.10)

    where S(t) is the surface of the solvent at time instant t, and a0 > 0 is aconstant. Solution of differential equation (1.10) for a sphere solid dosageform of solvent is presented in following text.

    For a sphere with radius r, the surface S = 4r2 is related to the volumeV = 4

    3r3 by formula S = a1V

    23 , where a1 =

    3

    36. Furthermore the volumeV is linearly proportional to the weight w, thus we can evaluate the surfacearea as

    S(t) = a2w23 (t), (1.11)

    where w(t) is remaining weight of the solvent at the time instant t and

    a2 = a1 2

    3 = 3

    362, where is density of the solvent. The dissolvedfraction F(t) can be written as a function of weight w(t)

    F(t) = 1 w(t)w0

    , (1.12)

    where w0 = w(0) is an initial amount of the solvent. Substituting (1.11) and(1.12) into (1.10) gives

    dw(t)

    dt= a3w 23 (t), (1.13)

    where a3 = a0a2w0, with solutionw

    130 w

    13 (t) = a4t, (1.14)

    where a4 = 3a3 = 3a0a2w0 = 3a0w03

    362 is constant. Dividing equa-

    tion (1.14) by w1/30 and inserting into (1.12) gives us the time course of the

    dissolved fractionF(t) = 1 (1 at)3, (1.15)

    where a = a4w1

    30 = 3a0w

    230

    3

    362 is constant. Using the knowledge ofinitial radius r0 and density of the solid dosage form of the solvent, we canwrite w0 = 4r

    30/3, thus a = 12a0r

    20. In contrast to the previous models

    the dissolution ends at the finite time t = 1/a. This model was introducedby Hixson and Crowell in [12] and can be generalized to form

    F(t) =

    1 (1 at)b for t [0, 1

    a]

    1 for t > 1a, (1.16)

    9

  • 7/31/2019 Mthesisv2 Final Corrected

    15/66

    1.2. STOCHASTIC MODELS

    where a > 0, b > 0 are constants. These models are called the root lawsbecause of the form of equation (1.14). How the parameters a, b affect theshape of F(t) can be seen in Fig. 1.4.

    0 0.5 1 1.5 20

    0.2

    0.4

    0.6

    0.8

    1

    (A)

    t

    F(t)

    0 0.5 1 1.5 20

    0.2

    0.4

    0.6

    0.8

    1

    (B)

    t

    F(t)

    Fig. 1.4: Time course of dissolution profile F(t) of Hixson-Crowell model (1.16) with (A)fixed parameter b = 0.5, parameter a = 0.3 (black), a = 0.6 (red) and a = 1.2 (blue), (B)fixed parameter a = 2, parameter b = 0.5 (black), b = 1 (red) and b = 5 (blue).

    The fractional dissolution rate of model (1.16) is

    k(t) =

    ab

    1atfor t [0, 1a ]

    0 for t > 1a

    , (1.17)

    and is increasing function of time until the end of the dissolution.Models (1.7), (1.8) and (1.16) are the most common in the theory of

    dissolution. For more information see ref. [3],[6],[17].

    1.2 Stochastic models

    The models presented in Section 1.1 are stochastic from microscopic pointof view, but macroscopically behave deterministically. It means that underidentical conditions, the dissolution profile remains unchanged. In this Sec-

    tion we present stochastic modification of the deterministic models. The ran-dom component used in modified models disturbs our assumption of strictlyincreasing concentration C(t), but it allows us to study stochastic propertiesof the models. Interesting discussion on locally decreasing concentration andphysical principle of the dissolution process can be found in ref. [15].

    10

  • 7/31/2019 Mthesisv2 Final Corrected

    16/66

    1.2. STOCHASTIC MODELS

    1.2.1 Gaussian model

    Any deterministic model of dissolution described by cdf F(t) does not takeinto account random factors influencing the process of dissolution. In order toobtain a more realistic picture of reality, the stochastic model of dissolution

    can take form(t) = F(t) + s(t)(t), (1.18)

    where (t) is a random process called white noise with properties

    E(t) = 0, cov((t), (s)) =

    0 s = t1 s = t

    , (1.19)

    and F(t) is selected deterministic dissolution profile we wish to randomize.Function s(t) determines an amplitude of the noise at the time instant t. Inapplied science, white noise is often taken as a mathematical idealization of

    phenomena involving sudden fluctuations of any size; for formal mathemati-cal treatment see ref. [25]. The simplest example which can be proposed isto assume that the noise has Gaussian distribution (Gaussian white noise),

    (t) N F(t), s2(t) , (1.20)with probability density

    f(x, t) =1

    2s2(t)exp

    (x F(t))

    2

    2s2(t)

    , x R. (1.21)

    An example of Gaussian probability densities with mean described by ho-

    mogenous model (1.7) and constant variance is shown in Fig. 1.5.Model (1.18) violates our assumption of continuity of dissolution trajec-

    tories. To improve it we should assume model

    (t) = F(t) +s(t)

    tW(t), (1.22)

    where W(t) is a standard Wiener process given by following definition.

    Definition 1.1. A standard Wiener process (or a standard Brownian mo-tion) is a stochastic process W(t), t 0, with following properties:

    1. W(0) = 0,

    2. W(t) is continuous with probability 1,

    3. processW(t) has independent increments,

    11

  • 7/31/2019 Mthesisv2 Final Corrected

    17/66

    1.2. STOCHASTIC MODELS

    4. W(t + s) W(s) N(0, t) for 0 s < t.

    Conditions 1. and 4. of Definition 1.1 give for s = 0

    W(t) N(0, t), (1.23)thus model (1.22) has probability density (1.20). Furthermore, it can beshown (see ref. [31]) that

    W(t) =

    t0

    (s)ds, (1.24)

    where (s) is a white noise given by (1.19). Sample paths of random processes(1.22) and (1.24) are shown in Fig. 1.6 and details about their numericalsimulation are given in Chapter 4.

    Our aim is to analyze measured data, thus we assume that the variances2(t) of the data obtained from stochastic model (1.18), resp. (1.22), hasform

    s2(t) = 2(t) + (t), (1.25)

    where 2(t) is variance of the dissolution and (t) 0 reflects the measure-ment error. The function 2(t) is unknown in general. We should assume

    0

    1

    2

    3

    0

    0.5

    1

    0

    0.5

    1

    1.5

    2

    tx

    f(x,t

    )

    Fig. 1.5: Gaussian probability density (1.21) with mean described by homogenous model(1.7), parameter a = 1, and constant variance of the data s2(t) 0.04.

    12

  • 7/31/2019 Mthesisv2 Final Corrected

    18/66

    1.2. STOCHASTIC MODELS

    0 1 2 33

    2

    1

    0

    1

    2

    3

    (A)

    t

    W(t)

    0 1 2 3

    0

    0.2

    0.4

    0.6

    0.8

    1 (B)

    t

    (t)

    Fig. 1.6: Sample paths of random processes. (A) Sample paths of Wiener process W(t),(B) sample paths of random process (1.22) with mean F(t) = 1

    exp(

    t2) and variance

    s2(t) = 0.04 exp(t2)(1 exp(t2)).

    that 2(0) = 0 (nothing is dissolved) and 2(t) tends to zero for increasing t(everything is dissolved). As an example of such a function we can take

    2(t) = pF(t)(1 F(t)), (1.26)

    where F(t) is the dissolution profile and p > 0 is a constant. It can beeasily verified that this function satisfies the assumptions mentioned above.Furthermore, it has maximum p/4 at time instant t1/2, where F(t1/2) = 1/2.

    The measurement error (t) should be proportional to the instantaneousconcentration. Due to Webers law (see ref. [33]) the stimulus (for us con-centration) correctly discriminated is constant fraction of the stimulus mag-nitude. So (t), which represents the size of error, is proportional to F(t),(t) qF(t) + r, where q, r 0 are constants.

    Disadvantage of models (1.18) and (1.22) with probability distribution(1.20) is that they permit observations outside the interval [0, 1] and thustheir modification may appear as useful. The experimental data showingpermanent fluctuations in both directions around 100% were presented e.g.in [26] or [34], but it is not realistic that the concentration data takes negativevalues for t small. To improve the model one may require that the noise

    effect becomes asymmetric at the beginning of the dissolution. This could beachieved by considering noise with different probability distribution insteadof the Gaussian one. Such a model is not proposed here.

    13

  • 7/31/2019 Mthesisv2 Final Corrected

    19/66

    1.2. STOCHASTIC MODELS

    1.2.2 Log-normal model

    Lansky and Weiss introduced in [15] another stochastic modification of ho-mogenous model (1.7). In parallel with these authors let us assume that thefractional dissolution rate is corrupted by a white noise defined by (1.19),

    (t) = k(t) + (t), (1.27)

    where > 0 is an amplitude of the noise. From (1.2), after replacing F(t) by(t) to distinguish between deterministic and stochastic models, we obtainstochastic differential equation

    d(t) = k(t)(1 (t))dt + (1 (t))dW(t), (0) = 0, (1.28)

    where W(t) is a standard Wiener process given by Definition (1.1), anddW(t) = (t)dt we obtain from (1.24). Applying substitution U(t) = 1(t),dU(t) = d(t), on equation (1.28) gives

    dU(t)

    U(t)= k(t)dt dW(t), U(0) = 1, (1.29)

    thus t0

    dU(s)

    U(s)=

    t0

    k(s)ds W(t). (1.30)

    To evaluate the integral on the left hand side we have to introduce the Itocalculus.

    Definition 1.2(1-dimensional Ito process)

    .LetW(t) be a Wiener process.An (1-dimensional) Ito process is a stochastic process X(t) of the form

    X(t) = X(0) +

    t0

    u(s)ds +

    t0

    v(s)dW(s) (1.31)

    or equivalentlydX(t) = u(t)dt + v(t)dW(t),

    where u(t), v(t) are random processes satisfying

    P t

    0 |u(s)

    |ds 0

    and obtain

    d(ln U(t)) =1

    U(t)dU(t) 1

    2

    1

    U2(t)[dU(t)]2

    =

    dU(t)

    U(t) 1

    2U2(t)

    2

    U

    2

    (t)dt

    =dU(t)

    U(t) 1

    22dt.

    HencedU(t)

    U(t)= d(ln U(t)) +

    1

    22dt

    so from (1.30) we conclude

    lnU(t)

    U(0)=

    t

    0

    k(s)ds 12

    2t W(t)

    where U(0) = 1, so

    U(t) = exp(t0

    k(s)ds 12

    2t W(t)),

    15

  • 7/31/2019 Mthesisv2 Final Corrected

    21/66

    1.2. STOCHASTIC MODELS

    and thus

    (t) = 1 exp(t0

    k(s)ds 12

    2t W(t)). (1.34)

    We wish to obtain probability distribution of the random variable (t).

    Due to (1.23), the expression in exponent in (1.34) has Gaussian probabilitydistribution

    N t

    0

    k(s)ds 12

    2t, 2t

    ,

    thus random variable 1 (t) has log-normal distribution (see ref. [21])

    1 (t) logN (t), 2(t) , (1.35)with probability density

    f(x, t) =1

    x22(t) exp(ln x (t))2

    22

    (t) , x (0, ), (1.36)where the coefficients (t), 2(t) are

    (t) = t0

    k(s)ds 12

    2t,

    2(t) = 2t.

    0 1 2 3

    0

    0.2

    0.4

    0.6

    0.8

    1

    (A)

    t

    (t)

    0 1 2 30

    0.2

    0.4

    0.6

    0.8

    1 (B)

    t

    (t)

    Fig. 1.7: Sample paths of a random process (t) with indicated Weibull dissolutionprofile F(t) given by (1.8) with parameters a = 1, b = 2, (A) random process (t) definedby equation (1.34), = 0.2, (B) random process (t) defined by equation (1.42), = 0.05.

    16

  • 7/31/2019 Mthesisv2 Final Corrected

    22/66

    1.2. STOCHASTIC MODELS

    It holds

    E (1 (t)) = 1 E(t), var (1 (t)) = var(t),

    and by applying (1.3) on the expressions of mean and variance of a random

    process 1 (t) (see ref. [21]) we obtainE(t) = F(t),

    var(t) = (1 F(t))2 exp(2t) 1 .As the function f(x, t) for fixed t gives probability density of the randomvariable 1 (t), the random variable (t) has probability density f(1x, t).

    Model (1.34) suffers of certain defects. In Fig. 1.7 (A) it can be seen thatthis model completely prevents the concentration to take values above 100%,but it allows them to drop below zero, thus the model does not satisfy ourrequirements. In equation (1.28) the random component dW(t) influences

    the process of dissolution at its beginning only, because term 1(t) is largefor t small. To obtain a more precise model one should assume, for example,stochastic differential equation

    d(t) = k(t)(1 (t))dt + (t)dW(t), (1.37)or

    d(t) = k(t)(1 (t))dt + (t)(1 (t))dW(t), (1.38)

    0 1 2 30

    0.2

    0.4

    0.6

    0.8

    1 (A)

    t

    (t)

    0 1 2 30

    0.2

    0.4

    0.6

    0.8

    1

    (B)

    t

    (t)

    Fig. 1.8: Sample paths of dissolution profile (t), k(t) = 2t. (A) model (1.37), = 0.1,(B) model (1.38), = 0.4.

    17

  • 7/31/2019 Mthesisv2 Final Corrected

    23/66

    1.2. STOCHASTIC MODELS

    with initial condition (0) = 0, where the random factor dW(t) does notinfluence the process of dissolution at its begin and end. In this case we arenot able to find solution of stochastic differential equations (1.37) and (1.38)in analytic form. Their numerical approximations (see Section 4.1) can beseen in Fig. 1.8.

    Let us study another model with solution in analytic form. Stochastic dif-ferential equation (1.28) with solution (1.34) can be rewritten after dividingby (1 (t))dt to form

    ddt (t)

    1 (t) =ddt F(t)

    1 F(t) + (t), (0) = 0.

    As we have seen, this model does not satisfy our assumptions and thus letus consider stochastic differential equation

    ddt (t)

    (t)

    =ddt F(t)

    F(t)

    + (t), (0) = 0. (1.39)

    Let > 0 be small. Firstly we solve equation (1.39) with initial condition() = F(). This equation can be solved similarly as the equation (1.28),where after application of Ito theorem we obtain

    ln(t)

    ()= ln

    F(t)

    F() 1

    22t + W(t)

    which can be rewritten as

    (t) =()

    F()exp

    ln F(t) 1

    22t + W(t)

    . (1.40)

    Our purpose is to find limit for 0. As () is a random process, we haveto use stochastic approach.

    Definition 1.3. LetX(t) be a random process. We say that X(t) has limitin the mean X(t0) for t t0 if

    limtt0

    E(X(t) X(t0))2 = 0.

    We writel. i. m.

    tt0X(t) = X(t0).

    Theorem 1.2. LetF(t) be a dissolution profile and (t) be a random processwith propertiesE(t) = F(t) and var(t) = o(F2(t)) for t small. Then

    l. i. m.t0

    (t)

    F(t)= 1. (1.41)

    18

  • 7/31/2019 Mthesisv2 Final Corrected

    24/66

    1.2. STOCHASTIC MODELS

    Proof. We have

    limt0

    E(t)

    F(t) 1

    2

    = limt0

    E(t)2

    F2(t) 2

    F(t)

    E(t)

    F(t)+ 1

    = lim

    t0

    var(t) + (E(t))2

    F2(t) 1

    = lim

    t0

    var(t)

    F2(t)= 0.

    Hence (1.40) gives for 0 solution of stochastic differential equation(1.39) in form

    (t) = expln F(t) 12

    2t + W(t) . (1.42)This random process has log-normal probability distribution

    (t) logN((t), 2(t)) (1.43)

    with coefficients

    (t) = ln F(t) 12

    2t,

    2(t) = 2t,

    mean and variance have form

    E(t) = F(t),

    var(t) = F2(t)

    exp(2t) 1 .Here we can see that random process (t) fulfill conditions mentioned inTheorem 1.2. Model described by stochastic differential equation (1.39) withanalytical solution (1.42) was deduced from model described by (1.28) onwhich additional conditions were imposed. Disadvantage of this model isthat its variance, var(t), tends to infinity for large t. On the other hand,

    this model prevents the dissolution data to drop bellow zero and permits thedata fluctuation around 100%, as we can see in Fig. 1.7. Furthermore, inFig. 1.8 (A) can be seen that sample paths of model (1.37) are similar tosample paths of solution to stochastic differential equation (1.42) depicted inFig. 1.7 (B).

    19

  • 7/31/2019 Mthesisv2 Final Corrected

    25/66

    1.2. STOCHASTIC MODELS

    Model (1.42) satisfies our assumptions and thus we deal in the next withthe generalized model with log-normal probability distribution (1.43). Itsparameters (t), 2(t) we obtain from equations for mean and variance oflog-normal distribution (see ref.[21])

    E(t) = e(t)+ 122(t) = F(t), (1.44)

    var(t) = e2(t)+2(t)

    e2(t) 1

    = s2(t), (1.45)

    where F(t) is a dissolution profile, and s2(t) = 2(t) is a variance of thedissolution without measurement error. After a short calculation we obtain

    (t) = ln F(t) 12

    2(t), (1.46)

    2(t) = ln

    1 +

    s2(t)

    F2(t)

    . (1.47)

    Disadvantage of all the models presented up to now is that they describethe dissolution without measurement error. The modified model can be de-scribed by a random process

    (t) = (t) +

    (t)(t), (1.48)

    where

    (t)(t) represents the measurement error. It seems natural to as-sume that the measurement error has Gaussian distribution. The probabil-ity density of random process given by (1.48) for fixed t is then describedby convolution of corresponding densities of the log-normal and Gaussian

    distribution, and cannot be evaluated analytically.If we consider the measurement error with Gaussian distribution, observa-tion of negative concentrations at the beginning of the dissolution can appearagain. To prevent this we assume the measurement error with log-normal dis-tribution. Random process (1.48) with log-normal error is not log-normallydistributed, but it can be reasonably approximated by a new random vari-able with log-normal distribution (see ref. [10]). Thus we assume that theoriginal random process (t) has already the measurement error taken intoaccount and the variance has form var(t) = s2(t) = 2(t) + (t).

    20

  • 7/31/2019 Mthesisv2 Final Corrected

    26/66

    2Theoretical background

    Theoretical results needed in the text are presented in this chapter. It con-tains basic theory of regular densities, Fisher information and maximumlikelihood estimation, including formulae for its numeric solution. Informa-tion were taken mostly from ref. [1],[2],[20],[27] where more details can befound.

    2.1 Fisher information and its lower bound

    Definition 2.1. Let Rn be a space of parameters. We say, that setF= f(x; ) : = (1, . . . , n)T , x = (x1, . . . , xm)T Rm is system ofregular densities, if following conditions are satisfied:

    (a) is nonempty and open set,

    (b) support M = {x Rm : f(x;) > 0} is independent of,

    (c) finite partial derivative

    f(x; )

    i, i = 1, . . . , n ,

    exists for every x M,(d) it holds

    M

    f(x; )

    idx = 0, i = 1, . . . , n ,

    for every ,

    (e) finite integral

    Jij =

    M

    ln f(x;)

    i

    ln f(x; )

    jf(x;)dx, i, j = 1, . . . , n (2.1)

    exists for every and matrix J = (Jij )ni,j=1 is positive definite.

    21

  • 7/31/2019 Mthesisv2 Final Corrected

    27/66

    2.1. FISHER INFORMATION AND ITS LOWER BOUND

    Matrix J is called Fisher information matrix. If n = 1 (we deal with a singleparameter) then J is called Fisher information.

    For example, if m = 1 and a is a single parameter, then the Fisherinformation has form

    J =

    M

    dd f(x; )

    f(x; )

    2f(x; )dx. (2.2)

    Definition 2.2. LetX = (X1, . . . , X m)T be a random vector and f(x; ) be

    a regular density. Then random vector

    U() = (U1(), . . . , U n())T =

    ln f(X; )

    1, . . . ,

    ln f(X; )

    n

    T(2.3)

    is called score vector of the density f(x).

    Theorem 2.1. Let f(x) = f(x; ) be a regular density. Then score vectorU = U() of the density f(x) has meanEU = 0 and variance varU = J.

    Proof. For every component Ui of the score vector U it holds

    EUi =

    M

    ln f(x)

    if(x)dx =

    M

    f(x)

    idx = 0

    due to condition (d) of regularity. Second statement follows definition of theFisher information matrix (2.1) and variance varU = E

    UUT

    Following theorem shows the basic property of the Fisher information.Theorem 2.2. LetX andY be two independent random vectors with regularprobability densitiesfX(x), fY(y) F, and corresponding Fisher informationmatricesJX, JY. Then random vectorZ = (XT, YT)T has regular probabilitydensity

    fZ(z) = fX(x)fY(y), z = (xT, yT)T, (2.4)

    and Fisher information matrix takes form JZ = JX + JY.

    Proof. Conditions of regularity (a) (d) can be easily verified for joint prob-

    ability density (2.4). For every component J

    Z

    ij of the Fisher informationmatrix JZ in i-th column and j-th row it holds

    22

  • 7/31/2019 Mthesisv2 Final Corrected

    28/66

    2.1. FISHER INFORMATION AND ITS LOWER BOUND

    JZij =

    MM

    fZ(z; )

    i

    fZ(z; )

    j

    1

    fZ(z; )dz

    = MM

    fX(x)fY(y)i

    fX(x)fY(y)j

    1fX(x)fY(y)

    dxdy

    =

    M

    fX(x)

    i

    fX(x)

    j

    1

    fX(x)dx +

    M

    fY(y)

    i

    fY(y)

    j

    1

    fY(y)dy

    +2

    i,j=1i=j

    MM

    fX(x)

    i

    fY(y)

    jdxdy

    = JXij + JYij +

    2

    i,j=1i=j MfX(x)

    idx

    MfY(y)

    idy

    = JXij + JYij .

    The last equation follows condition (d) of regularity. Thus JZ = JX + JY

    and matrix JZ is positive definite.

    Theorem 2.2 has important consequence: let Xi for i = 1, . . . , m bemutually independent random variables with regular probability densitiesfi(x; ) and corresponding Fisher information matrices Ji. Then randomvector X = (X1, . . . , X m) has Fisher information matrix J

    =

    mi=1 Ji.

    The main importance of Fisher information is given by following theorem.

    Theorem 2.3. (Rao-Cramer) LetF= {f(x; ) : R} be regular sys-tem of densities with a single parameter . Let S be an unbiased estimatorof parametric function g(), whereES2 < for every . Let derivativeg() = dg()/d exists for every and let

    d

    d

    M

    S(x)f(x; )dx =

    M

    S(x)d

    df(x; )dx (2.5)

    Then it holds

    E(S g())2 [g()]2

    J(2.6)

    for every .Proof. We have

    ES =

    M

    S(x)f(x; )dx = g(),

    23

  • 7/31/2019 Mthesisv2 Final Corrected

    29/66

    2.1. FISHER INFORMATION AND ITS LOWER BOUND

    hence condition (2.5) impliesM

    S(x)d

    df(x; )dx = g(), (2.7)

    and condition (d) of regularity givesM

    g()d

    df(x; )dx = 0. (2.8)

    Subtracting equation (2.8) from (2.7) givesM

    [S(x) g()]dd f(x; )

    f(x; )f(x; )dx = g()

    and from Cauchy-Schwarz inequality we obtain

    [g()]2

    M

    [S(x) g()]2f(x; )dx

    M

    dd

    f(x; )

    f(x; )

    2f(x; )dx (2.9)

    which is equivalent to formula (2.6).

    If we insert S = and g() = into formula (2.6) we obtain

    var 1J

    , (2.10)

    where is an unbiased estimator of a single parameter . Inequality (2.10)is called Rao-Cramer inequality and term 1/J is called Rao-Cramer (lower)bound.

    A complete knowledge of distribution f(x; ) is needed for evaluation ofthe Fisher information (2.2). Even if the distribution is available, calculationof the Fisher information is often substantially complicated. However formodel with a single parameter we can use the lower bound of the Fisherinformation J2. Inserting S = X and g() = EX into formula (2.6) gives

    J2 =1

    varX

    dEX

    d

    2 J, (2.11)

    where X is a random variable with probability density f(x; ). In contrastto the Fisher information, value of the lower bound is based only on the firsttwo moments of a random variable X.

    24

  • 7/31/2019 Mthesisv2 Final Corrected

    30/66

    2.1. FISHER INFORMATION AND ITS LOWER BOUND

    Fisher information (2.2) can be equal to its lower bound (2.11) undercertain conditions. Equality in Cauchy-Schwarz inequality (2.9) appears ifS(x) g() = 0 or if exists such function K() independent of x that

    dd

    f(x; )

    f(x; ) = K()[S(x) g()], x M.Let us assume that S(x) g() = 0. Then we have

    d ln f(x; )

    d= K()S(x) g()K(). (2.12)

    Denote Q() and R() functions satisfying

    Q() = K(), R() = g()K(). (2.13)

    Then solution of (2.12) is

    ln f(x; ) = Q()S(x) R() + H(x),where H(x) is independent. Denote u(x) = eH(x) and v() = eR(). Then

    f(x; ) = u(x)v()eQ()S(x). (2.14)

    This density belongs to the exponential family with one parameter. Onlyin this case can Rao-Cramer inequality (2.11) became equality. Note thatfunctions Q() and R() must fulfil conditions mentioned above.

    Theorem 2.3 can be generalized for the case of parameter vector .

    Theorem 2.4. Let F = {f(x; ) : Rn} be regular system of den-sities. Let S = S(X) = (S1, . . . , S k)

    T be an unbiased estimator of theparametric function g() = (g1(), . . . , gk())

    T, where ES2i < for everyi = 1, . . . , k and every . Let the partial derivative

    gij () =gi()

    j, i = 1, . . . , k;j = 1, . . . , n

    exists and let

    M Si(x)f()

    jf()dx = gij(), i = 1, . . . , k;j = 1, . . . , n .

    Denote H = (gij ())k ni=1,j=1. Then it holds

    varS HJ1HT. (2.15)

    25

  • 7/31/2019 Mthesisv2 Final Corrected

    31/66

    2.2. MAXIMUM LIKELIHOOD ESTIMATION

    Proof of the theorem can be found in ref. [2], [29]. Inequality A B fortwo symmetric matrices A and B means the difference A B is positive-semidefinite matrix. If n = k > 1 and S = = (1, . . . ,n)T is an unbiasedestimate ofg() = (1, . . . , n)

    T, then H = In, where In is the identity matrixof size n, and formula (2.15) takes form

    var J1, (2.16)where var is the covariance matrix of an unbiased estimate = (1, . . . , n)T.Matrix inequality (2.15), resp. (2.16) is called Rao-Cramer inequality again.

    2.2 Maximum likelihood estimation

    Definition 2.3. LetX = (X1, . . . , X m)T be a random sample with joint regu-

    lar probability density f(x,), where xR

    m, parameter vector

    R

    n

    and is convex. Function

    L(; x) = f(x,),

    as a function of and fixed x is called likelihood function, and function

    l(; x) = ln L(; x) = ln f(x,)

    is called log-likelihood function.

    Definition 2.4. Let X = (X1, . . . , X m)T be a random sample with joint

    regular probability density f(x,), where x Rm

    , Rn

    and is convex. Estimate M LE of the parameter vector is called maximumlikelihood estimate if it maximizes the likelihood function L(; x) for givenX = x, i.e. it holds

    L(MLE; X) L(; X) (2.17)for every .

    Note that function of logarithm is monotonically increasing, thus for max-imum likelihood estimate

    M LE holds

    l(MLE; X) l(; X) (2.18)for every .

    26

  • 7/31/2019 Mthesisv2 Final Corrected

    32/66

    2.2. MAXIMUM LIKELIHOOD ESTIMATION

    Density f(x,) is assumed to be regular, thus we have ensured existenceof the first partial derivatives with respect to every component of parametervector . Thus maximum likelihood estimate MLE of the parameter vector can be obtained as a solution of system of equations

    L(, X)i

    = 0, i = 1, . . . , n , (2.19)

    orl(, X)

    i= 0, i = 1, . . . , n . (2.20)

    System (2.20) can be rewritten to form

    U() = 0 (2.21)

    due to Definition 2.2 of score vector U(). It can be shown (see ref.[1]) thatif equation

    Rm

    2f(x,)ij

    dx = 0, i, j = 1, . . . , n (2.22)

    is satisfied for every , then for U() (matrix of second partial deriva-tives ofl(; X) with respect to every component of parameter vector ) holds

    EU() = J,where J is the Fisher information matrix of probability density f(x,). Thismatrix is positive definite due to Definition 2.1, thus J is negative definite.Hence, function l(, X) is concave on convex set , and solution

    M LE of

    system (2.20) exists, is unique and maximizes the likelihood functions L(; X)

    and l(; X), see ref.[8].

    Iterative methods

    The likelihood equations U() = 0 are generally nonlinear with respect toan unknown parameter , so we can solve them with iterative methods. Themost common methods are Newton-Raphson method

    k = k1 U(k1)1 U(k1), k = 1, 2, . . . (2.23)and Fishers method of scoringk = k1 + (J(k1))1 U(k1), k = 1, 2, . . . , (2.24)where matrix U() is replaced with its mean EU() = J(). Both ofthese methods require an initial approximation 0 .

    27

  • 7/31/2019 Mthesisv2 Final Corrected

    33/66

    3Parameter estimation in stochastic

    models of dissolution

    Theory of Fisher information is applied on the stochastic models of disso-lution in this chapter. Method for maximum likelihood estimation of theirparameters is proposed and several examples of the Fisher information, resp.Rao-Cramer bounds, are presented.

    3.1 Fisher information in theory of dissolution

    In this Section we present methods for evaluation of the Fisher informationfor stochastic models of dissolution presented in Section 1.2. It can be easilyverified that their probability densities (1.21) and (1.36) are regular.

    3.1.1 Single-parameter model

    Formula (2.10) describing Rao-Cramer inequality suggests that larger is theFisher information, better estimate of the parameter can be achieved. Inother words, the most precise estimation of the parameter can be obtainedfrom the data measured at the time instant with the highest Fisher informa-tion. That time instant we call the optimal time and denote it topt.

    Gaussian model

    At first we study stochastic model described by equation (1.22) with Gaussianprobability distribution N(F(t), s2(t)) of the dissolved fraction data, where

    F(t) = F(t; a) is a dissolution profile and s

    2

    (t) = s

    2

    (t; a) is given variance.Both of them contain only a single unknown parameter a characterizing scaleof dissolution. Probability density f(x, t) = f(x, t; a) is given by (1.21) and

    28

  • 7/31/2019 Mthesisv2 Final Corrected

    34/66

  • 7/31/2019 Mthesisv2 Final Corrected

    35/66

    3.1. FISHER INFORMATION IN THEORY OF DISSOLUTION

    After inserting specific functions F(t) and s2(t) into (3.3) we can find theoptimal time topt, where

    topt = argmaxt(0,)

    J(t). (3.4)

    We have seen that computation of Fisher information J(t) is complicated,thus in following examples is not given, however, the approach is similar.

    Log-normal model

    Let us investigate stochastic model of dissolution described by formula (1.42)with log-normal distribution logN((t), 2(t)) of the dissolved fraction data.Parametric functions (t) = (t; a) and 2(t) = 2(t; a) are defined by (1.46)and (1.47), and contain only a single unknown parameter a characterizingscale of dissolution. Inserting probability density (1.36) into formula (2.2), = a, gives with similar approach as in the previous case

    J(t) =1

    2

    dda 2(t)2(t)

    2+ dda (t)2

    2(t), (3.5)

    After inserting specific functions F(t) and s2(t) into (1.46), (1.47) and conse-quently into (3.5) we can find the optimal time topt as a solution of (3.4). Onecan see that formula (3.3) with parameters of Gaussian distribution (1.20)is similar to formula (3.5) with parameters of log-normal distribution (1.43)given by (1.46) and (1.47).

    Multiple measurements

    If we can measure at m 2 time instants given by vector t = (t1, . . . , tm)T,where ti > 0 for i = 1, . . . , m, and the measurements can be consideredto be mutually independent, then the Fisher information J(t) has due toTheorem 2.2 form

    J(t) =m

    i=1

    J(ti). (3.6)

    Thus if the optimal time topt exists, and such index i {1, . . . , m} thatti = topt exists, then it holds

    J(t) =m

    i=1 J(ti) m

    i=1 J(topt) = mJ(topt).It means that in this case it is better to measure m data at the optimaltime topt than to measure data at m different time instants spreaded over aninterval, as it is usually done in dissolution experiments, see ref. [19],[26],[34].

    30

  • 7/31/2019 Mthesisv2 Final Corrected

    36/66

    3.1. FISHER INFORMATION IN THEORY OF DISSOLUTION

    Lower bound of the Fisher information

    Lower bound (2.11) of the Fisher information is invariant with respect to aprobability distribution, hence it exists and is the same for stochastic models(t) given by (1.22) and (1.35). Both models satisfy equations

    E(t) = F(t) = F(t; a), var(t) = s2(t) = s2(t; a),

    where a is a single parameter of dissolution. Hence, from (2.11) by applying = a, we obtain

    J2(t) =

    dda

    F(t)2

    s2(t). (3.7)

    After inserting specific functions F(t) and s2(t) into (3.7) we get an approx-imation of the optimal time topt as the time instant of J2(t) maxima.

    Lower bound of the Fisher information J2(t) coincides with Fisher infor-

    mation (3.3) ifd

    da s2

    (t) 0, which means that variance s2

    (t) is independentof the single parameter a. In this case density (1.21) has exponential familyform (2.14) and satisfies required conditions (2.13).

    3.1.2 Two-parameter model

    A different approach has to be used if the model contains two unknownparameters. In this case we deal with Fisher information matrix J(t) givenby definition (2.1). We assume that the data can be measured at m 1 timeinstants given by vector t = (t1, . . . , tm)

    T, ti > 0 for i = 1, . . . , m and themeasurements are mutually independent.

    Formula (2.16) suggests that the parameter vector = (a, b)T has es-timate with minimal variance if the data are measured at such time in-stants, when the diagonal components of the inverse Fisher information ma-trix J1(t) reach their minima. These time instants we call optimal timesand denote them topt.a for parameter a, resp. topt.b for parameter b. Thediagonal elements of J1(t) we call Rao-Cramer bounds again, in referenceto definition of Rao-Cramer bound for a single parameter model given by(2.10). In contrast to single-parameter case we need to find them directly.

    Firstly we need to investigate the Fisher information matrix in the casewe can make only a single measurement at a single time instant ( m = 1).Estimate of both parameters based on such measurements may not exist, butresults of following Sections are needed for further calculations.

    31

  • 7/31/2019 Mthesisv2 Final Corrected

    37/66

    3.1. FISHER INFORMATION IN THEORY OF DISSOLUTION

    Single measurement: Gaussian model

    Let us consider Gaussian stochastic model (1.22) with mean F(t) = F(t; a, b)and variance s2(t) = s2(t; a, b) containing two parameters a and b. At firstwe calculate Fisher information matrix J(t), where each of its component

    Jij (t) is given by formula (2.1), = (a, b)T and probability density f(x; ) isdefined by (1.21). With similar approach as in one-parameter case we obtaincomponents of matrix

    J(t) =

    12 s2a(t)s2(t) 2 + (Fa(t))2s2(t) 12 s2a(t)s2b (t)s4(t) + Fa(t)Fb(t)s2(t)12

    s2a(t)s2

    b(t)

    s4(t) +Fa(t)Fb(t)

    s2(t)12

    s2b(t)

    s2(t)

    2+

    (Fb(t))2

    s2(t)

    . (3.8)Determinant of the matrix is

    |J(t)| = [s2a(t)Fb(t) s2b (t)Fa(t)]2

    2s6(t) 0. (3.9)

    If the determinant is nonzero, the inverse matrix J1(t) exists and has form

    J1(t) =1

    |J(t)|

    12 s2b (t)s2(t) 2 + (Fb(t))2s2(t) 12 s2a(t)s2b (t)s4(t) Fa(t)Fb(t)s2(t)1

    2

    s2a(t)s2

    b(t)

    s4(t) Fa(t)Fb(t)

    s2(t)12

    s2a

    (t)s2(t)

    2+ (F

    a(t))2

    s2(t)

    .(3.10)

    From inequality (2.16) we obtain two optimal times topt.a and topt.b as thetime instants of the diagonal elements of (3.10) minima,

    topt.a = argmint(0,)

    J111 (t), topt.b = argmint(0,)

    J122 (t). (3.11)

    At the time instant topt.a we should obtain the best estimate of parameter a,but not the best estimate of parameter b and vice versa.Determinant (3.9) of Fisher information matrix (3.8) can be zero, espe-

    cially if the variance s2(t) is parameter independent. The case of a singularFisher information matrix represents a significant complication for the the-ory of the Rao-Cramer lower bound and is usually handled by resorting tothe pseudoinverse of the Fisher matrix, see ref. [29]. It has been shownthere, that the Rao-Cramer lower bound does not exist for estimation prob-lems with singular Fisher information matrix, and thus we can not find theoptimal times.

    Dissolution profile F(t) is given, thus singularity of Fisher information

    matrix (3.8) depends on form of the function s2(t). For example, the deter-minant is zero if variance of measured data has form

    s2(t) =

    Ni=0

    iFi(t), (3.12)

    32

  • 7/31/2019 Mthesisv2 Final Corrected

    38/66

    3.1. FISHER INFORMATION IN THEORY OF DISSOLUTION

    where i R for i = 1, . . . , N . It holds

    s2a(t) = Fa(t)P(t), s

    2b(t) = Fb(t)P(t),

    where

    P(t) =N

    i=1

    iiFi1(t), (3.13)

    thuss2a(t)Fb(t) s2b (t)Fa(t) 0 (3.14)

    which gives |J(t)| 0 for variance s2(t) given by (3.12). Hence, we are notable to find the optimal times especially if the variance takes form

    s2(t) = pF(t)(1 F(t)) + qF(t) + r (3.15)

    as proposed in Section 1.2.1.

    Single measurement: Log-normal model

    For log-normal stochastic model (1.43) with parameters (t) = (t; a, b) and2(t) = 2(t; a, b) defined by (1.46) and (1.47), the approach is similar. Fisherinformation matrix for parameter vector = (a, b)T and probability densityf(x; ) given by (1.36) has form

    J(t) =

    12

    2a

    (t)2(t)

    2+ (

    a(t))2

    2(t)12

    2a(t)2

    b(t)

    4(t)+

    a(t)

    b(t)

    2(t)

    12

    2a(t)2

    b(t)

    4(t)+

    a(t)b(t)

    2(t)12

    2b(t)

    2(t) 2

    +(

    b(t))2

    2(t)

    (3.16)

    with determinant

    |J(t)| = [2a(t)b(t) 2b (t)a(t)]2

    26(t) 0. (3.17)

    If the determinant is nonzero, the inverse Fisher information matrix has form

    J1(t) =1

    |J(t)|

    12

    2b(t)

    2(t)

    2+

    (b(t))2

    2(t) 122a

    (t)2b(t)

    4(t) a(t)

    b(t)

    2(t)

    12

    2a(t)2

    b(t)

    4(t)

    a(t)

    b(t)

    2(t)12

    2a(t)

    2(t) 2

    + (

    a(t))2

    2(t)

    .

    (3.18)The optimal times topt.a and topt.b correspond to the argument minima of thediagonal functions of matrix J1(t) as given by (3.11). Again, if the Fisherinformation matrix is singular, we can not find the Rao-Cramer bounds nor

    33

  • 7/31/2019 Mthesisv2 Final Corrected

    39/66

    3.1. FISHER INFORMATION IN THEORY OF DISSOLUTION

    the optimal times. Similarly as in the previous case it can be shown thatdeterminant (3.17) is zero if

    2(t) =N

    i=0 ii(t), (3.19)

    where i R. Functions (t) and 2(t) depends on functions F(t) ands2(t), hence value of determinant (3.17) of two-parametric log-normal modeldepends on their form. We show that the determinant is zero if variances2(t) has form (3.12). Differentiation of (1.46) and (1.47) with respect toparameters a and b gives

    2a(t) = s2a

    (t)P(t) Fa(t)Q(t), 2b (t) = s2b (t)P(t) Fb(t)Q(t)a(t) = F

    a(t)R(t)

    1

    2s2a(t)P(t), b(t) = F

    b(t)R(t)

    1

    2s2b(t)P(t),

    where

    P(t) =F(t)

    F(t)(F2(t) + s2(t)), Q(t) =

    2s2(t)

    F(t)(F2(t) + s2(t)),

    R(t) =1

    2Q(t) +

    1

    F(t).

    It holds

    2a(t)b(t) = F

    a(t)s

    2b(t)P(t)R(t) Fa(t)Fb(t)Q(t)R(t)

    1

    2s2a

    (t)s2b

    (t)P2

    (t) +

    1

    2F

    bs2a

    (t)P(t)Q(t)2b(t)a(t) = F

    b(t)s

    2a(t)P(t)R(t) Fa(t)Fb(t)Q(t)R(t)

    12

    s2a(t)s2b

    (t)P2(t) +1

    2Fas

    2b(t)P(t)Q(t),

    and

    2a(t)b(t)2b (t)a(t) =

    Fa(t)s

    2b(t) Fb(t)s2a(t)

    P(t)R(t) 1

    2P(t)Q(t)

    .

    (3.20)For variance s2(t) given by (3.12) holds (3.14), hence

    |J(t)

    | 0 for Fisher

    information matrix (3.16).

    34

  • 7/31/2019 Mthesisv2 Final Corrected

    40/66

    3.1. FISHER INFORMATION IN THEORY OF DISSOLUTION

    Multiple measurements

    If the measurements can be taken at m 2 time instants given by vectort = (t1, . . . , tm)

    T, ti > 0 for i = 1, . . . , m, and the dissolution data are mu-tually independent, then the Fisher information matrix of parameter vector

    = (a, b)T takes due to Theorem 2.2 form

    J(t) =

    mi=1

    J(ti), (3.21)

    where J(ti) is the Fisher information matrix of a single measurement studiedpreviously. If determinant of matrix (3.21) is nonzero, then we can findthe inverse matrix J1(t) and obtain the optimal times as the argument ofdiagonal elements minima,

    topt.a = argmintRm+

    J111 (t), topt.b = argmintRm+

    J122 (t). (3.22)

    For example, let us have m = 2 independent measurements, t = (t1, t2)T.

    Let Jjk (t) be component of Fisher information matrix J(t) in j-th columnand k-th row, j, k {1, 2}. Then (3.21) has form

    J(t) =

    J11(t1) + J11(t2) J12(t1) + J12(t2)J12(t1) + J12(t2) J22(t1) + J22(t2)

    (3.23)

    and for its determinant holds

    |J(t)| = (J11(t1) + J11(t2))(J22(t1) + J22(t2)) (J12(t1) + J12(t2))2

    =

    J11(t1)J22(t1) J212(t1) + J11(t2)J22(t2) J212(t2)++J11(t1)J22(t2) + J11(t2)J22(t1) 2J12(t1)J12(t2)

    = |J(t1)| + |J(t2)| ++J11(t1)J22(t2) + J11(t2)J22(t1) 2J12(t1)J12(t2). (3.24)

    If the determinant is not zero, then the inverse Fisher information matrixhas form

    J1(t) =1

    |J(t)

    | J22(t1) + J22(t2) J12(t1) J12(t2)

    J12(t1)

    J12(t2) J11(t1) + J11(t2) (3.25)

    It can be easily verified that for its diagonal elements holds

    J1ii (t1, t2) = J1ii (t2, t1), i = 1, 2,

    35

  • 7/31/2019 Mthesisv2 Final Corrected

    41/66

    3.1. FISHER INFORMATION IN THEORY OF DISSOLUTION

    thus if the optimal time (t(1)opt, t

    (2)opt)

    T of any of the parameter exists, then

    (t(2)opt, t

    (1)opt)

    T is the optimal time too.If Fisher information matrix J(t) for a single measurement is singular,

    then the Fisher information matrix (3.23) does not necessarily need to be

    singular too. We show it for Gaussian model (1.22) with variance s2

    (t) givenby (3.12). For log-normal model (1.43) the approach is similar. Componentsof Fisher information matrix J(t) has form

    J11(t) =1

    2

    Fa(t)P(t)

    s2(t)

    2+

    [Fa(t)]2

    s2(t),

    J22(t) =1

    2

    Fb(t)P(t)

    s2(t)

    2+

    [Fb(t)]2

    s2(t),

    J12(t) =1

    2

    Fa(t)F

    b(t)P2(t)

    s4(t)+

    Fa(t)F

    b(t)

    s2(t),

    where F(t) is a two-parametric dissolution profile, s2(t) is given by (3.12)and P(t) is given by (3.13). In previous Sections we showed that Fisherinformation matrix J(t) of a single measurement has determinant equal zerofor Gaussian model (1.22) with variance (3.12). Hence, from (3.24) we obtainafter a short calculation

    |J(t)| = [F

    a(t1)F

    b(t2) Fa(t2)Fb(t1)]24s2(t1)s2(t2)

    P2(t1) + 2s

    2(t1)

    P2(t2) + 2s2(t2)

    .

    Here we can see that Fisher information matrix (3.23) of Gaussian model(1.22) with variance (3.12) is singular if

    Fa(t1)F

    b(t2) Fa(t2)Fb(t1) 0, (3.26)

    Furthermore, it can be seen that matrix (3.23) is singular for t1 = t2. Inlight of these facts it seems that singularity of the Fisher information matrixpoints on our unability of two parameter estimation in case of measurementsmade at a single time instant. As we have seen, adding another time instantcan solve the problem.

    It can be easily verified that (3.26) does not appear for two-parametricdissolution profiles F(t) studied in the first chapter.

    36

  • 7/31/2019 Mthesisv2 Final Corrected

    42/66

    3.2. PARAMETERS OF DISSOLUTION MODELS AND THEIR

    MAXIMUM LIKELIHOOD ESTIMATION

    3.2 Parameters of dissolution models and their

    maximum likelihood estimation

    Probability distribution of the dissolution data is assumed to be known, thus

    we can apply maximum likelihood method for estimation of the parameters.Vector C = (C1, . . . , C m)

    T contains m 1 mutually independent dissolvedfraction data measured at the time instants given by vector t = (t1, . . . , tm)

    T,we do not exclude ti = tj for i = j. We assume that the parametric formof the dissolution profile F(t) = F(t; ) and variance s2(t) = s2(t; ) areknown. The parameter vector can be, for example, = a, resp. = (a, b)T

    for models investigated in Section 3.1. In this Section, method of maximumlikelihood estimation is applied on the stochastic models of dissolution withGaussian and log-normal probability distribution studied in Section 1.2. Itcan be easily verified that their probability densities (1.21) and (1.36) satisfiesequation (2.22).

    3.2.1 Gaussian model

    Let us assume the dissolution is described by stochastic model (1.22) withGaussian distribution N(F(t), s2(t)), where F(t) = F(t; ) and s2(t) = s2(t; )depends on vector parameter Rn. The log-likelihood functionl() = l(, C) has form

    l() = ln

    m

    i=1fi() =

    m

    i=1ln fi(), (3.27)

    where m is number of observations and

    fi() =1

    2s2(ti)exp

    (Ci F(ti))

    2

    2s2(ti)

    . (3.28)

    It holds

    ln fi() = 12

    ln

    2s2(ti) (Ci F(ti))2

    2s2(ti)

    hence

    j

    ln fi() = 12

    j s2(ti)s2(ti)

    + 2F(t

    i)

    j s2(ti)(Ci F(ti)) +s2(t

    i)

    j (Ci F(ti))22s4(ti)

    =1

    2

    j

    s2(ti)

    s2(ti)

    (Ci F(ti))2

    s2(ti) 1

    +

    F(ti)

    j

    Ci F(ti)s2(ti)

    37

  • 7/31/2019 Mthesisv2 Final Corrected

    43/66

    3.2. PARAMETERS OF DISSOLUTION MODELS AND THEIR

    MAXIMUM LIKELIHOOD ESTIMATION

    for j = 1, . . . , n. Thus, system of likelihood equations (2.20) takes form

    1

    2

    mi=1

    j

    s2(ti)

    s2(ti)

    (Ci F(ti))2

    s2(ti) 1

    +

    mi=1

    F(ti)

    j

    Ci F(ti)s2(ti)

    = 0 (3.29)

    for j = 1, . . . , n, where n is number of parameters. Term at the left hand sideof (3.29) is equal to the component Uj of the score vector U = (U1, . . . , U n)

    T.Newton-Raphson (2.23) or Fishers score iterative method (2.24) now can beapplied. Fisher information matrix used in (2.24) has due to Theorem 2.2form

    J() =m

    i=1

    J(ti; ), ti t, (3.30)

    where J(ti; ) = J(ti), i = 1, . . . , m, are Fisher information matrices of pa-rameter vector in case of a single measurement made at the time instant

    ti and m is number of observations.Note that if variance s2(t) is constant, then solution of maximum likeli-hood equations (3.29) coincides with solution obtained with the least-squareestimation method. To be more specific, solution of least-square minimaliza-tion problem = min

    mi=1

    (Ci F(ti))2 (3.31)

    leads to system of equations

    m

    i=1F(ti)

    j(Ci

    F(ti)) = 0, j = 1, . . . , n , (3.32)

    where m is number of observations and n is number of parameters. If weinsert constant variance s2(t) c > 0 into equation (3.29) we obtain (3.32).

    3.2.2 Log-normal model

    In contrast to previous part we assume that the dissolution is describedby stochastic model with log-normal distribution logN((t), 2(t)), where(t) = (t; ) and 2(t) = 2(t; ) are known functions described by equa-tions (1.46) and (1.47);

    R

    n is vector of parameters. The log-

    likelihood function l() = l(, C) has form (3.27), where

    fi() =1

    Ci

    22(ti)exp

    (ln Ci (ti))

    2

    22(ti)

    . (3.33)

    38

  • 7/31/2019 Mthesisv2 Final Corrected

    44/66

    3.2. PARAMETERS OF DISSOLUTION MODELS AND THEIR

    MAXIMUM LIKELIHOOD ESTIMATION

    The system of likelihood equations is obtained similarly as in the previouscase and has form

    1

    2

    m

    i=1

    j2(ti)

    2(ti) (ln Ci (ti))2

    2(ti) 1+

    m

    i=1(ti)

    j

    ln Ci (ti)2(ti)

    = 0 (3.34)

    for j = 1, . . . , n, where m is number of observations and n is number ofparameters. Term at the left hand side of (3.34) is equal to the componentUj of score vector U = (U1, . . . , U n)

    T. It can be inserted into (2.23) or (2.24)to obtain numeric solution of the likelihood equations. Fisher informationmatrix used in (2.24) has due to Theorem 2.2 form (3.30).

    39

  • 7/31/2019 Mthesisv2 Final Corrected

    45/66

    3.3. EXAMPLES

    3.3 Examples

    3.3.1 Stochastic homogenous model

    As an example of stochastic model with single parameter = a we takestochastic homogenous model with mean specified by (1.7). Form of itsvariance s2(t) influences substantially time course of the Fisher information,thus we introduce several examples to illustrate this fact. Analytical form ofthe Fisher information for log-normal model (3.5) is complicated in all casesand is not given here.

    Example 1

    The simplest example we can take is stochastic homogenous model withconstant variance,

    s2(t)

    r, r > 0. (3.35)

    Fisher information of Gaussian model (1.22) takes, after substitution (1.7)and (3.35) into formula (3.3), form

    J(t) = J2(t) =t2e2at

    r. (3.36)

    For variance given by (3.35), Fisher information (3.36) satisfies J(0) = 0 andlimt J(t) = 0. Optimal time for this model is obtained from (3.4) and hasanalytic form topt = 1/a. Time courses of Fisher information J(t) given by(3.3) and (3.5) for Gaussian and log-normal models are shown in Fig. 3.1.

    Example 2

    The stochastic homogenous model with nonconstant variance

    s2(t) = peat

    1 eat , p > 0, (3.37)is another example we consider. It holds s2(0) = 0 and limt s

    2(t) = 0.Fisher information of Gaussian model (1.22) and its lower bound (2.11) haveform

    J(t) =

    1

    2t21 2eat1 eat

    2

    +

    t2eat

    p(1 eat) , (3.38)J2(t) =

    t2eat

    p(1 eat) . (3.39)

    40

  • 7/31/2019 Mthesisv2 Final Corrected

    46/66

    3.3. EXAMPLES

    These functions are plotted together with Fisher information (3.5) of log-normal model in Fig. 3.1. There it can be seen that functions J(t) tend toinfinity for increasing t in this example. This is caused by formal continuationof dissolution in model (1.7) for large t while variance s2(t) tends to zero. Inthis case Fisher information J(t) gives no optimal time according to (3.4).

    We have a natural requirement that no information about the dissolutionprocess can be obtained when it is finished. Example of Fisher informationwith variance (3.37) contradicts that assumption, hence the model cannot beaccepted.

    Example 3

    In the last example we take into account the measurement error. Variance(3.37) of the stochastic homogenous model can be, for example, modified toform

    s

    2

    (t) = pe

    at 1 eat+ q(1 eat), p > 0, q > 0. (3.40)Analytic form of the Fisher information is complicated and is not given here.Lower bound of the Fisher information has form

    J2(t) =t2e2at

    (1 eat)(peat + q) . (3.41)

    Example of this function together with time courses of Fisher information(3.3), resp. (3.5) of Gaussian, resp. log-normal model are given in Fig. 3.1.It holds J(0) = 0 and limt J(t) = 0, thus our requirements for the Fisherinformation are satisfied for variance given by (3.40). All the optimal times

    can be found with appropriate numeric method.If we take into account variance (3.37) with constant measurement error

    s2(t) = peat

    1 eat+ r, (3.42)where p > 0, r > 0 are constants, then the course of the Fisher information isvery similar to the one with variance (3.40) depicted in Fig. 3.1. We can seethat for behavior of the Fisher information of stochastic homogenous modelis crucial variance which does not tend to zero for increasing t.

    41

  • 7/31/2019 Mthesisv2 Final Corrected

    47/66

  • 7/31/2019 Mthesisv2 Final Corrected

    48/66

    3.3. EXAMPLES

    3.3.2 Stochastic Weibull model

    As an example of model with two parameters we take stochastic Weibullmodel with mean described by (1.8). In the next examples we see how itsvariance s2(t) influences form of the Rao-Cramer bounds for both parame-

    ters in the case of single measurement. We have seen that for variance (3.15)the Fisher information matrix is singular, and thus we have to find a dif-ferent function that satisfies our requirements. Examples of such functionspresented in following are shown in Fig. 3.2. Analytical form of the inverseFisher information matrices described in Section 3.1.2 is complicated in allexamples and is not given here.

    0 1 2 30

    0.01

    0.02

    0.03

    0.04

    0.05

    0.06

    (A)

    t

    s2(t)

    0 1 2 30

    0.01

    0.02

    0.03

    0.04

    (B)

    t

    s2(t)

    Fig. 3.2: Different functions of variance s2(t). (A) s2(t) corresponding to Weibull model(1.8) with parameters a = 1, b = 2 of the form (3.43), p = 0.1 (black), variance (3.44),p = 0.1, r = 0.02 (blue) and variance (3.45), p = 0.1, q = 0.02 (red). (B) Variancecorresponding to Hixson-Crowell model (1.16), a = 0.5, b = 2, of the form (3.35), r = 0.02(black), variance (3.46), p = 0.1 (blue) and variance (3.47), p = 0.1, r = 0.015 (red).

    Example 1

    As the first example of stochastic Weibull model with variance that givesnonzero determinant of the Fisher information matrices described in Section3.1.2 we take function

    s2

    (t) = ptb

    (1 F(t)) = ptb

    eatb

    , p > 0 (3.43)

    where F(t) is the dissolution profile of Weibull model described by formula(1.8) and a, b are its parameters. Component tb in (3.43) ensures that de-terminants of the Fisher information matrices of Gaussian and log-normal

    43

  • 7/31/2019 Mthesisv2 Final Corrected

    49/66

    3.3. EXAMPLES

    stochastic models are nonzero. It can be easily verified that (3.43) satis-fies our assumptions about the variance of the dissolution given in Section1.2.1. Inserting F(t) and s2(t) into (3.10) and (3.18) gives the inverse Fisherinformation matrices of Gaussian and log-normal model. Time courses ofdiagonal functions J1

    11

    (t) and J1

    22

    (t) are shown in Fig. 3.3.There the discontinuity of function J122 (t) at the time instant t = 1 can be

    seen, which is due to presence of logarithm in denominator of the function.Despite the one-parameter model had increasing Fisher information, resp.decreasing Rao-Cramer bound if its variance tended to zero for increasing t,this does not hold for the two-parameter model.

    For model with Gaussian distribution, the optimal time to measure forparameter a can be evaluated from equation (3.11) and has analytic formtopt.a = a

    1/b. The parameter b has optimal time to measure topt.b = 0 whichimplies that for the best estimate of parameter b we have to measure whennothing, or only very small amount of a solvent is dissolved. This is analogous

    to previous examples of single-parameter model, where the models withoutmeasurement errors had the optimal time at the infinity, i.e. when everythingis dissolved.

    For model with log-normal distribution, the optimal time of the param-eter a is close to the optimal time of the Gaussian model, as one can see inFig.3.3. In contrast to the Gaussian model, the optimal time for estimation of

    0 0.5 1 1.5 22

    4

    6

    8

    10

    (A)

    t

    J1

    11

    (t)

    0 0.5 1 1.5 2 2.50

    20

    40

    60

    80

    100

    (B)

    t

    J1

    22

    (t)

    Fig. 3.3: Rao-Cramer bounds for stochastic Weibull model with mean (1.8) and variance

    (3.43), a = 1, b = 2, p = 0.1. Rao-Cramer bound for (A) parameter a and (B) parameterb of Gaussian (black) and log-normal (red) stochastic models. The optimal times aretopt.a = 1, topt.b = 0 (Gaussian model) and topt.a = 1.042, topt.b = 0.259 (log-normalmodel).

    44

  • 7/31/2019 Mthesisv2 Final Corrected

    50/66

    3.3. EXAMPLES

    the parameter b is positive. In the example depicted in the Fig.3.3 (B) the op-timal time topt.b = 0.259. But for given parameters it holds F(topt.b) = 0.065,which can be interpreted as the best time for measure is when only 6.5% ofthe solvent is dissolved.

    Example 2

    As the second example we take stochastic Weibull model with variance (3.43)with added constant measurement error,

    s2(t) = ptbeatb

    + r, p > 0, r > 0, (3.44)

    where a, b are parameters of Weibull model (1.8). Inserting F(t) and s2(t)into (3.10) and (3.18) gives the inverse Fisher information matrices of Gaus-sian and log-normal model. Time courses of their diagonal functions J111 (t)and J122 (t) are shown in Fig. 3.3. Discontinuity of the Rao-Cramer bound of

    the parameter b at the time instant t = 1 can be seen there due to presence oflogarithm in denominator of function J122 (t). The optimal times to measurecan be obtained from (3.11). For a model with Gaussian distribution, addinga constant measurement error to variance (3.43) has no effect to value of theoptimal time topt.a, but the optimal time topt.b is no longer zero.

    0 0.5 1 1.5 2 2.5

    5

    10

    15

    20

    25

    30

    35

    40

    (A)

    t

    J1

    11

    (t)

    0 0.5 1 1.5 2 2.5

    50

    100

    150

    200

    (B)

    t

    J1

    22

    (t)

    Fig. 3.4: Rao-Cramer bounds for stochastic Weibull model with mean (1.8) and variance(3.44), a = 1, b = 2, p = 0.1, r = 0.02. Rao-Cramer bound for (A) parameter a and (B)

    parameter b of Gaussian (black) and log-normal (red) stochastic models. The optimaltimes are topt.a = 1, topt.b = 0.382 (Gaussian model) and topt.a = 1.077, topt.b = 0.533(log-normal model).

    45

  • 7/31/2019 Mthesisv2 Final Corrected

    51/66

    3.3. EXAMPLES

    Example 3

    As the last example we take stochastic Weibull model with variance

    s2(t) = ptbeatb

    + q1 eatb

    , p > 0, q > 0, (3.45)where a and b are parameters of Weibull model (1.8). Rao-Cramer bounds ofthe Gaussian and the log-normal stochastic models are shown in Fig.3.5. Atthe time instant t = 1 discontinuity of J122 (t) appears again. The Gaussianmodel has the optimal time topt.b = 0.056, where F(topt.b) = 0.003. This canbe interpreted that the best time to measure is when 0.3% of the solvent isdissolved. For log-normal model it holds topt.b = 0.25 and F(topt.b) = 0.06. Asone can see, in this example with variance (3.45), the optimal times for theparameter b corresponds with very small values of dissolved fraction. As wehave seen in the previous example, this can be solved by adding a constantto variance s2(t).

    0 0.5 1 1.5 2 2.5

    5

    10

    15

    20

    25

    30

    35

    40

    (A)

    t

    J1

    11

    (t)

    0 0.5 1 1.5 2 2.50

    50

    100

    150

    200

    (B)

    t

    J1

    22

    (t)

    Fig. 3.5: Rao-Cramer bounds for stochastic Weibull model with mean (1.8) and variance(3.45), a = 1, b = 2, p = 0.1, q = 0.02. Rao-Cramer bound for (A) parameter a and (B)parameter b of Gaussian (black) and log-normal (red) stochastic models. The optimaltimes are topt.a = 0.719, topt.b = 0.056 (Gaussian model) and topt.a = 0.862, topt.b = 0.25(log-normal model).

    46

  • 7/31/2019 Mthesisv2 Final Corrected

    52/66

    3.3. EXAMPLES

    3.3.3 Stochastic Hixson-Crowell model

    In this Section we investigate properties of the Rao-Cramer bounds of Gaus-sian model (1.22) with mean described by dissolution profile F(t) of Hixson-Crowell model (1.16) and a variance s2(t). The measurements are assumed

    to be in two different time instants, t = (t1, t2)T. As we have seen in theo-retical part of this chapter, it allows to use those functions of variance s2(t),for which the Fisher information matrix J(t) is singular. Analytical form ofthe inverse Fisher information matrix J1(t) is complicated in all examplesand is not given here.

    Example 1

    The simplest example of stochastic Hixson-Crowell model is with a constantvariance (3.35). Inverse diagonal functions of the Fisher information matrixof Gaussian model (1.22), 1/J111 (t) and 1/J

    122 (t), are shown in Fig 3.6.

    We use this approach for better view as the diagonal functions have largenumber of discontinuities. We can see that despite variance (3.35) gives nosingle optimal time due to singularity of the Fisher information matrix, formeasurements in two different time instants it gives optimal times for bothparameters.

    0

    1

    2

    0

    1

    20

    1

    2

    3

    4

    5

    t2t1

    (A)

    1/J1

    11

    (t1,t2

    )

    0

    1

    2

    0

    1

    20

    0.05

    0.1

    0.15

    t2t1

    (B)

    1/J1

    22

    (t1,t2

    )

    Fig. 3.6: Inverse Rao-Cramer bounds for Gaussian model (1.22) with mean describedby (1.16) of Hixson-Crowell model and variance given by (3.35), a = 0.5, b = 2, r = 0.02.

    (A) inverse Rao-Cramer bound of scale parameter a and (B) inverse Rao-Cramer bound ofshape parameter b. The optimal times are topt.a = (0.398, 1.6)T, topt.b = (0.467, 1.647)

    T.

    47

  • 7/31/2019 Mthesisv2 Final Corrected

    53/66

    3.3. EXAMPLES

    Example 2

    As the second example we take stochastic Hixson-Crowell model with vari-ance without measurement error

    s2(t) = p(1 at)b 1 (1 at)b for t [0, 1a ]0 for t > 1a , p > 0. (3.46)Inverse diagonal functions of the Fisher information matrix of Gaussianmodel (1.22), 1/J111 (t) and 1/J

    122 (t), are shown in in Fig 3.7. We can see

    that the functions reach their maxima if at least one of the times to measureis equal to 2, i.e. when 100% is dissolved. This is analogous to the single-parameter case, where the stochastic homogenous model had the optimaltime at the infinity, i.e. when everything is dissolved, for variance tending tozero with increasing t.

    0

    1

    2

    0

    1

    20

    100

    200

    300

    400

    500

    t2t1

    (A)

    1/J1

    11

    (t1,t2

    )

    0

    1

    2

    0

    1

    20

    0.5

    1

    t2t1

    (B)

    1/J1

    22

    (t1,t2

    )

    Fig. 3.7: Inverse Rao-Cramer bounds for Gaussian model (1.22) with mean describedby (1.16) of Hixson-Crowell model and variance given by (3.46), a = 0.5, b = 2, p = 0.1.(A) inverse Rao-Cramer bound of scale parameter a and (B) inverse Rao-Cramer boundof shape parameter b. The optimal times are topt.a = (1.235, 2)

    T, topt.b = (1.284, 2)T.

    48

  • 7/31/2019 Mthesisv2 Final Corrected

    54/66

    3.3. EXAMPLES

    Example 3

    As the last example we take

    s2(t) = p(1 at)b

    1 (1 at)b

    + r for t [0, 1a ]

    0 for t >1

    a

    , p > 0, r > 0.

    (3.47)Inverse diagonal elements of the Fisher information matrix of Gaussian model(1.22), 1/J111 (t) and 1/J

    122 (t), are shown in in Fig 3.8. We can see that for

    behavior of the Fisher information of stochastic Hixson-Crowell model iscrucial variance which does not tend to zero for t 1/a.

    0

    1

    2

    0

    1

    20

    2

    4

    6

    t2t1

    (A)

    1/J1

    11

    (t1,t2)

    0

    1

    2

    0

    1

    20

    0.05

    0.1

    0.15

    t2t1

    (B)

    1/J1

    22

    (t1,t2)

    Fig. 3.8: Inverse Rao-Cramer bounds for Gaussian model (1.22) with mean described by(1.16) of Hixson-Crowell model and variance given by (3.47), a = 0.5, b = 2, p = 0.1, r =0.015. (A) inverse Rao-Cramer bound of scale parameter a and (B) inverse Rao-Cramerbound of the shape parameter b. The optimal times are topt.a = (0.431, 1.697)

    T, topt.b =(0.533, 1.744)T.

    49

  • 7/31/2019 Mthesisv2 Final Corrected

    55/66

    4Computational procedures and

    examples

    4.1 Simulation of random processes

    Sample paths of random processes were used to illustrate the theory studiedin the first chapter. In this section we describe our approach to their numericapproximation.

    Simulation of a Wiener process

    Let W(t) be a standard Wiener process given by Definition 1.1. We are look-ing for its approximation Wn(t) at n + 1 equidistant time instants ti = it,where i = 0, . . . , n, and t is the time step of the approximation. A com-monly employed procedure for generating an approximation Wn(ti) is via therecursive equation

    Wn(ti) = Wn(ti1) + Ni

    t, Wn(0) = 0, (4.1)

    where {Ni}ni=0 is a sequence of (simulated) independent, identically-distributedGaussian random variables with ENk = 0 and varNk = 1. For more theoret-ical details see ref. [28], [31]. It can be easily verified that

    EWn(ti) = 0, varWn(ti) = ti.

    Described approach we use for simulation of the Wiener process in functionwiener(t), where t is vector of time steps and Ni are actually pseudorandom

    variables obtained from Matlab library program randn. The function wienerreturns vector of functional values of simulated Wiener process Wn(t) at thetime instants ti. For example, sample path of a Wiener process at the timeinterval from t0 = 0 to tn = 5 with step t = 0.1 can be plotted withcommand

    50

  • 7/31/2019 Mthesisv2 Final Corrected

    56/66

    4.2. MAXIMUM LIKELIHOOD ESTIMATION

    >> t=0:0.1:5;

    >> plot(t, wiener(t));

    Function wiener(t) has been used in Matlab programs stochastic.m andstochastic2.m returning plots of sample paths of random processes (1.22),

    (1.34), (1.42) and Wiener process W(t). For more info type help stochastic,resp. help stochastic2 into Matlab command line.

    Numerical solution of stochastic differential equations

    Similar approach as in the previous case has been used for numerical solutionof stochastic differential equations. Let (t) be a random process satisfyingstochastic differential equation

    d(t) = (t, (t))dt + (t, (t))dW(t), (0) = 0,

    where and are given functions, and W(t) is a standard Wiener processgiven by Definition 1.1. With similar notation as in the previous case wecan obtain numerical approximation n(t) at n + 1 equidistant time instantsti = it, i = 0, . . . , n, from recursive equation (see ref.[5],[13],[30])

    n(ti) = n(ti1) + (ti1, n(ti1))t + (ti1, n(ti1))Ni

    t (4.2)

    where {Ni}ni=0 is a sequence of (simulated) independent, identically-distributedGaussian random variables with ENk = 0, varNk = 1 and ti = it is i-thtime step. This approach has been used in Matlab program sde.m whichplots sample paths of random processes given by stochastic differential equa-

    tions (1.37) and (1.38). For more info type help sde into Matlab commandline.

    4.2 Maximum likelihood estimation

    In this Section we introduce some examples of the maximum likelihood esti-mation applied on Monte-Carlo simulated data. We show the error behaviorof the parameter estimates. The results are based on dissolved fraction datameasured at a single time instant for single-parameter model, resp. at twotime instants for two-parameter model.

    Example 1

    In the first example we investigate error behavior of the maximum likelihoodestimate of a single parameter a of stochastic homogenous model with mean

    51

  • 7/31/2019 Mthesisv2 Final Corrected

    57/66

    4.2. MAXIMUM LIKELIHOOD ESTIMATION

    described by (1.7). The dissolved fraction data are Monte-Carlo simulated inMatlab with function generate_homogenous_data(m,t,a,type), where m isrequired number of observations at time points given by vector t, a is a scaleparameter of model (1.7) and type determines probability distribution ofsimulated data. This parameter can take value g for Gaussian distributionor l for log-normal distribution. Variance s2(t) is given by (3.40) withparameters p = 0.02 and q = 0.0005, which were chosen to obtain standarddeviation s(t) to be around 5% of instantaneous concentration. Output ofthe function is matrix [t,C], where the second column C contains simulatedfractions of concentration data at the time instants given by first column tof the matrix. For example:

    >> data = generate_homogenous_data(3, [1.2] , 1, g)

    data =

    1.2000 0.7567

    1.2000 0.7835

    1.2000 0.5912

    >> data = generate_homogenous_data(1, [0.8 1.2 1.6] , 1, l)

    data =

    0.8000 0.4523

    1.2000 0.7349

    1.6000 0.7722

    From the simulated data we can obtain maximum likelihood estimates ofthe parameter a. Matlab function sp_gmle(data, a0) uses approach given

    by (3.29) and assumes Gaussian distribution of the data. The functionsp_lmle(data, a0) uses approach given by (3.34) and assumes log-normaldistribution of the data in the problem. For numeric solution of given equa-tions we use score method (2.24), because it needs no differentiation of any ofthe functions. Matrix data=[t, C] is output of function generate_homogen-ous_data and variable a0 is an initial approximation of the parameter a. Thisinitial approximation can be obtained from function sp_fit(data), whichfits sample average C = 1m

    mi=1 Ci of the data simulated at a single time

    instant t with homogenous dissolution profile (1.7),

    1

    exp(

    at) = C,

    and returns an initial approximation a0 obtained from equationa0 = 1

    tln

    1 1

    m

    mi=1

    Ci

    ,

    52

  • 7/31/2019 Mthesisv2 Final Corrected

    58/66

    4.2. MAXIMUM LIKELIHOOD ESTIMATION

    where Ci, i = 1, . . . , m, are dissolved fraction data Monte-Carlo simulated atthe time instant t.

    Now let us select a = 1 in arbitrary time units. We investigate samplevariance of an estimate a based on m = 4 dissolved fraction data simulatedat a single time instant ti for i = 1, . . . , 11, given by vector

    t = (0.4, 0.6, 0.8, 1, 1.2, 1.4, 1.6, 1.8, 2, 2.2, 2.4)T . (4.3)

    For example, at the time instant t1 = 0.4 we obtain random sample fromGaussian distribution

    >>data = generate_homogenous_data(4, 0.4, 1, g)

    data =

    0.4000 0.3900

    0.4000 0.3397

    0.4000 0.2875

    0.4000 0.2116

    At first we need an intitial approximation of the parameter. We use approachdescribed above:

    >>a_0=sp_fit(data)

    a_0 =

    0.9175

    At each of the time instants given by (4.3) now we can calculate estimate afrom the functions sp_gmle and sp_lmle. For example, for the data givenabove we obtain

    >>a_hat = [sp_gmle(data, a_0), sp_lmle(data, a_0)]

    a_hat =0.9174 0.9110

    In the next step we calculate sample variance var (a) (aa)2. For example,from the data given above we obtain

    >>var_a_hat = ([1, 1] -a_hat).^2

    var_a_hat =

    0.0068 0.0079

    This estimation procedure is done for every ti given by vector t and for eachtime instant we obtain vector of two sample variances, where its components

    correspond to specific estimation method. These vectors vary in dependencyon simulated data. To gain a more realistic picture, we take average of thesesample variances. Matlab function sp_point_estimate(r,m,t_i,a,type)simulate r times m dissolved fraction data at the time instant t_i and returnsaverage vector of sample variances of estimate a. They are plotted in Fig. 4.1.

    53

  • 7/31/2019 Mthesisv2 Final Corrected

    59/66

    4.2. MAXIMUM LIKELIHOOD ESTIMATION

    0.5 1 1.5 2 2.50.008

    0.01

    0.012

    0.014

    0.016

    0.018

    0.02

    (A)

    t

    (a

    a)2

    0.5 1 1.5 2 2.5

    0.008

    0.01

    0.012

    0.014

    0.016

    (B)

    t

    (a

    a)2

    Fig. 4.1: Average sample variances of r = 2000 estimates of a single parameter a ofhomogenous model (1.7) based on m = 4 data generated at single time instants ti given

    by vector t in (4.3). The average sample variances are marked with symbol o and con-nected with lines. The dissolved fraction data has mean (1.7), a = 1, and variance (3.40),p = 0.02, q = 0.0005. In Fig. (A) the dissolved fraction data were simulated from Gaus-sian distribution, in Fig. (B) the dissolved fraction data were simulated from log-normaldistribution. Gaussian (black) and log-normal (red) maximum likelihood estimation hasbeen used.

    Example 2

    In the second example we investigate error behavior of the maximum likeli-hood estimate of parameter vector (a, b)T of stochastic Weibull model withmean described by (1.8). The dissolved fraction data are Monte-Carlo sim-

    ulated in Matlab with function generate_weibull_data(m,t,a,b,type),where m is required number of observations at the tim