package ‘ga4stratification’ fileauthor sebnem er, timur keskinturk, charlie daly maintainer...

63
Package ‘GA4Stratification’ February 15, 2013 Type Package Title A genetic algorithm approach to determine stratum boundaries and sample sizes of each stratum in stratified sampling Version 1.0 Date 2010-12-03 Author Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er <[email protected]> Description This is a Genetic Algorithm package for the determination of the stratum boundaries and sample sizes of each stratum in stratified sampling License GPL-2 LazyLoad yes LazyData yes Repository CRAN Date/Publication 2012-10-29 08:57:06 NeedsCompilation no R topics documented: GA4Stratification-package ................................. 2 beta10_3 ........................................... 4 chi1 ............................................. 4 chi10 ............................................ 5 chi15 ............................................ 5 chi5 ............................................. 6 GA4Stratification ...................................... 6 GA4StratificationP1 ..................................... 8 GA4StratificationP1fit ................................... 10 1

Upload: others

Post on 10-Sep-2019

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

Package ‘GA4Stratification’February 15, 2013

Type Package

Title A genetic algorithm approach to determine stratum boundaries andsample sizes of each stratum in stratified sampling

Version 1.0

Date 2010-12-03

Author Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer Sebnem Er <[email protected]>

Description This is a Genetic Algorithm package for the determinationof the stratum boundaries and sample sizes of each stratum in stratified sampling

License GPL-2

LazyLoad yes

LazyData yes

Repository CRAN

Date/Publication 2012-10-29 08:57:06

NeedsCompilation no

R topics documented:GA4Stratification-package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2beta10_3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4chi1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4chi10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5chi15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5chi5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6GA4Stratification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6GA4StratificationP1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8GA4StratificationP1fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1

Page 2: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

2 GA4Stratification-package

GA4StratificationP1fitt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13GA4StratificationP1m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15GA4StratificationP1x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17GA4StratificationP2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19GA4StratificationP2fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22GA4StratificationP2fitt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25GA4StratificationP2m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27GA4StratificationP2x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29GA4StratificationP3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32GA4StratificationP3fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34GA4StratificationP3fitt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37GA4StratificationP3m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39GA4StratificationP3x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41GA4StratificationP4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44GA4StratificationP4fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46GA4StratificationP4fitt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49GA4StratificationP4m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51GA4StratificationP4x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54GA4StratificationSelection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57iso2004 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58normal100_10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59P75 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60randomnumGenerator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Index 62

GA4Stratification-package

A genetic algorithm approach to determine stratum boundaries andsample sizes of each stratum in stratified sampling

Description

This is a package for Genetic Algorithm approach for the determination of the stratum boundariesand sample sizes in each stratum in stratified sampling developed by Keskinturk and Er in 2007.

Details

Package: GA4StratificationType: PackageVersion: 1.0Date: 2010-12-03License: GPL-2LazyLoad: yes

Page 3: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4Stratification-package 3

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

See Also

iso2004 P75 chi1 chi5 chi10 chi15 normal100_10 beta10_3

GA4StratificationP1 GA4StratificationP1fit GA4StratificationP1fitt GA4StratificationP1mGA4StratificationP1x

GA4StratificationP2 GA4StratificationP2fit GA4StratificationP2fitt GA4StratificationP2mGA4StratificationP2x

GA4StratificationP3 GA4StratificationP3fit GA4StratificationP3fitt GA4StratificationP3mGA4StratificationP3x

GA4StratificationP4 GA4StratificationP4fit GA4StratificationP4fitt GA4StratificationP4mGA4StratificationP4x

GA4StratificationSelection randomnumGenerator

Examples

data(chi1)data(chi10)data(chi15)data(chi5)data(iso2004)data(normal100_10)data(beta10_3)data(P75)GA4Stratification(chi1,2,80,10,35,0.15,"Equal")GA4Stratification(chi10,3,81,10,35,0.15,"Proportional")GA4Stratification(chi15,4,80,10,35,0.15,"Neyman")GA4Stratification(chi5,5,80,10,35,0.15,"GA")GA4Stratification(iso2004,6,84,10,35,0.15,"Equal")GA4Stratification(normal100_10,2,80,10,35,0.15,"Proportional")GA4Stratification(beta10_3,3,81,10,35,0.15,"Neyman")GA4Stratification(P75,4,80,10,35,0.15,"GA")

Page 4: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

4 chi1

beta10_3 A randomly generated data

Description

beta10_3 is a randomly generated data from beta distribution with parameters of 10 and 3.

Usage

data(beta10_3)

Format

A data frame with 1000 observations on the following variable.

V1 a numeric vector

References

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

Examples

data(beta10_3)

chi1 A randomly generated data

Description

Chi1 is a randomly generated data from Chisquare distribution with 1 degrees of freedom.

Usage

data(chi1)

Format

A data frame with 1000 observations on the following variable.

V1 a numeric vector

References

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

Examples

data(chi1)

Page 5: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

chi10 5

chi10 A randomly generated data

Description

Chi10 is a randomly generated data from Chisquare distribution with 10 degrees of freedom.

Usage

data(chi10)

Format

A data frame with 1000 observations on the following variable.

V1 a numeric vector

References

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

Examples

data(chi10)

chi15 A randomly generated data

Description

Chi15 is a randomly generated data from Chisquare distribution with 15 degrees of freedom.

Usage

data(chi15)

Format

A data frame with 1000 observations on the following variable.

V1 a numeric vector

References

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

Examples

data(chi15)

Page 6: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

6 GA4Stratification

chi5 A randomly generated data

Description

Chi5 is a randomly generated data from Chisquare distribution with 5 degrees of freedom.

Usage

data(chi5)

Format

A data frame with 1000 observations on the following variable.

V1 a numeric vector

References

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

Examples

data(chi5)

GA4Stratification A genetic algorithm approach to determine stratum boundaries andsample sizes of each stratum in stratified sampling

Description

This is a package for Genetic Algorithm approach for the determination of the stratum boundariesand sample sizes in each stratum in stratified sampling.

Usage

GA4Stratification(dataName, numberOfStrata, sampleSize, iteration, GAgenerationSize, mutationRate, sampleAllocation)

Page 7: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4Stratification 7

Arguments

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

sampleSize An integer: The total sample size.

iteration An integer: The number of iterations in the Genetic Algorithm process.GAgenerationSize

An integer: The number of generations in the Genetic Algorithm process.

mutationRate A numeric: The mutation rate in the Genetic Algorithm process. Mutation ratemust be in between 0 and 1, inclusive. Small levels of mutation rate is preferablein Genetic Algorithm approach.

sampleAllocation

A string: sampleAllocation defines the type of the sample allocation method,which could be either "Equal", "Proportional", "Neyman" or "GA".

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

Examples

data(iso2004)GA4Stratification(iso2004,2,80,10,35,0.15,"Equal")## The function is currently defined asfunction(dataName,numberOfStrata,sampleSize,iteration,GAgenerationSize,mutationRate,sampleAllocation){

if(sampleAllocation == "Equal"){

GA4StratificationP1(dataName,numberOfStrata,sampleSize,iteration,GAgenerationSize,mutationRate)

} else if(sampleAllocation=="Proportional") {

GA4StratificationP2(dataName,numberOfStrata,sampleSize,iteration,GAgenerationSize,mutationRate)

} else if(sampleAllocation == "Neyman") {

GA4StratificationP3(dataName,numberOfStrata,sampleSize,iteration,GAgenerationSize,mutationRate)

Page 8: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

8 GA4StratificationP1

} else if (sampleAllocation == "GA") {

GA4StratificationP4(dataName,numberOfStrata,sampleSize,iteration,GAgenerationSize,mutationRate)

}

}

GA4StratificationP1 The genetic algorithm function to determine the stratum boundariesand sample sizes of each stratum in stratified sampling with EqualSample Allocation Scheme

Description

This is the general function in Genetic Algorithm that initially generates a random generation andthen applies the fitness function, selects, mutates and crossovers in order to obtain the best solution.

Usage

GA4StratificationP1(dataName, numberOfStrata, sampleSize, iteration, GAgenerationSize, mutationRate)

Arguments

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

sampleSize An integer: The total sample size.

iteration An integer: The number of iterations in the Genetic Algorithm process.GAgenerationSize

An integer: The number of the generations in the Genetic Algorithm process.

mutationRate A numeric: The mutation rate in the Genetic Algorithm process. Mutation ratemust be in between 0 and 1, inclusive. Small levels of mutation rate is preferablein Genetic Algorithm approach.

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

Page 9: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP1 9

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP1fit GA4StratificationP1fitt GA4StratificationP1m GA4StratificationP1x

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(dataName,numberOfStrata,sampleSize,iteration,GAgenerationSize,mutationRate){

dataName=data.frame(dataName)dataName=data.frame(dataName[order(dataName[,1]),])tableData=as.data.frame(table(dataName))randomnumRange=cumsum(tableData[,2])lengthRandomnum=length(randomnumRange)

lengthData=nrow(dataName)randomGeneration=array(0,dim=c(GAgenerationSize,lengthData))

nocrom=nrow(randomGeneration)fitp1=array(0,dim=c(1,nocrom))fit=array(0,dim=c(nocrom,1))N=means=s=n=vars=mas=NN=k=p=t=array(0,dim=c(nocrom,numberOfStrata))dd=array(0,dim=c(nocrom,1))

randomNumbers=array(0,dim=c(GAgenerationSize,numberOfStrata-1))

randomNumbersX=array(0,dim=c(nocrom,2))

aftermut=array(0,dim=c(GAgenerationSize,lengthData+numberOfStrata+1))cumTotal=cumsum(dataName)sumSquares=cumsum(dataName^2)

for (i in 1:GAgenerationSize){

randomNumbers[i,]=randomnumGenerator(randomnumRange,lengthRandomnum,numberOfStrata-1)

}

son=array(lengthData,dim=c(GAgenerationSize,1))indis=array(c(1:GAgenerationSize,randomNumbers,son),dim=c(GAgenerationSize,(numberOfStrata+1)))

for(i in 2:(numberOfStrata+1)){randomGeneration[indis[,c(1,i)]]=1

Page 10: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

10 GA4StratificationP1fit

}

bestValue=-99999999999999999999999999999999999999999999999999999999999999for ( i in 1:iteration ){

fitnessValueGeneration=GA4StratificationP1fit(randomGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)if ( max(fitnessValueGeneration)>bestValue ){

bestValue=max(fitnessValueGeneration)bestGeneration=randomGeneration[max(which(fitnessValueGeneration==bestValue)),]

}

randomGeneration=GA4StratificationSelection(randomGeneration,fitnessValueGeneration)

randomGeneration=GA4StratificationP1x(randomGeneration,bestGeneration,dataName,numberOfStrata,sampleSize,fitnessValueGeneration,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t,randomNumbersX,tableData,randomnumRange,lengthRandomnum)

randomGeneration=GA4StratificationP1m(randomGeneration,mutationRate,i)

fitnessValueGeneration=GA4StratificationP1fit(randomGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)

if ( max(fitnessValueGeneration)>bestValue ){

bestValue=max(fitnessValueGeneration)bestGeneration=randomGeneration[max(which(fitnessValueGeneration==bestValue)),]

} else

{randomGeneration[sample(GAgenerationSize,1),]=bestGeneration}

cat(i, " ",-bestValue,’\n’)flush.console()Sys.sleep(1)

}GA4StratificationP1fitt(bestGeneration,dataName,numberOfStrata,sampleSize,-bestValue,cumTotal,sumSquares)}

GA4StratificationP1fit

The genetic algorithm (GA) fitness function that calculates the vari-ance of the estimate of each chromosome in the GA generation withan Equal Sample Allocation Scheme

Description

This is the fitness function in GA that calculates the variance of the estimate according to theboundaries obtained with GA and sample sizes obtained with Equal Allocation.

Page 11: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP1fit 11

Usage

GA4StratificationP1fit(randomGeneration, dataName, numberOfStrata, sampleSize, cumTotal, sumSquares, c, dd, nocrom, fitp1, fit, N, means, s, n, vars, mas, NN, k, p, t)

Arguments

randomGeneration

The generation which a fitness function will be applied and a fitness value willbe calculated. This is initially a random generation and after each iteration it isthe mutated, crossovered and selected generation.

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

sampleSize An integer: The total sample size.

cumTotal An array: The cumulative total of the data elements from i=1 to i=N

sumSquares An array: The cumulative total of the squares of the data elements from i=1 toi=N

c An integer: The length of the data.

dd An array (nocrom X 1): The minimum of the difference in between the size ofeach stratum and the sample size to be drawn from that stratum.

nocrom An integer: The number of the chromosomes in the generation.

fitp1 An array (1 X nocrom): The fitness value for each chromosome in the genera-tion.

fit An array (nocrom X 1): The fitness value for each chromosome in the genera-tion.

N An array (nocrom X nofstrata): The number of the elements in each stratum foreach chromosome in the generation.

means An array (nocrom X nofstrata): The mean of the elements in each stratum foreach chromosome in the generation.

s An array (nocrom X nofstrata): The standart deviation of the elements in eachstratum for each chromosome in the generation.

n An array (nocrom X nofstrata): The number of the sample size to be drawn fromeach stratum for each chromosome in the generation.

vars An array (nocrom X nofstrata): The variance of the estimate in each stratum foreach chromosome in the generation.

mas An array (nocrom X nofstrata): The indice of each stratum for each chromosomein the generation.

NN An array (nocrom X nofstrata): The cumulative sum of the number of the ele-ments in each stratum.

k An array (nocrom X nofstrata): The difference between the number of the ele-ments and the sample sizes in each stratum.

t An array (nocrom X nofstrata): The maximum of the k

p An array (nocrom X nofstrata): The indice of the element where k is equal to t

Page 12: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

12 GA4StratificationP1fit

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP1 GA4StratificationP1fitt GA4StratificationP1m GA4StratificationP1x

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(randomGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,c,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t){

for ( i in 1:nocrom ){mas[i,]=which(randomGeneration[i,]==1,arr.ind=TRUE)N[i,1]=min(mas[i,])means[i,1]=cumTotal[mas[i,1],]/N[i,1]

if(N[i,1]==1){s[i,1]=0} else{

s[i,1]=((N[i,1]/(N[i,1]-1))*(sumSquares[N[i,1]]/N[i,1]-means[i,1]^2))^.5}

for ( j in 2:numberOfStrata ){

N[i,j]=mas[i,j]-mas[i,(j-1)]means[i,j]=(cumTotal[mas[i,j],]-cumTotal[mas[i,j-1],])/N[i,j]

if(N[i,j]==1){s[i,j]=0} else

{s[i,j]=((N[i,j]/(N[i,j]-1))*((sumSquares[mas[i,j]]-sumSquares[mas[i,j-1]])/N[i,j]-means[i,j]^2))^.5}}

for ( j in 1:numberOfStrata ){

n[i,j]=sampleSize/numberOfStrata}

Page 13: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP1fitt 13

for ( j in 1:numberOfStrata ){

vars[i,j]=((N[i,j]-n[i,j])*s[i,j]^2*N[i,j]^2)/(c^2*n[i,j]*N[i,j])}

kl=0

fit[i,]=sum(vars[i,])dd[i,]=min(N[i,]-n[i,])

if ( dd[i]<0 ){fit[i]= 999999999999999999999}

p1fit=array(-fit,dim=c(nocrom,1))

}return(p1fit)}

GA4StratificationP1fitt

The genetic algorithm fitness function to calculate the variance of theestimate of each chromosome in the generation with an Equal SampleAllocation Scheme

Description

This is the fitness function in Genetic Algorithm that calculates the variance of the estimate accord-ing to the boundaries obtained with GA and sample sizes obtained with Equal Allocation for thefinal chromosome.

Usage

GA4StratificationP1fitt(bestGeneration, dataName, numberOfStrata, sampleSize, bestValue, cumTotal, sumSquares)

Arguments

bestGeneration The generation that has the smallest fitness value is the best generation that willbe delivered to the next step.

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.sampleSize An integer: The total sample size.bestValue A numeric: The best fitness value that is the minimum variance of the estimate

for the best generation.cumTotal An array: The cumulative total of the data elements from i=1 to i=NsumSquares An array: The cumulative total of the squares of the data elements from i=1 to

i=N

Page 14: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

14 GA4StratificationP1fitt

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP1 GA4StratificationP1fit GA4StratificationP1m GA4StratificationP1x

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(bestGeneration,dataName,numberOfStrata,sampleSize,bestValue,cumTotal,sumSquares){

c=nrow(dataName)nocrom=length(bestGeneration)/cbestGeneration=array(bestGeneration,dim=c(length(bestGeneration)/c,c))fitp1=array(0,dim=c(1,nocrom))fit=array(0,dim=c(nocrom,1))N=means=s=n=vars=mas=NN=k=p=t=array(0,dim=c(nocrom,numberOfStrata))

dd=array(0,dim=c(nocrom,1))

for ( i in 1:nocrom ){

mas[i,]=which(bestGeneration[i,]==1,arr.ind=TRUE)N[i,1]=min(mas[i,])means[i,1]=cumTotal[mas[i,1],]/N[i,1]s[i,1]=((N[i,1]/(N[i,1]-1))*(sumSquares[N[i,1]]/N[i,1]-means[i,1]^2))^.5

for ( j in 2:numberOfStrata ){

N[i,j]=mas[i,j]-mas[i,(j-1)]N

means[i,j]=(cumTotal[mas[i,j],]-cumTotal[mas[i,j-1],])/N[i,j]means

s[i,j]=((N[i,j]/(N[i,j]-1))*((sumSquares[mas[i,j]]-sumSquares[mas[i,j-1]])/N[i,j]-means[i,j]^2))^.5s}

Page 15: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP1m 15

for ( j in 1:numberOfStrata ){

n[i,j]=(sampleSize/numberOfStrata)}

}

return(array(c(N,n,bestValue),dim=c(numberOfStrata,3)))}

GA4StratificationP1m The mutation operation in genetic algorithm for the determination ofstratum boundaries and sample sizes of each stratum in stratified sam-pling

Description

This is the mutation operation in the Genetic Algorithm approach for the determination of thestratum boundaries and sample sizes in each stratum in stratified sampling with an Equal SampleAllocation Scheme.

Usage

GA4StratificationP1m(mutationGeneration, mutationRate, it)

Arguments

mutationGeneration

The generation that will be crossovered and transfered to the next generation.

mutationRate A numeric: The mutation rate in the Genetic Algorithm process. Mutation ratemust be in between 0 and 1, inclusive. Small levels of mutation rate is preferablein Genetic Algorithm approach.

it An integer: The number of the iteration

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

Page 16: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

16 GA4StratificationP1m

See Also

GA4StratificationP1 GA4StratificationP1fit GA4StratificationP1fitt GA4StratificationP1x

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(mutationGeneration,mutationRate,it)

{

rowOfMutationGeneration=nrow(mutationGeneration)for ( i in 1:rowOfMutationGeneration ){

for ( k in 1:5 ){

if ( runif(1,0,1) < mutationRate ){

if ( runif(1,0,1) < mutationRate ){if ( it<50 )

{ones=which(mutationGeneration[i,]==1,arr.ind=TRUE)zeros=which(mutationGeneration[i,]==0,arr.ind=TRUE)mutationPoint=ones[sample((length(ones)-1),1)]

mutationPoint1=zeros[sample(length(zeros),1)]mutationGeneration[i,mutationPoint]=0mutationGeneration[i,mutationPoint1]=1} else

{ones=which(mutationGeneration[i,]==1,arr.ind=TRUE)mutationPoint=ones[sample((length(ones)-1),1)]

if ( runif(1,0,1)<0.51 ){

if ( mutationGeneration[i,(mutationPoint+1)]==0 ){mutationGeneration[i,mutationPoint]=0

mutationGeneration[i,(mutationPoint+1)]=1}

} else if ( mutationPoint>1){

if ( mutationGeneration[i,(mutationPoint-1)]==0 ){

mutationGeneration[i,mutationPoint]=0;mutationGeneration[i,mutationPoint-1]=1;

}}

}}

}}}return(mutationGeneration)

Page 17: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP1x 17

}

GA4StratificationP1x The crossover operation in genetic algorithm for the determinationof stratum boundaries and sample sizes of each stratum in stratifiedsampling

Description

This is the crossover operation in the Genetic Algorithm approach for the determination of thestratum boundaries and sample sizes in each stratum in stratified sampling with an Equal SampleAllocation Scheme.

Usage

GA4StratificationP1x(crossoverGeneration,bestGeneration,dataName,numberOfStrata,sampleSize,fitnessValueGeneration,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t,randomNumbersX,tableData,randomnumRange,lengthRandomnum)

Arguments

crossoverGeneration

The generation that will be crossovered and transfered to the next generation.

bestGeneration The generation that has the best fitness value after the selection process.

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

sampleSize An integer: The total sample size.fitnessValueGeneration

The fitness value -the variance of the estimate- of the best generation after se-lection

cumTotal An array: The cumulative total of the data elements from i=1 to i=N

sumSquares An array: The cumulative total of the squares of the data elements from i=1 toi=N

lengthData An integer: The size of the data.

dd An array (nocrom X 1): The minimum of the difference in between the size ofeach stratum and the sample size to be drawn from that stratum.

nocrom An integer: The number of the chromosomes in the generation.

fitp1 An array (1 X nocrom): The fitness value for each chromosome in the genera-tion.

fit An array (nocrom X 1): The fitness value for each chromosome in the genera-tion.

N An array (nocrom X nofstrata): The number of the elements in each stratum foreach chromosome in the generation.

Page 18: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

18 GA4StratificationP1x

means An array (nocrom X nofstrata): The mean of the elements in each stratum foreach chromosome in the generation.

s An array (nocrom X nofstrata): The standart deviation of the elements in eachstratum for each chromosome in the generation.

n An array (nocrom X nofstrata): The number of the sample size to be drawn fromeach stratum for each chromosome in the generation.

vars An array (nocrom X nofstrata): The variance of the estimate in each stratum foreach chromosome in the generation.

mas An array (nocrom X nofstrata): The indice of each stratum for each chromosomein the generation.

NN An array (nocrom X nofstrata): The cumulative sum of the number of the ele-ments in each stratum.

k An array (nocrom X nofstrata): The difference between the number of the ele-ments and the sample sizes in each stratum.

t An array (nocrom X nofstrata): The maximum of the k

p An array (nocrom X nofstrata): The indice of the element where k is equal to t

randomNumbersX An integer: The number of random numbers.

tableData The frequency table of the data.

randomnumRange An array: The range of the data where different random numbers will be chosen.

lengthRandomnum

An integer: The number of random numbers that are different from eachother.

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP1 GA4StratificationP1fit GA4StratificationP1fitt GA4StratificationP1m

GA4StratificationSelection randomnumGenerator

Page 19: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP2 19

Examples

## The function is currently defined asfunction(crossoverGeneration,bestGeneration,dataName,numberOfStrata,sampleSize,fitnessValueGeneration,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t,randomNumbersX,tableData,randomnumRange,lengthRandomnum){

fitnessValueParents=fitnessValueGenerationparents=cbind(crossoverGeneration,fitnessValueParents)crossoverGenerationp=crossoverGenerationrowCrossoverGenerationp=nocrom

for (i in 1:rowCrossoverGenerationp){

randomNumbersX[i,]=randomnumGenerator((1:rowCrossoverGenerationp),(rowCrossoverGenerationp+1),2)}

mother=father=NULL

for (i in 1:rowCrossoverGenerationp){mother=randomNumbersX[i,1]father=randomNumbersX[i,2]

crossoverPoint=sample((randomnumRange[1:lengthRandomnum-1]),1)while ( sum(crossoverGenerationp[mother,c(1:crossoverPoint)])!=sum(crossoverGenerationp[father,c(1:crossoverPoint)]) ){

crossoverPoint=sample((randomnumRange[1:lengthRandomnum-1]),1)

}crossoverGeneration[i,c(1:crossoverPoint)]=crossoverGenerationp[mother,c(1:crossoverPoint)]crossoverGeneration[i,c((crossoverPoint+1):lengthData)]=crossoverGenerationp[father,c((crossoverPoint+1):lengthData)]

}s=GA4StratificationP1fit(crossoverGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)crossoverGenerationx=cbind(crossoverGeneration,s)GA4StratificationP1x=rbind(parents, crossoverGenerationx)GA4StratificationP1x=GA4StratificationP1x[order(GA4StratificationP1x[,(lengthData+1)]),]GA4StratificationP1x=GA4StratificationP1x[c((rowCrossoverGenerationp+1):(rowCrossoverGenerationp*2)),c(1:lengthData)]

return(GA4StratificationP1x)}

GA4StratificationP2 The genetic algorithm function to determine the stratum boundariesand sample sizes of each stratum in stratified sampling with Propor-tional Sample Allocation Scheme

Description

This is the general function in Genetic Algorithm that initially generates a random generation andthen applies the fitness function, selects, mutates and crossovers in order to obtain the best solution.

Page 20: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

20 GA4StratificationP2

Usage

GA4StratificationP2(dataName, numberOfStrata, sampleSize, iteration, GAgenerationSize, mutationRate)

Arguments

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

sampleSize An integer: The total sample size.

iteration An integer: The number of iterations in the Genetic Algorithm process.GAgenerationSize

An integer: The number of the generations in the Genetic Algorithm process.

mutationRate A numeric: The mutation rate in the Genetic Algorithm process. Mutation ratemust be in between 0 and 1, inclusive. Small levels of mutation rate is preferablein Genetic Algorithm approach.

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP2fit GA4StratificationP2fitt GA4StratificationP2m GA4StratificationP2x

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(dataName,numberOfStrata,sampleSize,iteration,GAgenerationSize,mutationRate){

dataName=data.frame(dataName)dataName=data.frame(dataName[order(dataName[,1]),])lengthData=nrow(dataName)randomGeneration=array(0,dim=c(GAgenerationSize,lengthData))randomNumbers=array(0,dim=c(GAgenerationSize,numberOfStrata-1))cumTotal=cumsum(dataName)sumSquares=cumsum(dataName^2)

Page 21: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP2 21

nocrom=GAgenerationSizefitp1=array(0,dim=c(1,nocrom))fit=array(0,dim=c(nocrom,1))N=means=s=n=vars=mas=NN=k=p=t=array(0,dim=c(nocrom,numberOfStrata))

dd=array(0,dim=c(nocrom,1))

tableData=as.data.frame(table(dataName))randomnumRange=cumsum(tableData[,2])

lengthRandomnum=length(randomnumRange)

for (i in 1:GAgenerationSize){

randomNumbers[i,]=randomnumGenerator(randomnumRange,lengthRandomnum,numberOfStrata-1)

}

son=array(lengthData,dim=c(GAgenerationSize,1))indis=array(c(1:GAgenerationSize,randomNumbers,son),dim=c(GAgenerationSize,(numberOfStrata+1)))

for(i in 2:(numberOfStrata+1)){randomGeneration[indis[,c(1,i)]]=1}

bestValue=-99999999999999999999999999999999999999999999999999999999999999for ( i in 1:iteration ){

fitnessValueGeneration=GA4StratificationP2fit(randomGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)if ( max(fitnessValueGeneration)>bestValue ){

bestValue=max(fitnessValueGeneration)bestGeneration=randomGeneration[max(which(fitnessValueGeneration==bestValue)),]

}

randomGeneration=GA4StratificationSelection(randomGeneration,fitnessValueGeneration)

randomGeneration=GA4StratificationP2x(randomGeneration,bestGeneration,dataName,numberOfStrata,sampleSize,fitnessValueGeneration,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)

randomGeneration=GA4StratificationP2m(randomGeneration,mutationRate,i)

fitnessValueGeneration=GA4StratificationP2fit(randomGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)

if ( max(fitnessValueGeneration)>bestValue ){

bestValue=max(fitnessValueGeneration)bestGeneration=randomGeneration[max(which(fitnessValueGeneration==bestValue)),]

Page 22: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

22 GA4StratificationP2fit

} else{randomGeneration[sample(GAgenerationSize,1),]=bestGeneration}

cat(i, " ",-bestValue,’\n’)flush.console()Sys.sleep(1)

}GA4StratificationP2fitt(bestGeneration,dataName,numberOfStrata,sampleSize,-bestValue,cumTotal,sumSquares)

}

GA4StratificationP2fit

The genetic algorithm (GA) fitness function that calculates the vari-ance of the estimate of each chromosome in the GA generation withan Proportional Sample Allocation Scheme

Description

This is the fitness function in GA that calculates the variance of the estimate according to theboundaries obtained with GA and sample sizes obtained with Proportional Allocation.

Usage

GA4StratificationP2fit(randomGeneration, dataName, numberOfStrata, sampleSize, cumTotal, sumSquares, c, dd, nocrom, fitp1, fit, N, means, s, n, vars, mas, NN, k, p, t)

Arguments

randomGeneration

The generation which a fitness function will be applied and a fitness value willbe calculated. This is initially a random generation and after each iteration it isthe mutated, crossovered and selected generation.

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

sampleSize An integer: The total sample size.

cumTotal An array: The cumulative total of the data elements from i=1 to i=N

sumSquares An array: The cumulative total of the squares of the data elements from i=1 toi=N

c An integer: The length of the data.

dd An array (nocrom X 1): The minimum of the difference in between the size ofeach stratum and the sample size to be drawn from that stratum.

nocrom An integer: The number of the chromosomes in the generation.

fitp1 An array (1 X nocrom): The fitness value for each chromosome in the genera-tion.

Page 23: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP2fit 23

fit An array (nocrom X 1): The fitness value for each chromosome in the genera-tion.

N An array (nocrom X nofstrata): The number of the elements in each stratum foreach chromosome in the generation.

means An array (nocrom X nofstrata): The mean of the elements in each stratum foreach chromosome in the generation.

s An array (nocrom X nofstrata): The standart deviation of the elements in eachstratum for each chromosome in the generation.

n An array (nocrom X nofstrata): The number of the sample size to be drawn fromeach stratum for each chromosome in the generation.

vars An array (nocrom X nofstrata): The variance of the estimate in each stratum foreach chromosome in the generation.

mas An array (nocrom X nofstrata): The indice of each stratum for each chromosomein the generation.

NN An array (nocrom X nofstrata): The cumulative sum of the number of the ele-ments in each stratum.

k An array (nocrom X nofstrata): The difference between the number of the ele-ments and the sample sizes in each stratum.

t An array (nocrom X nofstrata): The maximum of the k

p An array (nocrom X nofstrata): The indice of the element where k is equal to t

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP2 GA4StratificationP2fitt GA4StratificationP2m GA4StratificationP2x

GA4StratificationSelection randomnumGenerator

Page 24: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

24 GA4StratificationP2fit

Examples

## The function is currently defined asfunction(randomGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,c,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t){

for ( i in 1:nocrom ){mas[i,]=which(randomGeneration[i,]==1,arr.ind=TRUE)N[i,1]=min(mas[i,])means[i,1]=cumTotal[mas[i,1],]/N[i,1]if(N[i,1]==1)

{s[i,1]=0} else{

s[i,1]=((N[i,1]/(N[i,1]-1))*(sumSquares[N[i,1]]/N[i,1]-means[i,1]^2))^.5}

for ( j in 2:numberOfStrata ){

N[i,j]=mas[i,j]-mas[i,(j-1)]means[i,j]=(cumTotal[mas[i,j],]-cumTotal[mas[i,j-1],])/N[i,j]if(N[i,j]==1)

{s[i,j]=0} else{

s[i,j]=((N[i,j]/(N[i,j]-1))*((sumSquares[mas[i,j]]-sumSquares[mas[i,j-1]])/N[i,j]-means[i,j]^2))^.5}}

for ( j in 1:numberOfStrata ){

n[i,j]=max(1,floor(sampleSize*N[i,j]/sum(N[i,])))n[i,j]=min(n[i,j],N[i,j])

}if ( sampleSize-sum(n[i,])>0 ){

k[i,]=N[i,]-n[i,]t[i,]=max(k[i,])p[i,]=which(k[i,]==t[i,],arr.ind=TRUE)[1]n[i,p[i,1]]=min(n[i,p[i,1]]+sampleSize-sum(n[i,]),N[i,p[i,1]])

}

for ( j in 1:numberOfStrata ){

vars[i,j]=((N[i,j]-n[i,j])*s[i,j]^2*N[i,j]^2)/(c^2*n[i,j]*N[i,j])}

dd[i,]=min((N[i,]-n[i,]))NN[i,]=cumsum(N[i,])kl=0

fit[i,]=sum(vars[i,])

Page 25: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP2fitt 25

if ( dd[i]<0 ){fit[i]= 9999999999999999

} else if (!all(N[i,]!=1)){fit[i]= 999999999999999999

} else if (!all(N[i,]!=0)){fit[i]= 999999999999999999} else{fit[i]=fit[i]}

for ( j in 1:(numberOfStrata-1) ){

kl=kl+dataName[(NN[i,j]+1),]-dataName[NN[i,j],]}

p2fit=array(-fit,dim=c(nocrom,1))

}return(p2fit)

}

GA4StratificationP2fitt

The genetic algorithm fitness function to calculate the variance of theestimate of each chromosome in the generation with an ProportionalSample Allocation Scheme

Description

This is the fitness function in Genetic Algorithm that calculates the variance of the estimate accord-ing to the boundaries obtained with GA and sample sizes obtained with Proportional Allocation forthe final chromosome.

Usage

GA4StratificationP2fitt(bestGeneration, dataName, numberOfStrata, sampleSize, bestValue, cumTotal, sumSquares)

Arguments

bestGeneration The generation that has the smallest fitness value is the best generation that willbe delivered to the next step.

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

Page 26: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

26 GA4StratificationP2fitt

sampleSize An integer: The total sample size.

bestValue A numeric: The best fitness value that is the minimum variance of the estimatefor the best generation.

cumTotal An array: The cumulative total of the data elements from i=1 to i=N

sumSquares An array: The cumulative total of the squares of the data elements from i=1 toi=N

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP2 GA4StratificationP2fit GA4StratificationP2m GA4StratificationP2x

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(bestGeneration,dataName,numberOfStrata,sampleSize,bestValue,cumTotal,sumSquares){

c=nrow(dataName)nocrom=length(bestGeneration)/cbestGeneration=array(bestGeneration,dim=c(length(bestGeneration)/c,c))fitp1=array(0,dim=c(1,nocrom))fit=array(0,dim=c(nocrom,1))N=means=s=n=vars=mas=NN=k=p=t=array(0,dim=c(nocrom,numberOfStrata))

dd=array(0,dim=c(nocrom,1))

for ( i in 1:nocrom ){

mas[i,]=which(bestGeneration[i,]==1,arr.ind=TRUE)N[i,1]=min(mas[i,])means[i,1]=cumTotal[mas[i,1],]/N[i,1]s[i,1]=((N[i,1]/(N[i,1]-1))*(sumSquares[N[i,1]]/N[i,1]-means[i,1]^2))^.5

Page 27: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP2m 27

for ( j in 2:numberOfStrata ){

N[i,j]=mas[i,j]-mas[i,(j-1)]N

means[i,j]=(cumTotal[mas[i,j],]-cumTotal[mas[i,j-1],])/N[i,j]means

s[i,j]=((N[i,j]/(N[i,j]-1))*((sumSquares[mas[i,j]]-sumSquares[mas[i,j-1]])/N[i,j]-means[i,j]^2))^.5s}

for ( j in 1:numberOfStrata ){

n[i,j]=max(1,floor(sampleSize*N[i,j]/sum(N[i,])))n[i,j]=min(n[i,j],N[i,j])

}if ( sampleSize-sum(n[i,])>0 ){

k[i,]=N[i,]-n[i,]t[i,]=max(k[i,])p[i,]=which(k[i,]==t[i,],arr.ind=TRUE)n[i,p[i,1]]=min(n[i,p[i,1]]+sampleSize-sum(n[i,]),N[i,p[i,1]])

}}

return(array(c(N,n,bestValue),dim=c(numberOfStrata,3)))}

GA4StratificationP2m The mutation operation in genetic algorithm for the determination ofstratum boundaries and sample sizes of each stratum in stratified sam-pling

Description

This is the mutation operation in the Genetic Algorithm approach for the determination of thestratum boundaries and sample sizes in each stratum in stratified sampling with an ProportionalSample Allocation Scheme.

Usage

GA4StratificationP2m(mutationGeneration, mutationRate, it)

Arguments

mutationGeneration

The generation that will be crossovered and transfered to the next generation.

mutationRate A numeric: The mutation rate in the Genetic Algorithm process. Mutation ratemust be in between 0 and 1, inclusive. Small levels of mutation rate is preferablein Genetic Algorithm approach.

Page 28: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

28 GA4StratificationP2m

it An integer: The number of the iteration

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP2 GA4StratificationP2fit GA4StratificationP2fitt GA4StratificationP2x

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(mutationGeneration,mutationRate,it)

{

rowOfMutationGeneration=nrow(mutationGeneration)for ( i in 1:rowOfMutationGeneration ){

for ( k in 1:5 ){if ( runif(1,0,1) < mutationRate )

{if ( runif(1,0,1) < mutationRate ){if ( it<50 )

{ones=which(mutationGeneration[i,]==1,arr.ind=TRUE)zeros=which(mutationGeneration[i,]==0,arr.ind=TRUE)mutationPoint=ones[sample((length(ones)-1),1)]

mutationPoint1=zeros[sample(length(zeros),1)]mutationGeneration[i,mutationPoint]=0mutationGeneration[i,mutationPoint1]=1} else

{ones=which(mutationGeneration[i,]==1,arr.ind=TRUE)mutationPoint=ones[sample((length(ones)-1),1)]

if ( runif(1,0,1)<0.51 ){

Page 29: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP2x 29

if ( mutationGeneration[i,(mutationPoint+1)]==0 ){mutationGeneration[i,mutationPoint]=0

mutationGeneration[i,(mutationPoint+1)]=1}

} else if ( mutationPoint>1){

if ( mutationGeneration[i,(mutationPoint-1)]==0 ){

mutationGeneration[i,mutationPoint]=0;mutationGeneration[i,mutationPoint-1]=1;

}}

}}

}}}return(mutationGeneration)}

GA4StratificationP2x The crossover operation in genetic algorithm for the determinationof stratum boundaries and sample sizes of each stratum in stratifiedsampling

Description

This is the crossover operation in the Genetic Algorithm approach for the determination of thestratum boundaries and sample sizes in each stratum in stratified sampling with an ProportionalSample Allocation Scheme.

Usage

GA4StratificationP2x(crossoverGeneration,bestGeneration,dataName,numberOfStrata,sampleSize,fitnessValueGeneration,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)

Arguments

crossoverGeneration

The generation that will be crossovered and transfered to the next generation.

bestGeneration The generation that has the best fitness value after the selection process.

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

sampleSize An integer: The total sample size.

Page 30: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

30 GA4StratificationP2x

fitnessValueGeneration

The fitness value -the variance of the estimate- of the best generation after se-lection

cumTotal An array: The cumulative total of the data elements from i=1 to i=N

sumSquares An array: The cumulative total of the squares of the data elements from i=1 toi=N

lengthData An integer: The size of the data.

dd An array (nocrom X 1): The minimum of the difference in between the size ofeach stratum and the sample size to be drawn from that stratum.

nocrom An integer: The number of the chromosomes in the generation.

fitp1 An array (1 X nocrom): The fitness value for each chromosome in the genera-tion.

fit An array (nocrom X 1): The fitness value for each chromosome in the genera-tion.

N An array (nocrom X nofstrata): The number of the elements in each stratum foreach chromosome in the generation.

means An array (nocrom X nofstrata): The mean of the elements in each stratum foreach chromosome in the generation.

s An array (nocrom X nofstrata): The standart deviation of the elements in eachstratum for each chromosome in the generation.

n An array (nocrom X nofstrata): The number of the sample size to be drawn fromeach stratum for each chromosome in the generation.

vars An array (nocrom X nofstrata): The variance of the estimate in each stratum foreach chromosome in the generation.

mas An array (nocrom X nofstrata): The indice of each stratum for each chromosomein the generation.

NN An array (nocrom X nofstrata): The cumulative sum of the number of the ele-ments in each stratum.

k An array (nocrom X nofstrata): The difference between the number of the ele-ments and the sample sizes in each stratum.

t An array (nocrom X nofstrata): The maximum of the k

p An array (nocrom X nofstrata): The indice of the element where k is equal to t

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

Page 31: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP2x 31

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP2 GA4StratificationP2fit GA4StratificationP2fitt GA4StratificationP2m

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(crossoverGeneration,bestGeneration,dataName,numberOfStrata,sampleSize,fitnessValueGeneration,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t){fitnessValueParents=fitnessValueGenerationparents=cbind(crossoverGeneration,fitnessValueParents)crossoverGenerationp=crossoverGenerationrowCrossoverGenerationp=nocrom

tableData=as.data.frame(table(dataName))randomnumRange=cumsum(tableData[,2])

lengthRandomnum=length(randomnumRange)

randomNumbers=array(0,dim=c(rowCrossoverGenerationp,2))

for (i in 1:rowCrossoverGenerationp){

randomNumbers[i,]=randomnumGenerator((1:rowCrossoverGenerationp),(rowCrossoverGenerationp+1),2)}mother=father=NULLfor (i in 1:rowCrossoverGenerationp){mother=randomNumbers[i,1]father=randomNumbers[i,2]

crossoverPoint=sample((randomnumRange[1:lengthRandomnum-1]),1)while ( sum(crossoverGenerationp[mother,c(1:crossoverPoint)])!=sum(crossoverGenerationp[father,c(1:crossoverPoint)]) ){

crossoverPoint=sample((randomnumRange[1:lengthRandomnum-1]),1)}

crossoverGeneration[i,c(1:crossoverPoint)]=crossoverGenerationp[mother,c(1:crossoverPoint)]crossoverGeneration[i,c((crossoverPoint+1):lengthData)]=crossoverGenerationp[father,c((crossoverPoint+1):lengthData)]

}s=GA4StratificationP2fit(crossoverGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)crossoverGenerationx=cbind(crossoverGeneration,s)GA4StratificationP2x=rbind(parents, crossoverGenerationx)GA4StratificationP2x=GA4StratificationP2x[order(GA4StratificationP2x[,(lengthData+1)]),]GA4StratificationP2x=GA4StratificationP2x[c((rowCrossoverGenerationp+1):(rowCrossoverGenerationp*2)),c(1:lengthData)]

return(GA4StratificationP2x)}

Page 32: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

32 GA4StratificationP3

GA4StratificationP3 The genetic algorithm function to determine the stratum boundariesand sample sizes of each stratum in stratified sampling with NeymanSample Allocation Scheme

Description

This is the general function in Genetic Algorithm that initially generates a random generation andthen applies the fitness function, selects, mutates and crossovers in order to obtain the best solution.

Usage

GA4StratificationP3(dataName, numberOfStrata, sampleSize, iteration, GAgenerationSize, mutationRate)

Arguments

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

sampleSize An integer: The total sample size.

iteration An integer: The number of iterations in the Genetic Algorithm process.GAgenerationSize

An integer: The number of the generations in the Genetic Algorithm process.

mutationRate A numeric: The mutation rate in the Genetic Algorithm process. Mutation ratemust be in between 0 and 1, inclusive. Small levels of mutation rate is preferablein Genetic Algorithm approach.

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

Page 33: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP3 33

See Also

GA4StratificationP3fit GA4StratificationP3fitt GA4StratificationP3m GA4StratificationP3x

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(dataName,numberOfStrata,sampleSize,iteration,GAgenerationSize,mutationRate){

dataName=data.frame(dataName)dataName=data.frame(dataName[order(dataName[,1]),])lengthData=nrow(dataName)randomGeneration=array(0,dim=c(GAgenerationSize,lengthData))randomNumbers=array(0,dim=c(GAgenerationSize,numberOfStrata-1))cumTotal=cumsum(dataName)sumSquares=cumsum(dataName^2)

nocrom=GAgenerationSizefitp1=array(0,dim=c(1,nocrom))fit=array(0,dim=c(nocrom,1))N=means=s=n=vars=mas=NN=k=p=t=array(0,dim=c(nocrom,numberOfStrata))dd=array(0,dim=c(nocrom,1))

tableData=as.data.frame(table(dataName))randomnumRange=cumsum(tableData[,2])

lengthRandomnum=length(randomnumRange)-1

for (i in 1:GAgenerationSize){

randomNumbers[i,]=randomnumGenerator(randomnumRange,lengthRandomnum,numberOfStrata-1)

}

son=array(lengthData,dim=c(GAgenerationSize,1))indis=array(c(1:GAgenerationSize,randomNumbers,son),dim=c(GAgenerationSize,(numberOfStrata+1)))

for(i in 2:(numberOfStrata+1)){randomGeneration[indis[,c(1,i)]]=1}

bestValue=-99999999999999999999999999999999999999999999999999999999999999for ( i in 1:iteration ){

fitnessValueGeneration=GA4StratificationP3fit(randomGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)if ( max(fitnessValueGeneration)>bestValue ){

bestValue=max(fitnessValueGeneration)

Page 34: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

34 GA4StratificationP3fit

bestGeneration=randomGeneration[max(which(fitnessValueGeneration==bestValue)),]}

randomGeneration=GA4StratificationSelection(randomGeneration,fitnessValueGeneration)

randomGeneration=GA4StratificationP3x(randomGeneration,bestGeneration,dataName,numberOfStrata,sampleSize,fitnessValueGeneration,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)

randomGeneration=GA4StratificationP3m(randomGeneration,mutationRate,i)

fitnessValueGeneration=GA4StratificationP3fit(randomGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)

if ( max(fitnessValueGeneration)>bestValue ){

bestValue=max(fitnessValueGeneration)bestGeneration=randomGeneration[max(which(fitnessValueGeneration==bestValue)),]

} else{randomGeneration[sample(GAgenerationSize,1),]=bestGeneration}

cat(i, " ",-bestValue,’\n’)flush.console()Sys.sleep(1)

}GA4StratificationP3fitt(bestGeneration,dataName,numberOfStrata,sampleSize,-bestValue,cumTotal,sumSquares)

}

GA4StratificationP3fit

The genetic algorithm (GA) fitness function that calculates the vari-ance of the estimate of each chromosome in the GA generation withan Neyman Sample Allocation Scheme

Description

This is the fitness function in GA that calculates the variance of the estimate according to theboundaries obtained with GA and sample sizes obtained with Neyman Allocation.

Usage

GA4StratificationP3fit(randomGeneration, dataName, numberOfStrata, sampleSize, cumTotal, sumSquares, c, dd, nocrom, fitp1, fit, N, means, s, n, vars, mas, NN, k, p, t)

Arguments

randomGeneration

The generation which a fitness function will be applied and a fitness value willbe calculated. This is initially a random generation and after each iteration it isthe mutated, crossovered and selected generation.

Page 35: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP3fit 35

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

sampleSize An integer: The total sample size.

cumTotal An array: The cumulative total of the data elements from i=1 to i=N

sumSquares An array: The cumulative total of the squares of the data elements from i=1 toi=N

c An integer: The length of the data.

dd An array (nocrom X 1): The minimum of the difference in between the size ofeach stratum and the sample size to be drawn from that stratum.

nocrom An integer: The number of the chromosomes in the generation.

fitp1 An array (1 X nocrom): The fitness value for each chromosome in the genera-tion.

fit An array (nocrom X 1): The fitness value for each chromosome in the genera-tion.

N An array (nocrom X nofstrata): The number of the elements in each stratum foreach chromosome in the generation.

means An array (nocrom X nofstrata): The mean of the elements in each stratum foreach chromosome in the generation.

s An array (nocrom X nofstrata): The standart deviation of the elements in eachstratum for each chromosome in the generation.

n An array (nocrom X nofstrata): The number of the sample size to be drawn fromeach stratum for each chromosome in the generation.

vars An array (nocrom X nofstrata): The variance of the estimate in each stratum foreach chromosome in the generation.

mas An array (nocrom X nofstrata): The indice of each stratum for each chromosomein the generation.

NN An array (nocrom X nofstrata): The cumulative sum of the number of the ele-ments in each stratum.

k An array (nocrom X nofstrata): The difference between the number of the ele-ments and the sample sizes in each stratum.

t An array (nocrom X nofstrata): The maximum of the k

p An array (nocrom X nofstrata): The indice of the element where k is equal to t

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

Page 36: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

36 GA4StratificationP3fit

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP3 GA4StratificationP3fitt GA4StratificationP3m GA4StratificationP3x

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(randomGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,c,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t){

for ( i in 1:nocrom ){mas[i,]=which(randomGeneration[i,]==1,arr.ind=TRUE)N[i,1]=min(mas[i,])means[i,1]=cumTotal[mas[i,1],]/N[i,1]

if(N[i,1]==1){s[i,1]=0} else

s[i,1]=((N[i,1]/(N[i,1]-1))*(sumSquares[N[i,1]]/N[i,1]-means[i,1]^2))^.5

for ( j in 2:numberOfStrata ){

N[i,j]=mas[i,j]-mas[i,(j-1)]

means[i,j]=(cumTotal[mas[i,j],]-cumTotal[mas[i,j-1],])/N[i,j]if(N[i,j]==1){s[i,j]=0} else

s[i,j]=((N[i,j]/(N[i,j]-1))*((sumSquares[mas[i,j]]-sumSquares[mas[i,j-1]])/N[i,j]-means[i,j]^2))^.5

}

for ( j in 1:numberOfStrata ){

n[i,j]=max(1,round(sampleSize*N[i,j]*s[i,j]/sum(N[i,]*s[i,])))n[i,j]=min(n[i,j],N[i,j])

}if ( (sampleSize-sum(n[i,]))>0 ){

k[i,]=N[i,]-n[i,]t[i,]=max(k[i,])p[i,]=which(k[i,]==t[i,],arr.ind=TRUE)[1]n[i,p[i,1]]=min(n[i,p[i,1]]+sampleSize-sum(n[i,]),N[i,p[i,1]])

}

Page 37: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP3fitt 37

for ( j in 1:numberOfStrata ){

vars[i,j]=((N[i,j]-n[i,j])*s[i,j]^2*N[i,j]^2)/(c^2*n[i,j]*N[i,j])}

dd[i,]=min((N[i,]-n[i,]))NN[i,]=cumsum(N[i,])kl=0

fit[i,]=sum(vars[i,])

if ( dd[i]<0 ){fit[i]= 9999999999999999

} else if (!all(N[i,]!=1)){fit[i]= 999999999999999999

} else if (!all(N[i,]!=0)){fit[i]= 999999999999999999} else{fit[i]=fit[i]}

for ( j in 1:(numberOfStrata-1) ){

kl=kl+dataName[(NN[i,j]+1),]-dataName[NN[i,j],]}

p3fit=array(-fit,dim=c(nocrom,1))

}return(p3fit)

}

GA4StratificationP3fitt

The genetic algorithm fitness function to calculate the variance of theestimate of each chromosome in the generation with an Neyman Sam-ple Allocation Scheme

Description

This is the fitness function in Genetic Algorithm that calculates the variance of the estimate accord-ing to the boundaries obtained with GA and sample sizes obtained with Neyman Allocation for thefinal chromosome.

Page 38: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

38 GA4StratificationP3fitt

Usage

GA4StratificationP3fitt(bestGeneration, dataName, numberOfStrata, sampleSize, bestValue, cumTotal, sumSquares)

Arguments

bestGeneration The generation that has the smallest fitness value is the best generation that willbe delivered to the next step.

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

sampleSize An integer: The total sample size.

bestValue A numeric: The best fitness value that is the minimum variance of the estimatefor the best generation.

cumTotal An array: The cumulative total of the data elements from i=1 to i=N

sumSquares An array: The cumulative total of the squares of the data elements from i=1 toi=N

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP3 GA4StratificationP3fit GA4StratificationP3m GA4StratificationP3x

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(bestGeneration,dataName,numberOfStrata,sampleSize,bestValue,cumTotal,sumSquares){

c=nrow(dataName)nocrom=length(bestGeneration)/cbestGeneration=array(bestGeneration,dim=c(length(bestGeneration)/c,c))fitp1=array(0,dim=c(1,nocrom))fit=array(0,dim=c(nocrom,1))

Page 39: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP3m 39

N=means=s=n=vars=mas=NN=k=p=t=array(0,dim=c(nocrom,numberOfStrata))dd=array(0,dim=c(nocrom,1))

for ( i in 1:nocrom ){

mas[i,]=which(bestGeneration[i,]==1,arr.ind=TRUE)N[i,1]=min(mas[i,])means[i,1]=cumTotal[mas[i,1],]/N[i,1]s[i,1]=((N[i,1]/(N[i,1]-1))*(sumSquares[N[i,1]]/N[i,1]-means[i,1]^2))^.5

for ( j in 2:numberOfStrata ){

N[i,j]=mas[i,j]-mas[i,(j-1)]means[i,j]=(cumTotal[mas[i,j],]-cumTotal[mas[i,j-1],])/N[i,j]s[i,j]=((N[i,j]/(N[i,j]-1))*((sumSquares[mas[i,j]]-sumSquares[mas[i,j-1]])/N[i,j]-means[i,j]^2))^.5

}

for ( j in 1:numberOfStrata ){

n[i,j]=max(1,round(sampleSize*N[i,j]*s[i,j]/sum(N[i,]*s[i,])))n[i,j]=min(n[i,j],N[i,j])

}if ( (sampleSize-sum(n[i,]))>0 ){

k[i,]=N[i,]-n[i,]t[i,]=max(k[i,])p[i,]=which(k[i,]==t[i,],arr.ind=TRUE)[1]n[i,p[i,1]]=min(n[i,p[i,1]]+sampleSize-sum(n[i,]),N[i,p[i,1]])

}

}

return(array(c(N,n,bestValue),dim=c(numberOfStrata,3)))}

GA4StratificationP3m The mutation operation in genetic algorithm for the determination ofstratum boundaries and sample sizes of each stratum in stratified sam-pling

Description

This is the mutation operation in the Genetic Algorithm approach for the determination of thestratum boundaries and sample sizes in each stratum in stratified sampling with an Neyman SampleAllocation Scheme.

Page 40: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

40 GA4StratificationP3m

Usage

GA4StratificationP3m(mutationGeneration, mutationRate, it)

Arguments

mutationGeneration

The generation that will be crossovered and transfered to the next generation.

mutationRate A numeric: The mutation rate in the Genetic Algorithm process. Mutation ratemust be in between 0 and 1, inclusive. Small levels of mutation rate is preferablein Genetic Algorithm approach.

it An integer: The number of the iteration

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP3 GA4StratificationP3fit GA4StratificationP3fitt GA4StratificationP3x

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(mutationGeneration,mutationRate,it)

{

rowOfMutationGeneration=nrow(mutationGeneration)for ( i in 1:rowOfMutationGeneration){

for ( k in 1:5 ){if ( runif(1,0,1) < mutationRate )

{if ( runif(1,0,1) < mutationRate ){if ( it<50 )

{

Page 41: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP3x 41

ones=which(mutationGeneration[i,]==1,arr.ind=TRUE)zeros=which(mutationGeneration[i,]==0,arr.ind=TRUE)mutationPoint=ones[sample((length(ones)-1),1)]

mutationPoint1=zeros[sample(length(zeros),1)]mutationGeneration[i,mutationPoint]=0mutationGeneration[i,mutationPoint1]=1} else

{ones=which(mutationGeneration[i,]==1,arr.ind=TRUE)mutationPoint=ones[sample((length(ones)-1),1)]

if ( runif(1,0,1)<0.51 ){

if ( mutationGeneration[i,(mutationPoint+1)]==0 ){mutationGeneration[i,mutationPoint]=0

mutationGeneration[i,(mutationPoint+1)]=1}

} else if ( mutationPoint>1){

if ( mutationGeneration[i,(mutationPoint-1)]==0 ){

mutationGeneration[i,mutationPoint]=0;mutationGeneration[i,mutationPoint-1]=1;

}}

}}

}}}return(mutationGeneration)}

GA4StratificationP3x The crossover operation in genetic algorithm for the determinationof stratum boundaries and sample sizes of each stratum in stratifiedsampling

Description

This is the crossover operation in the Genetic Algorithm approach for the determination of thestratum boundaries and sample sizes in each stratum in stratified sampling with an Neyman SampleAllocation Scheme.

Usage

GA4StratificationP3x(crossoverGeneration,bestGeneration,dataName,numberOfStrata,sampleSize,fitnessValueGeneration,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)

Page 42: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

42 GA4StratificationP3x

Arguments

crossoverGeneration

The generation that will be crossovered and transfered to the next generation.

bestGeneration The generation that has the best fitness value after the selection process.

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

sampleSize An integer: The total sample size.fitnessValueGeneration

The fitness value -the variance of the estimate- of the best generation after se-lection

cumTotal An array: The cumulative total of the data elements from i=1 to i=N

sumSquares An array: The cumulative total of the squares of the data elements from i=1 toi=N

lengthData An integer: The size of the data.

dd An array (nocrom X 1): The minimum of the difference in between the size ofeach stratum and the sample size to be drawn from that stratum.

nocrom An integer: The number of the chromosomes in the generation.

fitp1 An array (1 X nocrom): The fitness value for each chromosome in the genera-tion.

fit An array (nocrom X 1): The fitness value for each chromosome in the genera-tion.

N An array (nocrom X nofstrata): The number of the elements in each stratum foreach chromosome in the generation.

means An array (nocrom X nofstrata): The mean of the elements in each stratum foreach chromosome in the generation.

s An array (nocrom X nofstrata): The standart deviation of the elements in eachstratum for each chromosome in the generation.

n An array (nocrom X nofstrata): The number of the sample size to be drawn fromeach stratum for each chromosome in the generation.

vars An array (nocrom X nofstrata): The variance of the estimate in each stratum foreach chromosome in the generation.

mas An array (nocrom X nofstrata): The indice of each stratum for each chromosomein the generation.

NN An array (nocrom X nofstrata): The cumulative sum of the number of the ele-ments in each stratum.

k An array (nocrom X nofstrata): The difference between the number of the ele-ments and the sample sizes in each stratum.

t An array (nocrom X nofstrata): The maximum of the k

p An array (nocrom X nofstrata): The indice of the element where k is equal to t

Page 43: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP3x 43

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP3 GA4StratificationP3fit GA4StratificationP3fitt GA4StratificationP3m

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(crossoverGeneration,bestGeneration,dataName,numberOfStrata,sampleSize,fitnessValueGeneration,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t){fitnessValueParents=fitnessValueGenerationparents=cbind(crossoverGeneration,fitnessValueParents)crossoverGenerationp=crossoverGenerationrowCrossoverGenerationp=nocrom

tableData=as.data.frame(table(dataName))randomnumRange=cumsum(tableData[,2])

lengthRandomnum=length(randomnumRange)

randomNumbers=array(0,dim=c(rowCrossoverGenerationp,2))

for (i in 1:rowCrossoverGenerationp){

randomNumbers[i,]=randomnumGenerator((1:rowCrossoverGenerationp),(rowCrossoverGenerationp+1),2)}mother=father=NULLfor (i in 1:rowCrossoverGenerationp){mother=randomNumbers[i,1]father=randomNumbers[i,2]

crossoverPoint=sample((randomnumRange[1:lengthRandomnum-1]),1)while ( sum(crossoverGenerationp[mother,c(1:crossoverPoint)])!=sum(crossoverGenerationp[father,c(1:crossoverPoint)]) ){crossoverPoint=sample((randomnumRange[1:lengthRandomnum-1]),1)

Page 44: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

44 GA4StratificationP4

}crossoverGeneration[i,c(1:crossoverPoint)]=crossoverGenerationp[mother,c(1:crossoverPoint)]crossoverGeneration[i,c((crossoverPoint+1):lengthData)]=crossoverGenerationp[father,c((crossoverPoint+1):lengthData)]

}s=GA4StratificationP3fit(crossoverGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)crossoverGenerationx=cbind(crossoverGeneration,s)GA4StratificationP3x=rbind(parents, crossoverGenerationx)GA4StratificationP3x=GA4StratificationP3x[order(GA4StratificationP3x[,(lengthData+1)]),]GA4StratificationP3x=GA4StratificationP3x[c((rowCrossoverGenerationp+1):(rowCrossoverGenerationp*2)),c(1:lengthData)]

return(GA4StratificationP3x)}

GA4StratificationP4 The genetic algorithm function to determine the stratum boundariesand sample sizes of each stratum in stratified sampling with GA Sam-ple Allocation Scheme

Description

This is the general function in Genetic Algorithm that initially generates a random generation andthen applies the fitness function, selects, mutates and crossovers in order to obtain the best solution.

Usage

GA4StratificationP4(dataName, numberOfStrata, sampleSize, iteration, GAgenerationSize, mutationRate)

Arguments

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

sampleSize An integer: The total sample size.

iteration An integer: The number of iterations in the Genetic Algorithm process.GAgenerationSize

An integer: The number of the generations in the Genetic Algorithm process.

mutationRate A numeric: The mutation rate in the Genetic Algorithm process. Mutation ratemust be in between 0 and 1, inclusive. Small levels of mutation rate is preferablein Genetic Algorithm approach.

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Page 45: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP4 45

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP4fit GA4StratificationP4fitt GA4StratificationP4m GA4StratificationP4x

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(dataName,numberOfStrata,sampleSize,iteration,GAgenerationSize,mutationRate){

dataName=data.frame(dataName)dataName=data.frame(dataName[order(dataName[,1]),])lengthData=nrow(dataName)randomGeneration=array(0,dim=c(GAgenerationSize,lengthData+numberOfStrata))

nocrom=GAgenerationSizefitp1=array(0,dim=c(1,nocrom))fit=array(0,dim=c(nocrom,1))N=means=s=n=vars=mas=NN=k=p=t=array(0,dim=c(nocrom,numberOfStrata))dd=array(0,dim=c(nocrom,1))

randomNumbers=array(0,dim=c(GAgenerationSize,numberOfStrata-1))samples=sumSamples=array(0,dim=c(GAgenerationSize,numberOfStrata))aftermut=array(0,dim=c(GAgenerationSize,lengthData+numberOfStrata+1))cumTotal=cumsum(dataName)sumSquares=cumsum(dataName^2)

tableData=as.data.frame(table(dataName))randomnumRange=cumsum(tableData[,2])

lengthRandomnum=length(randomnumRange)

for (i in 1:GAgenerationSize){

randomNumbers[i,]=randomnumGenerator(randomnumRange,lengthRandomnum,numberOfStrata-1)samples[i,]=floor(runif(numberOfStrata,1,10))sumSamples[i,]=sum(samples[i,])

}samples=floor(samples/sumSamples*sampleSize)son=array(lengthData,dim=c(GAgenerationSize,1))indis=array(c(1:GAgenerationSize,randomNumbers,son),dim=c(GAgenerationSize,(numberOfStrata+1)))

Page 46: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

46 GA4StratificationP4fit

for(i in 2:(numberOfStrata+1)){randomGeneration[indis[,c(1,i)]]=1}

for ( i in 1:GAgenerationSize ){

samples[i,numberOfStrata]=samples[i,numberOfStrata]+sampleSize-sum(samples[i,])}

randomGeneration[,(lengthData+1):(lengthData+numberOfStrata)]=samples

bestValue=-99999999999999999999999999999999999999999999999999999999999999for ( i in 1:iteration ){

fitnessValueGeneration=GA4StratificationP4fit(randomGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)if ( max(fitnessValueGeneration)>bestValue ){

bestValue=max(fitnessValueGeneration)bestGeneration=randomGeneration[max(which(fitnessValueGeneration==bestValue)),]

}

randomGeneration=GA4StratificationSelection(randomGeneration,fitnessValueGeneration)

randomGeneration=GA4StratificationP4x(randomGeneration,bestGeneration,dataName,numberOfStrata,sampleSize,fitnessValueGeneration,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)

randomGeneration=GA4StratificationP4m(randomGeneration,numberOfStrata,mutationRate,i)

fitnessValueGeneration=GA4StratificationP4fit(randomGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)

if ( max(fitnessValueGeneration)>bestValue ){

bestValue=max(fitnessValueGeneration)bestGeneration=randomGeneration[max(which(fitnessValueGeneration==bestValue)),]

} else{randomGeneration[sample(GAgenerationSize,1),]=bestGeneration}

cat(i, " ",-bestValue,’\n’)flush.console()Sys.sleep(1)

}GA4StratificationP4fitt(bestGeneration,dataName,numberOfStrata,sampleSize,-bestValue,cumTotal,sumSquares)

}

Page 47: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP4fit 47

GA4StratificationP4fit

The genetic algorithm (GA) fitness function that calculates the vari-ance of the estimate of each chromosome in the GA generation withan GA Sample Allocation Scheme

Description

This is the fitness function in GA that calculates the variance of the estimate according to theboundaries obtained with GA and sample sizes obtained with GA Allocation.

Usage

GA4StratificationP4fit(randomGeneration, dataName, numberOfStrata, sampleSize, cumTotal, sumSquares, c, dd, nocrom, fitp1, fit, N, means, s, n, vars, mas, NN, k, p, t)

Arguments

randomGeneration

The generation which a fitness function will be applied and a fitness value willbe calculated. This is initially a random generation and after each iteration it isthe mutated, crossovered and selected generation.

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

sampleSize An integer: The total sample size.

cumTotal An array: The cumulative total of the data elements from i=1 to i=N

sumSquares An array: The cumulative total of the squares of the data elements from i=1 toi=N

c An integer: The length of the data.

dd An array (nocrom X 1): The minimum of the difference in between the size ofeach stratum and the sample size to be drawn from that stratum.

nocrom An integer: The number of the chromosomes in the generation.

fitp1 An array (1 X nocrom): The fitness value for each chromosome in the genera-tion.

fit An array (nocrom X 1): The fitness value for each chromosome in the genera-tion.

N An array (nocrom X nofstrata): The number of the elements in each stratum foreach chromosome in the generation.

means An array (nocrom X nofstrata): The mean of the elements in each stratum foreach chromosome in the generation.

s An array (nocrom X nofstrata): The standart deviation of the elements in eachstratum for each chromosome in the generation.

n An array (nocrom X nofstrata): The number of the sample size to be drawn fromeach stratum for each chromosome in the generation.

Page 48: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

48 GA4StratificationP4fit

vars An array (nocrom X nofstrata): The variance of the estimate in each stratum foreach chromosome in the generation.

mas An array (nocrom X nofstrata): The indice of each stratum for each chromosomein the generation.

NN An array (nocrom X nofstrata): The cumulative sum of the number of the ele-ments in each stratum.

k An array (nocrom X nofstrata): The difference between the number of the ele-ments and the sample sizes in each stratum.

t An array (nocrom X nofstrata): The maximum of the k

p An array (nocrom X nofstrata): The indice of the element where k is equal to t

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP4 GA4StratificationP4fitt GA4StratificationP4m GA4StratificationP4x

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(randomGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,c,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t){

for ( i in 1:nocrom ){mas[i,]=which(randomGeneration[i,1:c]==1,arr.ind=TRUE)N[i,1]=min(mas[i,])means[i,1]=cumTotal[mas[i,1],]/N[i,1]if(N[i,1]==1)

{s[i,1]=0} else{

s[i,1]=((N[i,1]/(N[i,1]-1))*(sumSquares[N[i,1]]/N[i,1]-means[i,1]^2))^.5}

Page 49: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP4fitt 49

n[i,]=randomGeneration[i,(c+1):(c+numberOfStrata)]

for ( j in 2:numberOfStrata ){

N[i,j]=mas[i,j]-mas[i,(j-1)]means[i,j]=(cumTotal[mas[i,j],]-cumTotal[mas[i,j-1],])/N[i,j]means

if(N[i,j]==1){s[i,j]=0} else{

s[i,j]=((N[i,j]/(N[i,j]-1))*((sumSquares[mas[i,j]]-sumSquares[mas[i,j-1]])/N[i,j]-means[i,j]^2))^.5}}

for ( j in 1:numberOfStrata ){

vars[i,j]=((N[i,j]-n[i,j])*s[i,j]^2*N[i,j]^2)/(c^2*n[i,j]*N[i,j])}

dd[i,]=min((N[i,]-n[i,]))NN[i,]=cumsum(N[i,])

fit[i,]=sum(vars[i,])

if ( dd[i]<0 ){fit[i]= 99999999999999999999999

} else if (!all(N[i,]!=1)){fit[i]= 99999999999999999999999

} else if (!all(N[i,]!=0)){fit[i]= 99999999999999999999999} else{fit[i]=fit[i]}

p4fit=array(-fit,dim=c(nocrom,1))

}return(p4fit)

}

GA4StratificationP4fitt

The genetic algorithm fitness function to calculate the variance of theestimate of each chromosome in the generation with an GA SampleAllocation Scheme

Page 50: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

50 GA4StratificationP4fitt

Description

This is the fitness function in Genetic Algorithm that calculates the variance of the estimate accord-ing to the boundaries obtained with GA and sample sizes obtained with GA Allocation for the finalchromosome.

Usage

GA4StratificationP4fitt(bestGeneration, dataName, numberOfStrata, sampleSize, bestValue, cumTotal, sumSquares)

Arguments

bestGeneration The generation that has the smallest fitness value is the best generation that willbe delivered to the next step.

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

sampleSize An integer: The total sample size.

bestValue A numeric: The best fitness value that is the minimum variance of the estimatefor the best generation.

cumTotal An array: The cumulative total of the data elements from i=1 to i=N

sumSquares An array: The cumulative total of the squares of the data elements from i=1 toi=N

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP4 GA4StratificationP4fit GA4StratificationP4m GA4StratificationP4x

GA4StratificationSelection randomnumGenerator

Page 51: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP4m 51

Examples

## The function is currently defined asfunction(bestGeneration,dataName,numberOfStrata,sampleSize,bestValue,cumTotal,sumSquares){

c=nrow(dataName)nocrom=length(bestGeneration)/(c+numberOfStrata)bestGeneration=array(bestGeneration,dim=c(length(bestGeneration)/(c+numberOfStrata),(c+numberOfStrata)))fitp1=array(0,dim=c(1,nocrom))fit=array(0,dim=c(nocrom,1))N=means=s=n=vars=mas=NN=k=p=t=array(0,dim=c(nocrom,numberOfStrata))dd=array(0,dim=c(nocrom,1))

for ( i in 1:nocrom ){

mas[i,]=which(bestGeneration[i,1:c]==1,arr.ind=TRUE)N[i,1]=min(mas[i,])means[i,1]=cumTotal[mas[i,1],]/N[i,1]s[i,1]=((N[i,1]/(N[i,1]-1))*(sumSquares[N[i,1]]/N[i,1]-means[i,1]^2))^.5n[i,]=bestGeneration[i,(c+1):(c+numberOfStrata)]

for ( j in 2:numberOfStrata ){

N[i,j]=mas[i,j]-mas[i,(j-1)]N

means[i,j]=(cumTotal[mas[i,j],]-cumTotal[mas[i,j-1],])/N[i,j]means

s[i,j]=((N[i,j]/(N[i,j]-1))*((sumSquares[mas[i,j]]-sumSquares[mas[i,j-1]])/N[i,j]-means[i,j]^2))^.5s}

}

return(array(c(N,n,bestValue),dim=c(numberOfStrata,3)))}

GA4StratificationP4m The mutation operation in genetic algorithm for the determination ofstratum boundaries and sample sizes of each stratum in stratified sam-pling

Description

This is the mutation operation in the Genetic Algorithm approach for the determination of thestratum boundaries and sample sizes in each stratum in stratified sampling with an GA SampleAllocation Scheme.

Page 52: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

52 GA4StratificationP4m

Usage

GA4StratificationP4m(mutationGeneration, numberOfStrata, mutationRate, it)

Arguments

mutationGeneration

The generation that will be crossovered and transfered to the next generation.

numberOfStrata An integer: The number of strata.

mutationRate A numeric: The mutation rate in the Genetic Algorithm process. Mutation ratemust be in between 0 and 1, inclusive. Small levels of mutation rate is preferablein Genetic Algorithm approach.

it An integer: The number of the iteration

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP4 GA4StratificationP4fit GA4StratificationP4fitt GA4StratificationP4x

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(mutationGeneration,numberOfStrata,mutationRate,it)

{

rowOfMutationGeneration=nrow(mutationGeneration)colOfMutationGeneration=ncol(mutationGeneration)

c=colOfMutationGeneration-numberOfStratafor ( i in 1:rowOfMutationGeneration){

if ( runif(1,0,1) < mutationRate ){

for ( k in 1:2 ){

if ( runif(1,0,1) < 0.35 )

Page 53: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP4m 53

{if ( it<100 ){

ones=which(mutationGeneration[i,1:c]==1,arr.ind=TRUE)zeros=which(mutationGeneration[i,1:c]==0,arr.ind=TRUE)mutationPoint=ones[sample(1:(length(ones)-1),1)]mutationPoint1=zeros[sample(1:length(zeros),1)]mutationGeneration[i,mutationPoint]=0mutationGeneration[i,mutationPoint1]=1

} else{

ones=which(mutationGeneration[i,1:c]==1,arr.ind=TRUE)mutationPoint=ones[sample(1:(length(ones)-1),1)]

if ( runif(1,0,1)<0.51 ){

if ( mutationGeneration[i,(mutationPoint+1)]==0 ){mutationGeneration[i,mutationPoint]=0

mutationGeneration[i,(mutationPoint+1)]=1}

} else if ( mutationPoint>1){

if ( mutationGeneration[i,(mutationPoint-1)]==0 ){

mutationGeneration[i,mutationPoint]=0mutationGeneration[i,mutationPoint-1]=1

}}

}} else

{mutationPoint=sample((c+1):(c+numberOfStrata),1)mutationPoint1=sample((c+1):(c+numberOfStrata),1)

while ( mutationPoint == mutationPoint1 ){

mutationPoint=sample((c+1):(c+numberOfStrata),1)mutationPoint1=sample((c+1):(c+numberOfStrata),1)}

if (mutationGeneration[i,mutationPoint]>2 & mutationGeneration[i,mutationPoint1]>2){

if (runif(1,0,1)< 0.51){mutationGeneration[i,mutationPoint]=mutationGeneration[i,mutationPoint]-1

mutationGeneration[i,mutationPoint1]=mutationGeneration[i,mutationPoint1]+1

} else{

mutationGeneration[i,mutationPoint]=mutationGeneration[i,mutationPoint]+1mutationGeneration[i,mutationPoint1]=mutationGeneration[i,mutationPoint1]-1

}

Page 54: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

54 GA4StratificationP4x

} else if (mutationGeneration[i,mutationPoint]==2 & mutationGeneration[i,mutationPoint1]>2){

mutationGeneration[i,mutationPoint]=mutationGeneration[i,mutationPoint]+1mutationGeneration[i,mutationPoint1]=mutationGeneration[i,mutationPoint1]-1

} else (mutationGeneration[i,mutationPoint]>2 & mutationGeneration[i,mutationPoint1]==2)

mutationGeneration[i,mutationPoint]=mutationGeneration[i,mutationPoint]-1mutationGeneration[i,mutationPoint1]=mutationGeneration[i,mutationPoint1]+1

}}

}}

return(mutationGeneration)}

GA4StratificationP4x The crossover operation in genetic algorithm for the determinationof stratum boundaries and sample sizes of each stratum in stratifiedsampling

Description

This is the crossover operation in the Genetic Algorithm approach for the determination of thestratum boundaries and sample sizes in each stratum in stratified sampling with an GA SampleAllocation Scheme.

Usage

GA4StratificationP4x(crossoverGeneration,bestGeneration,dataName,numberOfStrata,sampleSize,fitnessValueGeneration,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)

Arguments

crossoverGeneration

The generation that will be crossovered and transfered to the next generation.

bestGeneration The generation that has the best fitness value after the selection process.

dataName A string: The name of any type of numeric data containing the values of aunivariate stratification variable.

numberOfStrata An integer: The number of strata.

sampleSize An integer: The total sample size.fitnessValueGeneration

The fitness value -the variance of the estimate- of the best generation after se-lection

cumTotal An array: The cumulative total of the data elements from i=1 to i=N

sumSquares An array: The cumulative total of the squares of the data elements from i=1 toi=N

Page 55: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationP4x 55

lengthData An integer: The size of the data.

dd An array (nocrom X 1): The minimum of the difference in between the size ofeach stratum and the sample size to be drawn from that stratum.

nocrom An integer: The number of the chromosomes in the generation.

fitp1 An array (1 X nocrom): The fitness value for each chromosome in the genera-tion.

fit An array (nocrom X 1): The fitness value for each chromosome in the genera-tion.

N An array (nocrom X nofstrata): The number of the elements in each stratum foreach chromosome in the generation.

means An array (nocrom X nofstrata): The mean of the elements in each stratum foreach chromosome in the generation.

s An array (nocrom X nofstrata): The standart deviation of the elements in eachstratum for each chromosome in the generation.

n An array (nocrom X nofstrata): The number of the sample size to be drawn fromeach stratum for each chromosome in the generation.

vars An array (nocrom X nofstrata): The variance of the estimate in each stratum foreach chromosome in the generation.

mas An array (nocrom X nofstrata): The indice of each stratum for each chromosomein the generation.

NN An array (nocrom X nofstrata): The cumulative sum of the number of the ele-ments in each stratum.

k An array (nocrom X nofstrata): The difference between the number of the ele-ments and the sample sizes in each stratum.

t An array (nocrom X nofstrata): The maximum of the k

p An array (nocrom X nofstrata): The indice of the element where k is equal to t

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

Page 56: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

56 GA4StratificationP4x

See Also

GA4StratificationP4 GA4StratificationP4fit GA4StratificationP4fitt GA4StratificationP4m

GA4StratificationSelection randomnumGenerator

Examples

## The function is currently defined asfunction(crossoverGeneration,bestGeneration,dataName,numberOfStrata,sampleSize,fitnessValueGeneration,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t){fitnessValueParents=fitnessValueGenerationparents=cbind(crossoverGeneration,fitnessValueParents)crossoverGenerationp=crossoverGenerationrowCrossoverGenerationp=nrow(crossoverGenerationp)colCrossoverGenerationp=ncol(crossoverGenerationp)

tableData=as.data.frame(table(dataName))randomnumRange=cumsum(tableData[,2])

lengthRandomnum=length(randomnumRange)

randomNumbers=array(0,dim=c(rowCrossoverGenerationp,3))

for (i in 1:rowCrossoverGenerationp){

randomNumbers[i,]=randomnumGenerator((1:rowCrossoverGenerationp),(rowCrossoverGenerationp+1),3)}mother=father=NULLfor (i in 1:rowCrossoverGenerationp){mother=randomNumbers[i,1]father=randomNumbers[i,2]

crossoverPoint=sample((randomnumRange[1:lengthRandomnum-1]),1)while ( sum(crossoverGenerationp[mother,c(1:crossoverPoint)])!=sum(crossoverGenerationp[father,c(1:crossoverPoint)]) ){crossoverPoint=sample((randomnumRange[1:lengthRandomnum-1]),1)

}crossoverGeneration[i,c(1:crossoverPoint)]=crossoverGenerationp[mother,c(1:crossoverPoint)]crossoverGeneration[i,c((crossoverPoint+1):colCrossoverGenerationp)]=crossoverGenerationp[father,c((crossoverPoint+1):colCrossoverGenerationp)]

}s=GA4StratificationP4fit(crossoverGeneration,dataName,numberOfStrata,sampleSize,cumTotal,sumSquares,lengthData,dd,nocrom,fitp1,fit,N,means,s,n,vars,mas,NN,k,p,t)crossoverGenerationx=cbind(crossoverGeneration,s)GA4StratificationP4x=rbind(parents, crossoverGenerationx)GA4StratificationP4x=GA4StratificationP4x[order(GA4StratificationP4x[,(colCrossoverGenerationp+1)]),]GA4StratificationP4x=GA4StratificationP4x[c((rowCrossoverGenerationp+1):(rowCrossoverGenerationp*2)),c(1:colCrossoverGenerationp)]

return(GA4StratificationP4x)}

Page 57: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

GA4StratificationSelection 57

GA4StratificationSelection

The selection operation in genetic algorithm for the determination ofstratum boundaries and sample sizes of each stratum in stratified sam-pling with Equal sample allocation scheme

Description

This is the selection operation in the Genetic Algorithm approach for the determination of thestratum boundaries and sample sizes in each stratum in stratified sampling with an Equal SampleAllocation Scheme.

Usage

GA4StratificationSelection(selectionGeneration, selectionGenerationFitness)

Arguments

selectionGeneration

The generation that selection operator will be applied.

selectionGenerationFitness

The fitness value of the generation that selection operator will be applied.

Note

This study is part of a project supported by the Scientific and Technological Research Council ofTurkey (TUBITAK).

Author(s)

Sebnem Er, Timur Keskinturk, Charlie Daly

Maintainer: Sebnem Er <[email protected]>

References

http://ideas.repec.org/a/eee/csdana/v52y2007i1p53-67.html

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

See Also

GA4StratificationP1 GA4StratificationP1fit GA4StratificationP1fitt GA4StratificationP1mGA4StratificationP1x

randomnumGenerator

Page 58: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

58 iso2004

Examples

## The function is currently defined asfunction(selectionGeneration,selectionGenerationFitness)

{rowSelectionGeneration=nrow(selectionGeneration)

colSelectionGeneration=ncol(selectionGeneration)selectionStrata=array(0,dim=c(rowSelectionGeneration,(colSelectionGeneration+1)))newSelectionGeneration=cbind(selectionGeneration, -selectionGenerationFitness)sortedSelectionGeneration=newSelectionGeneration[order(newSelectionGeneration[,colSelectionGeneration+1]),]sortedSelectionGenerationFitness=selectionGenerationFitness[order(-selectionGenerationFitness),]wheelOld = sortedSelectionGenerationFitness / sum(selectionGenerationFitness)

wheel=1:rowSelectionGeneration

for(i in rowSelectionGeneration:1){wheel[rowSelectionGeneration+1-i]=wheelOld[i]}wheel=cumsum(wheel)

for ( i in 1:rowSelectionGeneration ){

r = runif(1,0,1)for ( j in 1:rowSelectionGeneration ){if(r < wheel[j]){selectionStrata[i,] = sortedSelectionGeneration[j,]break;}}}randomGeneration=selectionStrata[,1:colSelectionGeneration]fitnessValueGeneration=selectionStrata[,(colSelectionGeneration+1)]return(randomGeneration)

}

iso2004 Net sales data of 487 Turkish manufacturing firms

Description

Net sales data of 487 Turkish manufacturing firms from the first 500 largest corporations belongingto the Istanbul Chamber of Industry ICI in year 2004

Usage

data(iso2004)

Page 59: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

normal100_10 59

Format

A data frame with 487 observations on the following variable.

V1 a numeric vector

Source

http://www.iso.org.tr/tr/web/statiksayfalar/index.aspx?ref=3

References

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

Examples

data(iso2004)

normal100_10 A randomly generated data

Description

normal100_10is a randomly generated data from normal distribution with mean 100 and standartdeviation 10.

Usage

data(normal100_10)

Format

A data frame with 1000 observations on the following variable.

V1 a numeric vector

References

http://www.sciencedirect.com/science/article/B6V8V-4NHM520-1/2/a21e0295aa1616ff56da1ddf2c0ba1ac

Examples

data(normal100_10)

Page 60: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

60 randomnumGenerator

P75 Population data of Swedish municipalities

Description

P75 is the 1975 population (in thousands) of 284 Swedish municipalities obtained from the paperof Hedlin (2003).

Usage

data(P75)

Format

A data frame with 1000 observations on the following variable.

V1 a numeric vector

Source

http://lib.stat.cmu.edu/datasets/mu284.txt

References

Hedlin, Dan (2003) Minimum variance stratification of a finite population. Southampton, UK,Southampton Statistical Sciences Research Institute, 30pp. (S3RI Methodology Working Papers,M03/07) : http://eprints.soton.ac.uk/7796/

Examples

data(P75)

randomnumGenerator Random number generator

Description

This function generates random variables that don’t repeat in a given range

Usage

randomnumGenerator(randomnumRange, lengthRandomnum, howManyRands)

Page 61: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

randomnumGenerator 61

Arguments

randomnumRange This is the range of the random numbersrandomnumRange can be something given as: randomnumRange=1:487orrandomnumRange=c(6, 12, 36, 67, 87, 146, 267)therefore length of the randonnumRangein the first case is 487in the second case is 7, defining the index of the randomnumRange to be swapped

lengthRandomnum

This is the length of the random number range

howManyRands Number of random values

Value

An array of different random numbers

Author(s)

Sebnem Er

References

Swap method of Charlie Daly

Examples

randomnumGenerator(c(1,3,5,7,9,15,21),7,3)

function(randomnumRange,lengthRandomnum,howManyRands){

for (i in 1:(howManyRands)){

integer=sample(lengthRandomnum-1,1)tmp=randomnumRange[integer]randomnumRange[integer]=randomnumRange[i]randomnumRange[i]=tmp

}return(randomnumRange[1:howManyRands])

}

Page 62: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

Index

∗Topic datasetsbeta10_3, 4chi1, 4chi10, 5chi15, 5chi5, 6iso2004, 58normal100_10, 59P75, 60

∗Topic differentrandomnumGenerator, 60

∗Topic randomNumbersrandomnumGenerator, 60

∗Topic stratificationGA4Stratification, 6GA4Stratification-package, 2GA4StratificationP1, 8GA4StratificationP1fit, 10GA4StratificationP1fitt, 13GA4StratificationP1m, 15GA4StratificationP1x, 17GA4StratificationP2, 19GA4StratificationP2fit, 22GA4StratificationP2fitt, 25GA4StratificationP2m, 27GA4StratificationP2x, 29GA4StratificationP3, 32GA4StratificationP3fit, 34GA4StratificationP3fitt, 37GA4StratificationP3m, 39GA4StratificationP3x, 41GA4StratificationP4, 44GA4StratificationP4fit, 47GA4StratificationP4fitt, 49GA4StratificationP4m, 51GA4StratificationP4x, 54GA4StratificationSelection, 57

beta10_3, 3, 4

chi1, 3, 4chi10, 3, 5chi15, 3, 5chi5, 3, 6

GA4Stratification, 6GA4Stratification-package, 2GA4StratificationP1, 3, 8, 12, 14, 16, 18, 57GA4StratificationP1fit, 3, 9, 10, 14, 16,

18, 57GA4StratificationP1fitt, 3, 9, 12, 13, 16,

18, 57GA4StratificationP1m, 3, 9, 12, 14, 15, 18,

57GA4StratificationP1x, 3, 9, 12, 14, 16, 17,

57GA4StratificationP2, 3, 19, 23, 26, 28, 31GA4StratificationP2fit, 3, 20, 22, 26, 28,

31GA4StratificationP2fitt, 3, 20, 23, 25, 28,

31GA4StratificationP2m, 3, 20, 23, 26, 27, 31GA4StratificationP2x, 3, 20, 23, 26, 28, 29GA4StratificationP3, 3, 32, 36, 38, 40, 43GA4StratificationP3fit, 3, 33, 34, 38, 40,

43GA4StratificationP3fitt, 3, 33, 36, 37, 40,

43GA4StratificationP3m, 3, 33, 36, 38, 39, 43GA4StratificationP3x, 3, 33, 36, 38, 40, 41GA4StratificationP4, 3, 44, 48, 50, 52, 56GA4StratificationP4fit, 3, 45, 46, 50, 52,

56GA4StratificationP4fitt, 3, 45, 48, 49, 52,

56GA4StratificationP4m, 3, 45, 48, 50, 51, 56GA4StratificationP4x, 3, 45, 48, 50, 52, 54GA4StratificationSelection, 3, 9, 12, 14,

16, 18, 20, 23, 26, 28, 31, 33, 36, 38,40, 43, 45, 48, 50, 52, 56, 57

62

Page 63: Package ‘GA4Stratification’ fileAuthor Sebnem Er, Timur Keskinturk, Charlie Daly Maintainer Sebnem Er  Description This is a Genetic Algorithm package

INDEX 63

iso2004, 3, 58

normal100_10, 3, 59

P75, 3, 60

randomnumGenerator, 3, 9, 12, 14, 16, 18, 20,23, 26, 28, 31, 33, 36, 38, 40, 43, 45,48, 50, 52, 56, 57, 60