lecture 4 research design and sampling

Lecture 4Research Design and Sampling

Research Design

Research design is a master plan specifying the methods and procedures for collecting and analyzing the needed information

Research Designs in Terms of the Controlling MethodExperimental DesignResearchers plan to measure the response variable depending (dependent variable) on the explanatory variable (independent variable)---Randomization

Quasi-experimental DesignA mixed design where random and non-random experiments are employed together---Lack of randomization

Observational DesignProspective or Retrospective

Research Design

Research Designs in Terms of Time SequencesProspective DesignResearcher follows the participants and measures or observes the behavior of the participants: clinical trials or cohort design

Retrospective DesignResearcher gathers data at once and classifies the participants simultaneously into the group categories: case-control studies vs. cross-sectional studies

Research Design

Research Designs in Terms of Sampling MethodsClinical TrialRandomly allocates participants to the various groups of interest and measures differences in the futureCohort StudyParticipants have a right to choose the group they want to join. The researcher measures differences between groups without randomizationCase-control StudyResearcher gathers the data at once and then looks into the past of the participants to classify themCross-sectional StudyResearcher gathers the data at once like case-control studies and then classifies them simultaneously on the classification (more than two categories) and their current responses.

Sources of Data

Two sources of information normally used for research purposesPrimary source of dataObtain from a survey or experimentSecondary source of data

Already available or have been collected for other purposes Types of data

Qualitative and quantitative Measurement of data

Nominal Scale: The numbers or letters assigned to objects serve as labels for identification or classificationOrdinal Scale: arranges objects or categorical variables according to an ordered

relationship---When a nominal scale follows an order Interval Scale: not only indicate order, they also measure the order or distance in

units of equal intervals---When a ordinal scale measure the differences BW eventsRatio Scale: have absolute rather than relative quantities (e.g., age, money)---When

an interval scale has an absolute zero

Method of Data Collection

Two phases in data collectionPre-test and main studyPre-test checks the data collection form to minimize errors due to improper design elements using a small-scale data

The unit of analysisOrganizations, departments, work groups, individuals, or objects

Sampling techniques Sampling is the process or technique of selecting a suitable sample for the purpose

of determining parameters or characteristics of the whole population. Decisions to be made:

Size of the sampleWhat method to sampleThe cost for samplingHow ‘representative’ is the sample

Two basic sampling techniquesProbability sample: every element of the population has an equal chance of

being selected.Non-probability sampling: sample units are selected on the basis of personal

judgment Sampling frame: the list of elements from which the sample may be drawn

sampling unit is a single element or group of elements subject to selection

Representative Sampling Plans

Simple Random Sample---applied to homogeneous populationsSelections are made from a specified and defined population (i.e., the frame is

known).Every unit in the population has an equal (known) chance of selection.The method of selection is specified, objective and replicable.

Stratified Random Sampling---applied to heterogeneous populationsSamples are drawn equally or proportionately from each stratum (i.e., each

homogeneous subpopulation). Systematic (Quasi-random) Sampling

Range the population from which selections are to be made in a list or series, choose a random staring point and then count through the list selecting every n-th unit


Cluster (Multistage) SamplingHave a number of clusters which are characterized by heterogeneity in

between and homogeneity within. Sequential (Multiphase) Sampling

A sampling scheme where the researcher is allowed to draw sample on more than one occasions.

Non-probability Sampling Methods The probability of selecting population elements is unknown

Convenience SamplingUnrestricted non-probability sampling---the cheapest and easiest


Purposive Sampling: A non-probability sample that conforms to certain criteriaJudgment sampling: A cross-section of the sample selected by the researcher

conforms to some criteria.Quota Sampling: control characteristics such as gender or social status to

draw a representative sample of the population. Snowball (Network or Chain) Sampling

A small number of the samples initially selected by the researcher are then asked to nominate a group who would be prepared to be interviewed for the research; these in turn nominate others, and so forth.

Summary: Random samplingprovide protection against selection biasenables the precision of estimates to be estimatedenables the scope for non-response bias to be addressed

Sample Size Determination

Various questions may arise:How big should the sample be? How small can we allow it to be? Whether the sample size is adequate in relation to the goals of the study?

Time and cost One of the most popular approaches

The power of a test of hypothesis

Test of Significance for Population Mean

The formula for determine sample size in the case of testing hypothesis ofpopulation means is:

Test of Significance for Population Proportion

The formula for determine sample size in the case of testing hypothesis ofpopulation proportion (e.g., percentage of voters, or prevalence rates of a disease) is:

Key Statistical Concepts

PopulationCan be enumeratedPopulation parameters: population mean (µ) and population standard

deviation (σ) Population TotalsProportions of the Population Having an Attribute

Sub-populationSample statistics: sample mean and sample varianceRatios for sub-populationStatistical estimates: correlation coefficient, multiple regressions coefficients,

and factor scores.

Some Problems With Random Sample Surveys

Non-response Bias: distort selection probabilities Limited Sample Size and the ‘Inverse Square Law’: useful even if not very precise Sampling Distribution, Sampling Bias and Sampling Variance

Sampling distribution: the complete set of estimates that could be obtained from all the samples that could be selected under the sample designSampling bias: a measure of the location of the sampling distribution, relative

to the population value.Sampling variance: the variance of sampling distribution around the true

population values. Sampling error: the tendency for estimates to differ from the population value

Sampling error=sampling bias + sampling variance Confidence Intervals

In the form of n standard errors (commonly n = 2)

The Normal Distribution

The normal distribution is a statistical distribution that has a particular shape and known properties.

• Symmetrical• A known and fixed relationship between the standard deviation of the

distribution (the standard error) and the percentiles of the distribution

lecture 4 research design and sampling

Documents