research methodology - isp-infinite possibilities · 2016. 8. 17. · lecture 12 sampling &...

90
LECTURE 12 SAMPLING & PROBABILITY DISTRIBUTIONS Mazhar Hussain Dept of Computer Science ISP,Multan [email protected] 1 RESEARCH METHODOLOGY

Upload: others

Post on 15-Feb-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • LECTURE 12

    SAMPLING & PROBABILITY

    DISTRIBUTIONS

    Mazhar Hussain

    Dept of Computer Science

    ISP,Multan

    [email protected]

    1

    RESEARCH

    METHODOLOGY

  • ROAD MAP

    Introduction

    Chosing your research problem

    Chosing your research advisor

    Literature Review

    Plagiarism

    Variables in Research

    Construction of Hypothesis

    Research Design

    Writing Research Proposal

    Writing your Thesis

    Data Collection

    Data Representation

    Sampling and Distributions

    Paper Writing

    Ethics of Research

    2

  • SAMPLING

    How to find average student age in the

    university?

    Ask each student and compute the average

    Randomly select 3 to 4 students from each discipline

    and find their average age – Estimation of the

    average age of student in the university

    3

  • SAMPLING

    Why sampling?

    Efforts and resources required to carry out the study

    on the population

    Examples

    Average income of families living in a city

    Results of an election

    Opinion about the a problem

    4

  • SAMPLING

    5

    Sampling is the process of selcetion a few (a

    sample) from a bigger group (the sampling

    population) to become the basis for estimating

    or predicting the prevalence of an unknown

    piece of information, situation or outcome

    regarding the bigger group

  • RECAP – MEAN & STANDARD DEVIATION

    6

    Mean/Average

    Standard Deviation

    On the average, how far the data values are from the

    mean

  • POPULATION VS SAMPLE

    7

  • 8

    Karl Friedrich Gauss 1777-1855

    Gaussian

    Distribution

  • GAUSSIAN/NORMAL PROBABILITY

    DISTRIBUTION

    9

    Most of the naturally occurring processes can be

    modeled by a bell shaped curve

  • GAUSSIAN/NORMAL PROBABILITY

    DISTRIBUTION

    The Gaussian probability distribution is perhaps

    the most used distribution in all of science.

    Sometimes it is called the ―bell shaped curve‖ or

    normal distribution.

    10

    2

    2

    ( )

    21

    ( )2

    x

    p x e

    = mean of distribution

    = standard deviation of distributionx is a continuous variable (-∞x ∞

    2( , )N

  • GAUSSIAN/NORMAL PROBABILITY

    DISTRIBUTION

    11

    The area within +/- σ is ≈ 68%

    The area within +/- 2σ is ≈ 95%

    The area within +/- 2σ is ≈ 99.7%

  • GAUSSIAN/NORMAL PROBABILITY

    DISTRIBUTION

    Probability (P) of x being in the range [a, b] is

    given by an integral:

    12

    2

    2

    ( )

    21

    ( ) ( )2

    xb b

    a a

    P a x b p x dx e dx

    95% of area within 2 Only 5% of area outside 2

    Gaussian pdf with =0 and =1

  • GAUSSIAN/NORMAL PROBABILITY

    DISTRIBUTION

    13

    Standard Normal Distribution

    http://en.wikipedia.org/wiki/Image:Normal_Distribution_PDF.svg

  • STANDARD NORMAL DISTRIBUTION

    Normal distribution with mean of zero and

    standard deviation of one

    Since mean and standard deviation define any

    normal distribution…

    Standard normal distribution can be used for any

    normally distributed variable by converting mean

    to zero and standard deviation to one—z scores

    14

  • Z SCORES

    By itself, a raw score or X value provides very little

    information about how that particular score

    compares with other values in the distribution.

    A score of X = 53, for example, may be a relatively

    low score, or an average score, or an extremely

    high score depending on the mean and standard

    deviation for the distribution from which the score

    was obtained.

    If the raw score is transformed into a z-score,

    however, the value of the z-score tells exactly

    where the score is located relative to all the other

    scores in the distribution. 15

  • Z SCORES

    The process of changing an X value into a z-score

    involves creating a signed number, called a z-

    score, such that

    The sign of the z-score (+ or –) identifies whether the

    X value is located above the mean (positive) or below

    the mean (negative).

    The numerical value of the z-score corresponds to the

    number of standard deviations between X and the

    mean of the distribution.

    Thus, a score that is located two standard deviations

    above the mean will have a z-score of +2.00

    16

  • Z SCORES

    In addition to knowing the basic definition of a z-

    score and the formula for a z-score, it is useful to

    be able to visualize z-scores as locations in a

    distribution.

    Remember, z = 0 is in the center (at the mean),

    and the extreme tails correspond to z-scores of

    approximately –2.00 on the left and +2.00 on the

    right.

    Although more extreme z-score values are

    possible, most of the distribution is contained

    between z = –2.00 and z = +2.00.17

  • Z SCORES

    z-score for a sample value in a data set is obtained by

    subtracting the mean of the data set from the value

    and dividing the result by the standard deviation of

    the data set.

    NOTE: When computing the value of the z-score,

    the data values can be population values or sample

    values. Hence we can compute either a population z-

    score or a sample z-score.

    18

  • Z SCORES

    The Sample z-score for a value x is given by the

    following formula:

    Where is the sample mean and s is the sample

    standard deviation.

    19

    x xz

    s

    x

  • Z SCORES

    The Population z-score for a value x is given by

    the following formula:

    Where is the population mean and is the

    population standard deviation.

    20

    xz

  • EXAMPLE

    Example: What is the z-score for the value of 14

    in the following sample values?

    3 8 6 14 4 12 7 10

    21

    Thus, the data value of 14 is 1.57 standard deviations above the mean of 8, since the z-score is positive.

  • EXAMPLE

    Dot Plot of the data points with the location of

    the mean and the data value of 14.

    22

  • Z SCORE & PROBABILITY

    What is the probability of finding a value

    between 100 and 110?

    23

    How to calculate

    this area using z

    scores?

  • Z SCORE CHART

    24

    0.9394

    Reading area under curve for z=1.55

  • Z SCORE & PROBABILITY

    25

    Probability of z>1.55 (Area in tail)

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    0.35

    0.40

    0.45

    -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0

    1.55

    P=.0606

    0.9394

    P=1-0.9394

    P=0.0606

  • Z SCORE & PROBABILITY

    26

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    0.35

    0.40

    0.45

    -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0

    -1.55 1.55

    P=.0606+.0606

    P=.1212

    Probability of z>1.55 + z

  • Z SCORE & PROBABILITY

    27

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    0.35

    0.40

    0.45

    -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0

    1.55

    P=.5-.0606=.4394

    Probability of z>0 and z

  • EXAMPLE: 50 MEASURES OF POLLUTION

    28

    V a l u e F r e q u e n c y

    2 0 0

    2 5 3

    3 0 5

    3 5 6

    4 0 8

    4 5 1 3

    5 0 5

    5 5 6

    6 0 3

    6 5 1

    M o r e 0

    Histogram

    0

    2

    4

    6

    8

    10

    12

    14

    20 25 30 35 40 45 50 55 60 65

    Mor

    e

    Value

    Fre

    que

    ncy

    68.40 88.9

  • EXAMPLE: 50 MEASURES OF POLLUTION

    Probability value > 45

    29

    4372.88.9

    68.4045

    z

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    0.35

    0.40

    0.45

    -3 -2 -1 0 1 2 3

    .4372P=.3300

  • EXAMPLE: 50 MEASURES OF POLLUTION

    Probability from 35 to 45

    30

    •5749.

    88.9

    68.4035

    z

    4372.88.9

    68.4045

    z

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    0.35

    0.40

    0.45

    -3 -2 -1 0 1 2 3

    -.5749 .4372

    P=.5-.3300=.1700P=.5-.2843=.2157

    P=.2157+.1700=.3857

  • 31

    Sampling

  • SAMPLING

    Pros

    Saves time

    Resources – financial, human

    Cons

    Not exact value for the population

    An estimate or prediction

    Compromise on accuracy of findings

    32

  • SAMPLING – TERMINOLOGY

    Examples

    Average student age in the university

    Average income of families living in a city

    Results of an election

    Population or study population (N)

    The university students, families living in the city,

    electors

    Sample

    The small group of students, families or electors you

    chose to collect the required information33

  • SAMPLING – TERMINOLOGY

    Sample size (n)

    The number of entities in your sample

    Sampling design or strategy

    The way you select the students, families or electors

    Sampling unit or sampling element

    Each student, family or elector in your study

    Sample statistics

    Your findings based on infomration obtained from

    your sample34

  • SAMPLING – TERMINOLOGY

    Population Parameters

    Aim of research – find answers to research question

    for study population not the sample

    Use sample statistics to estimate answers to research

    questions in study population

    Estimates arrived at from sample statistics –

    population parameters

    Saturation Point

    When no new information is coming from your

    respondents

    35

  • SAMPLING – TERMINOLOGY

    Sampling Frame

    A list identifying each student, family or elector in

    the study population

    36

  • PRINCIPLES OF SAMPLING

    Example – Four individuals A,B,C, D

    A = 18 years

    B = 20 years

    C = 23 years

    D = 25 years

    Average age

    (18+20+23+25) / 4 = 21.5 years

    Use a sample of two indivudals to estimate the

    average age of your study population (4

    individuals)37

  • PRINCIPLES OF SAMPLING

    How many possible combinations of two

    individuals?

    38

    A and B

    A and C

    A and D

    B and C

    B and D

    C and D

  • PRINCIPLES OF SAMPLING

    A+B = 18+20 = 38/2 = 19.0 years

    A+C = 18+23 = 41/2 = 20.5 years

    A+D = 18+25 = 43/2 = 21.5 years

    B+C = 20+23 = 43/2 = 21.5 years

    B+D = 20+25 = 45/2 = 22.5 years

    C+D = 23+25 = 48/2 = 24.0 years

    In two cases – no difference between sample

    statistics and population parameters

    Difference – Sampling error39

  • PRINCIPLES OF SAMPLING

    Sample Sample

    Statistics

    Population

    Parameters

    Difference

    1 19.0 21.5 -2.5

    2 20.5 21.5 -1.5

    3 21.5 21.5 0.0

    4 21.5 21.5 0.0

    5 22.5 21.5 +1.0

    6 24 21.5 +2.5

    40

  • PRINCIPLES OF SAMPLING

    Principle I

    41

    In majority of cases of sampling, there will

    be a difference between sample statistics

    and the true population parameters which

    is attribuatable to the selection of the units

    in the sample

  • PRINCIPLES OF SAMPLING

    Instead of samples of two – take a sample of

    three

    Four possible combinations

    A+B+C = 18+20+23 = 61/3 = 20.33 years

    A+B+D = 18+20+25 = 63/3 = 21.00 years

    A+C+D = 18+23+25 = 66/3 = 22.00 years

    B+C+D = 20+23+25 = 68/3 = 22.67 years

    42

  • PRINCIPLES OF SAMPLING

    43

    Sample Sample

    Statistics

    Population

    Parameters

    Difference

    1 20.33 21.5 -1.17

    2 21.00 21.5 -0.5

    3 22.00 21.5 +0.5

    4 22.67 21.5 +1.17

  • PRINCIPLES OF SAMPLING

    44

    Sample Sample

    Statistics

    Population

    Parameters

    Difference

    1 20.33 21.5 -1.17

    2 21.00 21.5 -0.5

    3 22.00 21.5 +0.5

    4 22.67 21.5 +1.17

    Sample Sample

    Statistics

    Population

    Parameters

    Difference

    1 19.0 21.5 -2.5

    2 20.5 21.5 -1.5

    3 21.5 21.5 0.0

    4 21.5 21.5 0.0

    5 22.5 21.5 +1.0

    6 24 21.5 +2.5

    -2.5 to +2.5

    -1.17 to +1.17

  • PRINCIPLES OF SAMPLING

    The gap between sample statistics and population parameters is reduced

    Principle II

    45

    The greater the sample size, the more

    accurate will be the estimate of the true

    population statistics

  • PRINCIPLES OF SAMPLING

    Same Example – Different Data

    A =18 years

    B = 26 years

    C = 32 years

    D = 40 years

    Variable (age) – markedly different

    46

  • PRINCIPLES OF SAMPLING

    Estimate average using

    Samples of two

    Samples of three

    Difference in the average age:

    Sample size of 2: -7.00 to +7.00 years

    Sample size of 3: -3.67 to +3.67 years

    Range of difference is greater than previously

    calculated

    47

  • PRINCIPLES OF SAMPLING

    Principle III

    48

    The greater the difference in the variable

    under study in a population for a given

    sample size, the greater will be the

    difference between the sample statistics

    and the true population parameters

  • FACTORS AFFECTING THE INFERENCE

    Principles suggest that two factors may influence

    the degree of certainity about the inferences

    drawn from a sample

    Size of sample

    Larger the sample size, the more accurate will be the

    findings

    The extent of variation in the sampling population

    Greater the variation in the study population w.r.t. the

    chracteristics under study, the greater will be the

    uncertainity for a given sample size

    49

  • AIMS IN SELECTING A SAMPLE

    Achieve maximum precision in your estimate

    Avoid bias in selection

    Bias can occur if: Non-random sampling – consciously or unconsciously affected

    by human choice

    Sampling frame does not cover the sampling population

    accurately or completely

    A section of sampling population is impossible to find or

    refuses to cooperate

    50

  • SAMPLING METHODS

    Probability Sampling

    Used to generate random/non-biased samples

    required for conducting inferential analyses

    Non-probability Sampling

    Used mostly in qualitative analysis

    Mixed Sampling

    51

  • SAMPLING METHODS

    Probability Sampling

    Non-probability Sampling

    Mixed Sampling

    52

  • PROBABILITY SAMPLING

    Each element in the population has an equal and independent chance of selection in the sample

    Equal:

    Probability of selection of each element is the same

    Choice is not affected by other considerations –human preferences

    Independent:

    Choice of one element is not dependent upon the choice of another element

    Selection or rejection of one element does not affect the inclusion or exclusion of another 53

  • PROBABILITY SAMPLING

    Example – Equal Chance

    80 students in the class

    20 refuse to participate in your study

    Each of 80 students (population) does not have an

    equal chance of selection

    Sample is not representative of your class

    54

  • PROBABILITY SAMPLING

    Example – Independence

    Three close friends in the class

    One is selected – Two are not

    Refuses to participate without friends

    Forced to chose all three or none

    Not independent sampling

    55

    Inferences drawn from random samples

    can be generalized to the total sampling

    population

  • PROBABILITY SAMPLING

    Simple Random Sampling

    Fishbowl draw

    Computer program

    Table of random numbers

    Stratified Sampling

    Proportional

    Disproportional

    Cluster Sampling

    56

  • SIMPLE RANDOM SAMPLING

    The Fishbowl Draw

    Small population

    Number each element on separate slips of paper for

    each element

    Put them in a box

    Pick out one by one until you get desired sample size

    Similar to lotteries

    Computer Program

    Write a program to select a random sample

    57

  • SIMPLE RANDOM SAMPLING

    Table of random numbers

    58

    Random

    Number Table

  • 59

    How to Use Random Number Tables

    ________________________________________________

    1. Assign a unique number to each population element in the

    sampling frame. Start with serial number 1, or 01, or 001,

    etc. upwards depending on the number of digits required.

    2. Choose a random starting position.

    3. Select serial numbers systematically across rows or down

    columns.

    4. Discard numbers that are not assigned to any population

    element and ignore numbers that have already been

    selected.

    5. Repeat the selection process until the required number of

    sample elements is selected.

  • 60

    How to Use a Table of Random Numbers to Select a Sample

    Your supervisor wants to randomly select 20 students from your class of 100

    students. Here is how he can do it using a random number table.

    Step 1: Assign all the 100 members of the population a unique number.You may

    identify each element by assigning a two-digit number. Assign 01 to the first name

    on the list, and 00 to the last name. If this is done, then the task of selecting the

    sample will be easier as you would be able to use a 2-digit random number table.

    NAME NUMBER NAME NUMBER

    Adam, Tan 01 Tan Teck Wah

    …………..

    42

    ………………

    …………………… … Carrol, Chan 08 Tay Thiam Soon

    61

    ………………. … ……………….. … Jerry Lewis 18 Teo Tai Meng 87

    ………………. … …………………. … Lim Chin Nam 26 …………………… …

    ………………. … Yeo Teck Lan 99

    Singh, Arun

    ……………….

    30 Zailani bt Samat 00

  • 61

    Step 2: Select any starting point in the Random Number Table and find the first number that

    corresponds to a number on the list of your population. In the example below, # 08 has been

    chosen as the starting point and the first student chosen is Carol Chan.

    10 09 73 25 33 76

    37 54 20 48 05 64

    08 42 26 89 53 19

    90 01 90 25 29 09

    12 80 79 99 70 80

    66 06 57 47 17 34

    31 06 01 08 05 45

    Step 3: Move to the next number, 42 and select the person corresponding to that number into

    the sample. #42 – Tan Teck Wah

    Step 4: Continue to the next number that qualifies and select that person into the sample.

    # 26 -- Jerry Lewis, followed by #89, #53 and #19

    Step 5: After you have selected the student # 19, go to the next line and choose #90. Continue

    in the same manner until the full sample is selected. If you encounter a number selected

    earlier (e.g., 90, 06 in this example) simply skip over it and choose the next number.

    Starting point: move right to the end of the row, then down to the next row row; move left to the end, then down to the next row, and so on.

  • TABLE OF RANDOM NUMBERS

    Suppose you are using a table like this:

    62

  • TABLE OF RANDOM NUMBERS

    Sampling population – 256 individuals

    Numbered from 1 to 256

    You chose to select 10% - 25 individuals

    Randomly select any starting point

    Pick last three digits of the number

    Select the valid ones (001-256) and skip the

    invalid numbers (257-999)

    63

  • DRAWING A RANDOM SAMPLE

    Two ways of selecting a random sample

    Sampling without replacement

    Sampling with replacement

    Example

    20 students to be selected out of 80

    First student is selected – Probability 1/80

    For second student – 79 left, Probability 1/79

    By the time you select the 20th – Probability 1/61

    64

  • DRAWING A RANDOM SAMPLE

    Sampling without replacement

    Contrary to randomization – Each element should

    have equal probability of selection

    Sampling with replacement

    Selected element is replaced in the population

    If it is selected again – it is discarded

    65

  • PROBABILITY SAMPLING

    Simple Random Sampling

    Fishbowl draw

    Computer program

    Table of random numbers

    Stratified Sampling

    Proportional

    Disproportional

    Cluster Sampling

    66

  • STRATIFIED SAMPLING

    Step 1- Divide the population into homogeneous,

    mutually exclusive and collectively exhaustive

    subgroups or strata using some stratification variable.

    Step 2- Select an independent simple random sample

    from each stratum.

    Step 3- Form the final sample by consolidating all

    sample elements chosen in step 2.

    67

  • STRATIFIED SAMPLING

    Example

    Stratify on the basis of gender

    Two groups – male and female

    Select random samples from each group

    68

  • STRATIFIED SAMPLING

    Stratified samples can be:

    Proportionate: involving the selection of sample

    elements from each stratum, such that the ratio of sample

    elements from each stratum to the sample size equals that

    of the population elements within each stratum to the

    total number of population elements.

    Disproportionate: the sample is disproportionate when

    the above mentioned ratio is unequal.

    69

  • 70

    To select a stratified sample of 20 members of the Island Video Club which has 100 members

    belonging to three language based groups of viewers i.e., English (E), Mandarin (M) and Others

    (X).

    Step 1: Identify each member from the membership l ist by his or her respective language groups

    00 (E ) 20 (M) 40 (E ) 60 ( X ) 80 (M)

    01 (E ) 21 ( X ) 41 ( X ) 61 (M) 81 (E )

    02 ( X ) 22 (E ) 42 ( X ) 62 (M) 82 (E )

    03 (E ) 23 ( X ) 43 (E ) 63 (E ) 83 (M)

    04 (E ) 24 (E ) 44 (M) 64 (E ) 84 ( X )

    05 (E ) 25 (M) 45 (E ) 65 ( X ) 85 (E )

    06 (M) 26 (E ) 46 ( X ) 66 (M) 86 (E )

    07 (M) 27 (M) 47 (M) 67 (E ) 87 (M)

    08 (E ) 28 ( X ) 48 (E ) 68 (M) 88 ( X )

    09 (E ) 29 (E ) 49 (E ) 69 (E ) 89 (E )

    10 (M) 30 (E ) 50 (E ) 70 (E ) 90 ( X )

    11 (E ) 31 (E ) 51 (M) 71 (E ) 91 (E )

    12 ( X ) 32 (E ) 52 ( X ) 72 (M) 92 (M)

    13 (M) 33 (M) 53 (M) 73 (E ) 93 (E )

    14 (E ) 34 (E ) 54 (E ) 74 ( X ) 94 (E )

    15 (M) 35 (M) 55 (E ) 75 (E ) 95 ( X )

    16 (E ) 36 (E ) 56 (M) 76 (E ) 96 (E )

    17 ( X ) 37 (E ) 57 (E ) 77 (M) 97 (E )

    18 ( X ) 38 ( X ) 58 (M) 78 (M) 98 (M)

    19 (M) 39 ( X ) 59 (M) 79 (E ) 99 (E )

  • 71

    Step 2: Sub-divide the club members into three homogeneous sub-groups or strata by the

    language groups: English, Mandarin and others .

    EnglishLanguage Mandarin Language Other Language

    Stratum Stratum Stratum .

    00 22 40 64 82 06 35 66 02 42

    01 24 43 67 85 07 44 68 12 46

    03 26 45 69 86 10 47 72 17 52

    04 29 48 70 89 13 51 77 18 60

    05 30 49 71 91 15 53 78 21 65

    08 31 50 73 93 19 56 80 23 74

    09 32 54 75 94 20 58 83 28 84

    11 34 55 76 96 25 59 87 38 88

    14 36 57 79 97 27 61 92 39 90

    16 37 63 81 99 33 62 98 41 95

    1. Calculate the overall sampling fraction, f, in the following manner:

    f = n = 20 = 1 = N 100 5

    where n = sample size and N = population size

    0.2

  • 72

    Determine the number of sample elements (n1) to be selected from the English

    language stratum. In this example, n1 = 50 x f = 50 x 0.2 =10. By using a simple

    random sampling method [using a random number table] members whose numbers

    are 01, 03, 16, 30, 43, 48, 50, 54, 55, 75, are selected.

    Next, determine the number of sample elements (n2) from the Mandarin language

    stratum. In this example, n2 = 30 x f = 30 X 0.2 = 6. By using a simple random

    sampling method as before, members having numbers 10,15, 27, 51, 59, 87 are

    selected from the Mandarin language stratum.

    In the same manner, the number of sample elements (n3) from the „Other language‟

    stratum is calculated. In this example, n3 = 20 x f = 20 X 0.2 = 4. For this stratum,

    members whose numbers are 17, 18, 28, 38 are selected‟

    These three different sets of numbers are now aggregated to obtain the ultimate

    stratified sample as shown below.

    S = (01, 03, 10, 15, 16, 17, 18, 27, 28, 30, 38, 43, 48, 50, 51, 54, 55, 59, 75, 87)

  • PROBABILITY SAMPLING

    Simple Random Sampling

    Fishbowl draw

    Computer program

    Table of random numbers

    Stratified Sampling

    Proportional

    Disproportional

    Cluster Sampling

    73

  • CLUSTER SAMPLING

    Simple Random, Symmetric and Stratified

    sampling – based on researcher’s ability to

    identify each element in population

    Small population size – easy

    Large population – country

    Cluster sampling

    74

  • CLUSTER SAMPLING

    Divide population into clusters

    Select elements wihtin each cluster

    Cluster formation

    Geographical proximity

    Common characteristic – similar to stratified

    sampling

    75

  • STRATIFIED VS CLUSTER SAMPLING

    In startified sampling the target population is

    sub-divided into a few subgroups or strata, each

    containing a large number of elements.

    In cluster sampling, the target population is sub-

    divided into a large number of sub-population or

    clusters, each containing a few elements.

    76

  • 77

    AREA SAMPLING

    A common form of cluster sampling where clusters consist of geographic areas, such as

    districts, housing blocks or townships. Area sampling could be one-stage, two-stage, or

    multi-stage.

    How to Take an Area Sample Using Subdivisions

    Your company wants to conduct a survey on the expected patronage of its new outlet in a new

    housing estate. The company wants to use area sampling to select the sample households to be

    interviewed. The sample may be drawn in the manner outlined below.

    ___________________________________________________________________________________

    Step 1: Determine the geographic area to be surveyed, and identify its subdivisions. Each

    subdivision cluster should be highly similar to all others. For example, choose ten housing

    blocks within 2 kilometers of the proposed site [say, Model Town ] for your new retail outlet;

    assign each a number.

    Step 2: Decide on the use of one-step or two-step cluster sampling. Assume that you decide to

    use a two-stage cluster sampling.

    Step 3: Using random numbers, select the housing blocks to be sampled. Here, you select 4

    blocks randomly, say numbers #102, #104, #106, and #108.

    Step 4: Using some probability method of sample selection, select the households in each of the

    chosen housing block to be included in the sample. Identify a random starting point (say,

    apartment no. 103), instruct field workers to drop off the survey at every fifth house

    (systematic sampling).

  • SAMPLING METHODS

    Probability Sampling

    Non-probability Sampling

    Mixed Sampling

    78

  • NON-PROBABILITY SAMPLING

    Do not follow probability theory

    Useful when the number of elements in

    population is unknown or cannot be individually

    identified

    Four common designs

    Quota Sampling

    Accidental Sampling

    Judgemental Sampling

    Snowball Sampling

    79

  • QUOTA SAMPLING

    Main consideration – ease of access to sample

    population

    Guided by some visible chracteristic of study

    population – age, gender etc.

    Sample selection – location convenient to the

    researcher

    Whenever a person with required characteristic

    is seen – asked to participate in the study

    Process continues until the required number of

    respondents (quota) is reached

    80

  • QUOTA SAMPLING - EXAMPLE

    Average age of male students in the university

    Select a sample of 20 male students

    You decide to stand at the entrance of the

    university - convenient

    Whenever a male student arrives – ask his age

    When you get 20 – Target is achieved

    81

  • QUOTA SAMPLING

    Advantages

    Convenient

    Less expensive

    Disadvantages

    Not probability based

    May not be generalized to the population

    82

  • ACCIDENTAL SAMPLING

    Also based on convenience

    Quota sampling – include people with some

    obvious characteristic

    Accidental sampling – no such attempt

    Common in market research and newspaper

    reporters

    Since you just pick up the people – may not get

    the required information

    83

  • JUDGEMENTAL SAMPLING

    Judgement of the researcher as to who can

    provide the best information to achieve the

    objectives of the study

    Sampling based on some judgment, gut-feelings

    or experience of the researcher.

    84

  • SNOWBALL SAMPLING

    Sample selection using network

    Start with few individuals or organizations and

    collect the information

    They are then asked to identify other

    participants – people selected by them become a

    part of sample

    The process continues……

    85

  • SAMPLING METHODS

    Probability Sampling

    Non-probability Sampling

    Mixed Sampling

    86

  • MIXED SAMPLING

    Systematic sampling

    Divide the frame into segments or intervals

    Select one element from first interval using SRS

    Select elements from subsequent intervals

    depending upon the element selected from the

    first interval

    Example

    5th element selected from first element

    Select the 5th from each interval

    87

  • SYSTEMATIC SAMPLING

    88

    To use systematic sampling, a researcher needs:

    [i] A sampling frame of the population; .

    [ii] A skip interval calculated as follows:

    Skip interval = population list size

    Sample size

    Names are selected using the skip interval.

    If a researcher were to select a sample of 1000 people using the local telephone

    directory containing 215,000 listings as the sampling frame, skip interval is

    [215,000/1000], or 215. The researcher can select every 215th

    name of the entire

    directory [sampling frame], and select his sample.

  • 89

    Example: How to Take a Systematic Sample

    Step 1: Select a listing of the population, say the City Telephone Directory, from which to

    sample. Step 2: Compute the skip interval by dividing the number of entries in the directory by the

    desired sample size.

    Example: 250,000 names in the phone book, desired a sample size of 2500,

    So skip interval = every 100th name

    Step 3: Using random number(s), determine a starting position for sampling the list.

    Example: Select: Random number for page number. (page 01)

    Select: Random number of column on that page. (col. 03)

    Select: Random number for name position in that column (#38, say, A..Mahadeva)

    Step 4: Apply the skip interval to determine which names on the list will be in the sample.

    Example: A. Mahadeva (Skip 100 names), new name chosen is A Rahman b Ahmad.

    Step 5: Consider the list as “circular”; that is, the first name on the list is now the init ial name

    you selected, and the last name is now the name just prior to the initially selected one.

    Example: When you come to the end of the phone book names (Zs), just continue on

    through the beginning (As).

  • CREDITS

    Chapter 12, Research Methodology, Ranjit

    Kumar

    Sampling in Market Research - APMF

    90