types_of_sampling.docx

Upload: wwwaqar

Post on 03-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 Types_of_Sampling.docx

    1/7

    Types of Sampling

    We may then consider different types of probability samples. Although there are a number of different methods

    that might be used to create a sample, they generally can be grouped into one of two categories: probability

    samplesor non-probabilitysamples.

    Probability Samples

    The idea behind this type is random selection. More specifically, each sample from the population of interest

    has a known probability of selection under a given sampling scheme. There are four categories of probability

    samples described below.

    Simple Random Sampling

    The most widely known type of a random sample is the simple random sample (SRS). This is characterized by

    the fact that the probability of selection is the same for every case in the population. Simple random sampling is

    a method of selecting n units from a population of size N such that every possible sample of size an has equal

    chance of being drawn.

    An example may make this easier to understand. Imagine you want to carry out a survey of 100 voters in a

    small town with a population of 1,000 eligible voters. With a town this size, there are "old-fashioned" ways to

    draw a sample. For example, we could write the names of all voters on a piece of paper, put all pieces of paper

    into a box and draw 100 tickets at random. You shake the box, draw a piece of paper and set it aside, shake

    again, draw another, set it aside, etc. until we had 100 slips of paper. These 100 form our sample. And this

    sample would be drawn through a simple random sampling procedure - at each draw, every name in the box

    had the same probability of being chosen.

    In real-world social research, designs that employ simple random sampling are difficult to come by. We can

    imagine some situations where it might be possible - you want to interview a sample of doctors in a hospital

    about work conditions. So you get a list of all the physicians that work in the hospital, write their names on apiece of paper, put those pieces of paper in the box, shake and draw. But in most real-world instances it is

    impossible to list everything on a piece of paper and put it in a box, then randomly draw numbers until desired

    sample size is reached.

    There are many reasons why one would choose a different type of probability sample in practice.

    Example 1

    Suppose you were interested in investigating the link between the family of origin and income and your

    particular interest is in comparing incomes of Hispanic and Non-Hispanic respondents. For statistical reasons,

    you decide that you need at least 1,000 non-Hispanics and 1,000 Hispanics. Hispanics comprise around 6 or

    7% of the population. If you take a simple random sample of all races that would be large enough to get you

    1,000 Hispanics, the sample size would be near 15,000, which would be far more expensive than a method

    that yields a sample of 2,000. One strategy that would be more cost-effective would be to split the population

    into Hispanics and non-Hispanics, then take a simple random sample within each portion (Hispanic and non-

    Hispanic).

    Example 2

  • 7/29/2019 Types_of_Sampling.docx

    2/7

    Let's suppose your sampling frame is a large city's telephone book that has 2,000,000 entries. To take

    a SRS, you need to associate each entry with a number and choose n= 200 numbers from N= 2,000,000. This

    could be quite an ordeal. Instead, you decide to take a random start between 1 and N/n= 20,000 and then take

    every 20,000th name, etc. This is an example of systematic sampling, a technique discussed more fully below.

    Example 3

    Suppose you wanted to study dance club and bar employees in NYC with a sample of n = 600. Yet there is no

    list of these employees from which to draw a simple random sample. Suppose you obtained a list of all

    bars/clubs in NYC. One way to get this would be to randomly sample 300 bars and then randomly sample 2

    employees within each bars/club. This is an example of cluster sampling. Here the unit of analysis (employee)

    is different from the primary sampling unit (the bar/club).

    In each of these three examples, a probability sample is drawn, yet none is an example of simple random

    sampling. Each of these methods is described in greater detail below.

    Although simple random sampling is the ideal for social science and most of the statistics used are based on

    assumptions of SRS, in practice, SRS are rarely seen. It can be terribly inefficient, and particularly difficult

    when large samples are needed. Other probability methods are more common. Yet SRS is essential, both as amethod and as an easy-to-understand method of selecting a sample.

    To recap, though, that simple random sampling is a sampling procedure in which every element of the

    population has the same chance of being selected and every element in the sample is selected by chance.

    Stratified Random Sampling

    In this form of sampling, the population is first divided into two or more mutually exclusive segments based on

    some categories of variables of interest in the research. It is designed to organize the population into

    homogenous subsets before sampling, then drawing a random sample within each subset. With stratified

    random sampling the population of N units is divided into subpopulations of units respectively. These

    subpopulations, called strata, are non-overlapping and together they comprise the whole of the population.

    When these have been determined, a sample is drawn from each, with a separate draw for each of the different

    strata. The sample sizes within the strata are denoted by respectively. If a SRS is taken within each stratum,

    then the whole sampling procedure is described as stratified random sampling.

    The primary benefit of this method is to ensure that cases from smaller strata of the population are included in

    sufficient numbers to allow comparison. An example makes it easier to understand. Say that you're interested

    in how job satisfaction varies by race among a group of employees at a firm. To explore this issue, we need to

    create a sample of the employees of the firm. However, the employee population at this particular firm is

    predominantly white, as the following chart illustrates:

  • 7/29/2019 Types_of_Sampling.docx

    3/7

    If we were to take a simple random sample of employees, there's a good chance that we would end up with

    very small numbers of Blacks, Asians, and Latinos. That could be disastrous for our research, since we might

    end up with too few cases for comparison in one or more of the smaller groups.

    Rather than taking a simple random sample from the firm's population at large, in a stratified sampling design,

    we ensure that appropriate numbers of elements are drawn from each racial group in proportion to the

    percentage of the population as a whole. Say we want a sample of 1000 employees - we would stratify the

    sample by race (group of White employees, group of African American employees, etc.), then randomly draw

    out 750 employees from the White group, 90 from the African American, 100 from the Asian, and 60 from the

    Latino. This yields a sample that is proportionately representative of the firm as a whole.

    Stratification is a common technique. There are many reasons for this, such as:

    1. If data of known precision are wanted for certain subpopulations, than each of theseshould be treated as a population in its own right.

    2. Administrative convenience may dictate the use of stratification, for example, if an

    agency administering a survey may have regional offices, which can supervise the

    survey for a part of the population.

    3. Sampling problems may be inherent with certain sub populations, such as people

    living in institutions (e.g. hotels, hospitals, prisons).

    4. Stratification may improve the estimates of characteristics of the whole population. It

    may be possible to divide a heterogeneous population into sub-populations, each of

    which is internally homogenous. If these strata are homogenous, i.e., the

    measurements vary little from one unit to another; a precise estimate of any stratum

    mean can be obtained from a small sample in that stratum. The estimate can then be

    combined into a precise estimate for the whole population.5. There is also a statistical advantage in the method, as a stratified random sample

    nearly always results in a smaller variance for the estimated mean or other population

    parameters of interest.

    Systematic Sampling

  • 7/29/2019 Types_of_Sampling.docx

    4/7

    This method of sampling is at first glance very different from SRS. In practice, it is a variant of simple random

    sampling that involves some listing of elements - every nth element of list is then drawn for inclusion in the

    sample. Say you have a list of 10,000 people and you want a sample of 1,000.

    Creating such a sample includes three steps:

    1. Divide number of cases in the population by the desired sample size. In this example,dividing 10,000 by 1,000 gives a value of 10.

    2. Select a random number between one and the value attained in Step 1. In this

    example, we choose a number between 1 and 10 - say we pick 7.

    3. Starting with case number chosen in Step 2, take every tenth record (7, 17, 27, etc.).

    More generally, suppose that the N units in the population are ranked 1 to N in some order (e.g., alphabetic).

    To select a sample of n units, we take a unit at random, from the 1st k units and take every k-th unit thereafter.

    The advantages of systematic sampling method over simple random sampling include:

    1. It is easier to draw a sample and often easier to execute without mistakes. This is a

    particular advantage when the drawing is done in the field.2. Intuitively, you might think that systematic sampling might be more precise

    than SRS. In effect it stratifies the population into n strata, consisting of the 1st k

    units, the 2nd k units, and so on. Thus, we might expect the systematic sample to be

    as precise as a stratified random sample with one unit per stratum. The difference is

    that with the systematic one the units occur at the same relative position in the

    stratum whereas with the stratified, the position in the stratum is determined

    separately by randomization within each stratum.

    Cluster Sampling

    In some instances the sampling unit consists of a group or cluster of smaller units that we call elements or

    subunits (these are the units of analysis for your study). There are two main reasons for the widespreadapplication of cluster sampling. Although the first intention may be to use the elements as sampling units, it is

    found in many surveys that no reliable list of elements in the population is available and that it would be

    prohibitively expensive to construct such a list. In many countries there are no complete and updated lists of

    the people, the houses or the farms in any large geographical region.

  • 7/29/2019 Types_of_Sampling.docx

    5/7

    Even when a list of individual houses is available, economic considerations may point to the choice of a larger

    cluster unit. For a given size of sample, a small unit usually gives more precise results than a large unit. For

    example a SRS of 600 houses covers a town more evenly than 20 city blocks containing an average of 30

    houses apiece. But greater field costs are incurred in locating 600 houses and in traveling between them than

    in covering 20 city blocks. When cost is balanced against precision, the larger unit may prove superior.

    Important things about cluster sampling:

    1. Most large scale surveys are done using cluster sampling;2. Clustering may be combined with stratification, typically by clustering within strata;

    3. In general, for a given sample size n cluster samples are less accurate than the other

    types of sampling in the sense that the parameters you estimate will have greater

    variability than an SRS, stratified random or systematic sample.

    Nonprobability Sampling

  • 7/29/2019 Types_of_Sampling.docx

    6/7

    Social research is often conducted in situations where a researcher cannot select the kinds of probability

    samples used in large-scale social surveys. For example, you wanted to study homelessness - there is no list

    of homeless individuals nor are you likely to create such a list. However, you need to get some kind of a

    sample of respondents in order to conduct your research. To gather such a sample, you would likely use some

    form of non-probability sampling.

    To reiterate, the primary difference between probability methods of sampling and non-probability methods isthat in the latter you do not know the likelihood that any element of a population will be selected for study.

    There are four primary types of non-probability sampling methods:

    Availability Sampling

    Availability sampling is a method of choosing subjects who are available or easy to find. This method is also

    sometimes referred to as haphazard, accidental, or convenience sampling. The primary advantage of the

    method is that it is very easy to carry out, relative to other methods. A researcher can merely stand out on

    his/her favorite street corner or in his/her favorite tavern and hand out surveys. One place this used to show up

    often is in university courses. Years ago, researchers often would conduct surveys of students in their large

    lecture courses. For example, all students taking introductory sociology courses would have been given asurvey and compelled to fill it out. There are some advantages to this design - it is easy to do, particularly with

    a captive audience, and in some schools you can attain a large number of interviews through this method.

    The primary problem with availability sampling is that you can never be certain what population the participants

    in the study represent. The population is unknown, the method for selecting cases is haphazard, and the cases

    studied probably don't represent any population you could come up with.

    However, there are some situations in which this kind of design has advantages - for example, survey

    designers often want to have some people respond to their survey before it is given out in the "real" research

    setting as a way of making certain the questions make sense to respondents. For this purpose, availability

    sampling is not a bad way to get a group to take a survey, though in this case researchers care less about the

    specific responses given than whether the instrument is confusing or makes people feel bad.

    Despite the known flaws with this design, it's remarkably common. Ask a provocative question, give telephone

    number and web site address ("Vote now at CNN.com), and announce results of poll. This method provides

    some form of statistical data on a current issue, but it is entirely unknown what population the results of such

    polls represent. At best, a researcher could make some conditional statement about people who are

    watching CNN at a particular point in time who cared enough about the issue in question to log on or call in.

    Quota Sampling

    Quota sampling is designed to overcome the most obvious flaw of availability sampling. Rather than taking just

    anyone, you set quotas to ensure that the sample you get represents certain characteristics in proportion to

    their prevalence in the population. Note that for this method, you have to know something about the

    characteristics of the population ahead of time. Say you want to make sure you have a sample proportional tothe population in terms of gender - you have to know what percentage of the population is male and female,

    then collect sample until yours matches. Marketing studies are particularly fond of this form of research design.

    The primary problem with this form of sampling is that even when we know that a quota sample is

    representative of the particular characteristics for which quotas have been set, we have no way of knowing if

    sample is representative in terms of any other characteristics. If we set quotas for gender and age, we are likely

    to attain a sample with good representativeness on age and gender, but one that may not be very

    representative in terms of income and education or other factors.

  • 7/29/2019 Types_of_Sampling.docx

    7/7

    Moreover, because researchers can set quotas for only a small fraction of the characteristics relevant to a

    study quota sampling is really not much better than availability sampling. To reiterate, you must know the

    characteristics of the entire population to set quotas; otherwise there's not much point to setting up quotas.

    Finally, interviewers often introduce bias when allowed to self-select respondents, which is usually the case in

    this form of research. In choosing males 18-25, interviewers are more likely to choose those that are better-

    dressed, seem more approachable or less threatening. That may be understandable from a practical point of

    view, but it introduces bias into research findings.

    Purposive Sampling

    Purposive sampling is a sampling method in which elements are chosen based on purpose of the study.

    Purposive sampling may involve studying the entire population of some limited group (sociology faculty at

    Columbia) or a subset of a population (Columbia faculty who have won Nobel Prizes). As with other non-

    probability sampling methods, purposive sampling does not produce a sample that is representative of a larger

    population, but it can be exactly what is needed in some cases - study of organization, community, or some

    other clearly defined and relatively limited group.

    Snowball SamplingSnowball sampling is a method in which a researcher identifies one member of some population of interest,

    speaks to him/her, then asks that person to identify others in the population that the researcher might speak to.

    This person is then asked to refer the researcher to yet another person, and so on.

    Snowball sampling is very good for cases where members of a special population are difficult to locate. For

    example, several studies of Mexican migrants in Los Angeles have used snowball sampling to get respondents.

    The method also has an interesting application to group membership - if you want to look at pattern of

    recruitment to a community organization over time, you might begin by interviewing fairly recent recruits, asking

    them who introduced them to the group. Then interview the people named, asking them who recruited them to

    the group.

    The method creates a sample with questionable representativeness. A researcher is not sure who is in the

    sample. In effect snowball sampling often leads the researcher into a realm he/she knows little about. It can be

    difficult to determine how a sample compares to a larger population. Also, there's an issue of who respondents

    refer you to - friends refer to friends, less likely to refer to ones they don't like, fear, etc.