chapter_1_s_

Upload: shalini-arivalagan

Post on 05-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Chapter_1_s_

    1/15

    Probability and Statistics I

    UECM1203 Probability and Statistics I

    Contents:

    Chapter 1. Introduction

    Chapter 2. Organizing Data

    Chapter 3. Numerical Descriptive Measures

    Chapter 4. Probability

    Chapter 5. Discrete Distributions

    Chapter 6. Continuous Distributions

    Reference Books

    1. Wackerly D.D., Mendenhall W., Scheaffer R.L.,Mathematical Statistics with Applications, 7th edition,Duxbury.

    2. Hogg R.V. & Tanis E.A., Probability and Statistical Inference,8th edition, Prentice Hall.

    3. Mann P.S., Introductory Statistics, 6th edition, John Wiley.

    Method of Assessment

    Chapter 1 - 1

  • 7/31/2019 Chapter_1_s_

    2/15

    Probability and Statistics I

    Chapter 1 Introduction

    Statistical Methods in Problem Solving

    Step 1 : Identify ProblemsStep 2 : Plan and Collect DataStep 3 : Classify and Simplify DataStep 4 : Analyze DataStep 5 : Draw Conclusions

    1.1 Definition of statistics

    The word statistics has 2 meanings.

    1. Statistics refers to numerical facts. The student population of UTAR is over 10,000 in 2010. The age of a student, the passing percentage of an

    examination,

    2. Statistics refers to the fieldor discipline of study. Statistics is a group of methods that are used to

    collect, organize, summarize, present and analyzedata, as well as to draw valid conclusions and to makereasonable decisions on the basis of such analysis.

    1.2 Types of Statistics

    No.

    Method of Assessment Total

    1. Courseworka) Test 1 :

    20%b) Test 2 :20%

    40%

    2. Final Examination 60%

    Chapter 1 - 2

  • 7/31/2019 Chapter_1_s_

    3/15

    Probability and Statistics I

    1. Descriptive Statistics (or DeductiveStatistics) Consists of methods for organizing, displaying, anddescribing data by using tables, graphs, and summarymeasures. Deals with the description and analysis of a givengroup of data. Present information in a convenient, usable andcomprehensible form.

    2. Inferential Statistics (orInductive Statistics ) Consists of methods that use sample results to makedecisions or predictions about a population. Deals with the problems of making inferences ordrawing conclusions about population based oninformation obtained from the samples taken from thepopulation.

    The reasons for learning statistics are the following:

    1.To know how to properly present and describe

    information.2.To know how to obtain reliable forecasts of variablesof interest.3.To know how to improve processes.

    4. To know how to draw conclusions about largepopulations based on information obtained from samples.

    Statistics

    Descriptivestatistics Inferentialstatistics

    Chapter 1 - 3

  • 7/31/2019 Chapter_1_s_

    4/15

    Probability and Statistics I

    1.3 Population versus Sample

    Population or Target PopulationConsists of all elements individuals, items, or objects whosecharacteristics are being studied.

    SampleA portion of the population selected for study.

    Example 1.1Suppose we wanted to study the height of the students of

    UTAR. To do so, 100 students in Year 1 Semester 2 wereselected and their heights were measured. State the

    population and the sample.

    Solution:

    Population Parameter

    A numerical measure (mean, median, mode, range, variance,standard deviation) calculated for a population data set.

    Sample statisticA summary measure calculated for a sample data set.

    Notations used for

    Population Parameters Sample Statistics

    population mean x sample mean

    2population variance s2sample variance

    population standarddeviation

    s sample standarddeviation

    p population proportion psample proportion

    Why Sample ?

    Chapter 1 - 4

  • 7/31/2019 Chapter_1_s_

    5/15

    Probability and Statistics I

    1. Save time2. Save cost3. Impossibility of conducting a census

    (a)Sometimes it is impossible to identify all member of thepopulation.

    Conduct a survey about the opinion of TV viewersabout a program. But, we dont know exactly whowatched the particular TV program.

    (b)Sometimes conducting a survey means destroying theitems included in the survey.

    Conduct a survey to estimate the mean life of lightbulbs. This would burn out all the bulbs included inthe survey.

    SurveyThe collection of information from the elements of a populationor a sample.

    Census- A survey that includes every member of the population.- Data obtained is known as census data.

    - It gives highly detailed information but it takes a long timeto analyze and is usually costly.

    - An example of census is the measurement of somevariables such as age, gender, results and year of studyfor all students in a college.

    Sample survey- The technique of collecting information from a portion ofthe population.- The data obtained from sample surveys is called sampledata.

    Methods of survey1. Observation2. Questionnaire

    Chapter 1 - 5

  • 7/31/2019 Chapter_1_s_

    6/15

    Probability and Statistics I

    Purpose of statisticsDrawing conclusions about thepopulation by studying thesample.

    Collect informationfrom a

    comparativelySMALL sample

    Draw conclusionsabout a LARGE

    population

    Example 1.2Suppose a television executive wants to know the percentage oftelevision viewers who watch the program ABC. Noted that

    100 million people may be watching television on a givenevening, how are you going to find out the percentage of TVviewers who watch ABC?

    1.4 Sampling technique

    Representative sampleA sample that represents the characteristics of the population

    as closely as possible.

    Random SampleA sample drawn in such a way that each element of the

    population has some chance of being selected.

    NoteFor the sample to be useful,(i) it must not be biased or the sample should be random.(ii) the sample must be taken from the correct population.(iii) the sample must be representative sample.

    Chapter 1 - 6

  • 7/31/2019 Chapter_1_s_

    7/15

    Probability and Statistics I

    Sampling FrameA list of all items in the population from which the samplewill be drawn is called a sampling frame. For a sample to betruly representative, it is important that the sampling frame

    should be complete, up to date and adequate for thepurpose.

    Four common ways to select a random sample are discussedbelow :(i) simple random sampling(ii) systematic random sampling(iii) stratified random sampling(iv) cluster sampling

    (i) Simple Random Sampling (SRS)A simple random sample is a sample that is selected insuch a way that each member of the population has thesame chance of being included in the sample.- One way to select a simple random sample is by

    mechanical process such as lottery, drawing or cardsshuffling

    - A table of random numbers may also be used. These

    numbers are generated by a random process.

    Chapter 1 - 7

    Sampling

    RandomSampling

    NonRandom

    - simple randomsampling- systematic random

    sam lin

    - convenientsampling- judgment

  • 7/31/2019 Chapter_1_s_

    8/15

    Probability and Statistics I

    93716 16894 98953 7323132886 59780 09958 1806592052 06831 19640 9941339510 35905 85244 35159

    The following steps can be used to obtain a simple randomsample of size n from a population of size N by using randomnumber tables.1. Assign a number to each element of the population from 1to N.2. Pick, randomly, a number starting with the same number of

    digits as N ( 2 digits if N=25, 4 digits if N=5000 ) from therandom number table.

    3. From the starting number, we can move in any directionuntil n numbers are chosen. Numbers chosen that are notfrom 1 to N, and repeating numbers will be ignored.

    4. This gives us a simple random sample of size n.

    Example 1.3Suppose there are 600 items in a population. A simple randomsample of size 10 is to be drawn from this population.

    Solution :

    (ii) Systematic Random Sampling

    In systematic random sampling, we first randomly selectone member from the first kunits (by lottery system or byusing a table of random numbers).

    k = [(population size) / (sample size)]where [a] is the largest integer that is less than or equal to

    a.Then every other k-th member, starting with the firstselected member, is included in the sample.

    Features:

    Chapter 1 - 8

  • 7/31/2019 Chapter_1_s_

    9/15

    Probability and Statistics I

    - Easy to be carried out.- Mainly used in factories

    Every k-th items produced by a machine is tested forquality control purposes.

    - Every member of the population does not have the same

    probability of being selected.

    Example 1.4Suppose there are 1000 voters in a village, and a survey is tobe done on them using a systematic random sample of size n =50.

    Solution :

    (iii) Stratified Random Sampling

    In a stratified random sampling, we first subdivide thepopulation into at least two subgroups (or strata) thatshare the same characteristics (such as gender or agebracket). Then, one sample is selected from each of thesestrata. The collection of all samples from all strata givesthe stratified random sample.Usually, the sizes of the samples selected from differentstrata are proportional to the sizes of the strata.

    Features:Chapter 1 - 9

  • 7/31/2019 Chapter_1_s_

    10/15

    Probability and Statistics I

    - A population that differs widely in the possession of acharacteristic is divided into different strata.

    - The elements in each stratum have the similarcharacteristic.

    Example 1.5A study is carried out to estimate the mean income of thehouseholds of a village.Suppose we have 500 households in that village.First, divide the villagers into 3 different groups based onincome levels :

    Stratum 1 : low-income (stratum size:

    N1=100)Stratum 2 : medium income (stratum size:

    N2=150)Stratum 3 : high income (stratum size:

    N3=250)where the population size is N=N1+N2+N3. A stratified randomsample of size n = 40 can be obtained as follows:

    Determine the sample size ni , of random sample to be takenfrom ith stratum using

    ( )i in N N n= , i = 1, 2, 3 where n = n1 + n2 + n3.Combining the 3 samples from each stratum will form astratified random sample.

    The stratified sampling is an improvementover a pure randomsample as it lessens the probability of one-sidedness.

    (iv) Cluster SamplingDivide the population into groups called clusters such thateach group is a representative of the population. Theclusters are non-overlap to each other. Then, a randomsample of clusters is selected.Finally, a random sample of elements from each of theselected clusters is selected. These random samples willthen be grouped and form a cluster random sample.

    Features:

    Chapter 1 - 10

  • 7/31/2019 Chapter_1_s_

    11/15

    Probability and Statistics I

    - Each cluster is a representative of the population.

    Example 1.6Suppose 500 boxes of milk are to be sent to a shop, and eachbox contain 10 bottles of milk. A cluster random sample of size

    n = 50 bottles of milk is to be obtained.Solution :

    1.5 Nonrandom sample

    In a nonrandom sample, some element of the population maynot have chance of being included in the sample.

    Three types of nonrandom samples are :(i) Convenience sample

    A sample that includes the most accessible members of thepopulation.

    (ii) Judgment sampleA sample that includes the members which are selectedfrom the population based on the judgment or priorknowledge of an expert or experimenter.

    (iii) Quota sampleA sample selected in such a way that each group orsubpopulation is represented in the sample in exactly thesame proportion as in the target population.

    Chance error and systematic error

    Chapter 1 - 11

  • 7/31/2019 Chapter_1_s_

    12/15

    Probability and Statistics I

    Chance Error or Sampling Error- The difference between the result obtained from sample

    survey and the result thatwould have been obtained from census survey.

    - It occurs because of chance, and it cannot be avoided. Asampling error can occuronly in sample survey. It does not occur in a census.

    Nonsampling or Systematic error- The errors that occur in the collection, recording, and

    tabulation of data.- It happens because of human mistakes and not chance. It

    can be minimized ifquestions are prepared carefully and data are handled

    cautiously.

    (1) Selection error- Occurs because of the sampling frame is notrepresentative of the population. select a sample by using a telephone directory survey conducted by a magazine that includes only

    its own readers

    (2) Response error- Occurs when people included in the survey do notprovide correct answers.

    Types of errors

    Sampling error

    or chance error

    Nonsampling error

    or systematic error

    Selection error

    Response error

    Nonresponseerror

    Voluntaryresponse error

    Chapter 1 - 12

  • 7/31/2019 Chapter_1_s_

    13/15

    Probability and Statistics I

    respondents will not disclose true incomes questions about race relations, answers given may

    differ depending on the race of the interviewer

    (3) Non-response error- Occurs because many of the people included in thesample do not respond to

    the survey. (response for mailing questionnaireusually low, ~ 30%)

    (4) Voluntary response error- Occurs when a survey is not conducted on a randomlyselected sample but

    people are invited to respond voluntarily to the survey(questionnaire published in a magazine or newspaper).

    Basic Terms

    Elementor MemberAn elementor memberof a sample or population is a specificsubject or object about which the information is collected.

    VariableA variable is a characteristic under study that assumes differentvalues for different elements.

    Observation or measurementAn observation or a measurementis the value of a variable foran element.

    Data SetData set is a collection of observations on one or morevariables.

    Give an example

    Chapter 1 - 13

  • 7/31/2019 Chapter_1_s_

    14/15

    Probability and Statistics I

    1.7 Type of variables and data

    1. Quantitative variable- A variable whose values can be measured numerically.

    - The data collected on a quantitative variable are calledquantitative data.

    - Examples : incomes, heights, sales, number ofaccidents etc.

    - Quantitative variables can be classified as discretevariables or continuous variables.

    Discrete variable(i) A variable whose values are countable and usually

    integer-valued.(ii) It can assume only certain values with no intermediate

    values.Example : The number of cars sold in one month etc.

    Continuous variable(i) A variable whose values cannot take exact value.

    (ii) The precision depends on the instruments.(iii) Assume any numerical value over a certain interval or

    intervals.Example : The height of students etc.

    2. Qualitative or Categorical variable(i) A variable that cannot assume a numerical value but

    can be classified or ranked into two or more

    nonnumeric categories.(ii) The data collected on such a variable are called

    qualitative data.Example: Beauty, intelligence, gender, etc.

    3. Cross-Section Data versus Time-Series DataBased on the time over which the data are collected, datacan be classified as either cross-section data or time-series

    data.

    Chapter 1 - 14

  • 7/31/2019 Chapter_1_s_

    15/15

    Probability and Statistics I

    Cross-section dataData collected on different elements at the same point intime or for the same period of time.Example : Total population of each state of Malaysia in year

    2000.

    Time-series dataData collected on the same element for the same variableat different points in time or for different periods of time.Example : New life insurance policies purchased between

    1995 and 2000.Example:

    The following table lists the crude oil reserves (in billions of

    barrels) for six countries with the largest reserves as ofJune 2004.

    Country OilReserves

    Saudi Arabia 261.7Iraq 112.0Kuwait 97.7Iran 94.4United ArabEmirates

    80.3

    Venezuela 64.0

    Chapter 1 - 15