statistics and prob

press button to start

Statistics and Probability

STATISTICS – deals with the collection, organization, presentation, analysis and interpretation of numerical data used as information for decision making. Descriptive Statistics – is a field of statistics that does not involve any generalization. Branch of science that deals with the methods concerned with the collection and description of data .This includes any thing done to the data that is designed to summarize or describe it without attempting to infer anything that goes beyond it.

Inferential Statistics- makes use of generalizations, predictions, estimations or approximations in the face of uncertainty .

Methods of Collecting Data1.Direct method- data is collected through the use of

interviews. The enumerator talks to the subject personally. He gets the data through a series of questions asked from the subject of the interview.

2.Indirect method- data is collected through the use of questionnaires.

3.Observation- Information is gathered by recording the behavior , attitude , or attribute of items, persons or group of items or persons at the time of occurrence.

Methods of Collecting Data4.) Experimentation-data is usually gathered through experiments in laboratories and classrooms.5.) Registration-data are acquired from private and government agencies such as from the National Statistics Office, the Bangko Sentral ng Pilipinas, Department of Finance, etc.

Ways in Presenting Data

1.Textual form- data and information are presented in paragraph and narrative form.

2. Tabular form- Quantitative data are summarized in rows and columns.

3.Graphical form- data are presented in charts, graphs or pictures.

Year Level Number of Students First Year 35

Second Year 50

Third Year 48

Fourth Year 24

Population and Sample

Population- is a set of all data that characterizes some phenomenon of interest. That is, the totality or collection of all elements to be studied. The population is also the universe set..

Sample – is a representative portion of the population

Census – is the process of gathering information from every unit in the population.

Survey-is the process of obtaining a representative portion of the population.

Variable –is a characteristics that changes or varies over time and for different individuals or objects under consideration.Quantitative and Qualitative Variable

Qualitative variable- measures a characteristics on each individual or object

Quantitative variable-measure a numerical amount on each individual or objects

Discrete and Continuous Variable

Discrete variable- can assume only a finite or countable number of values.

Continuous variable- can assume an infinite number of values corresponding to the point on a line interval.

Measurement Scales

1.) nominal level -the first level of measurements that consists of names, labels or categories only in which no order or ranking can be imposed. Example: Gender (male and female ), Marital Status (single, married, separated) ,employment (business, construction, engineering, education and etc. )2.) ordinal level- data measured can be ordered or rank but precise differences do not exist. Example : Income Distribution (low income, middle income and upper income), Body build (small, medium ,large)

.

Measurement Scales

3.) interval level –consist of data that may be arranged and meaningful amount of differences between data values can be determined, however, there is no meaningful zero. Example: Temperature, score in a particular examination4.) ratio level- consist of data that may be arranged and meaningful amounts of differences between data values can be determined and ratios between data values are meaningful. Example :Weight, Height, Age

.

Determine which level is most appropriate in measuring each of the following data.

1. SSS number2. Weight of a package3. Size of a family4. t-shirt size (small, medium, large, extra large)5. religion6. Speed of a car in km/hr7. SASE rating.

.

Two sources of errors 1. Sampling Errors

2. Non-sampling errors

Sampling errors- result from the actual sampling process such as sampling techniques, small sample size and the fact that no sample can be expected to be perfect represntation of the entire population.

Non-sampling errors- arise from other external factors not related to sampling, such as a defective measuring instrument, missing values, error in coding or recording data, or a discovered bias in the sample.

Sample design- is a definite plan, determined before any data are actually collected for obtaining a sample from a given population.

Methods of Sampling

1. Non-probability Sampling-is a procedure of sampling wherein some elements of

the population have no possibility of being drawn into the sample.

2. Probability Sampling-Is a process of sampling wherein each element in the

population and each possible sample has a nonzero probability of selection.

Non-Probability Sampling

1.) Purposive

2.) Judgemental

3.) Quota

4.) Convenience

5.) Snowball or Chain

Purposive-select sample that agrees with the profile of the population based on some pre-selected characteristics.

Judgemental-select a sample on the basis an “ expert’s” opinion, or on the judgement of the person or people talking to the sample.

Quota- select a specified number of units possessing certain characteristics with the actual selection being left to the researcher’s discretion.

Convenience- sometimes called accidental, grab or opportunity sampling.Use results that are readily available.

Snowball or Chain- select a sample where existing study subjects are used to recruit more subjects into the sample.

Probability Sampling

random sampling- the process of selecting random sample.

Simple random sampling- most frequently used and simplest probability sampling procedure. In simple random sampling , all the possible samples are equally likely.

Steps in Generating a Simple Random Sample:

1. Number the elements of the population from 1 to N2. Select n numbers from 1 to N using a random process recording each

number to identify the corresponding population element to be included in the sample.

Systematic random sampling-every kth element of the population is selected with the first unit being selected at random.

Steps in Generating a System Random Sample 1. Number the elements of the population from 1 to N 2.Determine the sampling interval k, by 3. Select at random the first element (random start) r of the sample from the first k elements of the population. That is 1. 4. take every kth element from the random start as part of the sample. 5. Continue the process until the required number of samples is acquired.

Stratified random sampling

Steps in generating a stratified random sample of size n from a population of N elements (Proportional and Optimum Allocation)

1. Classify the population into homogeneous strata2. Draw a sample from each homogeneous stratum3. The sample size of the stratum of size N, from a population of size N is :

Cluster Random Sampling-one or more partitions are selected at random and random samples of elements from each of the selected partitions are drawn.

Steps in generating a cluster random sampling

1. Partition the population into cluster.

2. Select at random one or several clusters.

CHAPTER II

DEFINITION

Raw Data- data collected in original formFrequency Distribution Table-is a summary of the distribution of observations in a systematically organized rows and columns. 1. One-way Frequency Distribution-tabular presentation where data are grouped or categorized into different classes and then the number of observations that fall in each of the classes is recorded.

2. Two-way Frequency Distribution- the data are grouped according to two variables. It is also called a cross- tabulation or a contingency table. Distribution of respondents according to Year Level and Gender

Weights (in kg) of Statistics and Probability Students

63 59 43 60 41 53 56 8150 66 62 52 49 48 52 4064 64 47 53 47 54 62 5665 53 50 47 79 70 45 4746 58 56 55 56 45 73 49

Construction of Frequency Distribution

1. Find the range (R) of the raw data: the range is the difference between the largest value and the smallest value. That is,

R=(highest value) - (lowest value)2.Determine the class interval, kor where =number of observations Note :( round offto the nearest whole number)

3. Determine the class size or class width, c: This is obtained by dividing the range of the raw data by the number of classes. But the result is rounded up to the nearest higher value whose precision is the same as those of the raw data.


4. Construct the lower class limit and the upper class limits.

List the lower class limit (LL) of the first class. The starting lower limit could be the lowest value or any smaller number close to it. List the lower limits of the succeeding classes by simply adding c (the class width) to the lower limit of the preceding class. .


The upper limit (UL) of the first class can then be obtained by subtracting one unit of measure from the lower limit of the next class. The upper limits of the rest of the classes can be then obtained in a similar fashion or by adding c to the upper limit of the preceding class.


5. Tally the frequencies for each class constructed

a. Class Boundaries (CB) – If the data are continuous, the CB’s reflect the continuous property of the data. The CB’s are obtained by taking the midpoints of the gaps between classes.

LCB= LL- UCM= UL +


b. Class Mark ) – is the midpoint of a class or interval or

c. Relative frequency- is the frequency of a class expressed in proportion to the total number of observation

d. Cumulative Frequency (- is a accumulated frequency of a class. It is the total number of observations whose values do not exceed the upper limit or boundary of the class.

GRAPHICAL PRESENTATION

Type of Graphs

1. Bar Graph

2. Histogram

3. Frequency Ogive

4. Frequency Polygon

5. Pie Chart

Bar Graph – is a graph where different classes are presented by rectangles or bars. The width of the rectangle is the length of the interval, represented by the class limit in the horizontal axis, or categories for nominal data. The length of the rectangle, corresponding to the class frequency, is drawn in the vertical axis.

Histogram- closely resembles the bar chart with the basic difference that a bar uses the class limit for the horizontal axis while the histogram employs the class boundaries. Using the class boundaries eliminates the spaces between rectangles, thus giving it a solid appearance.

Frequency Ogive- represents a cumulative frequency distribution. It is constructed by plotting class boundaries on the horizontal scale and the cumulative frequency less than the upper class boundaries in the vertical scale.

Frequency Polygon- is constructed by plotting the class marks against the frequency .Straight lines then connect the set of points formed by the class marks and their corresponding frequencies together with additional class marks at the beginning and the end of the distribution.

Pie Chart- is a circle divided into pie-shaped sections, which look like slices of a pizza. The angle of a sector is proportional in size to the frequencies or percentages but it is advisable to convert the frequency table into percentages.

Group Data Vs. Ungrouped Data

Ungrouped Data

- Data is in original form and structure

Grouped Data- Data are placed into systematic wherein they are organized. This procedure of organizing data is called a frequency distribution table (FDT)

MEASURE OF CENTRAL TENDENCY FOR

UNGROUPED DATA

1.Arithmetic Mean- the sum of all the values divided by the number of

values. -the quotient of the sum of all observations and the total number of observation2.Median –is the middle most value when the observations

are arranged either in ascending or descending order.3.Mode –is the value in the data set that occurs most

frequently.

MEANFor Population: If are the individual scores in a population of the N, then the population mean is defined as: For Sample: If are the individual scores in a sample of a size n, then the sample mean is defined as :

MEANExample:

1.) Find the mean of the population 7 2 3 7 6 9 10 8 9 9 102.) Find the mean of the sample

0.14 0.16 0.12 0.17 0.21 0.17

MEDIAN

Population median and Sample median

Median = If n is odd

If n is even

MEDIAN

Example:

1.) 7 2 3 7 6 9 10 8 9 9 10

2.) 0.14 0.1 6 0.14 0.17 0.21 0.18

MODE

Example:

1.) 9 , 10 , 10 , 11, 12, 14, 16

2.) 51, 51, 54, 54, 57, 57, 58, 58, 59, 59, 60, 60

3.) 6.1, 6.2, 6.2, 6.2, 6.3, 6.4, 6.5, 6.6, 6.6, 6.6 , 6.7, 6.8

Thank You

statistics and prob

Education