statistical concepts: introduction
DESCRIPTION
This contains some important concepts in statistics and methods of research. It is a good material for beginners who plan to explore or write a thesis or dissertation.TRANSCRIPT
IN STATISTICS
04/10/23 [email protected]
Think of these…Crime rate Unemployment
figures2010 BAR Passing
rateMortality ratesGasoline pricesProportion of voters
favoring a candidateEnrolment trendDrop-out rate
04/10/23 [email protected]
• Number of Accident per year
• Annual growth rate• Monthly income• Annual budget• Shooting average • Registered vehicles
annually • Ratio of male
teachers to the female
• Average life span
Numerical Numerical descriptions…descriptions…
Statistics04/10/23 [email protected]
ESTIMATES
PREDICTIONSPREDICTIONSDECISIONS
Statistics is a branch of mathematics that deals
with the methods of collection, presentation,
analysis and interpretation of data.
04/10/23 [email protected]
NATURE OF STATITICS
04/10/23 [email protected]
It is concerned with the gathering, classification, and presentation of data and summarizing the values to describe the group characteristic.
04/10/23 [email protected]
It pertains to the methods dealing with making of inference, estimate or prediction about a large set of data (population) using the information gathered from a sample.
04/10/23 [email protected]
• Population refers to groups or aggregate of people, animals,
subjects, materials, events, or things of any form.
• Samples are elements of the population selected through a process. They have of the same
characteristics with the population.
04/10/23 [email protected]
04/10/23 [email protected]
• Parameter – It is a descriptive measure of the population. Greek letters are used to represent parameters, e.g. population mean μ, population standard deviation σ, etc.
• Statistic – It is a descriptive measure of the sample. Roman letters are used for statistic, e.g. sample mean x, sample standard deviation s, etc.
04/10/23 [email protected]
•Raw Data•Grouped Data•Primary data•Secondary Data
DataData are any bits or collection of information, ideas, figures or concepts.
04/10/23 [email protected]
Try asking some Fourth Year students to give you his age, date of birth, ethnic group, religion, birth order, occupation of his father, occupation of her mother, educational background of his parents, place of birth, ambition, favorite subject, most liked Grade school teacher and hobbies – any information he will feed you are
basically RAW DATA.
Try asking some Fourth Year students to give you his age, date of birth, ethnic group, religion, birth order, occupation of his father, occupation of her mother, educational background of his parents, place of birth, ambition, favorite subject, most liked Grade school teacher and hobbies – any information he will feed you are
basically RAW DATA.
Grouped Data – those data placed in tabular form characterized by category or class intervals with the corresponding frequency
Ethnic Groups FrequencyIlongo 24Ilocano 56Cebuano 78Tagalog 52Bicolano 9Maguindanaon 23Maranao 21Total 26304/10/23 [email protected]
English Grades Frequency75 – 79 480 – 84 1685 – 89 2790 – 94 595 - 99 2Total 54
Age Bracket Frequency10 – 19 4020 – 29 2630 – 39 1740 – 49 5250 - 59 20Total 155
Grouped Data
class interva
ls
04/10/23 [email protected]
Primary Data – data are measured and gathered by the
researcher who published itYou submit a statistical data to your Professor regarding the educational profile of the teachers in your school which you yourself had gathered through interview.
Educ'l Attainment PercentageBSED 13%BEED 26%AB w/ Educ Units 10%BEED w/ MA units 45%Master's Degree Holder 3%MA w/ doctoral units 3%
Total 100%
Table 1. Educational Profile of Teachers in Balintong Elementary School, SY 2012-2013
04/10/23 [email protected]
Secondary Data – data being republished by another researcher for agency
PNARUs Officers and EPs Percentage
NARF 5,622 53%4TH NCRes Bn
268 3%
30TH NARG 1,107 10%502ND NRS 199 2%503RD NRS 125 1%705TH NRS 1,667 16%706TH NRS 1,561 15%
Total 10,549 100%
Table 4. Personnel Capability of the Philippine Navy Affiliated Reserve Units (PNARUs)
Source: NAVRESCOM, 2010
This data is lifted from an original
source by Col Robles
(2011) and aptly included in his study
on PNARUs.
04/10/23 [email protected]
Monthly Income Percent
below 7,500.00 47.60
7,501.00 - 10,000.00 18.80
10,001.00 - 12,500.00 14.70
12,501.00 - 15,000.00 5.80
above 15,000.00 13.10Total 100.00
Table 6. Monthly Income of the Parents of the Senior High School Students in Arakan Valley, Division of Cotabato, SY 2010-2011
Source: Alpajando, 2011
Secondary DataSecondary Data
If this data
would be used in another study, then it
turns into a
secondary data.
04/10/23 [email protected]
• It is a characteristic or attribute of the experimental unit (persons, units or objects) which assumes different values or labels.
• The process of assigning value or label of a particular experimental unit
is called measurement.
04/10/23 [email protected]
Quantitative Variables – When measured from the experimental units, they yield numerical responses.
Examples
height, age, income, family size
Age - 15, 18, 29, 45, 54, 60 Family size – 2, 4, 5, 8Height – 150 cms, 164 cms04/10/23 [email protected]
04/10/23 [email protected]
• Discrete Variables • Continuous Variables
Discrete variablesDiscrete variables assume a finite or countable infinite values such as 0, 1, 2, 3, etc.
Ex: number of students number of students population of teachers population of teachers
score in a testscore in a test
female Senators female Senators
04/10/23 [email protected]
Continuous variables cannot take finite values. These values are related with points on an interval of the real line.
Ex: Height - 23.3 cm, 23.456 m, 123.8 ft
Mass – 28.56 kgs, 8.36 lbs04/10/23 [email protected]
04/10/23 [email protected]
• Nominal• Ordinal• Interval • Ratio
Nominal Level is the crudest form of measurement. The numbers or symbols are used for the purpose of categorizing forms into groups. The categories are mutually exclusive, that is, being in one category automatically excludes another.
Ex: Gender (F – Female; M – Male)
Faculty (1 – Tenured; 0 – Non-tenured)
Response (1- Yes, 0 - No)
04/10/23 [email protected]
Student Attitude
1 – Strongly Disagree
2 – Slightly Disagree
3 – Disagree 4 – Moderately
Agree 5 – Strongly Agree
Ordinal Level is a sort of improvement of nominal level because data are ranked from the “bottom to the top” or from the “low to high” manner. Statements such as “greater than” or “lesser than” may be used in this level. Administrative
Performance • Excellent -1 • Very Satisfactory - 2• Good - 3• Fair - 4• Poor - 5
Examples:
04/10/23 [email protected]
Interval Level possesses both the properties of the nominal and ordinal levels. The distances between any two numbers on the scale are known and it does not have a stable standing point (or an absolute zero).
Ex: temperature
04/10/23 [email protected]
Ratio Level possesses all the properties of nominal, ordinal and interval levels. In addition, it has an absolute zero point and data can be classified and placed in a proper order to compare their magnitudes. ZeroZero stands for of something or absence absolutely nothing. Ex: grades
income tuition fees
04/10/23 [email protected]
04/10/23 [email protected]
Sampling techniques are used to economize (on the part of the researcher) the following:
Time Effort
Money
Sampling techniquesSampling techniques
are classified into: are classified into:
• probability sampling• non-
probability sampling
04/10/23 [email protected]
PROBABILITY SAMPLING
It is a method of selecting a sample (n) from a universe (N) such that each member of the population has an equal chance of being included in the sample and all possible combinations of size (n) have an equal chance of being chosen as the sample.
04/10/23 [email protected]
NON-PROBABILTY SAMPLING
It is a method wherein the manner of selecting a sample (n) from a universe (N) depends on some inclusion ruleinclusion rule as specified by the researcher. 04/10/23 [email protected]
04/10/23 [email protected]
• Simple Random (Lottery) Sampling
• Systematic Sampling• Stratified Sampling
• Cluster or Area Sampling• Multi-stage Sampling
04/10/23 [email protected]
04/10/23 [email protected]
Ex: N = 100, n = 25N/n = 100/25
= 4
• This means every 4th
element in a series should be taken as a sample.
This method still uses the concept of
random sampling and involves the selection of the nth element of a
series representing the
population.
04/10/23 [email protected]
1 11 21 31 41 51 61 71 81 91
2 12 22 32 42 52 62 72 82 92
3 13 23 33 43 53 63 73 83 93
4 14 24 34 44 54 64 74 84 94
5 15 25 35 45 55 65 75 85 95
6 16 26 36 46 56 66 76 86 96
7 17 27 37 47 57 67 77 87 97
8 18 28 38 48 58 68 78 88 98
9 19 29 39 49 59 69 79 89 99
10 20 30 40 50 60 70 80 90 100
04/10/23 [email protected]
This is a random sampling technique in
which the population is divided into non-
overlapping subpopulations called
strata.
Respondents
n
Administrators
10
Teachers 50Students 100Parents 50
STRATIFIED SAMPLESSTRATIFIED SAMPLES
Gender n
Female 170
Male 250
Schools nPublic 20Private non-sectarian 10Private sectarian 10
04/10/23 [email protected]
04/10/23 [email protected]
barangays in a municipality municipalities in a province
This is a random sampling technique in which
the population is divided into
non-overlapping clusters or area.
04/10/23 [email protected]
Ex: Region – 1st levelProvince – 2nd
LevelCity – 3rd Level Barangay – 4th
Level
A technique that considers different stages or phases in sampling.
04/10/23 [email protected]
MULTI-STAGE SAMPLINGMULTI-STAGE SAMPLING
04/10/23 [email protected]
04/10/23 [email protected]
• Purposive SamplingPurposive Sampling
It is based on a criteria It is based on a criteria or qualifications given by or qualifications given by the researcher. Those who the researcher. Those who will satisfy the criteria are will satisfy the criteria are included. included.
• Quota Sampling
It is quick and cheap since the interviewer is given a definite instruction and quota about the section of the population he is to work on.
The final choice of the actual person is left to his preference.
NON-PROBABILITY SAMPLING TECHNIQUES
NON-PROBABILITY SAMPLING TECHNIQUES
04/10/23 [email protected]
04/10/23 [email protected]
• Convenience SamplingConvenience Sampling
It uses some instruments or equipment that provide convenience like the telephone or hand set to pick his samples units.
That means, people with no telephones can not be given a chance at all.
04/10/23 [email protected]
How many samples do How many samples do we need to use we need to use sufficiently in our sufficiently in our study?study?
Is this number enough Is this number enough for the study?for the study?
Will it give a valid Will it give a valid result for the study? result for the study?
04/10/23 [email protected]
This equation is commonly used by statisticians to determine the samples when the population is equal or more than 500.
Nn = ----------------- (1 + e2 N)
wherewhere
n = the desired number of n = the desired number of
samplessamples
N = total populationN = total population
e = sampling errore = sampling error
e = 0.05, 0.02 or 0.01 (arbitrary)e = 0.05, 0.02 or 0.01 (arbitrary)
Case 1:
A study is to be conducted in a big School Division of 25,000 students. Determine the appropriate sample
using a 5% sampling error.
Solution:
n = [N/1 + e2N]
= {25,000/[1 + (0.05)(.05)
(25,000)]}
= 393.7 or
≈ 394 students 04/10/23 [email protected]
04/10/23 [email protected]
Descriptive Research – Descriptive Research –
10% of the population (20% for smaller N)10% of the population (20% for smaller N)
Correlational Research - Correlational Research - 30 subjects30 subjects
Ex-post Facto Research - Ex-post Facto Research - 15 per group15 per group
Experimental Research - Experimental Research - 15 subjects per group15 subjects per group
Where Zα/2 is the confidence level value
At 99% confidence level, Zα/2 = 2.58
At 95% confidence level, Zα/2 = 1.96
At 90% confidence level, Zα/2 = 1.65
04/10/23 [email protected]
, N = population n = desired sample size
p = largest possible proportion (0.50)
e = sampling error
e = 0.01 for 99% confidence level
e = 0.05 for 95% confidence level
e = 0.10 for 90% confidence level
04/10/23 [email protected]
1000 (1.96) 2 [0.50 (1 – 0.50)] n = --------------------------------------------------- 1000 (.05)2 + (1.96)2
[0.05(1 – 0.05)]
= 277.54 or 278
where Zα/2 is the confidence level value At 99% confidence level, Zα/2 = 2.58 At 95% confidence level, Zα/2 = 1.96 At 90% confidence level, Zα/2 = 1.65
E = allowable error (±E) in the estimate of the true value of μn = desired sample size
04/10/23 [email protected]
SAMPLE SIZE FROM THE ESTIMATION OF SAMPLE SIZE FROM THE ESTIMATION OF μμTHIS CAN BE USED
WHEN THE POPULATION IS NOT
KNOWN.
04/10/23 [email protected]
NN11
nnii = -------- x n; for i = 1, 2, 3,.. = -------- x n; for i = 1, 2, 3,..
NNwhere n = the total size of the
stratified random sample
N = total population
N1 = number of the 1st stratum elements
N2 = number of the 2nd stratum elements
N3 = number of the 3rd stratum elements
PROPORTIONAL ALLOCATIONPROPORTIONAL ALLOCATIONn1 = [119/1000](286)
= 34 (seniors)
n2 = [210/1000](286)
= 60 (juniors)
And so with n3, n4, and n5.
Strata Population (N)
Seniors 119
Juniors 210
Sophomores 325
Freshmen 346
Total 1000
n = 286 (desired samples)
04/10/23 [email protected]
Strata Population
(N)Sample
(n)
Seniors 119 34
Juniors 210 60
Sophomores 325 93
Freshmen 346 99
Total 1000 286
PROPORTIONAL ALLOCATION
PROPORTIONAL ALLOCATION
04/10/23 [email protected]
04/10/23 [email protected]
The choice of the appropriate methods to be used in gathering of data depends mainly on some factors. These include:
the nature of the problem
the population under investigation
the time
the material factors
04/10/23 [email protected]
Direct or Interview Method Indirect or Questionnaire Method Registration Method
Other Methods Other Methods ObservationObservation Phone interviewPhone interview ExperimentsExperiments
04/10/23 [email protected]
Direct or Interview Method
04/10/23 [email protected]
It is one of the easiest methods of data gathering.
It takes time to prepare because questionnaires need to be attractive.
The content of a typical questionnaire, directions included, must be precise, clear and self-explanatory.
04/10/23 [email protected]
Examples:Examples: Marriage Marriage registrationregistration birth certificatesbirth certificates vehicle vehicle registrationsregistrations firearms licenses , firearms licenses , etcetc
Registration Method
04/10/23 [email protected]
Observation• It is utilized to It is utilized to gather data gather data regarding regarding attitudes, behavior, attitudes, behavior, values, and cultural values, and cultural patterns of the patterns of the samples under samples under investigation.investigation.
Phone Interview
It is employed if the questions to be asked are brief and few.
04/10/23 [email protected]
Experiments
04/10/23 [email protected]
It is applied to collect or gather data if the investigator wants to control the factors affecting the variable being studied.
04/10/23 [email protected]
Data needs to be Data needs to be organized to show organized to show important properties important properties that may help in the that may help in the analysis and analysis and interpretation.interpretation.
04/10/23 [email protected]
• In this form, the presentation is in
narrative or paragraph mode.
•The data are within the text of the paragraph.
• In most cases, it cannot not get the immediate interest of the reader but it can present a more comprehensive picture of the data because of its written explanation.
04/10/23 [email protected]
• The data shows the grades of a student in The data shows the grades of a student in
the First Quarter. As indicated, he got an the First Quarter. As indicated, he got an
excellent grade in Values Education (96). On excellent grade in Values Education (96). On
the other hand, he achieved the same level of the other hand, he achieved the same level of
performance in both Filipino and English (90). performance in both Filipino and English (90).
As shown also, he gained fair performance in As shown also, he gained fair performance in
Science and Social Studies where he got 89 Science and Social Studies where he got 89
and 86, respectively. With a grade of 80, it and 86, respectively. With a grade of 80, it
only suggests that he finds Math a difficult only suggests that he finds Math a difficult
subject. subject.
04/10/23 [email protected]
• In this form, the presentation makes use of
rows and columnsrows and columns like a frequency table or distribution.
• The data are presented in a systematic and orderly manner
which catches one’s attention and may facilitate the
comprehension and analysis of the data presented.
Subject Areas
First Quarter Grades
Math 80English 90Science 89Social Studies 86Filipino 90Values Education 94
04/10/23 [email protected]
Average 88.17
ILLUSTRATIVE EXAMPLEILLUSTRATIVE EXAMPLE
TABULAR PRESENTATIONTABULAR PRESENTATION
Gender Frequency Percent
Male 20 40%
Female 30 60%
Total 50 100%
04/10/23 [email protected]
04/10/23 [email protected]
• In this form, the numerical data in a frequency distribution can be made more interesting and easier to understand when presented in pictures or geometrical representations.
04/10/23 [email protected]
04/10/23 [email protected]
04/10/23 [email protected]
GRAPHICAL PRESENTATION (Pie Graph)GRAPHICAL PRESENTATION (Pie Graph)
04/10/23 [email protected]
GRAPHICAL PRESENTATION (Cylindrical Graph)GRAPHICAL PRESENTATION (Cylindrical Graph)
04/10/23 [email protected]
04/10/23 [email protected]
Figure 1. The Ethnic Profile of PhD Students in SKSU Figure 1. The Ethnic Profile of PhD Students in SKSU Graduate Studies Program iGraduate Studies Program i
04/10/23
Ethnic Groups FrequencyIlongo 20Bicolano 5Tagalog 2Ilocano 3
Total 30
Table 1. The Ethnic Profile of PhD Students at SKSU Graduate Extension Program in Iloilo City
Category or label Category or label
04/10/23 [email protected]
04/10/23 [email protected]