correlation formula

22
Correlation Formula CORRELATION= is a measure of relationship between two variables. Coefficient of correlation determine validity, reliability and objectivity of an examination prepared. It also indicates the amount of agreement or disagreement between groups of scores, measurements, or individuals. Interpretation of Ranges + 0.00 to + 0.20 –Slight correlation, almost negligible relationship + 0.21 to + 0.40 –Slight correlation, definite but small relationship + 0.41 to + 0.70 –moderate correlation, substantial relationship + 0.71 to + 0.90 –High correlation, marked relationship + 0.91 to + 1.00 –Very high correlation, very dependable relationship Coefficient of correlation Spearman’s Formula: R=1-[(6(ΣG)/N2-1] Where: G=Difference of the two ranked scores N=Number of scores Procedure: 1. Write the scores or measures of the two variable under column x and column y 2. Rank the scores under column x , with the highest score as rank 1 and the lowest score as rank N. Write the ranks of the scores under column Rx which means rank of x 3. Rank the scores under column y with the highest scores as rank 1 and the lowest score as rank N. 4. Subtract the Ry values from the Rx values. Write the difference under column G, means gain. Consider only the positive values. Coefficient of correlation by the use of the Rank-Difference Method: rho=1-[(6(ΣD2)/N(N2-1)] Procedure: 1. Follow the same steps from 1 to 3 in the Spearman’s

Upload: athan114

Post on 16-Nov-2014

119 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Correlation Formula

Correlation Formula

CORRELATION= is a measure of relationship between two variables.Coefficient of correlation determine validity, reliability and objectivity of an examination prepared. It also indicates the amount of agreement or disagreement between groups of scores, measurements, or individuals.Interpretation of Ranges+ 0.00 to + 0.20 –Slight correlation, almost negligible relationship+ 0.21 to + 0.40 –Slight correlation, definite but small relationship+ 0.41 to + 0.70 –moderate correlation, substantial relationship+ 0.71 to + 0.90 –High correlation, marked relationship+ 0.91 to + 1.00 –Very high correlation, very dependable relationshipCoefficient of correlation Spearman’s Formula:R=1-[(6(ΣG)/N2-1]Where:G=Difference of the two ranked scoresN=Number of scoresProcedure:1. Write the scores or measures of the two variable under column x and column y2. Rank the scores under column x , with the highest score as rank 1 and the lowest score as rank N. Write the ranks of the scores under column Rx which means rank of x3. Rank the scores under column y with the highest scores as rank 1 and the lowest score as rank N.4. Subtract the Ry values from the Rx values. Write the difference under column G, means gain. Consider only the positive values.Coefficient of correlation by the use of the Rank-Difference Method:rho=1-[(6(ΣD2)/N(N2-1)]Procedure:1. Follow the same steps from 1 to 3 in the Spearman’s Formula.2. Find the difference between the two steps of ranks or values under column Rx and Ry. Subtract the larger value from the smaller value.3. Write the difference of Rx and Ry under column D, which means difference.4. Square the difference, D and write under column D2.5. Get the sum of the values under D2.Coefficient of correlation by the Product-Moment Method:rxy=Σdxdy/square root[(Σd2x)(Σd2y)]Procedure:1. Get the total of the data under test x and test y and find the mean of x and mean of y.2. Get the deviations dx and dy by getting the difference between the mean and the scores.3. Square dx to obtain d2x and dy to obtain d2y.4. Get the summation of each.5. Get the product of dx and dy to have dxdy6. Get the summation of dxdy. Posted by Statistics_Lecture at 9:10 AM 0 comments Labels: correlation

Page 2: Correlation Formula

Monday, February 2, 2009

Set Operations

1. With any two sets “A” and “B” there is associated a third set “C” satisfying the property that C = { X/X Є AV x Є B }

In words: “C” is equal to X, such that X is belong to “A” or X is an element of B

“C” is called the union of “A” and “B” we denote the set C symbolically as C = A U B

Example:A = {3, 4, 5, 6, 7}B = {2, 4, 6, 8, 10}A U B = {2, 3, 4, 5, 6, 7, 8, 10}

2. With any two sets A & B there is associated A third set “D” satisfying the property that D = { X/X Є A ۸ X Є B}

In words: D equals X such that X is an element of set “A” and x is a member of B.

“D” is called the intersection of sets A and B, and we denote the set D symbolically as D = A B

3. With any two sets A and B there is associated A third set “C” satisfying the property that C = { X/X Є A ۸ X € B}. We denote the set symbolically as C = A – B, and call C the relative complement or difference of A and B.

Example:A = { a, b, c, d, e, f}B = { a, e, i, o, u}A – B = {b, c, d, f} and B – A = {i, o, u}

4. If A is a subset of U, then the set of an elements contained in U that are not elements of A is called the complement of A in U and is designated by Ă then Ă = {X/X Є U ۸ X € A}

Example: Consider the universal set of an counting nos. and the set A of counting numbers less than 100 then

U = {1, 2, 3, 4, ……} A = { 1, 2, 3,……99}

Ă = {100, 101, 103…….}

5. The set product or cartesian product of two sets A and B is the set of an possible ordered pairs (a, b) where a is in A and b is in B. We symbolize this set of ordered pairs

Page 3: Correlation Formula

by A X B and write,A X B = {(a,b) / a Є A ۸ b Є B }

Example:

If A = {1, 2} and B = {x, y} then A X B = { (1, x), (1, y), (2, x) , (2, y}} and B X A = {(X, 1), (X, 2), (Y, 1), (Y, 2)} Posted by Statistics_Lecture at 10:35 PM 1 comments Labels: set operations

Tuesday, January 27, 2009

Kinds of set

1. Finite set – countableExample: Sets A, B, C, D are finite sets

2. Infinite set – uncountableExample: Set E is an infinite set

3. Empty or null set – has no elementExample: A = { }

4. Equal set – set A and set B are equal set if the elements of set A is exactly the element of set B.Example:A = {set of an even counting number of one digit} = {2,4,6,8}B = {set of an integral multiples of two having one digit = {2,4,6,8}

5. Equivalent set – two sets are equivalent if there exists a one-to-one correspondence between elements of the two sets.Example:A = {1, 2, 3, 4,5} - x coordinateB = {6, 7, 8, 9, 10} – y coordinate

then “A” is equivalent to B. We can construct the relation of set A and set B.

{ (1,6}, (2,7), (3,8), (4,4), (5,10) }

6. Subset – set whose elements are members of the given set A = {1,2,3,4,5,8}, B = {2,4,8}

7. Universal Set – totality of the given set with consideration. The set from which we select elements to form A given set is called universal.Example:Set A = {1, 2, 3, 4, 5, 8} is a universal set

Page 4: Correlation Formula

Set B = {2, 4, 8} is a subset of set A

8. Disjoint Set – sets that has no common element ; if two sets have no element in common, the sets are called disjoint sets. Posted by Statistics_Lecture at 5:03 PM 0 comments Labels: kinds of set

Friday, January 23, 2009

Methods of Writing Set

Example: Roster Method and Rule Method

Methods of Writing Set

1. Roster or tabular method

The elements of the set are enumerated and separated by comma.

2. Rule method or set builder

A, descriptive phrase is used to describe the elements of the set

Posted by Statistics_Lecture at 6:03 PM 0 comments Labels: roster method, set builder

Monday, December 29, 2008

Sets Definition and Examples

Set

Page 5: Correlation Formula

Definition:Set is a well-defined collection of things or objects

Note: Sets maybe denoted by capital letters such as A,B,X, Y

An element or member of a set is a thing that belongs to the set and maybe denoted by small letters such as a,b,c……..x,y.

The members of the set are enclose in braces { }, with a comma separating the members.

Example:

The set “A” whose members are ETHEL, CYNTHIA, CHELO, we usually use the symbol.

A = {ETHEL, CYNTHIA, CHELO}

ETHEL Є A - Read as ETHEL is an element of set A- Read as ETHEL belongs to set A- Read as ETHEL is a member of set A Posted by Statistics_Lecture at 8:56 PM 0 comments Labels: set

Wednesday, December 17, 2008

Empty Set and Set

- A set is a collection of things- An element or member of a set is a thing that belongs to the set.

* There are many words which we use in everyday language that have the same meaning as the word set.

Example:

1. A herb of cattle is a set of cattle2. A flock of birds is a set of birds3. A squadron of planes is a set of planes4. a school of fish is a set of fish5. A regiment of soldiers is a set of soldiers

* The members of the set are enclosed in braces, { }, with a comma separation the members and to identify sets we often name them by capital letters.

Page 6: Correlation Formula

Example:

1. The Set “A” whose members are Ethel, Emerson and Merecel. We usually use the symbol

A = {Ethel, Emerson, Merecel}

2. The Set “B” of days of the week

B = {Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday}

The set C of words to distinguish two faces of a coin

C = {Heads, Tails}

* The set that has no elements is called the empty set, we use the symbol Ǿ to indicate the empty set.

Example of Empty set:

1. the set of whole numbers by 9 and 10.2. the set of four-sided triangles.3. the set of astronauts who have landed on the planet Pluto4. the set of icebergs in the sahara desert5. the set of people with two heads6. the set of pink elephants7. the set of purple cows Posted by Statistics_Lecture at 6:10 PM 0 comments Labels: empty set, set

Tuesday, December 16, 2008

Permutation Formula and Example

Permutation

Each different arrangement or ordered set of object is caused a permutation of those objects.

- if A = {a1, a2, a3……. An} is any set of n elements then any arrangement of the elements of “A” along a line is called a permutation of the elements of A.

All the permutation of the elements of the set is given by the formula:

P = n! where n = no. of elements

Page 7: Correlation Formula

Problem:

How many permutations can be made from the word PINOY”

Solution:

PINOY – consist of 5 letters

P = 5! = 120 permutations

The total no. of permutations of n objects taken r at a time, P(n,r) is given by the expression.

P (n,r) = nPr = n!/(n-r)!

Problem:

Find the no. of permutations of the four integers 1,2,3,4 taken two at a time.

Solution:

n = 4, r = 2

4P2 = n!/(n-r)! = 4!/(4-2)! = 4!/2! = 4.3.2.1 / 2.1 = 12

Statistics Probability Sample Problems

1. At a certain canteen, Doris can choose merienda from three drinks (Coke, Pepsi, Gulaman) and four sandwiches from (bacon, chicken, tuna, egg). In how many ways.

Solution:

D = {Coke, Pepsi, Gulaman}N(D) = 3

Page 8: Correlation Formula

S = {Bacon, Chicken, Tuna, Egg}N(S) = 4

N1 . N2 = 3 x 4 = 12 ways

2. Two dice are rowed, in how many ways can they fall? If 3 dice are rowed? and if 4 dice are rowed?

For two diceN1 = 6N2 = 6N1.N2 = 6 x 6 = 36 ways

For three diceN1 . N2 . N36 x 6 x 6 = 216 ways

For four diceN1.N2.N3.N4 6 x 6 x 6 x 6 = 296 ways

3. Using the digits 1,2,3,4,5,6, How many two-digit can be formed if a) repetition is allowed b) repetition is not allowed. How many numbers do we have to choose from the given set, they are 6 numbers.

Solution:

a) Repetition is allowed6 x 6 = 36 ways

b) Repetition is not allowed6 x 5 = 30 ways Posted by Statistics_Lecture at 2:03 AM 0 comments Labels: probability, probability sample

Sunday, December 14, 2008

Statistics Probability: Definitions, Principles and Samples

Probability which connotes the “chance” or the “likelihood” that something will happen or occur is an interesting and fascinating area of mathematics.

Probability – the part of mathematics that deals with the questions “how likely” is called probability or the theory of probability.

Probability – is a measure of certainty, its scale is from 0 to 1. A probability of zero

Page 9: Correlation Formula

indicates that there is no chance at all that an event will happen or occur. A probability of one (1) indicates absolute certainty that an event will happen. Absolute certainly rarely happens in lifes.

1. Experiment

Activity that can be done repeatedly.

Examples:1. Tossing a coin2. Rolling a pie

2. Sample Space – set of all possible outcomes in an experiment(s)

Examples:a.) S = {H,T} n(S) = 2b.) S = {1,2,3,4,5,6} n(S_ = 6c.) S = {Rod, Ed, Emer} n(S) = 3

3. Sample Point – an element in the sample space

Examplesa.) H is a sample pointT is a sample point

4. Event – is a subset of sample space

Example:Getting an even number when you roll a die is an event

S = {1,2,3,4,5,6}E = {2,4,6}n (E) = 3

Counting Techniques

N1 . N2 . N3 . N4 …..Nn (where N = event)

Fundamental Principles

If one thing can be done independently in N1 different ways and if a second thing can be done independently in N2 different ways and so on. Then the total number of ways in which all the things may be done in the stated order is N1 . N2 . N3 . N4 ………. Posted by Statistics_Lecture at 12:03 AM 0 comments Labels: probability, probability sample, statistics probability

Page 10: Correlation Formula

Saturday, May 31, 2008

Mode

Mode = is that single measure or score which occurs most frequently. When data are grouped into a frequency distribution, the crude mode is usually taken to be the midpoint of that interval which contains the largest frequency.

When to use the mode:

1. When a quick and approximate measure of central tendency is all that is wanted.2. When the measure of central tendency should be the most typical value.

Finding mode from the ungrouped data:Example:

1. A set of numbers 11, 12, 13, 16, 16, 16, 19, 20 has 16 as the mode.2. A set of numbers 45, 49, 52, 55, 58 has no mode.3. A set of numbers 4, 4, 6, 8, 8, 8, 9, 9, 9, 10 has modes of 8 and 9 and is called bimodal.

Mode of grouped dataTo determine the mode of grouped data we have to find first the modal class. In a frequency distribution, the modal class can be easily determine by inspection as it is the class with the highest frequency.

Mo = Lmo + [ d1/d1 + d2 ] c

Where: Lmo = lower boundery of the modal classd1 = difference between the frequency of the modal class and the frequency of the class next lower in value.d2 = difference between the frequency of the modal class and the frequency of the class next higher in value.C = class size

Find the mode of example 3.8. table 3.1

Weekly wage ( in peso) f Lower class boundaryP 870-899 4 869.5900-929 6 899.5930-959 10 929.5960-989 13 959.5990-1019 8 989.51020-1049 7 1019.51050-1079 2 1049.5

Page 11: Correlation Formula

Mo = 959.5 + [ 3/3+5 ] 30 = P 970.75 Posted by Statistics_Lecture at 3:21 AM 0 comments

Median

b.) Median = in the midpoint of the distribution. Half of the values in a distribution fall below median and the other half fall above it.

When to use the median:1. When the exact midpoint f the distribution is wanted the, 50% point.2. When there are extreme scores which would markedly affect the mean. Extreme scores do not disturb the median.3. When it is desired that certain scores should influence the central tendency, but all that is known about them is that they are above or below the median.- determine of whether the cases fall within the lower halve or the upper halve of a distribution (appropriate locator of central tendency).

Finding the median from ungrouped data:1. When N is odd, the median is the middle score.Ex. 20 15 13 11 8 7 6There are 7 scores and the median is 11

2. When N is even, the median is the average of the two middle score.Ex. 21 18 15 14 11 8 8 7There are 8 scores and the median is (14+11)/2 = 12.5

3. When several scores have the same value as the midscore.Ex. 15 15 14 11 9 9 9 6 5Median is 9Ex. 1. Find the median of the following set of observations.8 4 1 3 & 7Sol. Array the set of observations and find the median1 3 4 7 84 is the middle item

2. Compute for the median from the following set of data12 9 6 10 7 & 14 Array the data and computer for the median6 7 9 10 12 14Median = (9 + 10 )/2 = 9.5

Finding the Median from the Grouped Data

Md = L + [N/2 – F2 / f2] C

Page 12: Correlation Formula

Where: L = lower class boundary of the interval where the median liesN = No. of scores or sum of frequencyF2 = cumulative frequency less than up to the class immediately precedingthe median class (F<)f2 = frequency of the median classC= class sizeSteps:1. Prepare 3 columns (Class intervals, class frequency and cumulative frequency less than)2. Determine the Median class. The median class is that class interval where n/2 lies.3. Substitute the data to the formula.

Ex. Find the median of the frequency distribution

Weekly wages,No. of Workers(f) ,F< (cumulative frequency less than)(in peso)870 – 899, 4, 4900 – 929 ,6, 10930 – 959 ,10, 20 – F<960 – 989, 13, 33 Median class990 – 1019 ,8 , 411020 – 1049, 7, 481050 – 1079, 2, 50N=50To determine the median class:

Solve for N/2 = 50/2 = 25th

25th items fall in 960 – 989 class interval therefore it is the median class

Md = L + [(N/2 – F2)/f2] C

= 959.5 + [(50/2 – 20) /13] 30

Md = Php 971.04 Posted by Statistics_Lecture at 1:44 AM 0 comments

Thursday, May 15, 2008

Measures of Central Tendency - Mean

Central Tendency is the point about which the scores tend to cluster, a sort of average in the series. It is the center of concentration of scores in any set of data. It is a single number which represents the general level of performance of a group.

Page 13: Correlation Formula

Three (3) measures of Central Tendency

a.) Mean – The mean on arithmetic mean, or arithmetic average is defined as the sum of the values in the data group divided by the no. of values.

When to use the mean1. When the scores are distributed symmetrically around a central point.2. When the measure of central tendency having the greatest stability is wanted.3. When other statistics like standard deviation, coefficient of correlation, etc. are to be computer later, since these statistics are based upon the mean.

Finding the Mean from Ungrouped Data

Where: x = score or measureX = ∑X /N N = No. of scores or measures∑ = summation of

Example:1.) Last year the five sales counselors of Pacific Plans Inc. sold the following number f educators plans; 24,16,35,13,25. Find the mean.

Solution:X = (24+16+35+13+25)/5= 22.6

Finding the mean from Grouped Data:

Long method:

X = ∑f M / N

Where: f = class frequencyM = class midpointN = sum of the frequencies

By the “ Assumed mean” or short method :

X=AM+(∑f X / N)c

Where: AM = assumed meanc = class sizex = deviation

Example: Scores of 50 students on a college algebra test

Class internal, Midpoint, fx, Class frequency f, x, fMscores

Page 14: Correlation Formula

45-47 ,46 ,18 ,3 ,6, 13842-44 ,43, 20, 4 ,5 ,17239-41 ,40, 16, 4, 4, 16036-38 ,37 ,12, 4, 3 ,14833-35 ,34, 4, 2 ,2 ,6830-32 ,31, 3 ,3 ,1 ,9327-29 ,28 ,-0 ,13, 0,364 24-26 ,25 ,-8, 8 ,-1, 20021-23 ,22, -6, 3, -2, 6618-20, 19, -9, 3, -3,5715-17, 16, 0, 0 ,-4 ,012-14 ,13, -10, 2, -5, 269-11 ,10, -6, 1, -6 ,1034 50 1502Long method :

X = ∑f M/ N = 1502/50=30.04

Assumed mean or short method : Steps :

1. Prepare 4 column ( class interval,f,x,fx ).2. Select the interval to contain the assumed mean (AM). For the assumed mean, one may take the midpoint of the interval near the center of the distribution, or the midpoint of the interval with the highest frequency.3. Determine the x column starting with 0, number each class interval positive up to the highest class interval; negative up to the lowest class interval.4. Multiply f by x to determine the fx column.5. Find the algebraic sum of the positive fx’s and the negative fx’s to get ∑fx.

Short method :

X=AM+(∑f X / N)c=28+(34/50)3=30.04

Weighted Arithmetic mean or Combined mean

Where: w = weight of x∑wx = sum of the weight of∑w = sum of the weight of x

Example: The same test was administered to fourth year high school students in 3 schools. Each school had computed its own mean using internal width of 3 as shown below.

School A

Page 15: Correlation Formula

Class IntervalScores f, x, fx

39-41, 1, 4 ,436-38,2 ,3 ,633-35, 4 ,2, 830-32, 4, 1 ,427-29 2 0 024-26, 3, -1, -321-23 ,4 ,-2, -818-20,2,-3,-622 5School B42-44, 1, 6, 639-41, 0, 5,036-38, 2, 4 ,833-35 ,5 ,3, 1530-32 ,6, 2, 1227-29, 7, 1, 724-26 3 0 021-23, 4, -1, -418-20, 2, -2, -415-17 ,2, -3 ,-612-14 ,1, -4, -49-11, 2, -5, -1035 20School C39-41, 1, 3, 336-38, 2 ,2 ,433-35 ,10, 1, 1030-32, 6, 0 ,027-29 ,7, -1 ,-724-26, 2, -2, -421-23, 1, -3, -318-20, 0 ,-4, 015-17 ,1, -5, -530 -2

To find the weighted mean of the 3 schools, follow the procedures below:

1. Find the highest and the lowest scores of the schools.2. Prepare the step intervals column for the combined distribution.3. Write the frequencies for each steps interval for the three schools.4. Find the total frequency for each steps interval for the total combined distribution.5. Compute the mean from this distribution.

Page 16: Correlation Formula

Class intervalScores , sch.A (f) ,sch.B (f), sch.c(f) ,total f, x, fx

42-44, 0, 1, 0, 1, 5, 539-41 ,1, 0 , 1, 2 ,4,836-38 2,2,2,6,3,1833-35 ,4, 5 ,10 ,19, 2,38 30-32 ,4, 6 ,6, 16, 1 ,1627-29, 2, 7 ,7, 16, 0 ,024-26 ,3, 3, 2, 8, -1, -821-23, 4, 4, 1, 9, -2 ,-1818-20, 2, 2, 0 ,4 ,-3 ,-1215-17, 0, 2 ,1, 3 ,-4, -1212-14 ,0 ,1, 0 ,1 ,-5, -59-11 ,0, 2, 0, 2 ,-6, -222, 35, 30, ,18

X=AM+(∑f X / N)c=28+(18/87)3=28.6

WX = ∑wx / ∑w =22(28.69)+35(26.71)+30(30.8)/87=28.62 Posted by Statistics_Lecture at 5:12 PM 0 comments

Saturday, May 3, 2008

Graphical Method of Presenting Data and Frequency

1. Histogram. Class boundaries (x) vs. class frequency (y)2. Frequency Polygon. Class Mark (x) vs. class frequency (y)3. Less than ogive, upper class limit (x) vs. less than cumulative frequency (y)4. Greater than ogive. Lower class limit (x) vs. greater than cumulative frequency (y) Posted by Statistics_Lecture at 1:31 AM 0 comments

Friday, May 2, 2008

Other Definition of Terms

Array – This is the arrangement of data from the highest to lowest or from lowest to highest.

Range, R - is the difference between the highest and the lowest number.

Number of class- it depends on the size and nature of or class interval distribution. The no. of classes is determined into which the range will be divided. Usually, an effective no. of classes is somewhere between 4 and 20.

Page 17: Correlation Formula

No. of classes = range / class size or class width +1

Note: a.) If series contains less than 50 cases, 10 cases or less are just enoughb.) If series contains 50 to 100 cases, 10 to 15 classes are recommendedc.) If more than 100 cases, 15 or more classes are good

Class Limit – the end number of a class. It is the highest and the lowest values that can go into each class.

Class Size – the width of each class interval

Class Boundaries – are the “true” class limits defined by lower and upper boundaries. The lower boundaries can be determined by getting the average of the upper limit of a class and the lower limit of the next class. They can also be obtained by simply adding of a unit (0.5) to the upper limit and subtracting the same to the lower limit of each class.

Class Mark, M – also known as class Midpoint. It is the average of the lower and upper limits or boundaries of each class.

Class Interval – The range of values used in defining a class. It is simply the length of a class. It is the difference or distance between the upper and lower class boundaries of each class and is affected by the nature of the data and by the number of classes. It is a good practice to set up uniform class interval whenever possible for easier computation and interpretation.