descriptive statistics ….to describe the data set summarizing, organizing, simplifying and...

Post on 12-Jan-2016

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

DESCRIPTIVE STATISTICS

….To describe the data set

Summarizing, organizing, simplifying and communicating the nature of a data set in numerical terms. These numerical accounts are intended to describe the data set without inferring causal factors ( what caused the data).

3 primary concerns

• CENTRALITY

• VARIABILITY

• RELATEDNESS

CONSIDERATION:

• SCALES OF DATA

NOMINAL Scales of data

Continuous data scalesInterval and ratio

MEASURES OF CENTRALITY

MEANMEDIANMODE

MEAN (which scales of data can be represented in this way?)

(which scales of data can be represented in this way?)

• Median-most central value

• Mode most frequently occurring data point

Centrality

• Can you find the mean hair color in the class?

• MEDIAN?

• MODE?

Centrality-for ordinal data makes little sense

• Can you find the mean?

• Median?

• Mode?

• TRANSFORM YOUR DATA!

Measures of Variability for Continuous data

• Range

• Variance

• Standard deviation

Range

• Highest score minus lowest score

– How accurately will the “range” describe dispersion of the data?

Variance

Don’t be confused by different formulations : this is also the formula for variance

Standard Deviation (the square root of the variance)

Why “n” or “n-1?”

• Find the standard deviation of 4, 9, 11, 12, 17, 5, 8, 12, 14

• STD Example• Find the standard deviation of 4, 9, 11, 12, 17, 5, 8, 12, 14

First work out the mean: 10.222Now, subtract the mean individually from each of the numbers given and square the result. This is equivalent to the (x - )² step. x refers to the values given in the question.

• X 4 9 11 12 17 5 8 12 14• (x - )2 38.7 1.49 0.60 3.16 45.9 27.3 4.94 3.16 14.3

• Now add up these results (this is the 'sigma' in the formula): 139.55

• Divide by n. n is the number of values, so in this case is 9. This gives us: 15.51

• And finally, square root this: 3.94

Example• Mean scores for a class on two different

tests were:

Test 1- 75.5%

Test 2- 75.5%

Did an average student do better, worse or the same on test 2 vs test 1?

0

10

20

30

40

50

60

70

80

test 1 test 2Cell

Mea

n %

per

form

ance

Test 1

0

1

2

3

4

5

6

7

8C

ount

65 70 75 80 85 90 95 100Exam 1

Histogram

Test 2

0

1

2

3

4

5

6

7

8

9C

ount

40 50 60 70 80 90 100 110Q1

Histogram

NOTE***

• When you present a mean, it should always be accompanied by a measure of variability!!

0

20

40

60

80

100

120

test 1 test 2Cell

Mea

n %

per

form

ance

Relatedness = correlation

– Correlations yield coefficient values– Between +1.0 and -1.0

Correlational outcomes can be visualized

Criteria For Evaluating Correlation Coefficients

There are no widely accepted criteria for defining a strong, moderate or weak association. However, there are suggestions for health science studies:

Correlation coefficient, r Interpretation 0.00 - 0.25 no or weak relationship 0.25 - 0.50 fair degree of relationship 0.50 - 0.75 moderate to good relationship 0.75 - 1.00 good to strong relationship

Correlation coefficients are not proportional. That is, the difference between 0.5 and 0.6 is not the same as the difference between 0.8 and 0.9. Also, a 0.4 correlation for one set of data is not the same as a 0.4 correlation for another set of data. Each correlation coefficient must interpreted with respect its own data.

Relatedness…correlation

• Consider scale of data for each variable

• Var X Var Y Technique

• Nominal Nominal Chi-square• Nominal continuous Rpbs

• Ordinal Spearmans rho

• ContinuousContinuous Pearson’s r

Pearsons’ product-moment correlation

Further considerations when performing correlation analysis

considerations

Correlation

• Third variables

• Directionality

• Correlation coefficients

• Positive, negative and zero correlations

• Graphing and curvilinearity

Graphing resultsBar graphs vs line graphs

Label axis

Use error bars

Provide a figure legend!

Rough methods section for worry survey • Methods• Materials• The subjects were randomly chosen people with no

preference for race, socioeconomic status, appearance, etc. Researchers attempted to avoid recruitment of subjects under the age of eight years old. The subjects were separated into two different groups, child (8 years to 12 years) and adult. Otherwise, a matched stratified random sampling procedure was used across age groups. Adult subjects were classified in one of four subsets: teen (13 years to 19 years), adult (20 years to 40 years), middle-aged adult (41 years to 60 years), and old persons (61 years plus). The subjects were chosen with intent at achieving an equal represent among all four subsets within the adult group. To maintain the objective nature of the experiment the subjects were recruited in a city in southeastern North Carolina at a variety of places including, but not limited to: local restaurants, the local college campus, the local beach, and local parks.

• Materials• A team of twenty-four researchers devised a survey to examine age,

relative anxiety, and risk perception. A survey was created through a collaboration of the twenty-four members of the research team. Members submitted a list of items/events that would cause them to worry. This list was compiled and examined for relevance to adults and children, redundancy, and effectiveness. Items deemed unnecessary or irrelevant were excluded from the survey. The team then decided which of the events/items were inappropriate for children and were thus discarded.

• There were two different forms of the survey, one for the adult set of subjects and one for the child group of subjects. The reasoning for this was there were some items on the adult survey that were not appropriate for children. Examples of this would include worry involved with “drunk driving”, “being drugged”, and “getting sexually assaulted”. The survey required some general demographic information

• including gender, age, highest level of education, and race. The survey then asked the subject to indicate on a scale of 1-7, 1 being not worried at all and 7 being extremely worried, if he or she was a worrier in general. A listing of potentially risky events followed. The adult survey contained 54 items and the child survey contained 42 items. Excluding the examples depicted above, the items on the adult and child surveys were the same. These items included events/activities tat would generally cause one anxiety such as “skydiving”, “holding a snake”, “being outside during a lightning storm”, “swimming in the ocean after reports of a shark attack”, etc. Items were also included to act as controls for biased or inaccurate responding such as “playing putt-putt golf” or “taking a walk”. The survey was also designed to control for those not paying attention / not taking the survey seriously by placing “being lost” on the survey in two separate places. Subjects were to indicate their level of worry if they were to participate in or encounter each of these items on the same 1-7 scale of worry. The survey concluded with a question regarding the interference of worry with the subject's normal routines, work, school, and/or social activities on which the subject gave a score of 1-7, 1 being no interference and 7 being extremely interfering.

• Procedure• Adult subjects were randomly recruited by walking up to a

person and asking it he or she would like to participate in a quick (about five minutes) survey. If the person obliged, he or she was informed that this was a survey inquiring about typical worries. The subject was also informed that partaking tin the survey posed no mental or physical risk, and was assured of the anonymity of his or her responses to the items. Following administration of the survey, the subject was informed of the intent of the survey.

• Potential subjects were excluded if they looked to be in either too much of a hurry or too busy. These potential subjects were excluded on the basis of possible contamination of the results. Subject traveling in a group were not approached, as group bias would possibly skew their responses. For the same

• reason, when surveys were administered to more than one subject at once, the subjects were asked not to speak with one another about the survey until all had completed.

• Because only two of the members had ready access to administration of the survey to children, some of the elimination criteria used of adults could not apply. There were two main places children were recruited, at a soccer practice and at a children's museum. At the soccer field, the children arrived in groups of three to five. These subjects were instructed to take the survey quietly without discussion of the survey during administration, but because they were children this was quite difficult to control. The children recruited at the children's museum were directed to a table where the survey was administered on a one by one basis.

For the Statistics we will talk about these are the general assumptions:• ASSUMPTIONS

Linearity– Normal distributions

•CONSIDERATIONS– SCALES OF DATA

top related