008 revised notes intro to datapresentation

Post on 08-Jul-2016

221 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

bb

TRANSCRIPT

Biological WeaponsProliferation Prevention ProgramBiological Threat Reduction Program

Introduction to Data Presentation

TRNEPI-00152

2

Learning objectives

Define different types of variables Create and interpret one and two variable

tables Create and interpret a line graph Create and interpret one and two variable bar

charts Describe when to use each type of table,

graph, and chart

3

Why organize data?

Many records Look for trends and relationships Get familiar with data before analysis Catch errors Communicate findings to others

4

How to organize data

Identify what type of data you have Determine what you need to communicate with

the data Summarize using tables, graphs, and/or charts

5

Variable: definition

What is observed or measured in the way people differExamples:

age height hair color smoking

6

Continuous(real-valued)e.g. height

Discrete(count data)e.g. number

of admissions

Ordinal(ordered)

e.g. response to treatment

Nominal(not ordered)e.g. ethnic

group

Quantitativemeasurement

Variable

Qualitativeor categorical

Types of Variables

7

Types of VariablesCategorical

Nominal OrdinalSex Nationality Status  M Yemen MildM Jordan ModerateF Yemen SevereM Jordan MildF Sudan ModerateF Yemen MildM Sudan ModerateM Iran SevereF Jordan SevereM Iran MildF Yemen ModerateF Sudan ModerateM Iran MildM Yemen Severe

Quantitative

Discrete ContinuousChildren Weight 1 56.41 47.82 59.93 13.11 25.71 23.02 30.03 13.72 15.42 52.51 26.61 38.21 59.02 57.9

8

Why Does it Matter?

Categorical and quantitative variables are statistically summarized and presented in different ways

Variable Type Data Presentation

Quantitative Graphs, Tables

Categorical Charts, Tables

Biological WeaponsProliferation Prevention ProgramBiological Threat Reduction Program

Tables

10

Tables: Characteristics

Data is arranged in rows and columns Presentation is simple and self-explanatory

Title Label each row and column Show totals for rows and columns Include units of measure (yrs, mg/dl) Explain codes in footnote

11

Simple Frequency Distribution

Age group (years) Number of Cases<14 230

15-19 437820-24 1040525-29 961030-34 864835-44 690145-54 2631>55 1278

Total 44081

Primary and secondary syphilis morbidityby age, United States, 1989

12

Determining Class Intervals

The intervals must be mutually exclusive and encompass all data.

For preliminary analysis a large number of intervals (4-8) is used. These intervals can then be consolidated.

Use standard or frequently applied intervals (for instance, up to the age of 19, 20-24 years, 25-29 years, etc.).

A category must be provided to accommodate unknown values (for instance “age unknown.”)

13

Two Variable Table

14

Format for 2 X 2 Table

Ill Well TotalExposed a bUnexposed c dTotal

15

Format for 2 X 2 Table

Dead Alive Total

Diabetic 100 89 189

Non-diabetic 811 2340 3151

TotalTotal 911911 24292429 3340

Follow-up status among diabetic and nondiabetic white men NHANES, 1982-1984

Biological WeaponsProliferation Prevention ProgramBiological Threat Reduction Program

Graphs and Charts

17

Charts and Graphs: Advantages

Easier to understand and interpret Get a good feel for the data before formal

analysis Reveal patterns in data

Used to generate hypothesis

18

Graphs: Types

Arithmetic-scale line graphs In-set graphs Histograms Frequency Polygons Cumulative Frequency Curve Scatter diagram

19

Graphs

0

0.5

1

1.5

2

2.5

3

3.5

1 2 3 4 5 6 7

Independent Variable

Depe

nden

t Var

iabl

eTitle

20

Types of Variables

Dependent Describe outcome of interest

Examples: Dead, cancer, ill

Independent May cause or contribute to variation of the

dependent variable Not influenced by dependent variable

Examples: Time, age, packs of cigarettes, cholesterol levels

21

Arithmetic-Scale Line Graph

Source: CDC, National Notifiable Diseases Surveillance System

40

30

20

10

01950 1960 1970 1980 1990

Incidence of Hepatitis A, United States, 1952-1993

Rat

e /1

00,0

00

Year

22

Arithmetic-Scale Line Graph: Characteristics

Method of choice for plotting rates over time Set distance on graph represents same quantity

anywhere on the axis Horizontal graph x:y ratio is 5:3

Y-axis should start with 0 Determine largest value of Y needed to plot Round off that number and divide into

intervals

23

Arithmetic-Scale Line Graph

0.0

5.0

10.0

15.0

20.0

25.0

30.0

35.0

40.0

45.0

50.0

<1 1-4 5-910

-1415

-1920

-2425

-2930

-3435

-3940

-4445

-4950

-5455

-5960

-64 65+

19961997199819992000

Registered Death Rates by Age and Year, 1996-2000

Rat

e pe

r 100

0 po

pula

tion

Age Categories (Years)

24

Inset Graph

0.0

5.0

10.0

15.0

20.0

25.0

30.0

35.0

40.0

45.0

50.0

<1 1-4 5-910

-1415

-1920

-2425

-2930

-3435

-3940

-4445

-4950

-5455

-5960

-64 65+

19961997199819992000

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

1-4 5-910-14

15-1920-24

25-2930-34

35-3940-44

45-49

Registered Death Rates by Age and Year, 1996-2000

Rat

e pe

r 100

0 po

pula

tion

Age Categories (Years)

25

Inset Graph: Characteristics

A magnified portion of the larger, or host, graph

Can see data in better detail Smaller graph is “inset” into the larger

graph Variables remain the same

Independent data points do not change (e.g. age categories will remain in 5-year segments)

26

Histograms

Frequency of measles by week of onset Dec 6, 2000 to May 16, 2001

27

Histograms: characteristics

Graph of the frequency distribution of a continuous variable

Columns are adjoining

Area of each column is proportional to number of observations in that interval

28

March 13 March 14Onset (3-hour periods)

29

30

Histograms using continuous data

31

Frequency Polygon

0

10

20

30

40

50

60

70

80

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Week

Cas

es

Cases

Cases-FP

Example of a Frequency Polygon

32

Frequency Polygon: Characteristics

Graph of entire frequency distribution of a continuous variable

Number of events in interval plotted at midpoint of interval

Straight line connects points Useful to compare two or more

distributions on the same axis

33

Frequency Polygon

Relative frequency of serum cholesterol level by age

34

Cumulative FrequencyCumulative incidence of hepatitis B virus infection

by duration of high-risk behavior

0102030405060708090

100

0 1 2 3 4 5 6 7 8 9 10 11 12Years at Risk

Perc

ent i

nfec

ted

with

HB

V

IV Drug Users Homosexual Men Heterosexuals - multiple parters

35

Scatter Diagram

Relationship between age in years and heavy metal X exposure

36

Charts: Types

Appropriate for categorical data Bar charts

Simple Grouped Stacked

Pie charts

37

Simple Bar Chart Annual Death Rates by Govornorate, 1996-2000

0 50 100 150 200 250 300 350 400

AQABA

ZARQA

AMMAN

JARAS

MAFRQ

MADAB

TAFEL

IRBID

BALQA

KARAK

AJLON

MAANN

Gov

orna

orat

e

Rate per 100,000 population

38

Bar Charts: Characteristics

Display data from one-variable table Each variable is represented by a bar Bars are proportional to the number of events Can be presented vertically or horizontally

39

Vertical Bar Chart Qualitative Ordinal Variable

0

5

10

15

Mild Moderate Severe

Distribution of Cases by Clinical Status

Cas

es

Clinical Status

40

Grouped Bar Chart

Race

Freq

uenc

y

Treatment completion and cure of disease X in various racial groups, 1994-2000

0200400600800

1000120014001600

Race A Race B Race C Race D

CasesCompletionCure

41

Grouped Bar Chart: Characteristics

Illustrate data from two variable or three variable tables

Bars within groups are usually adjoining Bars between groups have a space Limit number of bars within group to less than

four

42

Stacked Bar Chart

0100200300400500600

1992 1993 1994 1995 1996

OthersFalciparum

Cases of malaria in a region, 1992-1996

Time

Case

s

43

Pie Chart

44

Anti-HAV Prevalence

High

Intermediate

Low

Very Low

Geographic Distribution of Hepatitis A Virus Infection

45

46

Selecting the Right Presentation Method (1)

Type of Graph or Diagram Application

Arithmetic Scale Graph

Inset Graph

Histogram

Data or indicator trends over time.

View a larger image of a portion of the host

graph

1.Frequency distribution for a continuous variable.

2. Number of cases during an epidemic (epidemic curve) or over time.

47

Selecting the Right Presentation Method (2)

Type of Graph or Diagram Application

Frequency Polygons

Cumulative Frequency Curve

Scatter Plot

Simple Bar Charts

Frequency distribution of a continuous variable for displaying components

Display cumulative frequency of a quantitative variable

Plot the relationship between 2 variables – looking for any correlation.

Compare the size or frequency of different categories of the same variable.

48

Selecting the Right Presentation Method (3)

Type of Graph or Diagram Application

Grouped Bar Chart

Stacked Bar Chart

Pie Chart

Compare the sizes or frequencies of different categories across 2-4 data sets

Compare totals and display component parts for several data groups

Display parts of a whole

49

Selecting the Right Presentation Method (4)

Type of Graph or Diagram Application

Spot Map

Area Map

Display locations of cases or occurrences

Display occurrences or indicators as they correspond to geographic divisions

50

Summary

Tables, charts, and graphs are effective tools for organizing, summarizing, and communicating data

In order to effectively communicate data, the correct presentation method must be selected

top related