data dan pengolahan data.pdf

53
DATA DAN PENGOLAHAN DATA Dr. Pudji Lestari, dr Mkes Public Health Dept Faculty of Medicine Airlangga University

Upload: zakyramadhan

Post on 18-Jul-2016

34 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: data dan pengolahan data.pdf

DATA DAN PENGOLAHAN DATA

Dr. Pudji Lestari, dr Mkes

Public Health Dept

Faculty of Medicine Airlangga University

Page 2: data dan pengolahan data.pdf
Page 3: data dan pengolahan data.pdf

Data

Data are the pure and simple facts without any

particular structure or organization, the basic atoms

of information,

Page 4: data dan pengolahan data.pdf

Information

Information is structured data, which adds meaning to

the data and gives it context and significance,

Page 5: data dan pengolahan data.pdf

Knowledge

Knowledge is what we know.

Exp : the map Like a physical map, it helps us know

where things are –

Knowledge is the ability to use information

strategically to achieve one’s objectives,

Page 6: data dan pengolahan data.pdf

Wisdom

is the capacity to choose objectives consistent with

one’s values within a larger social context.

Page 7: data dan pengolahan data.pdf

Data to Information…..

Page 8: data dan pengolahan data.pdf

Proses dalam statistik

Kumpulkan

Data di dapat dari Penelitian, Survay, observasi dsb

Data mentah

Tujuan, jumlah dan jenis data menentukan alat yang dipakai

Olah

Saji

Page 9: data dan pengolahan data.pdf

Jumlah dan

jenis data

Hasil

Interpretasi

< 10 alat sederhana, manual

> 30, > 100 dgn soft ware

Tujuan

Pengolahan data

Page 10: data dan pengolahan data.pdf

Pengolahan data

Kualitas juga tergantung pada proses sebelumnya-

pengumpulan data; coding; transfer/entry;

cleaning data

tergantung tujuan-- informasi apa yg diinginkan

Disesuaikan dengan –jenis data, jumlah variable,

hubungan antar variable

Alat pengolahan mengikuti jumlah data

Page 11: data dan pengolahan data.pdf

Olah data

paling sederhana :

Urutkan (ARRAY) dari yang terkecil s/d terbesar

>30 ;> 100 tabel distribusi frequensi

Page 12: data dan pengolahan data.pdf

Olah Data

Termasuk

Menghitung Ukuran

pemusatan (mean, median,

mode)

Menghitung ukuran

pencaran (Standard

Deviasi, Varians, quartil,

persentil, skewness, kurtosis)

Melakukan uji statistik

Page 13: data dan pengolahan data.pdf

Sajikan data

Adalah kombinasi dari text dan grafik/tabel

dengan prinsip sejelas dan seinformatif mungkin.

Bila cukup sederhana sajikan dengan text, bila tak

cukup lengkapi dengan grafik/tabel

Page 14: data dan pengolahan data.pdf

Sajikan data

Penyajian data tergantung pada

Tujuan

deskripsi saja atau ada tujuan analitik lain

Sasaran

Lingkungan akademis, koran, masy awam

Page 15: data dan pengolahan data.pdf

Rambu Tabel

Selain angka absolut, gunakan rate/ratio untuk

memberi gambaran lebih jelas

Numerator, denumerator, konstanta harus jelas

Tabel harus bisa menerangkan dirinya sendiri (self

explain)

Meaningful, Unambiguous and efficient

Page 16: data dan pengolahan data.pdf

Type of data

Classification for its

measurement scale:

Qualititative

Binary - dichotomous

Ordinal

Nominal

Quantitative

Discrete

Continuous

Page 17: data dan pengolahan data.pdf

Level of Measurement / Skala Data

Nominal Level of Measurement

numbers or other symbols are assigned to a set of categories for the purpose of naming, labeling, or classifying the observations not imply anything about the magnitude or quantitative difference between the categories.

example Gender

Page 18: data dan pengolahan data.pdf

rank-ordered categories ranging from

low to high example : Social class status as

"upper class", "middle class", or "working class".

"upper class" has a higher class

position than a person in a "middle class" category,

do not know magnitude of the

differences between categories

Ordinal Level of Measurement

Page 19: data dan pengolahan data.pdf

the categories (or values) of a variable can be rank-ordered, and if the measurements for all the cases are expressed in the same units, then it is interval-ratio level of measurement.

Examples age, income, and SAT scores. how much larger or smaller one is compared with another.

Interval/Ratio Level of Measurement

Page 20: data dan pengolahan data.pdf

Skala Pembeda Urutan Jarak

Nominal +

Ordinal + +

Ratio /Interval

+ + +

Page 21: data dan pengolahan data.pdf

Summary of categorical data

We can obtain frequencies of categorical data

and summary them in a table or graphic.

Example: we have 21 agents of parasitic

diseases isolated from children.

Giardia lamblia

Entamoeba histolytica

Ascaris lumbricoides

Enterobius vermicularis

Ascaris lumbricoides

Enterobius vermicularis

Giardia lamblia

Giardia lamblia

Entamoeba histolytica

Ascaris lumbricoides

Enterobius vermicularis

Ascaris lumbricoides

Enterobius vermicularis

Giardia lamblia

Giardia lamblia

Entamoeba histolytica

Ascaris lumbricoides

Enterobius vermicularis

Ascaris lumbricoides

Enterobius vermicularis

Giardia lamblia

Page 22: data dan pengolahan data.pdf

Summary of categorical data

List of parasites detected show us an idea of the

frequency of each parasite, but that is not clear.

If we ordered them, the idea is more clear.

Giardia lamblia

Giardia lamblia

Giardia lamblia

Giardia lamblia

Giardia lamblia

Giardia lamblia

Ascaris lumbricoides

Ascaris lumbricoides

Ascaris lumbricoides

Ascaris lumbricoides

Ascaris lumbricoides

Ascaris lumbricoides

Enterobius vermicularis

Enterobius vermicularis

Enterobius vermicularis

Enterobius vermicularis

Enterobius vermicularis

Enterobius vermicularis

Entamoeba histolytica

Entamoeba histolytica

Entamoeba histolytica

Page 23: data dan pengolahan data.pdf

Summary of categorical data

We can show the results in a frequency distribution.

Parasite n

Giardia lamblia 6

Ascaris lumbricoides 6

Enterobius vermicularis 6

Entamoeba histolytica 3

Total 21

Frequency distribution of intestinal parasites detected in children from CAISES Celaya, n=21

Source: Laboratory report

Page 24: data dan pengolahan data.pdf

Summary of categorical data

It is useful to show the frequency of each category, expressed as percentage of the total frequency.

It is called distribution of relative frequencies.

Parásito n %

Giardia lamblia 6 28.57

Ascaris lumbricoides 6 28.57

Enterobius

vermicularis

6 28.57

Entamoeba

histolytica

3 14.29

Total 21 100.00

Source: Laboratory report

Frequency distribution of intestinal parasites detected in children from CAISES Celaya, n=21

Page 25: data dan pengolahan data.pdf

Summary of categorical data

Sometimes, the number of categories is high and should

diminish the number of categories.

Death cause n %

Cardiovascular disease 12,525 21.96

Cancer 10,321 18.10

Lower respiratory

infections

8,745 15.34

Other 25,435 44.60

Total 57,026 100.00

Distribution by death cause in Celaya, Gto, during 2007

Source: Certification of deaths

Page 26: data dan pengolahan data.pdf

Frequency distributions for quantitative

data

With quantitative data, we need group the data, before of

show it in a frequencies or relative frequencies table.

Age (years) n %

19 52 14.70

20 32 9.00

21 46 12.99

22 67 18.94

23 26 7.35

24 77 21.76

25 54 15.26

Total 534 100.00

Distribution of frequencies in students of FEOC that have smoked at least once. n=534

Source: Health survey

Page 27: data dan pengolahan data.pdf

With quantitative data, it is useful calculate cumulative

frequency.

Age (years) n % % cumulative

19 52 14.70 14.70

20 32 9.00 23.70

21 46 12.99 36.69

22 67 18.94 55.63

23 26 7.35 62.98

24 77 21.76 84.74

25 54 15.26 100.00

Total 534 100.00

Source: Health survey

Frequency distributions for

quantitative data

Distribution of frequencies in students of FEOC that have smoked at least once. n=534

Page 28: data dan pengolahan data.pdf

Bar chart

The frequency or relative frequency of a

categorical variable can be show easily in a bar

chart.

It is used with categorical or numerical discrete data.

Each bar represent one category and its high is the

frequency or relative frequency.

Bars should be separated.

It is very important that Y axis begin with 0.

Page 29: data dan pengolahan data.pdf

Bar chart

Gastrintestinal infections

0

12

3

4

56

7

Cryptos. E.histolyt. E.coli Giardia Rotavirus Shigella

Agents

Freq

uen

cy

Page 30: data dan pengolahan data.pdf

Grouped bar chart

If we have a nominal categorical variable, divided

in two categories, can show data with a grouped

bar chart.

It allow easy comparison between groups.

Page 31: data dan pengolahan data.pdf

Grouped bar chart

Gastrointestinal infections

0

1

2

3

4

5

Crypt. E.histolyt. E.coli Giardia Rotavirus Shigella

Agents

Fre

qu

en

cy

Males

Females

Page 32: data dan pengolahan data.pdf

Pie chart

It is an alternative to show categorical variable.

Each slice of pie correspond at frequency or relative

frequency of categories of variable.

It only shows one variable in each pie chart.

If we want to make comparisons, we need to build two pie

charts.

Page 33: data dan pengolahan data.pdf

Pie chart

Civil status of women in a community

Single

28%

Married

44%

Divorced

11%

Widowed

8%

Free union

9%

Page 34: data dan pengolahan data.pdf

Distribution of frequency charts: histograms

It is useful to quantitative variables.

There are not spaces between bars.

The area bar, not its high, represent its frequency.

X axis should be continuous.

Y axis should begin in 0.

Width represent the interval for each group.

Page 35: data dan pengolahan data.pdf

Number of sons in women from

Celaya

0

100

200

300

400

500

600

700

1 2 3 4 5 6 7 8+

Number of sons

Nu

mb

er

of

wo

ma

n

Distribution of frequency charts:

histograms

Page 36: data dan pengolahan data.pdf

Distribution of frequency charts: frequencies

polygon

It is another form to show the frequency distribution

of a numerical variable.

It is building, joining the middle point higher of each

bar of histogram.

We should be take into account the width of each

bar.

We can plot more than one polygon in each chart,

to make comparisons.

Page 37: data dan pengolahan data.pdf

Number of sons of women from

Celaya

0

100

200

300

400

500

600

700

1 2 3 4 5 6 7 8+

Number of sons

Nu

mb

er

of

wo

me

n

Distribution of frequency charts:

polygon of frequencies

Page 38: data dan pengolahan data.pdf

Distribution of frequencies: cumulative

histogram

We can plot directly from a cumulative frequencies

table.

It is not necessary to make adjustments to the high

of the bars, because the cumulative frequencies

represent the total frequency superior, including the

superior limit of the interval.

Page 39: data dan pengolahan data.pdf

Cumulative frequency of birthweight

0

20

40

60

80

100

120

501- 1001- 1501- 2001- 2501- 3001- 3501- 4001- 4501- 5000+

Weight

Cu

mu

lati

ve

freq

uen

cy (

%)

New borns

Distribution of frequencies:

cumulative histogram

Page 40: data dan pengolahan data.pdf

We use them to see proportions below o above of

a point in the curve.

We can read median and percentiles, directly.

If the distribution is symmetrical, it has S form

symmetrical.

If it is skewed to the right or to the left, will be

flatten in that side.

Distribution of frequencies:

cumulative polygon of frequencies

Page 41: data dan pengolahan data.pdf

Cumulative frequencies of birthweight

0

20

40

60

80

100

120

501- 1001- 1501- 2001- 2501- 3001- 3501- 4001- 4501- 5000+

Weight

Cu

mu

lati

ve

freq

uen

cy (

%)

New borns

Distribution of frequencies:

cumulative polygon of frequencies

Page 42: data dan pengolahan data.pdf

Other charts: tree and leafs

We use it to show directly quantitative data or

preliminary step in the build a frequency

distribution.

We organize data determining the number of divisions

(5-15).

We plot a vertical line and put the first digit of

category to the left of the line (tree) and the second

digit to the right of the vertical line (leafs).

Page 43: data dan pengolahan data.pdf

Other charts: tree and leafs

Patie

nt

Age

1 54

2 35

3 49

4 61

5 58

6 64

7 32

8 57

9 43

10 42

3 5 2

4 932

5 487

6 14

Page 44: data dan pengolahan data.pdf

Other charts: box plot

We plot a vertical line that represents the range of

distribution.

We plot a horizontal line that represents third

quartile and another that represents the first

quartile (box).

The point middle of distribution is show as a

horizontal line in the center of box.

Page 45: data dan pengolahan data.pdf

Other charts: box plot

5500

5000

4500

4000

3500

3000

2500

2000

1500

1000

500

Page 46: data dan pengolahan data.pdf

Box plot

Page 47: data dan pengolahan data.pdf
Page 49: data dan pengolahan data.pdf
Page 50: data dan pengolahan data.pdf
Page 51: data dan pengolahan data.pdf
Page 52: data dan pengolahan data.pdf
Page 53: data dan pengolahan data.pdf