missing data & how to handle it.pptx
TRANSCRIPT
-
7/24/2019 Missing data & how to handle it.pptx
1/32
Missing data & how to
handle it
Arooj Arshad
PhD Scholar
-
7/24/2019 Missing data & how to handle it.pptx
2/32
Goals
Discuss ways to evaluate andunderstand missing data
Discuss common missing datamethods
Know the advantages anddisadvantages of common methods
Treatment of the missing data
-
7/24/2019 Missing data & how to handle it.pptx
3/32
Missing data can occur for manyreasons: articiants can fail to resond to
!uestions "legitimately or illegitimately#more on that later$%
e!uiment and data collecting or recordingmechanisms can malfunction%
sujects can withdraw from studies eforethey are comleted%
data entry errors can occur'
-
7/24/2019 Missing data & how to handle it.pptx
4/32
Di(erence etween missing and
legitimate missing data
-
7/24/2019 Missing data & how to handle it.pptx
5/32
Methods for analy)ing missing data re!uireassumtions aout the nature of the data and aoutthe reasons for the missing oservations that areoften not ac*nowledged'
+hen researchers use missing data methods withoutcarefully considering the assumtions re!uired of thatmethod% they run the ris* of otaining iased andmisleading results' ,eviewing the stages of datacollection% data rearation% data analysis% andinterretation of results will highlight the issues thatresearchers must consider in ma*ing a decision aouthow to handle missing data in their wor*'
-
7/24/2019 Missing data & how to handle it.pptx
6/32
Point to e rememer--'
All researchers should e.amine theirdata for missingness% andresearchers wanting the est "i'e'%
the most ReplicableandGeneralizable$ results from theirresearch need to e reared to deal
with missing data in the mostaroriate and desirale wayossile'
-
7/24/2019 Missing data & how to handle it.pptx
7/32
Missing Data Mechanisms
Missing Completely at Random (MCAR) Probability of the missing data on Y is unrelated to Y
and X
/.amle: the reorting of income y the resondents'
0hec*ed with the hel of 1ittle2s M0A, test' Missing at Random (MAR)
Probability of missing data on y is relayed to X
/.amle: for really sic* atients% clinicians may not drawlood for routine las'
!ot Missing at Random Probability of missing data on Y is dependent on "alue
of Y
/.amle: ,esondents with high income less li*ely to reortincome
-
7/24/2019 Missing data & how to handle it.pptx
8/32
Missing Data 0onse!uences
#ias
/stimatesystematically
deviates from the!uantity ofinterest'
3o ias is the datais M0A,% ut iascan occur with notM0A,'
$ariance
Missing data cansometimes leas towrong standard
errors' +rong study
conclusions aoutrelationshi of
variales tooutcomes'
-
7/24/2019 Missing data & how to handle it.pptx
9/32
0ommonly45sed Missing Data
6andling Methods
-
7/24/2019 Missing data & how to handle it.pptx
10/32
0ommonly45sed Missing Data
Methods
Deletion Methods 1istwise7comlete case deletion%
airwise deletion
Single 8mutation Methods Mean7mode sustitution% dummy
variale method% single regression
Model49ased Methods Ma.imum 1i*elihood% Multile
imutation
-
7/24/2019 Missing data & how to handle it.pptx
11/32
Deletion Method
-
7/24/2019 Missing data & how to handle it.pptx
12/32
1istwise Deletion "0omlete 0ase
Analysis$
nly analy)e caseswith comlete datadroing the missingvariales'
+hen a researcher isestimating a model%such as a linear
regression% moststatistical ac*agesuse listwise deletiony default'
-
7/24/2019 Missing data & how to handle it.pptx
13/32
Ad"antages /ase of imlementation'
0omaraility across analyses
%isad"antage ,educes statistical ower "because lo&ers n a researcher
cannot anticiate if an ade!uate amount of data remain forthe analysis$'
Doesn2t use all information
/stimates may e iased if data isn2t M0A, "comlete caseanalysis assumes that the oserved comlete cases are arandom samle of the originally targeted samle% or in,uin;s "?$ terminology% that the missing data are M0A,$
1istwise Deletion "0omlete 0ase
Analysis$
-
7/24/2019 Missing data & how to handle it.pptx
14/32
Pairwise deletion "Availale 0ase
Analysis$ Analysis with all cases in which
the variales of interest areresent'
Ad"antage' Kees as many cases as
ossile for each analysis' 5ses all information
ossile with each analysis'%isad"antage'0an2t comare analysesecause samle di(erenteach time'
-
7/24/2019 Missing data & how to handle it.pptx
15/32
Single 8mutation Methods
-
7/24/2019 Missing data & how to handle it.pptx
16/32
Single 8mutation Methods
Mean7Mode sustitution
Dummy variale control
0onditional mean sustitution
-
7/24/2019 Missing data & how to handle it.pptx
17/32
Mean7Mode Sustitution
,elace missing value with samle meanor mode
,un analyses as if comlete cases analysis
Ad"antages0an use comlete case analysis methods
%isad"antages,educes variaility
+ea*ens covariance and correlation estimatesin the data "ecause 8t ignores relationshietween variales$
-
7/24/2019 Missing data & how to handle it.pptx
18/32
Dummy @ariale Adjustment
0reate an indicator for missing value "
-
7/24/2019 Missing data & how to handle it.pptx
19/32
,egression 8mutation
,elaces missing values withredicted score from a regressione!uation'
Advantage:
5ses information from oserveddata
Disadvantages:
verestimates model t andcorrelation estimates
+ea*ens variance
-
7/24/2019 Missing data & how to handle it.pptx
20/32
Model 9ased Methods
-
7/24/2019 Missing data & how to handle it.pptx
21/32
Model 9ased Methods
Ma.imum 1i*elihood 5sing /Malgorithm
Multile imutationThese methods share two assumtions:
that the joint distriution of the data ismultivariate normal% and that the
missing data mechanism is ignorale'
-
7/24/2019 Missing data & how to handle it.pptx
22/32
8denties the set of arameter values that roduces thehighest log4li*elihood'
M1 estimate: value that is most li*ely to have resulted inthe oserved data
0oncetually% rocess the same with or without missingdata
Advantages:
5ses full information "oth comlete cases andincomlete cases$ to calculate log li*elihood
5niased arameter estimates with M0A,7MA, data Disadvantages
S/s iased downward#can e adjusted y using oservedinformation matri.
-
7/24/2019 Missing data & how to handle it.pptx
23/32
we can ase estimation on theli*elihood of the oserved data'
-
7/24/2019 Missing data & how to handle it.pptx
24/32
Multile 8mutation
-
7/24/2019 Missing data & how to handle it.pptx
25/32
-
7/24/2019 Missing data & how to handle it.pptx
26/32
-
7/24/2019 Missing data & how to handle it.pptx
27/32
-
7/24/2019 Missing data & how to handle it.pptx
28/32
-
7/24/2019 Missing data & how to handle it.pptx
29/32
-
7/24/2019 Missing data & how to handle it.pptx
30/32
-
7/24/2019 Missing data & how to handle it.pptx
31/32
-
7/24/2019 Missing data & how to handle it.pptx
32/32
,eferences
Allison% Paul D' CC