estimating the correlation coefficient with censored data
DESCRIPTION
Estimating the correlation coefficient with censored data. Yanming Li 1 Brenda W. Gillespie 1 Kerby Shedden 1 John A. Gillespie 2 1. University of Michigan 2. University of Michigan Dearborn. Motivation. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Estimating the correlation coefficient with censored data](https://reader036.vdocument.in/reader036/viewer/2022062322/568146bc550346895db3edd0/html5/thumbnails/1.jpg)
of 10
Estimating the correlation coefficient Estimating the correlation coefficient with censored datawith censored data
2013 ASA-CSP 1
Yanming Li1 Brenda W. Gillespie1
Kerby Shedden1 John A. Gillespie2
1. University of Michigan 2. University of Michigan Dearborn
![Page 2: Estimating the correlation coefficient with censored data](https://reader036.vdocument.in/reader036/viewer/2022062322/568146bc550346895db3edd0/html5/thumbnails/2.jpg)
MotivationMotivation• A study on Belgian barn owls* aimed at investigating how chemical
concentrations of perfluoroalkyl substances (PFASs) and perfluorooctane sulfonate (PFOS) in tail feathers and other soft tissues are correlated.
• Statistical methods for censored data with user-friendly interfaces are needed to cope with levels below the limit of detection (LOD).
2013 ASA-CSP
Some examples of using our novel method and R package analyzing the Belgian barn owl data:
• Left panel: A scatter plot showing fully observed data, censored (left and interval censored) data, and partially missing data.
• Right panel: profile likelihood function for the correlation coefficient, showing the point estimate and the 95% confidence interval.
* Perfluoroalkyl substances in soft tissues and tail feathers of Belgian barn owls using statistical methods for left –censored data to handle non-detects. Veerle J. et al, Environment International 53(2013) 9-16.
of 102
![Page 3: Estimating the correlation coefficient with censored data](https://reader036.vdocument.in/reader036/viewer/2022062322/568146bc550346895db3edd0/html5/thumbnails/3.jpg)
OutlineOutline
• Estimating the correlation coefficient for bivariate Gaussian data with censoring or/and missing.
• Using parametric likelihood-based inference.
• Presenting an R package capable of handling different types of censoring (left, right, interval and mixtures of those types).
• Presenting ways of making scatterplots with censored bivariate data and graphing the profile likelihood function.
2013 ASA-CSP of 103
![Page 4: Estimating the correlation coefficient with censored data](https://reader036.vdocument.in/reader036/viewer/2022062322/568146bc550346895db3edd0/html5/thumbnails/4.jpg)
Censored Data: Their Likelihood, Maximum-Censored Data: Their Likelihood, Maximum-likelihood Estimation and Confidence Interval likelihood Estimation and Confidence Interval EstimationEstimation
2013 ASA-CSP
• Likelihood (for left censored data only, o=observed, c=censored) *
• Construct confidence interval via likelihood ratio tests
and a confidence interval with coverage probability is the set
complete data x censored, y observed x observed, y censored both x & y censored
log profile likelihood fixed at marginal maxima
chi-square critical value
* assumes missing completely at random and random censoring. of 104
![Page 5: Estimating the correlation coefficient with censored data](https://reader036.vdocument.in/reader036/viewer/2022062322/568146bc550346895db3edd0/html5/thumbnails/5.jpg)
The R package: ClikcorrCensored data likelihood based correlation inference
2013 ASA-CSP
• Input data format
One variableLower bounds Upper bounds
Observed 10.9 10.9Left censored NA 3.6
Right censored 16.7 NAInterval censored 7.8 13.4
Missing NA NA
• Output
• Syntax to run the main estimating function
Clikcorr(Data, "L1", "U1", "L2", "U2", cp=.95)
L1, U1: Lower and upper bounds for the 1st variableL2, U2: Lower and upper bounds for the 2nd variablecp: Coverage probability of the confidence interval
• Maximized likelihood estimate of the correlation coefficient
• Estimated bivariate variance covariance matrix
• Estimated means
• p-value for likelihood ratio test with null hypothesis r=0
• Lower bound of the CI
• Upper bound of the CI
• log likelihood value at MLEof 105
![Page 6: Estimating the correlation coefficient with censored data](https://reader036.vdocument.in/reader036/viewer/2022062322/568146bc550346895db3edd0/html5/thumbnails/6.jpg)
The R package: Graphics The R package: Graphics
Clikcorr.profilePlot(Data, "L1", "U1", "L2", "U2", cp=0.95)
Clikcorr.scatterPlot(Data, c("L1","L2","L3"), c("U1","U2","U3"))
of 106
![Page 7: Estimating the correlation coefficient with censored data](https://reader036.vdocument.in/reader036/viewer/2022062322/568146bc550346895db3edd0/html5/thumbnails/7.jpg)
Results From Simulated DataResults From Simulated Data
0% censored* 25% censored 75% censoredr=0.00 r=0.50 r=0.95 r=0.00 r=0.50 r=0.95 r=0.00 r=0.50 r=0.95
n=50 0.938 0.962 0.946 0.938 0.946 0.968 0.958 0.954 0.956(0.956) (0.968) (0.954) -- -- -- -- -- --
n=200 0.944 0.948 0.942 0.954 0.960 0.948 0.972 0.964 0.960(0.948) (0.954) (0.950) -- -- -- -- -- --
n=500 0.932 0.934 0.952 0.946 0.948 0.964 0.952 0.944 0.964(0.938) (0.948) (0.954) -- -- -- -- -- --
Table 1: 95% Confidence interval coverage probabilities for different censoring proportions
Coverage probabilities are estimated from 500 replications.* Coverage probabilities in parentheses are calculated from Fisher transformation in the case of no censoring.
Censoring % X 0%; XY 30%; Y 0% X 15%; XY 15%; Y 15% X 30%; XY 0%; Y 30%r=0.00 r=0.50 r=0.95 r=0.00 r=0.50 r=0.95 r=0.00 r=0.50 r=0.95
n=50 36.15 50.46 58.65 22.02 25.39 31.98 1.16 1.41 1.78n=200 166.26 206.09 233.09 88.98 102.72 110.46 1.51 2.10 2.19n=500 267.11 514.87 727.94 175.78 275.37 374.27 1.78 2.38 3.50
Table 2: Run time (seconds) for different settings of r, n and censoring percentages
of 1072013 ASA-CSP
![Page 8: Estimating the correlation coefficient with censored data](https://reader036.vdocument.in/reader036/viewer/2022062322/568146bc550346895db3edd0/html5/thumbnails/8.jpg)
Results From Simulated DataResults From Simulated Data
2013 ASA-CSP
Table 3: Bias (MSE) for normally distributed detection limits, where data are simulated from an independent N(0,1) distribution
N(-2,1); avg. 25% censored N(0,1); avg. 50% censored N(2,1); avg. 75% censoredr=0.00 r=0.50 r=0.95 r=0.00 r=0.50 r=0.95 r=0.00 r=0.50 r=0.95
n=50 0.006 -0.021 -0.018 0.031 0.007 0.008 -- -- --(0.019) (0.030) (0.053) (0.035) (0.064) (0.076) (--) (--) (--)
n=200 -0.005 -0.005 -0.026 0.020 0.003 -0.037 -0.017 0.053 0.021(0.008) (0.012) (0.014) (0.012) (0.010) (0.014) (0.024) (0.034) (0.029)
n=500 0.005 -0.008 -0.002 0.008 -0.001 -0.010 -0.006 -0.018 -0.010(0.002) (0.003) (0.005) (0.003) (0.006) (0.007) (0.010) (0.009) (0.013)
Bias and MSE are estimated from 50 replications.
of 108
![Page 9: Estimating the correlation coefficient with censored data](https://reader036.vdocument.in/reader036/viewer/2022062322/568146bc550346895db3edd0/html5/thumbnails/9.jpg)
Sensitivity to MisspecificationSensitivity to Misspecification
2013 ASA-CSP
n=50 n=200 n=500r=0.00 r=0.50 r=0.95 r=0.00 r=0.50 r=0.95 r=0.00 r=0.50 r=0.95
df=3 0.954 0.844 0.694 0.962 0.802 0.578 0.956 0.744 0.542df=5 0.946 0.914 0.828 0.936 0.912 0.818 0.958 0.884 0.792df=10 0.944 0.930 0.882 0.928 0.956 0.892 0.946 0.942 0.920df=20 0.948 0.916 0.930 0.944 0.944 0.922 0.964 0.938 0.944
Table 4: 95% confidence interval coverage probabilities of bivariate normal estimates for bivariate t generated data
• Coverage probabilities are estimated from 500 replications.
of 109
![Page 10: Estimating the correlation coefficient with censored data](https://reader036.vdocument.in/reader036/viewer/2022062322/568146bc550346895db3edd0/html5/thumbnails/10.jpg)
CSCAR at the University of MichiganCSCAR at the University of Michigan
The Center for Statistical Consultation and Research (CSCAR) provides support and training to University of Michigan researchers in a variety of areas relating to the management, collection, and analysis of data. CSCAR also supports the use of technical software and advanced computing in research.
Find us at: http://www.cscar.research.umich.edu/about/
2013 ASA-CSP
• Yanming Li, Graduate Student Research Assistant. [email protected]• Kerby Shedden, CSCAR Director. [email protected]• Brenda W. Gillespie, CSCAR Associate Director. [email protected]• John A. Gillespie, Professor of Mathematics and Statistics, University of Michigan Dearborn. [email protected]
of 1010