analysis of death causes in the stulong data set jan burian, jan rauch euromise – cardio...
TRANSCRIPT
![Page 1: Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague](https://reader036.vdocument.in/reader036/viewer/2022082818/56649eb35503460f94bbb695/html5/thumbnails/1.jpg)
Analysis of Death Causes Analysis of Death Causes in the STULONG Data Setin the STULONG Data Set
Jan Burian, Jan RauchJan Burian, Jan Rauch
EuroMISE – CardioEuroMISE – Cardio
University of EconomicsUniversity of Economics PraguePrague
![Page 2: Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague](https://reader036.vdocument.in/reader036/viewer/2022082818/56649eb35503460f94bbb695/html5/thumbnails/2.jpg)
Discovery Challenge 2003Discovery Challenge 2003 22
DEATH CAUSE DEATH CAUSE PATIENTSPATIENTS %%myocardial infarctionmyocardial infarction 80 80 20.620.6coronary heart diseasecoronary heart disease 3333 8.58.5stroke stroke 30 30 7.77.7other causesother causes 79 79 20.320.3sudden deathsudden death 2323 5.95.9unknownunknown 88 2.02.0tumorous diseasetumorous disease 114114 29.329.3general atherosclerosisgeneral atherosclerosis 2222 5.75.7
TOTAL TOTAL 389389 100.0100.0
![Page 3: Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague](https://reader036.vdocument.in/reader036/viewer/2022082818/56649eb35503460f94bbb695/html5/thumbnails/3.jpg)
Discovery Challenge 2003Discovery Challenge 2003 33
Data matrix ENTRY
General characteristics Examinations Vices
Marital status
Transport to a job
Physical activity in a job
Activity after a job
Education
Responsibility
Age
Weight
Height
Chest pain
Breathlesness
Cholesterol
Urine
Subscapular
Triceps
Alcohol
Liquors
Beer 10
Beer 12
Wine
Smoking
Former smoker Duration of smoking
Tea
Sugar
Coffee
![Page 4: Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague](https://reader036.vdocument.in/reader036/viewer/2022082818/56649eb35503460f94bbb695/html5/thumbnails/4.jpg)
Discovery Challenge 2003Discovery Challenge 2003 44
Analytic questions
Are there strong relations concerning death cause?
General characteristics (?) Death cause (?)
Examinations (?) Death cause (?)
Vices(?) Death cause (?)
Combinations (?) Death cause (?)
![Page 5: Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague](https://reader036.vdocument.in/reader036/viewer/2022082818/56649eb35503460f94bbb695/html5/thumbnails/5.jpg)
Discovery Challenge 2003Discovery Challenge 2003 55
Example of relation: founded implication
A Cholesterol<250;273> & Coffee(3 and more cups)
0.63;15 Death cause (tumorous disease) S
S ¬S
A 15 9 24
¬ A 99 266 365
114 275 389
63% of patients satisfying A satisfy also S
there are 15 patients satisfying both A and S
![Page 6: Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague](https://reader036.vdocument.in/reader036/viewer/2022082818/56649eb35503460f94bbb695/html5/thumbnails/6.jpg)
Discovery Challenge 2003Discovery Challenge 2003 66
Example of relation: above average
A Age( 65) +0.76;15 Death cause (general atherosclerosis) S
A Age( 65) 0.1;15 Death cause (general atherosclerosis) S
S ¬S
A 15 136 151
¬ A 7 231 238
22 275 389
relative frequency of S: 22/389 = 0.057
relative frequency of S if A: 15/151 = 0.099
relative frequency of S if A is 76 per cent higher than the relative frequency of S
there are 15 patients satisfying both A and S
![Page 7: Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague](https://reader036.vdocument.in/reader036/viewer/2022082818/56649eb35503460f94bbb695/html5/thumbnails/7.jpg)
Discovery Challenge 2003Discovery Challenge 2003 77
Liquors(?) & Smoking(?) +0.55;15 Death cause(?)
Alcohol(?) & Tea(?) +0.55;15 Death cause(?)
Beer 12(?) & Wine(?) +0.55;15 Death cause(?)
Liquors(?) & Smoking(?) & Coffee(?) & Beer 12(?) +0.55;15 Death cause(?)
????? +0.55;15 Death cause(?)
Vices(?) +0.55;15 Death cause (?)
For which combinations of vices is relative frequency of some death causes at least 55 per cent higher than relative frequency of the same death cause among all patients ?
We require at least 15 patients with particular death cause satisfying both particular condition.
Example of task
![Page 8: Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague](https://reader036.vdocument.in/reader036/viewer/2022082818/56649eb35503460f94bbb695/html5/thumbnails/8.jpg)
Discovery Challenge 2003Discovery Challenge 2003 88
4ft-Miner application Vices(?) +
0.55;15 Death cause (?)
Vices(?) = Antecedent +0.75;15
Death cause(?)
![Page 9: Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague](https://reader036.vdocument.in/reader036/viewer/2022082818/56649eb35503460f94bbb695/html5/thumbnails/9.jpg)
Discovery Challenge 2003Discovery Challenge 2003 99
Dealing with attributesAn example – Age
Predefined intervals length 10: Age<40,50), Age<50,60), …, Age <70,80)
Predefined intervals length 5: Age<40,45), Age<45,50), … Age <70,75)
Sliding window length 10
Sliding window length 5
Sliding window length 2
![Page 10: Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague](https://reader036.vdocument.in/reader036/viewer/2022082818/56649eb35503460f94bbb695/html5/thumbnails/10.jpg)
Discovery Challenge 2003Discovery Challenge 2003 1010
Sliding window length 544, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, ....., 67, 68, 69, 70, 71, 72, 73, 74
44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, ....., 67, 68, 69, 70, 71, 72, 73, 74
44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, ....., 67, 68, 69, 70, 71, 72, 73, 74
...........
44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, ....., 67, 68, 69, 70, 71, 72, 73, 74
44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, ....., 67, 68, 69, 70, 71, 72, 73, 74
![Page 11: Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague](https://reader036.vdocument.in/reader036/viewer/2022082818/56649eb35503460f94bbb695/html5/thumbnails/11.jpg)
Discovery Challenge 2003Discovery Challenge 2003 1111
Dealing with attributesAn other example – Marital status
Marital status(divorced) – 39 patients
Marital status(single) – 28 patients
81.5 %
10.0 % 7.2 % 1.3 %
![Page 12: Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague](https://reader036.vdocument.in/reader036/viewer/2022082818/56649eb35503460f94bbb695/html5/thumbnails/12.jpg)
Discovery Challenge 2003Discovery Challenge 2003 1212
Dealing with attributesSome further examples
Predefined intervals, sliding windows Cholesterol Subscapular Height, Weight, …
Particular values Activity after job Physical activity in a job Education Transport Responsibility …
![Page 13: Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague](https://reader036.vdocument.in/reader036/viewer/2022082818/56649eb35503460f94bbb695/html5/thumbnails/13.jpg)
Discovery Challenge 2003Discovery Challenge 2003 1313
4ft-Miner result example
Beer 12(yes) & Vine(yes) +0.55;15 Death cause (tumorous disease)
![Page 14: Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague](https://reader036.vdocument.in/reader036/viewer/2022082818/56649eb35503460f94bbb695/html5/thumbnails/14.jpg)
Discovery Challenge 2003Discovery Challenge 2003 1414
Tasks: Antecedent Death cause (?)Antecedent rules verifications
General characteristics
(9 attributes)
0.5;15 6 70 422
+0.75;15 3 58 685
Examinations
(6 attributes)
0.5;15 1 5 754
+0.5;15 5 16 836
Vices
(5 attributes)
0.5;15 0 22 755
+0.55;15 9 20 610
Combinations
1 general + 1 other
0.5;15 11 186 690
+0.75;15 22 294 288
Solution time in all cases ≤ 8 sec Intel Pentium on 3Ghz, 512 MB RAM
![Page 15: Analysis of Death Causes in the STULONG Data Set Jan Burian, Jan Rauch EuroMISE – Cardio University of Economics Prague](https://reader036.vdocument.in/reader036/viewer/2022082818/56649eb35503460f94bbb695/html5/thumbnails/15.jpg)
Discovery Challenge 2003Discovery Challenge 2003 1515
Conclusions
Only 389 patients with death code
Some potentially interesting rules
Fast work with 4ft-Miner
Possibility of tuning work with attributes
predefined intervals,
sliding windows
…